* DataParallel fixes:
1. switched to a more precise check
- if self.args.n_gpu > 1:
+ if isinstance(model, nn.DataParallel):
2. fix tests - require the same fixup under DataParallel as the training module
* another fix
* Kill model archive maps
* Fixup
* Also kill model_archive_map for MaskedBertPreTrainedModel
* Unhook config_archive_map
* Tokenizers: align with model id changes
* make style && make quality
* Fix CI
* Created using Colaboratory
* [examples] reorganize files
* remove run_tpu_glue.py as superseded by TPU support in Trainer
* Bugfix: int, not tuple
* move files around