* Kill model archive maps
* Fixup
* Also kill model_archive_map for MaskedBertPreTrainedModel
* Unhook config_archive_map
* Tokenizers: align with model id changes
* make style && make quality
* Fix CI
There's an inconsistency right now where:
- we load some models into CACHE_DIR
- and some models in the default cache
- and often, in both for the same models
When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.
I'd rather always use the default cache
* remove output_past from pt
* make style
* add optional input length for gpt2
* add use cache to prepare input
* save memory in gpt2
* correct gpt2 test inputs
* make past input optional for gpt2
* finish use_cache for all models
* make style
* delete modeling_gpt2 change in test file
* correct docstring
* correct is true statements for gpt2
* add some t5 integration tests
* finish summarization and translation integration tests for T5 - results loook good
* add tf test
* fix == vs is bug
* fix tf beam search error and make tf t5 tests pass
* make decoder input ids optional for t5 training
* lm_lables should not be shifted in t5
* add tests
* finish shift right functionality for PT T5
* move shift right to correct class
* cleaner code
* replace -100 values with pad token id
* add assert statement
* remove unnecessary for loop
* make style
* fix conflicts
* update bart max length test
* correct spelling mistakes
* implemented model specific encode function
* fix merge conflicts
* better naming
* save intermediate state -> need to rethink strucuture a bit
* leave tf problem as it is for now
* current version
* add layers.pop
* remove ipdb
* make style
* clean return cut decoding
* remove ipdbs
* Fix restoring layers in the decoders that doesnt exists.
* push good intermediate solution for now
* fix conflicts
* always good to refuse to merge conflicts when rebasing
* fix small bug
* improve function calls
* remove unused file
* add correct scope behavior for t5_generate
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
I suspect the wrapper classes were created in order to prevent the
abstract base class (TF)CommonModelTester from being included in test
discovery and running, because that would fail.
I solved this by replacing the abstract base class with a mixin.
Code changes are just de-indenting and automatic reformattings
performed by black to use the extra line space.
This construct isn't used anymore these days.
Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.
Use python -m unittest tests/test_foo.py instead.