Commit Graph

21 Commits

Author SHA1 Message Date
Amil Khare
c852036b4a
[cleanup] Hoist ModelTester objects to top level (#4939)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-06-16 08:03:43 -04:00
Sylvain Gugger
f1fe18465d
Use labels to remove deprecation warnings (#4807) 2020-06-05 16:41:46 -04:00
Julien Chaumond
d4c2cb402d
Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Patrick von Platen
aa925a52fa
[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468)
* fix gpu slow tests in pytorch

* change model to device syntax
2020-05-19 21:35:04 +02:00
Patrick von Platen
026a5d0888
[T5 fp16] Fix fp16 in T5 (#4436)
* fix fp16 in t5

* make style

* refactor invert_attention_mask fn

* fix typo
2020-05-18 17:25:58 +02:00
Julien Chaumond
f54dc3f4d5 [ci] Load pretrained models into the default (long-lived) cache
There's an inconsistency right now where:
- we load some models into CACHE_DIR
- and some models in the default cache
- and often, in both for the same models

When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.

I'd rather always use the default cache
2020-04-30 22:30:15 -04:00
Patrick von Platen
d13eca11e2
Higher tolerance for past testing in T5 (#3843) 2020-04-17 11:25:14 -04:00
Patrick von Platen
38f7461df3
[TFT5, Cache] Add cache to TFT5 (#3772)
* correct gpt2 test inputs

* make style

* delete modeling_gpt2 change in test file

* translate from pytorch

* correct tests

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* make tensorflow t5 caching work

* make style

* clean reorder cache

* remove unnecessary spaces

* fix test
2020-04-16 16:14:52 +02:00
Patrick von Platen
01c37dcdb5
[Config, Caching] Remove output_past everywhere and replace by use_cache argument (#3734)
* remove output_past from pt

* make style

* add optional input length for gpt2

* add use cache to prepare input

* save memory in gpt2

* correct gpt2 test inputs

* make past input optional for gpt2

* finish use_cache for all models

* make style

* delete modeling_gpt2 change in test file

* correct docstring

* correct is true statements for gpt2
2020-04-14 14:40:28 -04:00
Patrick von Platen
ce2298fb5f
[T5, generation] Add decoder caching for T5 (#3682)
* initial commit to add decoder caching for T5

* better naming for caching

* finish T5 decoder caching

* correct test

* added extensive past testing for T5

* clean files

* make tests cleaner

* improve docstring

* improve docstring

* better reorder cache

* make style

* Update src/transformers/modeling_t5.py

Co-Authored-By: Yacine Jernite <yjernite@users.noreply.github.com>

* make set output past work for all layers

* improve docstring

* improve docstring

Co-authored-by: Yacine Jernite <yjernite@users.noreply.github.com>
2020-04-10 01:02:50 +02:00
Patrick von Platen
b815edf69f
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550)
* add some t5 integration tests

* finish summarization and translation integration tests for T5 - results loook good

* add tf test

* fix == vs is bug

* fix tf beam search error and make tf t5 tests pass
2020-04-01 18:01:33 +02:00
Patrick von Platen
75ec6c9e3a
[T5] make decoder input ids optional for t5 training (#3521)
* make decoder input ids optional for t5 training

* lm_lables should not be shifted in t5

* add tests

* finish shift right functionality for PT T5

* move shift right to correct class

* cleaner code

* replace -100 values with pad token id

* add assert statement

* remove unnecessary for loop

* make style
2020-03-30 13:45:26 +02:00
Patrick von Platen
bbf26c4e61
Support T5 Generation (#3228)
* fix conflicts

* update bart max length test

* correct spelling mistakes

* implemented model specific encode function

* fix merge conflicts

* better naming

* save intermediate state -> need to rethink strucuture a bit

* leave tf problem as it is for now

* current version

* add layers.pop

* remove ipdb

* make style

* clean return cut decoding

* remove ipdbs

* Fix restoring layers in the decoders that doesnt exists.

* push good intermediate solution for now

* fix conflicts

* always good to refuse to merge conflicts when rebasing

* fix small bug

* improve function calls

* remove unused file

* add correct scope behavior for t5_generate

Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2020-03-19 23:18:23 +01:00
Julien Chaumond
9cda3620b6
Fix (non-slow) tests on GPU (torch) (#3024)
* Fix tests on GPU (torch)

* Fix bart slow tests

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-02-26 11:59:25 -05:00
Sam Shleifer
53ce3854a1
New BartModel (#2745)
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Aymeric Augustin
c824d15aa1 Remove __future__ imports. 2019-12-22 17:47:54 +01:00
Aymeric Augustin
345c23a60f Replace (TF)CommonTestCases for modeling with a mixin.
I suspect the wrapper classes were created in order to prevent the
abstract base class (TF)CommonModelTester from being included in test
discovery and running, because that would fail.

I solved this by replacing the abstract base class with a mixin.

Code changes are just de-indenting and automatic reformattings
performed by black to use the extra line space.
2019-12-22 15:35:18 +01:00
Aymeric Augustin
7e98e211f0 Remove unittest.main() in test modules.
This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204 Switch test files to the standard test_*.py scheme. 2019-12-22 14:15:13 +01:00