Commit Graph

66 Commits

Author SHA1 Message Date
Sam Shleifer
28a690a80e
[mBART] skip broken forward pass test, stronger integration test (#5327) 2020-06-28 15:08:28 -04:00
Sam Shleifer
393b8dc09a
examples/seq2seq/run_eval.py fixes and docs (#5322) 2020-06-26 19:20:43 -04:00
Thomas Wolf
601d4d699c
[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308)
* remove references to old API in docstring - update data processors

* style

* fix tests - better type checking error messages

* better type checking

* include awesome fix by @LysandreJik for #5310

* updated doc and examples
2020-06-26 19:48:14 +02:00
Patrick von Platen
c2a26ec8a6
[Use cache] Align logic of use_cache with output_attentions and output_hidden_states (#5194)
* fix use cache

* add bart use cache

* fix bart

* finish bart
2020-06-24 16:09:17 +02:00
Sam Shleifer
84be482f66
AutoTokenizer supports mbart-large-en-ro (#5121) 2020-06-18 20:47:37 -04:00
Patrick von Platen
ebba39e4e1
[Bart] Question Answering Model is added to tests (#5024)
* fix test

* Update tests/test_modeling_common.py

* Update tests/test_modeling_common.py
2020-06-15 22:50:09 +02:00
Sam Shleifer
a9f1fc6c94
Add bart-base (#5014) 2020-06-15 13:29:26 -04:00
Suraj Patil
e93ccb3290
BartForQuestionAnswering (#4908) 2020-06-12 15:47:57 -04:00
Sam Shleifer
5620033115
[mbart] Fix fp16 testing logic (#4949) 2020-06-11 22:11:34 -04:00
Sam Shleifer
08b59d10e5
MBartTokenizer:add language codes (#3776) 2020-06-11 13:02:33 -04:00
Sylvain Gugger
f1fe18465d
Use labels to remove deprecation warnings (#4807) 2020-06-05 16:41:46 -04:00
Julien Chaumond
b42586ea56
Fix CI after killing archive maps (#4724)
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Julien Chaumond
d4c2cb402d
Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Sam Shleifer
b86e42e0ac
[ci] fix 3 remaining slow GPU failures (#4584) 2020-05-25 19:20:50 -04:00
Sam Shleifer
956c4c4eb4
[gpu slow tests] fix mbart-large-enro gpu tests (#4472) 2020-05-19 19:45:31 -04:00
Julien Chaumond
4bf5042240
Fix BART tests on GPU (#4298) 2020-05-12 09:11:50 -04:00
Sam Shleifer
18db92dd9a
[testing] add timeout_decorator (#3543) 2020-05-01 09:05:47 -04:00
Julien Chaumond
f54dc3f4d5 [ci] Load pretrained models into the default (long-lived) cache
There's an inconsistency right now where:
- we load some models into CACHE_DIR
- and some models in the default cache
- and often, in both for the same models

When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.

I'd rather always use the default cache
2020-04-30 22:30:15 -04:00
Sam Shleifer
847e7f3379
MarianMTModel.from_pretrained('Helsinki-NLP/opus-marian-en-de') (#3908)
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
2020-04-28 18:22:37 -04:00
Sam Shleifer
7a7fdf71f8
Multilingual BART - (#3602)
- support mbart-en-ro weights
- add MBartTokenizer
2020-04-10 11:25:39 -04:00
Sam Shleifer
715aa5b135
[Bart] Replace config.output_past with use_cache kwarg (#3632) 2020-04-07 19:08:26 -04:00
Sam Shleifer
8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488) 2020-03-30 12:28:27 -04:00
Patrick von Platen
75ec6c9e3a
[T5] make decoder input ids optional for t5 training (#3521)
* make decoder input ids optional for t5 training

* lm_lables should not be shifted in t5

* add tests

* finish shift right functionality for PT T5

* move shift right to correct class

* cleaner code

* replace -100 values with pad token id

* add assert statement

* remove unnecessary for loop

* make style
2020-03-30 13:45:26 +02:00
Sam Shleifer
f6a23d1911
[BART] add bart-large-xsum weights (#3422) 2020-03-29 10:51:13 -04:00
Sam Shleifer
3ee431dd4c
[Bart/Memory] Two separate, smaller decoder attention masks (#3371) 2020-03-26 21:34:15 -04:00
Sam Shleifer
39371ee454
[Bart/Memory] don't create lm_head (#3323)
* delete lm_head, skips weight tying
* Fixed s3
2020-03-26 18:40:39 -04:00
Patrick von Platen
95e00d0808
Clean special token init in modeling_....py (#3264)
* make style

* fix conflicts
2020-03-20 21:41:04 +01:00
Patrick von Platen
bbf26c4e61
Support T5 Generation (#3228)
* fix conflicts

* update bart max length test

* correct spelling mistakes

* implemented model specific encode function

* fix merge conflicts

* better naming

* save intermediate state -> need to rethink strucuture a bit

* leave tf problem as it is for now

* current version

* add layers.pop

* remove ipdb

* make style

* clean return cut decoding

* remove ipdbs

* Fix restoring layers in the decoders that doesnt exists.

* push good intermediate solution for now

* fix conflicts

* always good to refuse to merge conflicts when rebasing

* fix small bug

* improve function calls

* remove unused file

* add correct scope behavior for t5_generate

Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2020-03-19 23:18:23 +01:00
Sam Shleifer
ad7233fc01
[BART] cleanup: remove redundant kwargs, improve docstrings (#3319) 2020-03-19 11:16:51 -04:00
Patrick von Platen
e8f44af5bf
[generate] do_sample default back to False (#3298)
* change do_samples back

* None better default as boolean

* adapt do_sample to True in test example

* make style
2020-03-17 10:52:37 -04:00
Sam Shleifer
b2c1a447fe
[BART] Delete redundant unit test (#3302) 2020-03-16 23:09:10 -04:00
Sam Shleifer
5ea8ba67b4
[BART] Remove unused kwargs (#3279)
* Remove unused kwargs
* dont call forward in tests
2020-03-15 23:00:44 -04:00
Thomas Wolf
3814e167d9
Merge pull request #3225 from patrickvonplaten/finalize_merge_bart_generate_into_default_generate
Complete merge Seq-2-Seq generation into default generation
2020-03-14 15:08:59 +01:00
Sam Shleifer
2bd79e23de
[BART] FP16 testing fixes (#3266) 2020-03-13 19:48:26 -04:00
Patrick von Platen
6a82f774f2 fix typo 2020-03-12 21:10:51 +01:00
Patrick von Platen
f1c71da115 fix eos_token_ids in test 2020-03-12 21:00:54 +01:00
Patrick von Platen
6047f46b19 re-add eos token to get good bart results 2020-03-12 20:17:50 +01:00
Patrick von Platen
ac303eae46 fix problem with half 2020-03-11 12:24:30 +01:00
Patrick von Platen
bc9d5d917c make all tensors half precision 2020-03-11 12:15:38 +01:00
Patrick von Platen
a332cc9f7f finalize generation merge 2020-03-11 11:53:36 +01:00
Patrick von Platen
7351a8dbaf re-add scoring filtering 2020-03-11 11:06:56 +01:00
Patrick von Platen
374deef48d fixed typo 2020-03-11 11:06:56 +01:00
patrickvonplaten
41b437ea3a add draft version of propsoed changes for ROGUE score 2020-03-11 11:06:56 +01:00
patrickvonplaten
a5751f7578 fix bug with attention_mask as optional input argument 2020-03-11 11:06:56 +01:00
patrickvonplaten
d880a5fbde finalized PR 2020-03-11 11:06:56 +01:00
patrickvonplaten
2acfe63964 best current version and make style 2020-03-11 11:06:56 +01:00
patrickvonplaten
c62444da39 fix conflicts 2020-03-11 11:06:56 +01:00
Patrick von Platen
77e6775065 add current changes 2020-03-11 11:06:56 +01:00
Patrick von Platen
421216997b comment out stuff 2020-03-11 11:06:56 +01:00
Patrick von Platen
7a11e925cf work in progress 2020-03-11 11:06:56 +01:00