Patrick von Platen
ab5d06a094
[T5, examples] replace heavy t5 models with tiny random models ( #3556 )
...
* replace heavy t5 models with tiny random models as was done by sshleifer
* fix isort
2020-04-02 12:34:05 +02:00
Patrick von Platen
a4ee4da18a
[T5, TF 2.2] change tf t5 argument naming ( #3547 )
...
* change tf t5 argument naming for TF 2.2
* correct bug in testing
2020-04-01 22:04:20 +02:00
Patrick von Platen
06dd597552
fix bug in warnings T5 pipelines ( #3545 )
2020-04-01 21:59:12 +02:00
Anirudh Srinivasan
9de9ceb6c5
Correct output shape for Bert NSP models in docs ( #3482 )
2020-04-01 15:04:38 -04:00
Patrick von Platen
b815edf69f
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results ( #3550 )
...
* add some t5 integration tests
* finish summarization and translation integration tests for T5 - results loook good
* add tf test
* fix == vs is bug
* fix tf beam search error and make tf t5 tests pass
2020-04-01 18:01:33 +02:00
HUSEIN ZOLKEPLI
8538ce9044
Add tiny-bert-bahasa-cased model card ( #3567 )
...
* add bert bahasa readme
* update readme
* update readme
* added xlnet
* added tiny-bert and fix xlnet readme
2020-04-01 07:15:00 -04:00
Manuel Romero
c1a6252be1
Create model card ( #3557 )
...
Create model card for: distilbert-multi-finetuned-for-xqua-on-tydiqa
2020-04-01 07:14:23 -04:00
Julien Chaumond
50e15c825c
Tokenizers: Start cleaning examples a little ( #3455 )
...
* Start cleaning examples
* Fixup
2020-04-01 07:13:40 -04:00
Patrick von Platen
b38d552a92
[Generate] Add bad words list argument to the generate function ( #3367 )
...
* add bad words list
* make style
* add bad_words_tokens
* make style
* better naming
* make style
* fix typo
2020-03-31 18:42:31 +02:00
Patrick von Platen
ae6834e028
[Examples] Clean summarization and translation example testing files for T5 and Bart ( #3514 )
...
* fix conflicts
* add model size argument to summarization
* correct wrong import
* fix isort
* correct imports
* other isort make style
* make style
2020-03-31 17:54:13 +02:00
Manuel Romero
0373b60c4c
Update README.md ( #3552 )
...
- Show that the last uploaded version was trained on more data (custom_license files)
2020-03-31 10:40:34 -04:00
Patrick von Platen
83d1fbcff6
[Docs] Add usage examples for translation and summarization ( #3538 )
2020-03-31 09:36:03 -04:00
Patrick von Platen
55bcae7f25
remove useless and confusing lm_labels line ( #3531 )
2020-03-31 09:32:25 -04:00
Patrick von Platen
42e1e3c67f
Update usage doc regarding generate fn ( #3504 )
2020-03-31 09:31:46 -04:00
Patrick von Platen
57b0fab692
Add better explanation to check docs
locally. ( #3459 )
2020-03-31 09:30:17 -04:00
Manuel Romero
a8d4dff0a1
Update README.md ( #3470 )
...
Fix typo
2020-03-31 08:01:09 -04:00
Manuel Romero
4a5663568f
Create card for the model: GPT-2-finetuned-covid-bio-medrxiv ( #3453 )
2020-03-31 08:01:03 -04:00
Branden Chan
bbedb59675
Create README.md ( #3393 )
...
* Create README.md
* Update README.md
2020-03-31 08:00:35 -04:00
Manuel Romero
c2cf192943
Add link to 16 POS tags model ( #3465 )
2020-03-31 08:00:00 -04:00
Gabriele Sarti
c82ef72158
Added CovidBERT-NLI model card ( #3477 )
2020-03-31 07:59:49 -04:00
Manuel Romero
b48a1f08c1
Add text shown in example of usage ( #3464 )
2020-03-31 07:59:36 -04:00
Manuel Romero
99833a9cbf
Create model card ( #3487 )
2020-03-31 07:59:22 -04:00
Sho Arora
ebceeeacda
Add electra and alectra model cards ( #3524 )
2020-03-31 07:58:48 -04:00
Leandro von Werra
a6c4ee27fd
Add model cards ( #3537 )
...
* feat: add model card bert-imdb
* feat: add model card gpt2-imdb-pos
* feat: add model card gpt2-imdb
2020-03-31 07:54:45 -04:00
Ethan Perez
e5c393dceb
[Bug fix] Using loaded checkpoint with --do_predict (instead of… ( #3437 )
...
* Using loaded checkpoint with --do_predict
Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).
* Update checkpoint loading
* Fixing model loading
2020-03-30 17:06:08 -04:00
Sam Shleifer
8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… ( #3488 )
2020-03-30 12:28:27 -04:00
dougian
1f72865726
[BART] Update encoder and decoder on set_input_embedding ( #3501 )
...
Co-authored-by: Ioannis Douratsos <ioannisd@amazon.com>
2020-03-30 12:20:37 -04:00
Julien Chaumond
cc598b312b
[InputExample] Unfreeze for now, cf. #3423
2020-03-30 10:41:49 -04:00
Julien Plu
d38bbb225f
Update the NER TF script ( #3511 )
...
* Update the NER TF script to remove the softmax and make the pad token label id to -1
* Reformat the quality and style
Co-authored-by: Julien Plu <julien.plu@adevinta.com>
2020-03-30 09:50:12 -04:00
LysandreJik
eff757f2e3
Re-pin isort version
2020-03-30 09:00:47 -04:00
LysandreJik
a009d751c2
Un-pin isort for v2.7.0 pypi
2020-03-30 08:55:10 -04:00
LysandreJik
6f5a12a583
Release: v2.7.0
2020-03-30 08:49:24 -04:00
Patrick von Platen
296252c49e
fix lm lables in docstring ( #3529 )
2020-03-30 14:26:24 +02:00
Patrick von Platen
75ec6c9e3a
[T5] make decoder input ids optional for t5 training ( #3521 )
...
* make decoder input ids optional for t5 training
* lm_lables should not be shifted in t5
* add tests
* finish shift right functionality for PT T5
* move shift right to correct class
* cleaner code
* replace -100 values with pad token id
* add assert statement
* remove unnecessary for loop
* make style
2020-03-30 13:45:26 +02:00
Patrick von Platen
5b44e0a31b
[T5] Add training documenation ( #3507 )
...
* Add clear description of how to train T5
* correct docstring in T5
* correct typo
* correct docstring format
* update t5 model docs
* implement collins feedback
* fix typo and add more explanation for sentinal tokens
* delete unnecessary todos
2020-03-30 13:35:53 +02:00
Sam Shleifer
33ef7002e1
[Docs] examples/summarization/bart: Simplify CNN/DM preprocessi… ( #3516 )
2020-03-29 13:25:42 -04:00
Sam Shleifer
f6a23d1911
[BART] add bart-large-xsum weights ( #3422 )
2020-03-29 10:51:13 -04:00
Stefan Schweter
601ac5b1dc
[model_cards]: use MIT license for all dbmdz models
2020-03-27 18:06:25 -04:00
Patrick von Platen
17dceae7a1
Fix circle ci flaky fail of wmt example ( #3485 )
...
* force bleu
* fix wrong file name
* rename file
* different filenames for each example test
* test files should clean up after themselves
* test files should clean up after themselves
* do not force bleu
* correct typo
* fix isort
2020-03-27 13:01:28 -04:00
Patrick von Platen
00ea100e96
add summarization and translation to notebook ( #3478 )
2020-03-27 11:05:37 -04:00
Funtowicz Morgan
b08259a120
run_ner.py / bert-base-multilingual-cased can output empty tokens ( #2991 )
...
* Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* run_ner.py - Do not add a label to the labels_ids if word_tokens is empty.
This can happen when using bert-base-multilingual-cased with an input containing an unique space.
In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior
over the labels_ids tokens adding one more tokens than tokens vector.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-27 10:59:55 -04:00
Patrick von Platen
f4f4946836
Rename t5-large
to t5-base
in README.md
2020-03-27 15:57:58 +01:00
Patrick von Platen
fa9af2468a
Add T5 to docs ( #3461 )
...
* add t5 docs basis
* improve docs
* add t5 docs
* improve t5 docstring
* add t5 tokenizer docstring
* finish docstring
* make style
* add pretrained models
* correct typo
* make examples work
* finalize docs
2020-03-27 10:57:16 -04:00
Lysandre Debut
ff80b73157
Add option to choose T5 model size. ( #3480 )
...
T5-small in test
isort
2020-03-27 15:56:59 +01:00
LysandreJik
e2c05f06ef
Correct indentation in docstring
...
For some reason Sphinx extremely dislikes this and crashes.
2020-03-27 09:28:52 -04:00
Sam Shleifer
3ee431dd4c
[Bart/Memory] Two separate, smaller decoder attention masks ( #3371 )
2020-03-26 21:34:15 -04:00
Manuel Romero
53fe733805
Model Cards: Fix grammar error ( #3467 )
2020-03-26 21:33:33 -04:00
Sam Shleifer
c10decf7a0
[Bart: example] drop columns that are exclusively pad_token_id… ( #3400 )
...
* trim seq_len below 1024 if there are columns full of pad_token_id
* Centralize trim_batch so SummarizationDataset can use it too
2020-03-26 19:33:54 -04:00
Sam Shleifer
63f4d8cad0
[Bart/Memory] SelfAttention only returns weights if config.outp… ( #3369 )
2020-03-26 18:42:39 -04:00
Sam Shleifer
2b2a2f8df2
[Bart] Fix: put dummy_inputs on correct device ( #3398 )
...
* Dummy inputs to model.device
* Move self.device to ModuleUtilsMixin
2020-03-26 18:42:09 -04:00