Commit Graph

15053 Commits

Author SHA1 Message Date
Patrick von Platen
ab5d06a094
[T5, examples] replace heavy t5 models with tiny random models (#3556)
* replace heavy t5 models with tiny random models as was done by sshleifer

* fix isort
2020-04-02 12:34:05 +02:00
Patrick von Platen
a4ee4da18a
[T5, TF 2.2] change tf t5 argument naming (#3547)
* change tf t5 argument naming for TF 2.2

* correct bug in testing
2020-04-01 22:04:20 +02:00
Patrick von Platen
06dd597552
fix bug in warnings T5 pipelines (#3545) 2020-04-01 21:59:12 +02:00
Anirudh Srinivasan
9de9ceb6c5
Correct output shape for Bert NSP models in docs (#3482) 2020-04-01 15:04:38 -04:00
Patrick von Platen
b815edf69f
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550)
* add some t5 integration tests

* finish summarization and translation integration tests for T5 - results loook good

* add tf test

* fix == vs is bug

* fix tf beam search error and make tf t5 tests pass
2020-04-01 18:01:33 +02:00
HUSEIN ZOLKEPLI
8538ce9044
Add tiny-bert-bahasa-cased model card (#3567)
* add bert bahasa readme

* update readme

* update readme

* added xlnet

* added tiny-bert and fix xlnet readme
2020-04-01 07:15:00 -04:00
Manuel Romero
c1a6252be1
Create model card (#3557)
Create model card for: distilbert-multi-finetuned-for-xqua-on-tydiqa
2020-04-01 07:14:23 -04:00
Julien Chaumond
50e15c825c
Tokenizers: Start cleaning examples a little (#3455)
* Start cleaning examples

* Fixup
2020-04-01 07:13:40 -04:00
Patrick von Platen
b38d552a92
[Generate] Add bad words list argument to the generate function (#3367)
* add bad words list

* make style

* add bad_words_tokens

* make style

* better naming

* make style

* fix typo
2020-03-31 18:42:31 +02:00
Patrick von Platen
ae6834e028
[Examples] Clean summarization and translation example testing files for T5 and Bart (#3514)
* fix conflicts

* add model size argument to summarization

* correct wrong import

* fix isort

* correct imports

* other isort make style

* make style
2020-03-31 17:54:13 +02:00
Manuel Romero
0373b60c4c
Update README.md (#3552)
- Show that the last uploaded version was trained on more data (custom_license files)
2020-03-31 10:40:34 -04:00
Patrick von Platen
83d1fbcff6
[Docs] Add usage examples for translation and summarization (#3538) 2020-03-31 09:36:03 -04:00
Patrick von Platen
55bcae7f25
remove useless and confusing lm_labels line (#3531) 2020-03-31 09:32:25 -04:00
Patrick von Platen
42e1e3c67f
Update usage doc regarding generate fn (#3504) 2020-03-31 09:31:46 -04:00
Patrick von Platen
57b0fab692
Add better explanation to check docs locally. (#3459) 2020-03-31 09:30:17 -04:00
Manuel Romero
a8d4dff0a1
Update README.md (#3470)
Fix typo
2020-03-31 08:01:09 -04:00
Manuel Romero
4a5663568f
Create card for the model: GPT-2-finetuned-covid-bio-medrxiv (#3453) 2020-03-31 08:01:03 -04:00
Branden Chan
bbedb59675
Create README.md (#3393)
* Create README.md

* Update README.md
2020-03-31 08:00:35 -04:00
Manuel Romero
c2cf192943
Add link to 16 POS tags model (#3465) 2020-03-31 08:00:00 -04:00
Gabriele Sarti
c82ef72158
Added CovidBERT-NLI model card (#3477) 2020-03-31 07:59:49 -04:00
Manuel Romero
b48a1f08c1
Add text shown in example of usage (#3464) 2020-03-31 07:59:36 -04:00
Manuel Romero
99833a9cbf
Create model card (#3487) 2020-03-31 07:59:22 -04:00
Sho Arora
ebceeeacda
Add electra and alectra model cards (#3524) 2020-03-31 07:58:48 -04:00
Leandro von Werra
a6c4ee27fd
Add model cards (#3537)
* feat: add model card bert-imdb

* feat: add model card gpt2-imdb-pos

* feat: add model card gpt2-imdb
2020-03-31 07:54:45 -04:00
Ethan Perez
e5c393dceb
[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437)
* Using loaded checkpoint with --do_predict

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).

* Update checkpoint loading

* Fixing model loading
2020-03-30 17:06:08 -04:00
Sam Shleifer
8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488) 2020-03-30 12:28:27 -04:00
dougian
1f72865726
[BART] Update encoder and decoder on set_input_embedding (#3501)
Co-authored-by: Ioannis Douratsos <ioannisd@amazon.com>
2020-03-30 12:20:37 -04:00
Julien Chaumond
cc598b312b [InputExample] Unfreeze for now, cf. #3423 2020-03-30 10:41:49 -04:00
Julien Plu
d38bbb225f
Update the NER TF script (#3511)
* Update the NER TF script to remove the softmax and make the pad token label id to -1

* Reformat the quality and style

Co-authored-by: Julien Plu <julien.plu@adevinta.com>
2020-03-30 09:50:12 -04:00
LysandreJik
eff757f2e3 Re-pin isort version 2020-03-30 09:00:47 -04:00
LysandreJik
a009d751c2 Un-pin isort for v2.7.0 pypi 2020-03-30 08:55:10 -04:00
LysandreJik
6f5a12a583 Release: v2.7.0 2020-03-30 08:49:24 -04:00
Patrick von Platen
296252c49e
fix lm lables in docstring (#3529) 2020-03-30 14:26:24 +02:00
Patrick von Platen
75ec6c9e3a
[T5] make decoder input ids optional for t5 training (#3521)
* make decoder input ids optional for t5 training

* lm_lables should not be shifted in t5

* add tests

* finish shift right functionality for PT T5

* move shift right to correct class

* cleaner code

* replace -100 values with pad token id

* add assert statement

* remove unnecessary for loop

* make style
2020-03-30 13:45:26 +02:00
Patrick von Platen
5b44e0a31b
[T5] Add training documenation (#3507)
* Add clear description of how to train T5

* correct docstring in T5

* correct typo

* correct docstring format

* update t5 model docs

* implement collins feedback

* fix typo and add more explanation for sentinal tokens

* delete unnecessary todos
2020-03-30 13:35:53 +02:00
Sam Shleifer
33ef7002e1
[Docs] examples/summarization/bart: Simplify CNN/DM preprocessi… (#3516) 2020-03-29 13:25:42 -04:00
Sam Shleifer
f6a23d1911
[BART] add bart-large-xsum weights (#3422) 2020-03-29 10:51:13 -04:00
Stefan Schweter
601ac5b1dc [model_cards]: use MIT license for all dbmdz models 2020-03-27 18:06:25 -04:00
Patrick von Platen
17dceae7a1
Fix circle ci flaky fail of wmt example (#3485)
* force bleu

* fix wrong file name

* rename file

* different filenames for each example test

* test files should clean up after themselves

* test files should clean up after themselves

* do not force bleu

* correct typo

* fix isort
2020-03-27 13:01:28 -04:00
Patrick von Platen
00ea100e96
add summarization and translation to notebook (#3478) 2020-03-27 11:05:37 -04:00
Funtowicz Morgan
b08259a120
run_ner.py / bert-base-multilingual-cased can output empty tokens (#2991)
* Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* run_ner.py - Do not add a label to the labels_ids if word_tokens is empty.

This can happen when using bert-base-multilingual-cased with an input containing an unique space.
In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior
over the labels_ids tokens adding one more tokens than tokens vector.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-27 10:59:55 -04:00
Patrick von Platen
f4f4946836
Rename t5-large to t5-base in README.md 2020-03-27 15:57:58 +01:00
Patrick von Platen
fa9af2468a
Add T5 to docs (#3461)
* add t5 docs basis

* improve docs

* add t5 docs

* improve t5 docstring

* add t5 tokenizer docstring

* finish docstring

* make style

* add pretrained models

* correct typo

* make examples work

* finalize docs
2020-03-27 10:57:16 -04:00
Lysandre Debut
ff80b73157
Add option to choose T5 model size. (#3480)
T5-small in test


isort
2020-03-27 15:56:59 +01:00
LysandreJik
e2c05f06ef Correct indentation in docstring
For some reason Sphinx extremely dislikes this and crashes.
2020-03-27 09:28:52 -04:00
Sam Shleifer
3ee431dd4c
[Bart/Memory] Two separate, smaller decoder attention masks (#3371) 2020-03-26 21:34:15 -04:00
Manuel Romero
53fe733805
Model Cards: Fix grammar error (#3467) 2020-03-26 21:33:33 -04:00
Sam Shleifer
c10decf7a0
[Bart: example] drop columns that are exclusively pad_token_id… (#3400)
* trim seq_len below 1024 if there are columns full of pad_token_id
* Centralize trim_batch so SummarizationDataset can use it too
2020-03-26 19:33:54 -04:00
Sam Shleifer
63f4d8cad0
[Bart/Memory] SelfAttention only returns weights if config.outp… (#3369) 2020-03-26 18:42:39 -04:00
Sam Shleifer
2b2a2f8df2
[Bart] Fix: put dummy_inputs on correct device (#3398)
* Dummy inputs to model.device

* Move self.device to ModuleUtilsMixin
2020-03-26 18:42:09 -04:00