* add some t5 integration tests
* finish summarization and translation integration tests for T5 - results loook good
* add tf test
* fix == vs is bug
* fix tf beam search error and make tf t5 tests pass
* Using loaded checkpoint with --do_predict
Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).
* Update checkpoint loading
* Fixing model loading
* Update the NER TF script to remove the softmax and make the pad token label id to -1
* Reformat the quality and style
Co-authored-by: Julien Plu <julien.plu@adevinta.com>
* make decoder input ids optional for t5 training
* lm_lables should not be shifted in t5
* add tests
* finish shift right functionality for PT T5
* move shift right to correct class
* cleaner code
* replace -100 values with pad token id
* add assert statement
* remove unnecessary for loop
* make style
* Add clear description of how to train T5
* correct docstring in T5
* correct typo
* correct docstring format
* update t5 model docs
* implement collins feedback
* fix typo and add more explanation for sentinal tokens
* delete unnecessary todos
* force bleu
* fix wrong file name
* rename file
* different filenames for each example test
* test files should clean up after themselves
* test files should clean up after themselves
* do not force bleu
* correct typo
* fix isort
* Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* run_ner.py - Do not add a label to the labels_ids if word_tokens is empty.
This can happen when using bert-base-multilingual-cased with an input containing an unique space.
In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior
over the labels_ids tokens adding one more tokens than tokens vector.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>