Commit Graph

691 Commits

Author SHA1 Message Date
Lysandre Debut
18c32eeb21
Model parallel tests should return, not pass in non model parallel settings. (#8825) 2020-11-27 16:41:29 -05:00
Max Del
0a921b6459
BART & FSMT: fix decoder not returning hidden states from the last layer (#8597)
* Fix decoder not returning hidden states from the last layer

* Resolve conflict

* Change the way to gather hidden states

* Add decoder hidden states test

* Make pytest and black happy

* Remove redundant line

* remove new line

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2020-11-27 18:35:34 +01:00
Moussa Kamal Eddine
81fe0bf085
Add barthez model (#8393)
* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-27 12:31:42 -05:00
Patrick von Platen
a7d46a0609
Fix dpr<>bart config for RAG (#8808)
* correct dpr test and bert pos fault

* fix dpr bert config problem

* fix layoutlm

* add config to dpr as well
2020-11-27 16:26:45 +01:00
Patrick von Platen
a2cf37595e
[Flax test] Add require pytorch to flix flax test (#8816)
* try flax fix

* same for roberta
2020-11-27 14:40:42 +01:00
Kristian Holsheimer
f8eda599bd
[FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791)
* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes

* [FlaxRoberta] Fix non-broadcastable attention mask

* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)

* Partially revert "Use jax.numpy ..."

* Add tests for batched forward passes

* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA

* Auto-fix style

* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism
2020-11-27 13:21:19 +01:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior (#8567)
* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Joe Davison
369f1d77b4
Return correct Bart hidden state tensors (#8747)
* bart output hidden states upstream

* same w/ decoder

* add tests

* fix prophetnet

* fix gpt2 and ctrl

* fix fstm and skip test for reformer and longformer

* fix all models

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-25 22:06:04 +01:00
Lysandre Debut
138f45c184
Fix QA argument handler (#8765)
* Fix QA argument handler

* Attempt to get a better fix for QA (#8768)

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2020-11-25 14:02:15 -05:00
Julien Plu
29d4992453
New TF model inputs (#8602)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add input processing for TF Flaubert

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add the new inputs in new Longformer models

* Update the template with the new input processing

* Remove useless assert

* Apply style

* Trigger CI
2020-11-24 13:55:00 -05:00
Stas Bekman
82d443a7fd
[core] implement support for run-time dependency version checking (#8645)
* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-24 13:22:25 -05:00
Lysandre Debut
e09e54fd9d
MT5 should have an autotokenizer (#8743)
* MT5 should have an autotokenizer

* Different configurations should be able to point to same tokenizers
2020-11-24 09:50:25 -05:00
Lysandre Debut
6fdd0bb231
Fix slow tests v2 (#8746)
* Fix BART test

* Fix MBART tests

* Remove erroneous line from yaml

* Update tests/test_modeling_bart.py

* Quality
2020-11-24 09:35:12 -05:00
zhiheng-huang
2c83b3c38d
Support various BERT relative position embeddings (2nd) (#8276)
* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-24 14:40:53 +01:00
LysandreJik
7f2c00913a TF BERT test update 2020-11-23 18:20:19 -05:00
LysandreJik
e1b7e10d5f Update TF BERT test 2020-11-23 18:19:12 -05:00
Colin Brochtrup
8ffc01a76a
Add early stopping callback to pytorch trainer (#8581)
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs
2020-11-23 17:25:35 -05:00
Stas Bekman
e84786aaa6
consistent ignore keys + make private (#8737)
* consistent ignore keys + make private

* style

* - authorized_missing_keys    => _keys_to_ignore_on_load_missing
  - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected

* move public doc of private attributes to private comment
2020-11-23 12:33:13 -08:00
alexorona
1cd9be2aeb
gpt2 and t5 parallel modeling (#8696)
* gpt2 and t5 parallel modeling

* model_parallel utils update

* adding missing model_parallel_utils

Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5

* training_args reformat

Reformatted training_args

* style formatting

Style formatting doc string length on training_args and model_parallel_utils

* style changes

make style && make quality for training_args and model_parallel_utils.

* adding tests

* minor change in trainer

reverts loss calculation

* Update training_args.py

* Update training_args.py

added back docstring language for adam_beta1 and adam_beta2

* Update trainer.py

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style & rebase

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-23 14:41:23 -05:00
Julien Chaumond
0cc5ab1333
Improve bert-japanese tokenizer handling (#8659)
* Make ci fail

* Try to make tests actually run?

* CI finally failing?

* Fix CI

* Revert "Fix CI"

This reverts commit ca7923be73.

* Ooops wrong one

* one more try

* Ok ok let's move this elsewhere

* Alternative to globals() (#8667)

* Alternative to globals()

* Error is raised later so return None

* Sentencepiece not installed make some tokenizers None

* Apply Lysandre wisdom

* Slightly clearer comment?

cc @sgugger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-23 11:15:02 -05:00
Yossi Synett
18c8cf000b
Fix bug in x-attentions output for roberta and harden test to catch it (#8660) 2020-11-23 13:28:29 +01:00
Patrick von Platen
9c0afdaf7b
fix flaky ci (#8694) 2020-11-20 22:07:21 +01:00
Sylvain Gugger
6494910f27
Add sentencepiece to the CI and fix tests (#8672)
* Fix the CI and tests

* Fix quality

* Remove that m form nowhere
2020-11-19 16:44:20 -05:00
Stas Bekman
42111f1d56
[tokenizers] convert_to_tensors: don't reconvert when the type is already right (#8283)
* don't reconvert when the type is already right

* better name

* adjust logic as suggested

* merge
2020-11-19 12:06:01 -08:00
Zhylko Dima
ca0109bd68
disable_ngram_loss fix for prophetnet (#8554)
* `disable_ngram_loss` fix for prophetnet

* add changes documentation

* fix _compute_loss to use mean reduction and -100 to masked tokens & remove unnecessary arguments

* mean label smoothing loss

* small refactor

* fix test

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-19 19:18:07 +01:00
Sylvain Gugger
4208f496ee
Better filtering of the model outputs in Trainer (#8633)
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
2020-11-19 10:43:15 -05:00
Lysandre Debut
f2e07e7272
Fix a bunch of slow tests (#8634)
* CI should install `sentencepiece`

* Requiring TF

* Fixing some TFDPR bugs

* remove return_dict=False/True hack

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-19 10:41:41 -05:00
elk-cloner
5362bb8a6b
Tf longformer for sequence classification (#8231)
* working on LongformerForSequenceClassification

* add TFLongformerForMultipleChoice

* add TFLongformerForTokenClassification

* use add_start_docstrings_to_model_forward

* test TFLongformerForSequenceClassification

* test TFLongformerForMultipleChoice

* test TFLongformerForTokenClassification

* remove test from repo

* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice

* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes

* pass all tests except test_inputs_embeds

* sync with master

* pass all tests except test_inputs_embeds

* pass all tests

* pass all tests

* work on test_inputs_embeds

* fix style and quality

* make multi choice work

* fix TFLongformerForTokenClassification signature

* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature

* fix mult choice

* fix mc hint

* fix input embeds

* fix input embeds

* refactor input embeds

* fix copy issue

* apply sylvains changes and clean more

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-19 10:37:27 -05:00
Sylvain Gugger
1e62e999e8
Fixes the training resuming with gradient accumulation (#8624) 2020-11-18 12:00:11 -05:00
Nicola De Cao
2f9d49b389
Adding PrefixConstrainedLogitsProcessor (#8529)
* Adding PrefixConstrainedLogitsProcessor

* fixing RAG and style_doc

* fixing black (v20 instead of v19)

* Improving doc in generation_logits_process.py

* Improving docs and typing in generation_utils.py

* docs improvement

* adding test and fixing doc typo

* fixing doc_len

* isort on test

* fixed test

* improve docstring a bit

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-18 17:06:25 +01:00
Sylvain Gugger
dd52804f5f
Remove deprecated (#8604)
* Remove old deprecated arguments

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-17 15:11:29 -05:00
Lysandre Debut
3095ee9dab
Tokenizers should be framework agnostic (#8599)
* Tokenizers should be framework agnostic

* Run the slow tests

* Not testing

* Fix documentation

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-17 14:03:03 -05:00
Patrick von Platen
86822a358b
T5 & mT5 (#8552)
* add mt5 and t5v1_1 model

* fix tests

* correct some imports

* add tf model

* finish tf t5

* improve examples

* fix copies

* clean doc
2020-11-17 12:23:09 +01:00
Sylvain Gugger
c89bdfbe72
Reorganize repo (#8580)
* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
LSinev
afb50c663a
Fix GPT2DoubleHeadsModel to work with model.generate() (#6601)
* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used

and for GPT2LMHeadModel too

* Update tests to check token_type_ids usage in GPT2 models
2020-11-16 14:35:44 +01:00
Yusuke Mori
04d8136bde
Adding the prepare_seq2seq_batch function to ProphetNet (#8515)
* Simply insert T5Tokenizer's prepare_seq2seq_batch

* Update/Add some 'import'

* fix RunTimeError caused by '.view'

* Moves .view related error avoidance from seq2seq_trainer to inside prophetnet

* Update test_tokenization_prophetnet.py

* Format the test code with black

* Re-format the test code

* Update test_tokenization_prophetnet.py

* Add importing require_torch in the test code

* Add importing BatchEncoding in the test code

* Re-format the test code on Colab
2020-11-16 14:18:25 +01:00
Thomas Wolf
f4e04cd2c6
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073)
* Fixing roberta for slow-fast tests

* WIP getting equivalence on pipelines

* slow-to-fast equivalence - working on question-answering pipeline

* optional FAISS tests

* Pipeline Q&A

* Move pipeline tests to their own test job again

* update tokenizer to add sequence id methods

* update to tokenizers 0.9.4

* set sentencepiecce as optional

* clean up squad

* clean up pipelines to use sequence_ids

* style/quality

* wording

* Switch to use_fast = True by default

* update tests for use_fast at True by default

* fix rag tokenizer test

* removing protobuf from required dependencies

* fix NER test for use_fast = True by default

* fixing example tests (Q&A examples use slow tokenizers for now)

* protobuf in main deps extras["sentencepiece"] and example deps

* fix protobug install test

* try to fix seq2seq by switching to slow tokenizers for now

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-15 22:50:59 +01:00
Julien Plu
24184e73c4
Rework some TF tests (#8492)
* Update some tests

* Small update

* Apply style

* Use max_position_embeddings

* Create a fake attribute

* Create a fake attribute

* Update wrong name

* Wrong TransfoXL model file

* Keep the common tests agnostic
2020-11-13 17:07:17 -05:00
Patrick von Platen
42e2d02e44
[T5] Bug correction & Refactor (#8518)
* fix bug

* T5 refactor

* refactor tf

* apply sylvains suggestions
2020-11-13 16:57:31 +01:00
Julien Plu
5d80539488
Add pretraining loss computation for TF Bert pretraining (#8470)
* Add pretraining loss computation for TF Bert pretraining

* Fix labels creation

* Fix T5 model

* restore T5 kwargs

* try a generic fix for pretraining models

* Apply style

* Overide the prepare method for the BERT tests
2020-11-12 14:08:26 -05:00
Lysandre
c7b6bbec5c Skip test until investigation 2020-11-11 12:59:40 -05:00
Ratthachat (Jung)
026a2ff225
Add TFDPR (#8203)
* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing

* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
2020-11-11 12:28:09 -05:00
Julien Plu
da842e4e72
Add next sentence prediction loss computation (#8462)
* Add next sentence prediction loss computation

* Apply style

* Fix tests

* Add forgotten import

* Add forgotten import

* Use a new parameter

* Remove kwargs and use positional arguments
2020-11-11 15:02:06 +01:00
Patrick von Platen
70708cca1a
fix t5 token type ids (#8437) 2020-11-10 14:21:54 -05:00
Lysandre Debut
9fd1f56236
[No merge] TF integration testing (#7621)
* stash

* TF Integration testing for ELECTRA, BERT, Longformer

* Trigger slow tests

* Apply suggestions from code review
2020-11-10 14:02:33 -05:00
Stas Bekman
02bdfc0251
using multi_gpu consistently (#8446)
* s|multiple_gpu|multi_gpu|g; s|multigpu|multi_gpu|g'

* doc
2020-11-10 13:23:58 -05:00
Patrick von Platen
b93569457f
fix t5 special tokens (#8435) 2020-11-10 18:54:17 +01:00
Sam Shleifer
c314b1fd3b
[docs] improve bart/marian/mBART/pegasus docs (#8421) 2020-11-10 10:18:34 -05:00
Lysandre Debut
850afb422d
Patch token classification pipeline (#8364)
* Patch token classification pipeline

* Some added tests for TokenClassificationArgumentHandler (#8366)

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2020-11-10 07:29:34 -05:00