Commit Graph

166 Commits

Author SHA1 Message Date
Julien Plu
4fbcf8ea49
Fix TF input for np.ndarray (#9294)
* Fix input for np.ndarray"

* add a test

* add a test

* Add a test

* Apply style

* Fix test
2021-01-08 08:23:29 -05:00
Julien Plu
812045adcc
New serving (#9419)
* Add a serving method

* Add albert

* Add serving for BERT and BART

* Add more models

* Finish the serving addition

* Temp fix

* Restore DPR

* Fix funnel attribute

* Fix attributes GPT2

* Fix OpenAIGPT attribute

* Fix T5 attributes

* Fix Bart attributes

* Fix TransfoXL attributes

* Add versioning

* better test

* Update template

* Fix Flaubert

* Fix T5

* Apply style

* Remove unused imports

* Deactivate extra parameters

* Remove too long test + saved_model default to False

* Ignore the saved model test for some models

* Fix some inputs

* Fix mpnet serving

* Trigger CI

* Address all comments
2021-01-07 11:48:49 +01:00
Julien Plu
4225740a7b
Use stable functions (#9369) 2021-01-05 03:58:26 -05:00
Julien Plu
ef2d4cd445
Fix tf2.4 (#9120)
* Fix tests for TF 2.4

* Remove <2.4 limitation

* Add version condition

* Update tests/test_optimization_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-15 10:10:46 -05:00
Julien Plu
df3f4d2aef
Fix T5 and BART for TF (#9063)
* Fix T5 for graphe compilation+execution

* Fix BART

* Fix import

* Fix naming

* fix attribute name

* Oops

* fix import

* fix tests

* fix tests

* Update test

* Add mising import

* Address Patrick's comments

* Style

* Address Patrick's comment
2020-12-14 18:47:00 +01:00
Julien Plu
51d9c569fa
Fix embeddings resizing in TF models (#8657)
* Resize the biases in same time than the embeddings

* Trigger CI

* Biases are not reset anymore

* Remove get_output_embeddings + better LM model detection in generation utils

* Apply style

* First test on BERT

* Update docstring + new name

* Apply the new resizing logic to all the models

* fix tests

* Apply style

* Update the template

* Fix naming

* Fix naming

* Apply style

* Apply style

* Remove unused import

* Revert get_output_embeddings

* Trigger CI

* Update num parameters

* Restore get_output_embeddings in TFPretrainedModel and add comments

* Style

* Add decoder resizing

* Style

* Fix tests

* Separate bias and decoder resize

* Fix tests

* Fix tests

* Apply style

* Add bias resizing in MPNet

* Trigger CI

* Apply style
2020-12-13 23:05:24 -05:00
Patrick von Platen
06971ac4f9
[Bart] Refactor - fix issues, consistency with the library, naming (#8900)
* remove make on the fly linear embedding

* start refactor

* big first refactor

* save intermediate

* save intermediat

* correct mask issue

* save tests

* refactor padding masks

* make all tests pass

* further refactor

* make pegasus test pass

* fix bool if

* fix leftover tests

* continue

* bart renaming

* delete torchscript test hack

* fix imports in tests

* correct shift

* fix docs and repo cons

* re-add fix for FSTM

* typo in test

* fix typo

* fix another typo

* continue

* hot fix 2 for tf

* small fixes

* refactor types linting

* continue

* finish refactor

* fix import in tests

* better bart names

* further refactor and add test

* delete hack

* apply sylvains and lysandres commens

* small perf improv

* further perf improv

* improv perf

* fix typo

* make style

* small perf improv
2020-12-09 20:55:24 +01:00
Julien Plu
29d4992453
New TF model inputs (#8602)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add input processing for TF Flaubert

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add the new inputs in new Longformer models

* Update the template with the new input processing

* Remove useless assert

* Apply style

* Trigger CI
2020-11-24 13:55:00 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
Julien Plu
24184e73c4
Rework some TF tests (#8492)
* Update some tests

* Small update

* Apply style

* Use max_position_embeddings

* Create a fake attribute

* Create a fake attribute

* Update wrong name

* Wrong TransfoXL model file

* Keep the common tests agnostic
2020-11-13 17:07:17 -05:00
Julien Plu
5d80539488
Add pretraining loss computation for TF Bert pretraining (#8470)
* Add pretraining loss computation for TF Bert pretraining

* Fix labels creation

* Fix T5 model

* restore T5 kwargs

* try a generic fix for pretraining models

* Apply style

* Overide the prepare method for the BERT tests
2020-11-12 14:08:26 -05:00
Julien Plu
da842e4e72
Add next sentence prediction loss computation (#8462)
* Add next sentence prediction loss computation

* Apply style

* Fix tests

* Add forgotten import

* Add forgotten import

* Use a new parameter

* Remove kwargs and use positional arguments
2020-11-11 15:02:06 +01:00
Guillaume Filion
27b402cab0
Output global_attentions in Longformer models (#7562)
* Output global_attentions in Longformer models

* make style

* small refactoring

* fix tests

* make fix-copies

* add for tf as well

* remove comments in test

* make fix-copies

* make style

* add docs

* make docstring pretty

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-05 21:10:43 +01:00
Lysandre Debut
10f8c63620
Ci test tf super slow (#8007)
* Test TF GPU CI

* Change cache

* Fix missing torch requirement

* Fix some model tests


Style

* LXMERT

* MobileBERT

* Longformer skip test

* XLNet

* The rest of the tests

* RAG goes OOM in multi gpu setup

* YAML test files

* Last fixes

* Skip doctests

* Fill mask tests

* Yaml files

* Last test fix

* Style

* Update cache

* Change ONNX tests to slow + use tiny model
2020-10-30 10:25:48 -04:00
Thomas Wolf
3a40cdf58d
[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970)
* WIP refactoring pipeline tests - switching to fast tokenizers

* fix dialog pipeline and fill-mask

* refactoring pipeline tests backbone

* make large tests slow

* fix tests (tf Bart inactive for now)

* fix doc...

* clean up for merge

* fixing tests - remove bart from summarization until there is TF

* fix quality and RAG

* Add new translation pipeline tests - fix JAX tests

* only slow for dialog

* Fixing the missing TF-BART imports in modeling_tf_auto

* spin out pipeline tests in separate CI job

* adding pipeline test to CI YAML

* add slow pipeline tests

* speed up tf and pt join test to avoid redoing all the standalone pt and tf tests

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/pipelines.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add require_torch and require_tf in is_pt_tf_cross_test

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-23 15:58:19 +02:00
Sam Shleifer
829842159e
Add TFBartForConditionalGeneration (#5411)
* half done

* doc improvement

* Cp test file

* brokedn

* broken test

* undo some mess

* ckpt

* borked

* Halfway

* 6 passing

* boom boom

* Much progress but still 6

* boom boom

* merged master

* 10 passing

* boom boom

* Style

* no t5 changes

* 13 passing

* Integration test failing, but not gibberish

* Frustrated

* Merged master

* 4 fail

* 4 fail

* fix return_dict

* boom boom

* Still only 4

* prepare method

* prepare method

* before delete classif

* Skip tests to avoid adding boilerplate

* boom boom

* fast tests passing

* style

* boom boom

* Switch to supporting many input types

* remove FIXMENORM

* working

* Fixed past_key_values/decoder_cached_states confusion

* new broken test

* Fix attention mask kwarg name

* undo accidental

* Style and reviewers

* style

* Docs and common tests

* Cleaner assert messages

* copy docs

* style issues

* Sphinx fix

* Simplify caching logic

* test does not require torch

* copy _NoLayerEmbedTokens

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Line length and dont document None

* Add pipeline test coverage

* assert msg

* At parity

* Assert messages

* mark slow

* Update compile test

* back in init

* Merge master

* Fix tests

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-21 13:10:16 +02:00
Patrick von Platen
62f5ae68ec
[Seq2Seq] Fix a couple of bugs and clean examples (#7474)
* clean T5

* fix t5 tests

* fix index typo

* fix tf common test

* fix examples

* change positional ordering for Bart and FSTM

* add signature test

* clean docs and add tests

* add docs to encoder decoder

* clean docs

* correct two doc strings

* remove sig test for TF Elektra & Funnel

* fix tf t5 slow tests

* fix input_ids to inputs in tf

* Update src/transformers/modeling_bart.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_bart.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implement lysandre results

* make style

* fix encoder decoder typo

* fix tf slow tests

* fix slow tests

* renaming

* remove unused input

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-01 17:38:50 +02:00
Julien Plu
324f361e91
Fix saving TF custom models (#7291)
* Fix #7277

* Apply style

* Add a full training pipeline test

* Apply style
2020-09-22 09:31:13 -04:00
Sylvain Gugger
15a189049e
Add TF Funnel Transformer (#7029)
* Add TF Funnel Transformer

* Proper dummy input

* Formatting

* Update src/transformers/modeling_tf_funnel.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* One review comment forgotten

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-10 10:41:56 -04:00
Puneetha Pai
4ebb52afdb
test_tf_common: remove un_used mixin class parameters (#6866) 2020-09-02 10:54:40 -04:00
Julien Chaumond
3242e4d942 [model_cards] Fix tiny typos 2020-08-26 23:16:06 +02:00
Patrick von Platen
858b7d5873
[TF Longformer] Improve Speed for TF Longformer (#6447)
* add tf graph compile tests

* fix conflict

* remove more tf transpose statements

* fix conflicts

* fix comment typos

* move function to class function

* fix black

* fix black

* make style
2020-08-26 14:55:41 -04:00
Lysandre
a75c64d80c Black 20 release 2020-08-26 17:20:22 +02:00
Sylvain Gugger
a573777901
Update repo to isort v5 (#6686)
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
sgugger
86c07e634f One last threshold to raise 2020-08-20 14:23:09 -04:00
Sylvain Gugger
e8af90c052
Move threshold up for flaky test with Electra (#6622)
* Move threshold up for flaky test with Electra

* Update above as well
2020-08-20 13:59:40 -04:00
Lysandre Debut
f7cbc13db7
Test model outputs equivalence (#6445)
* Test model outputs equivalence

* Fix failing tests

* From dict to kwargs

* DistilBERT

* Addressing @sgugger and @patrickvonplaten's comments
2020-08-13 11:59:35 -04:00
Sylvain Gugger
c67d1a0259
Tf model outputs (#6247)
* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* Add new models and fix issues

* Quality improvements

* Add T5

* A bit of cleanup

* Fix for slow tests

* Style
2020-08-05 11:34:39 -04:00
Julien Plu
9996f697e3
Fix saved model creation (#5468)
* Fix TF Serving when output_hidden_states and output_attentions are True

* Add tests for saved model creation + bug fix for multiple choices models

* remove unused import

* Fix the input for several layers

* Fix test

* Fix conflict printing

* Apply style

* Fix XLM and Flaubert for TensorFlow

* Apply style

* Fix TF check version

* Apply style

* Trigger CI
2020-08-03 08:10:40 -04:00
Lysandre Debut
3f94170a10
[WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests

* AutoModels


Tiny tweaks

* Style

* Final changes before merge

* Re-order for simpler review

* Final fixes

* Addressing @sgugger's comments

* Test MultipleChoice
2020-07-29 14:26:26 -04:00
Sylvain Gugger
edfd82f5ff
Change model outputs types to self-document outputs (#5438)
* [WIP] Proposal for model outputs

* All Bert models

* Make CI green maybe?

* Fix ONNX test

* Isolate ModelOutput from pt and tf

* Formatting

* Add Electra models

* Auto-generate docstrings from outputs

* Add TF outputs

* Add some BERT models

* Revert TF side

* Remove last traces of TF changes

* Fail with a clear error message

* Add Albert and work through Bart

* Add CTRL and DistilBert

* Formatting

* Progress on Bart

* Renames and finish Bart

* Formatting

* Fix last test

* Add DPR

* Finish Electra and add FlauBERT

* Add GPT2

* Add Longformer

* Add MMBT

* Add MobileBert

* Add GPT

* Formatting

* Add Reformer

* Add Roberta

* Add T5

* Add Transformer XL

* Fix test

* Add XLM + fix XLMForTokenClassification

* Style + XLMRoberta

* Add XLNet

* Formatting

* Add doc of return_tuple arg
2020-07-10 11:36:53 -04:00
Patrick von Platen
4dc65591b5
[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395)
* add first version of clm tf

* make style

* add more tests for bert

* update tf clm loss

* fix tests

* correct tf ner script

* add mlm loss

* delete bogus file

* clean tf auto model + add tests

* finish adding clm loss everywhere

* fix training in distilbert

* fix flake8

* save intermediate

* fix tf t5 naming

* remove prints

* finish up

* up

* fix tf gpt2

* fix new test utils import

* fix flake8

* keep backward compatibility

* Update src/transformers/modeling_tf_albert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_mobilebert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_distilbert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 18:15:53 +02:00
Sam Shleifer
13deb95a40
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
Lysandre Debut
cf10d4cfdd
Cleaning TensorFlow models (#5229)
* Cleaning TensorFlow models

Update all classes


stylr

* Don't average loss
2020-06-24 11:37:20 -04:00
Patrick von Platen
c2a26ec8a6
[Use cache] Align logic of use_cache with output_attentions and output_hidden_states (#5194)
* fix use cache

* add bart use cache

* fix bart

* finish bart
2020-06-24 16:09:17 +02:00
Joseph Liu
f4e1f02210
Output hidden states (#4978)
* Configure all models to use output_hidden_states as argument passed to foward()

* Pass all tests

* Remove cast_bool_to_primitive in TF Flaubert model

* correct tf xlnet

* add pytorch test

* add tf test

* Fix broken tests

* Configure all models to use output_hidden_states as argument passed to foward()

* Pass all tests

* Remove cast_bool_to_primitive in TF Flaubert model

* correct tf xlnet

* add pytorch test

* add tf test

* Fix broken tests

* Refactor output_hidden_states for mobilebert

* Reset and remerge to master

Co-authored-by: Joseph Liu <joseph.liu@coinflex.com>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-06-22 10:10:45 -04:00
Deniz
32e94cff64
tf add resize_token_embeddings method (#4351)
* resize token embeddings

* add tokens

* add tokens

* add tokens

* add t5 token method

* add t5 token method

* add t5 token method

* typo

* debugging input

* debugging input

* debug

* debug

* debug

* trying to set embedding tokens properly

* set embeddings for generation head too

* set embeddings for generation head too

* debugging

* debugging

* enable generation

* add base method

* add base method

* add base method

* return logits in the main call

* reverting to generation

* revert back

* set embeddings for the bert main layer

* description

* fix conflicts

* logging

* set base model as self

* refactor

* tf_bert add method

* tf_bert add method

* tf_bert add method

* tf_bert add method

* tf_bert add method

* tf_bert add method

* tf_bert add method

* tf_bert add method

* v0

* v0

* finalize

* final

* black

* add tests

* revert back the emb call

* comments

* comments

* add the second test

* add vocab size condig

* add tf models

* add tf models. add common tests

* remove model specific embedding tests

* stylish

* remove files

* stylez

* Update src/transformers/modeling_tf_transfo_xl.py

change the error.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* adding unchanged weight test

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-18 18:41:26 -04:00
Sylvain Gugger
20451195f0
Support multiple choice in tf common model tests (#4920)
* Support multiple choice in tf common model tests

* Add the input_embeds test
2020-06-11 10:31:26 -04:00
Bharat Raghunathan
6e603cb789
[All models] Extend config.output_attentions with output_attentions function arguments (#4538)
* DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* Fix further regressions in tests relating to `output_attentions`

Ensure proper propagation of `output_attentions` as a function parameter
to all model subclasses

* Fix more regressions in `test_output_attentions`

* Fix issues with BertEncoder

* Rename related variables to `output_attentions`

* fix pytorch tests

* fix bert and gpt2 tf

* Fix most TF tests for `test_output_attentions`

* Fix linter errors and more TF tests

* fix conflicts

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix pytorch tests

* fix conflicts

* fix conflicts

* Fix linter errors and more TF tests

* fix tf tests

* make style

* fix isort

* improve output_attentions

* improve tensorflow

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-06-09 23:39:06 +02:00
Julien Plu
f9414f7553
Tensorflow improvements (#4530)
* Better None gradients handling

* Apply Style

* Apply Style

* Create a loss class per task to compute its respective loss

* Add loss classes to the ALBERT TF models

* Add loss classes to the BERT TF models

* Add question answering and multiple choice to TF Camembert

* Remove prints

* Add multiple choice model to TF DistilBERT + loss computation

* Add question answering model to TF Electra + loss computation

* Add token classification, question answering and multiple choice models to TF Flaubert

* Add multiple choice model to TF Roberta + loss computation

* Add multiple choice model to TF XLM + loss computation

* Add multiple choice and question answering models to TF XLM-Roberta

* Add multiple choice model to TF XLNet + loss computation

* Remove unused parameters

* Add task loss classes

* Reorder TF imports + add new model classes

* Add new model classes

* Bugfix in TF T5 model

* Bugfix for TF T5 tests

* Bugfix in TF T5 model

* Fix TF T5 model tests

* Fix T5 tests + some renaming

* Fix inheritance issue in the AutoX tests

* Add tests for TF Flaubert and TF XLM Roberta

* Add tests for TF Flaubert and TF XLM Roberta

* Remove unused piece of code in the TF trainer

* bugfix and remove unused code

* Bugfix for TF 2.2

* Apply Style

* Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name

* Apply style

* Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling

* Fix TF optimizations tests and apply style

* Remove useless parameter

* Bugfix and apply style

* Fix TF Trainer prediction

* Now the TF models return the loss such as their PyTorch couterparts

* Apply Style

* Ignore some tests output

* Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models.

* Fix names for SQuAD data

* Apply Style

* Fix conflicts with 2.11 release

* Fix conflicts with 2.11

* Fix wrongname

* Add better documentation on the new create_optimizer function

* Fix isort

* logging_dir: use same default as PyTorch

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-06-04 19:45:53 -04:00
Patrick von Platen
01c37dcdb5
[Config, Caching] Remove output_past everywhere and replace by use_cache argument (#3734)
* remove output_past from pt

* make style

* add optional input length for gpt2

* add use cache to prepare input

* save memory in gpt2

* correct gpt2 test inputs

* make past input optional for gpt2

* finish use_cache for all models

* make style

* delete modeling_gpt2 change in test file

* correct docstring

* correct is true statements for gpt2
2020-04-14 14:40:28 -04:00
Patrick von Platen
2ee410560e
[Generate, Test] Split generate test function into beam search, no beam search (#3601)
* split beam search and no beam search test

* fix test

* clean generate tests
2020-04-06 10:37:05 +02:00
Patrick von Platen
a4ee4da18a
[T5, TF 2.2] change tf t5 argument naming (#3547)
* change tf t5 argument naming for TF 2.2

* correct bug in testing
2020-04-01 22:04:20 +02:00
Patrick von Platen
b38d552a92
[Generate] Add bad words list argument to the generate function (#3367)
* add bad words list

* make style

* add bad_words_tokens

* make style

* better naming

* make style

* fix typo
2020-03-31 18:42:31 +02:00
Patrick von Platen
bbf26c4e61
Support T5 Generation (#3228)
* fix conflicts

* update bart max length test

* correct spelling mistakes

* implemented model specific encode function

* fix merge conflicts

* better naming

* save intermediate state -> need to rethink strucuture a bit

* leave tf problem as it is for now

* current version

* add layers.pop

* remove ipdb

* make style

* clean return cut decoding

* remove ipdbs

* Fix restoring layers in the decoders that doesnt exists.

* push good intermediate solution for now

* fix conflicts

* always good to refuse to merge conflicts when rebasing

* fix small bug

* improve function calls

* remove unused file

* add correct scope behavior for t5_generate

Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2020-03-19 23:18:23 +01:00
Patrick von Platen
292186a3e7
Adding LM Head to Transfo-XL and first step to fixing problem with Adaptive Embeddings in TransfoXL (#3286)
* first commit

* work in progress

* make language generation task pass

* update to working version for LM

* delete print

* remove dead code

* make style
2020-03-18 09:24:27 -04:00
Patrick von Platen
e8f44af5bf
[generate] do_sample default back to False (#3298)
* change do_samples back

* None better default as boolean

* adapt do_sample to True in test example

* make style
2020-03-17 10:52:37 -04:00
Thomas Wolf
9499a3778e
Merge pull request #3103 from gthb/keras-serialization
Support keras JSON/HDF5 serialization of main layers
2020-03-06 12:59:13 +01:00
patrickvonplaten
61fef6e957 added beam_search generation for tf 2.0 2020-03-04 17:27:47 +01:00
Gunnlaugur Thor Briem
96c4990165 fix unused imports and style 2020-03-03 22:57:05 +00:00
Gunnlaugur Thor Briem
470753bcf5 Put @keras_serializable only on layers it works on
And only run the test on TF*MainLayer classes so marked.
2020-03-03 22:44:45 +00:00
Gunnlaugur Thor Briem
0c716ede8c Use class decorator instead of superclass
When supplied by Keras deserialization, the config parameter to initializers
will be a dict. So intercept it and convert to PretrainedConfig object (and
store in instance attribute for get_config to get at it) before passing to the
actual initializer. To accomplish this, and repeat as little code as possible,
use a class decorator on TF*MainLayer classes.
2020-03-03 22:31:42 +00:00
Gunnlaugur Thor Briem
b8da16f390 Add (failing) tests for Keras save/load 2020-03-03 15:22:34 +00:00
Patrick von Platen
4134100363
Add generate() functionality to TF 2.0 (#3063)
* add first copy past test to tf 2 generate

* add tf top_k_top_p_filter fn

* add generate function for TF

* add generate function for TF

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* implemented generate for all models expect transfoXL

* make style

* change permission of test file to correct ones

* delete ipdb

* delete ipdb

* fix bug and finish simple gpt2 integration test

* clean test file

* clean test file

* make style

* make style

* make style

* make style

* change import style

* change import style

* make style

* make style

* add decorators

* add decorators

* fix tf ctrl bug dim => axis in TF

* make style

* make style

* refactored test file

* refactored test file

* take out test_torch_tf_conversion if nothing is defined

* take out test_torch_tf_conversion if nothing is defined

* remove useless files

* remove useless files

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* solve conflicts

* solve conflicts

* fix conflicts

* fix conflicts

* merge conflicts

* delete ipdb

* exposed top_k_top_p_filtering fns

* delete weirdly created w! file

* add comment to test tf common modeling

* fix conflicts

* fix conflicts

* make style

* merge conflicts

* make style

* change tf.tensor.shape to shape_list(tensor)
2020-03-03 09:42:15 -05:00
Julien Chaumond
f169957d0c
TF GPU CI (#3085)
* debug env

* Restrict TF GPU memory

* Fixup

* One more test

* rm debug logs

* Fixup
2020-03-02 15:45:25 -05:00
Lysandre
ea2600bd5f Absolute definitive HeisenDistilBug solve
cc @julien-c @thomwolf
2020-01-27 21:58:36 -05:00
Lysandre
875c4ae48f Definitive HeisenDistilBug fix
cc @julien-c @@thomwolf
2020-01-27 12:09:58 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Julien Chaumond
594ca6dead [debug] Debug Heisenbug, the old school way. 2019-12-29 10:07:21 -05:00
Aymeric Augustin
e6c0019c80 Remove unused variables in tests. 2019-12-23 22:38:18 +01:00
Aymeric Augustin
798b3b3899 Remove sys.version_info[0] == 2 or 3. 2019-12-22 18:38:42 +01:00
Aymeric Augustin
c824d15aa1 Remove __future__ imports. 2019-12-22 17:47:54 +01:00
Aymeric Augustin
345c23a60f Replace (TF)CommonTestCases for modeling with a mixin.
I suspect the wrapper classes were created in order to prevent the
abstract base class (TF)CommonModelTester from being included in test
discovery and running, because that would fail.

I solved this by replacing the abstract base class with a mixin.

Code changes are just de-indenting and automatic reformattings
performed by black to use the extra line space.
2019-12-22 15:35:18 +01:00
Aymeric Augustin
7e98e211f0 Remove unittest.main() in test modules.
This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin
ced0a94204 Switch test files to the standard test_*.py scheme. 2019-12-22 14:15:13 +01:00