Julien Plu
bd701ab1a0
Fix template ( #9840 )
2021-01-27 07:40:30 -05:00
Julien Plu
4adbdce5ee
Clean TF Bert ( #9788 )
...
* Start cleaning BERT
* Clean BERT and all those depends of it
* Fix attribute name
* Apply style
* Apply Sylvain's comments
* Apply Lysandre's comments
* remove unused import
2021-01-27 11:28:11 +01:00
Julien Plu
a1720694a5
Remove a TF usage warning and rework the documentation ( #9756 )
...
* Rework documentation
* Update the template
* Trigger CI
* Restore the warning but with the TF logger
* Update convbert doc
2021-01-27 10:45:42 +01:00
Lysandre
897a24c869
Fix head_mask for model templates
2021-01-26 11:02:48 +01:00
Julien Plu
a7dabfb3d1
Fix TF s2s models ( #9478 )
...
* Fix Seq2Seq models for serving
* Apply style
* Fix lonfgormer
* Fix mBart/Pegasus/Blenderbot
* Apply style
* Add a main intermediate layer
* Apply style
* Remove import
* Apply tf.function to Longformer
* Fix utils check_copy
* Update S2S template
* Fix BART + Blenderbot
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix BlenderbotSmall
* Fix MBart
* Fix Marian
* Fix Pegasus + template
* Apply style
* Fix common attributes test
* Forgot to fix the LED test
* Apply Patrick's comment on LED Decoder
2021-01-21 17:03:29 +01:00
Julien Plu
3f290e6c84
Fix mixed precision in TF models ( #9163 )
...
* Fix Gelu precision
* Fix gelu_fast
* Naming
* Fix usage and apply style
* add TF gelu approximate version
* add TF gelu approximate version
* add TF gelu approximate version
* Apply style
* Fix albert
* Remove the usage of the Activation layer
2021-01-21 07:00:11 -05:00
Julien Plu
7251a4736d
Fix template ( #9697 )
2021-01-20 09:04:53 -05:00
Julien Plu
14042d560f
New TF embeddings (cleaner and faster) ( #9418 )
...
* Create new embeddings + add to BERT
* Add Albert
* Add DistilBert
* Add Albert + Electra + Funnel
* Add Longformer + Lxmert
* Add last models
* Apply style
* Update the template
* Remove unused imports
* Rename attribute
* Import embeddings in their own model file
* Replace word_embeddings per weight
* fix naming
* Fix Albert
* Fix Albert
* Fix Longformer
* Fix Lxmert Mobilebert and MPNet
* Fix copy
* Fix template
* Update the get weights function
* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/electra/modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* address Sylvain's comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-20 12:08:12 +01:00
Sylvain Gugger
7e662e6a3b
Fix model templates and use less than 119 chars ( #9684 )
...
* Fix model templates and use less than 119 chars
* Missing new line
2021-01-19 17:11:22 -05:00
Yusuke Mori
b020a736c3
Update past_key_values
in GPT-2 ( #9596 )
...
* Update past_key_values in gpt2 (#9391 )
* Update generation_utils, and rename some items
* Update modeling_gpt2 to avoid an error in gradient_checkpointing
* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2
* Change the location of '_reorder_cache' in modeling files
* Add '_reorder_cache' in modeling_ctrl
* Fix a bug of my last commit in CTRL
* Add '_reorder_cache' to GPT2DoubleHeadsModel
* Manage 'use_cache' in config of test_modeling_gpt2
* Clean up the doc string
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fix the doc string (GPT-2, CTRL)
* improve gradient_checkpointing_behavior
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-19 16:00:15 +01:00
Patrick von Platen
7f28613213
[TFBart] Split TF-Bart ( #9497 )
...
* make templates ready
* make add_new_model_command_ready
* finish tf bart
* prepare tf mbart
* finish tf bart
* add tf mbart
* add marian
* prep pegasus
* add tf pegasus
* push blenderbot tf
* add blenderbot
* add blenderbot small
* clean-up
* make fix copy
* define blend bot tok
* fix
* up
* make style
* add to docs
* add copy statements
* overwrite changes
* improve
* fix docs
* finish
* fix last slow test
* fix missing git conflict line
* fix blenderbot
* up
* fix blenderbot small
* load changes
* finish copied from
* upload fix
2021-01-12 02:06:32 +01:00
Julien Plu
1e3c362235
Fix template ( #9512 )
2021-01-11 08:03:28 -05:00
Julien Plu
1243ee7d0c
Full rework of the TF input/output embeddings and bias resizing ( #9193 )
...
* Start rework resizing
* Rework bias/decoder resizing
* Full resizing rework
* Full resizing rework
* Start to update the models with the new approach
* Finish to update the models
* Update all the tests
* Update the template
* Fix tests
* Fix tests
* Test a new approach
* Refactoring
* Refactoring
* Refactoring
* New rework
* Rework BART
* Rework bert+blenderbot
* Rework CTRL
* Rework Distilbert
* Rework DPR
* Rework Electra
* Rework Flaubert
* Rework Funnel
* Rework GPT2
* Rework Longformer
* Rework Lxmert
* Rework marian+mbart
* Rework mobilebert
* Rework mpnet
* Rework openai
* Rework pegasus
* Rework Roberta
* Rework T5
* Rework xlm+xlnet
* Rework template
* Fix TFT5EncoderOnly + DPRs
* Restore previous methods
* Fix Funnel
* Fix CTRL and TransforXL
* Apply style
* Apply Sylvain's comments
* Restore a test in DPR
* Address the comments
* Fix bug
* Apply style
* remove unused import
* Fix test
* Forgot a method
* missing test
* Trigger CI
* naming update
* Rebase
* Trigger CI
2021-01-11 06:27:28 -05:00
Julien Plu
cf416764f4
Fix template ( #9504 )
2021-01-11 05:21:25 -05:00
Julien Plu
4f7022d68d
Reformat ( #9482 )
2021-01-10 15:10:15 +01:00
Sylvain Gugger
1bdf42409c
Fast imports part 3 ( #9474 )
...
* New intermediate inits
* Update template
* Avoid importing torch/tf/flax in tokenization unless necessary
* Styling
* Shutup flake8
* Better python version check
2021-01-08 07:40:59 -05:00
Sylvain Gugger
758ed3332b
Transformers fast import part 2 ( #9446 )
...
* Main init work
* Add version
* Change from absolute to relative imports
* Fix imports
* One more typo
* More typos
* Styling
* Make quality script pass
* Add necessary replace in template
* Fix typos
* Spaces are ignored in replace for some reason
* Forgot one models.
* Fixes for import
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Add documentation
* Styling
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2021-01-07 09:36:14 -05:00
Julien Plu
812045adcc
New serving ( #9419 )
...
* Add a serving method
* Add albert
* Add serving for BERT and BART
* Add more models
* Finish the serving addition
* Temp fix
* Restore DPR
* Fix funnel attribute
* Fix attributes GPT2
* Fix OpenAIGPT attribute
* Fix T5 attributes
* Fix Bart attributes
* Fix TransfoXL attributes
* Add versioning
* better test
* Update template
* Fix Flaubert
* Fix T5
* Apply style
* Remove unused imports
* Deactivate extra parameters
* Remove too long test + saved_model default to False
* Ignore the saved model test for some models
* Fix some inputs
* Fix mpnet serving
* Trigger CI
* Address all comments
2021-01-07 11:48:49 +01:00
Patrick von Platen
b8462b5b2a
[GenerationOutputs] Fix GenerationOutputs Tests ( #9443 )
...
* fix generation models
* fix led
* fix docs
* add is_decoder
* fix last docstrings
* make style
* fix t5 cross attentions
* correct t5
2021-01-06 19:37:02 +01:00
Patrick von Platen
eef66035a2
[PyTorch Bart] Split Bart into different models ( #9343 )
...
* first try
* remove old template
* finish bart
* finish mbart
* delete unnecessary line
* init pegasus
* save intermediate
* correct pegasus
* finish pegasus
* remove cookie cutter leftover
* add marian
* finish blenderbot
* replace in file
* correctly split blenderbot
* delete "old" folder
* correct "add statement"
* adapt config for tf comp
* correct configs for tf
* remove ipdb
* fix more stuff
* fix mbart
* push pegasus fix
* fix mbart
* more fixes
* fix research projects code
* finish docs for bart, mbart, and marian
* delete unnecessary file
* correct attn typo
* correct configs
* remove pegasus for seq class
* correct peg docs
* correct peg docs
* finish configs
* further improve docs
* add copied from statements to mbart
* fix copied from in mbart
* add copy statements to marian
* add copied from to marian
* add pegasus copied from
* finish pegasus
* finish copied from
* Apply suggestions from code review
* make style
* backward comp blenderbot
* apply lysandres and sylvains suggestions
* apply suggestions
* push last fixes
* fix docs
* fix tok tests
* fix imports code style
* fix doc
2021-01-05 22:00:05 +01:00
Patrick von Platen
912f6881d2
add import math ( #9346 )
2020-12-29 19:35:06 +01:00
Patrick von Platen
785e52cd30
improve templates ( #9342 )
2020-12-29 16:48:44 +01:00
Patrick von Platen
83fdd252f6
[Seq2Seq Templates] Correct some TF-serving errors and add gradient checkpointing to PT by default. ( #9334 )
...
* correct tests
* correct shape and get_tf_activation
* more correction tf
* add gradient checkpointing to templates
* correct typo
2020-12-28 17:51:04 +01:00
Patrick von Platen
6c091abef2
[Templates] Adapt Bert ( #9284 )
...
* adapt templates
* adapt config
* add test as well
* fix output type
* fix cache false naming
* finish tests
* last fix
2020-12-24 01:44:33 +01:00
Patrick von Platen
d5db6c37d4
[Seq2Seq Templates] Fix check_repo.py templates file ( #9277 )
...
* add enc dec pt model to check repo
* fix indent
2020-12-23 11:40:20 +01:00
Patrick von Platen
cbe63949d7
Model Templates for Seq2Seq ( #9251 )
...
* adapt cookie cutter
* fix copy past statement
* delete copy statements for now
* remove unused import from template
* make doc rst
* correct config docstring
* correct training
* correct inputs processing tf enc dec
* make style
* adapt templates
* clean tabs
* correct tensor -> Tensor naming
* correct indent
* correct templates
* fix the test
* break lines to avoid > 119
* Apply suggestions from code review
2020-12-22 23:41:20 +01:00
Julien Plu
161a6461db
Fix TF template ( #9234 )
2020-12-21 13:52:16 +01:00
Julien Plu
5a8a4eb187
Improve BERT-like models performance with better self attention ( #9124 )
...
* Improve BERT-like models attention layers
* Apply style
* Put back error raising instead of assert
* Update template
* Fix copies
* Apply raising valueerror in MPNet
* Restore the copy check for the Intermediate layer in Longformer
* Update longformer
2020-12-21 13:10:15 +01:00
Lysandre Debut
6587cf9f84
Patch *ForCausalLM model ( #9092 )
2020-12-14 00:39:55 -05:00
Julien Plu
51d9c569fa
Fix embeddings resizing in TF models ( #8657 )
...
* Resize the biases in same time than the embeddings
* Trigger CI
* Biases are not reset anymore
* Remove get_output_embeddings + better LM model detection in generation utils
* Apply style
* First test on BERT
* Update docstring + new name
* Apply the new resizing logic to all the models
* fix tests
* Apply style
* Update the template
* Fix naming
* Fix naming
* Apply style
* Apply style
* Remove unused import
* Revert get_output_embeddings
* Trigger CI
* Update num parameters
* Restore get_output_embeddings in TFPretrainedModel and add comments
* Style
* Add decoder resizing
* Style
* Fix tests
* Separate bias and decoder resize
* Fix tests
* Fix tests
* Apply style
* Add bias resizing in MPNet
* Trigger CI
* Apply style
2020-12-13 23:05:24 -05:00
Lysandre Debut
67ff1c314a
Templates overhaul 1 ( #8993 )
2020-12-08 18:00:07 -05:00
Sylvain Gugger
00aa9dbca2
Copyright ( #8970 )
...
* Add copyright everywhere missing
* Style
2020-12-07 18:36:34 -05:00
Julien Plu
dcd3046f98
Better booleans handling in the TF models ( #8777 )
...
* Apply on BERT and ALBERT
* Update TF Bart
* Add input processing to TF BART
* Add input processing for TF CTRL
* Add input processing to TF Distilbert
* Add input processing to TF DPR
* Add input processing to TF Electra
* Add deprecated arguments
* Add input processing to TF XLM
* Add input processing to TF Funnel
* Add input processing to TF GPT2
* Add input processing to TF Longformer
* Add input processing to TF Lxmert
* Apply style
* Add input processing to TF Mobilebert
* Add input processing to TF GPT
* Add input processing to TF Roberta
* Add input processing to TF T5
* Add input processing to TF TransfoXL
* Apply style
* Rebase on master
* Bug fix
* Retry to bugfix
* Retry bug fix
* Fix wrong model name
* Try another fix
* Fix BART
* Fix input precessing
* Apply style
* Put the deprecated warnings in the input processing function
* Remove the unused imports
* Raise an error when len(kwargs)>0
* test ModelOutput instead of TFBaseModelOutput
* Bug fix
* Address Patrick's comments
* Address Patrick's comments
* Address Sylvain's comments
* Add boolean processing for the inputs
* Apply style
* Missing optional
* Fix missing some input proc
* Update the template
* Fix missing inputs
* Missing input
* Fix args parameter
* Trigger CI
* Trigger CI
* Trigger CI
* Address Patrick's and Sylvain's comments
* Replace warn by warning
* Trigger CI
* Fix XLNET
* Fix detection
2020-12-04 09:08:29 -05:00
Patrick von Platen
443f67e887
[PyTorch] Refactor Resize Token Embeddings ( #8880 )
...
* fix resize tokens
* correct mobile_bert
* move embedding fix into modeling_utils.py
* refactor
* fix lm head resize
* refactor
* break lines to make sylvain happy
* add news tests
* fix typo
* improve test
* skip bart-like for now
* check if base_model = get(...) is necessary
* clean files
* improve test
* fix tests
* revert style templates
* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py
2020-12-02 19:19:50 +01:00
Julien Plu
29d4992453
New TF model inputs ( #8602 )
...
* Apply on BERT and ALBERT
* Update TF Bart
* Add input processing to TF BART
* Add input processing for TF CTRL
* Add input processing to TF Distilbert
* Add input processing to TF DPR
* Add input processing to TF Electra
* Add input processing for TF Flaubert
* Add deprecated arguments
* Add input processing to TF XLM
* remove unused imports
* Add input processing to TF Funnel
* Add input processing to TF GPT2
* Add input processing to TF Longformer
* Add input processing to TF Lxmert
* Apply style
* Add input processing to TF Mobilebert
* Add input processing to TF GPT
* Add input processing to TF Roberta
* Add input processing to TF T5
* Add input processing to TF TransfoXL
* Apply style
* Rebase on master
* Bug fix
* Retry to bugfix
* Retry bug fix
* Fix wrong model name
* Try another fix
* Fix BART
* Fix input precessing
* Apply style
* Put the deprecated warnings in the input processing function
* Remove the unused imports
* Raise an error when len(kwargs)>0
* test ModelOutput instead of TFBaseModelOutput
* Bug fix
* Address Patrick's comments
* Address Patrick's comments
* Address Sylvain's comments
* Add the new inputs in new Longformer models
* Update the template with the new input processing
* Remove useless assert
* Apply style
* Trigger CI
2020-11-24 13:55:00 -05:00
Stas Bekman
e84786aaa6
consistent ignore keys + make private ( #8737 )
...
* consistent ignore keys + make private
* style
* - authorized_missing_keys => _keys_to_ignore_on_load_missing
- authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected
* move public doc of private attributes to private comment
2020-11-23 12:33:13 -08:00
Sylvain Gugger
dd52804f5f
Remove deprecated ( #8604 )
...
* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Remove needless imports
* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-17 15:11:29 -05:00
Sylvain Gugger
36a19915ea
Fix model templates ( #8595 )
...
* First fixes
* Fix imports and add init
* Fix typo
* Move init to final dest
* Fix tokenization import
* More fixes
* Styling
2020-11-17 10:35:38 -05:00
Sylvain Gugger
c89bdfbe72
Reorganize repo ( #8580 )
...
* Put models in subfolders
* Styling
* Fix imports in tests
* More fixes in test imports
* Sneaky hidden imports
* Fix imports in doc files
* More sneaky imports
* Finish fixing tests
* Fix examples
* Fix path for copies
* More fixes for examples
* Fix dummy files
* More fixes for example
* More model import fixes
* Is this why you're unhappy GitHub?
* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict
to True
by default. ( #8530 )
...
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Use the CI to identify failing tests
* Remove from all examples and tests
* More default switch
* Fixes
* More test fixes
* More fixes
* Last fixes hopefully
* Run on the real suite
* Fix slow tests
2020-11-16 11:43:00 -05:00
Lysandre Debut
826f04576f
Model templates encoder only ( #8509 )
...
* Model templates
* TensorFlow
* Remove pooler
* CI
* Tokenizer + Refactoring
* Encoder-Decoder
* Let's go testing
* Encoder-Decoder in TF
* Let's go testing in TF
* Documentation
* README
* Fixes
* Better names
* Style
* Update docs
* Choose to skip either TF or PT
* Code quality fixes
* Add to testing suite
* Update file path
* Cookiecutter path
* Update `transformers` path
* Handle rebasing
* Remove seq2seq from model templates
* Remove s2s config
* Apply Sylvain and Patrick comments
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Last fixes from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-13 11:59:30 -05:00