Commit Graph

9385 Commits

Author SHA1 Message Date
Sanchit Gandhi
e231c72906
[FlaxSpeechEncoderDecoder] Fix feature extractor gradient test (#16407) 2022-03-25 17:46:53 +01:00
lewtun
a97f3150c4
Add ONNX support for Blenderbot and BlenderbotSmall (#15875)
* Add ONNX support for Blenderbot

* Add BlenderbotSmall ONNX configuration

* Update serialization table
2022-03-25 17:04:43 +01:00
Sylvain Gugger
b473617d63
Checkpoint sharding (#16343)
* Sharded checkpoint support

* Handle distant sharded checkpoints

* Add tests

* TODO is done

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix docstring

* Add example and format

* Address review comments

* More review comments

* End of merge

* Revert unintentional change

* VsCode what did you do?

* Style

* Changes

* Address final comments

* Quality

* Moar tests

* Move import beneath is_pt_available

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2022-03-25 11:59:25 -04:00
Matt
7fa7408b26
Terminate previous pushes when we get to the final push (#16409) 2022-03-25 15:47:05 +00:00
Sylvain Gugger
867f3950fa
Rename master to main for notebooks links and leftovers (#16397) 2022-03-25 09:12:23 -04:00
Atharva Ingle
7e7490473e
fixed typo from enable to disable in disable_progress_bar function (#16406) 2022-03-25 09:07:43 -04:00
Sylvain Gugger
088c1880b7
Big file_utils cleanup (#16396)
* Big file_utils cleanup

* This one still needs to be treated separately
2022-03-25 07:25:20 -04:00
Michael Benayoun
2b23e0801a
Make FeaturesManager.get_model_from_feature a static method (#16357) 2022-03-25 11:35:48 +01:00
NielsRogge
aa6cfe9c4b
Rename to SemanticSegmenterOutput (#15849)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-24 20:44:15 +01:00
Yi Heng Lim
70a9bc69a8
Added type hints (#16389)
* Added type hints for PyTorch T5 model

* removed a type hint

* ran make style

* added type hints for ibert pytorch

* added type hints for lxmert pytorch

* removed kwargs type hint and fixed arguments order
2022-03-24 19:14:34 +00:00
Sylvain Gugger
cae394c8fa Adapt import to new structure 2022-03-24 14:40:05 -04:00
Robot Jelly
4e0f583eea
TF - variable naming for Distilbert model (unpack_inputs decorator) (#16384)
* variable naming for Distilbert model

* adding unpack inputs at top

* make style/quality

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-24 16:13:08 +00:00
Sylvain Gugger
3a0f1684c3
Fix readme links and add CI check (#16392)
* Fix doc links in README

* Fix name

* Fix links in READMEs and doc index

* Error if there is something wrong so the CI knows
2022-03-24 11:59:09 -04:00
Lysandre Debut
8cbd9b8fb1
Fix style (#16391) 2022-03-24 11:47:49 -04:00
Yih-Dar
9d88be5778
bump cookiecutter version (#16387)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 11:08:31 -04:00
Yih-Dar
f571dc20ac
Update PT Flax equivalence tests in PT test file (#16280)
* update PT/Flax equivalence tests on PT side

* overwrite check_outputs in BigBirdModelTest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 14:45:30 +01:00
Zehua Li
41bfc1e262
Add type hints for ConvBert model (#16377)
* Add missing type hints for ConvBERT flavored models.

* Update src/transformers/models/convbert/modeling_convbert.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-03-24 13:23:54 +00:00
Dahlbomii
23a75a5338
Type hints and decorator for TF T5 (#16376)
* Type hints and TF decorator added

* Re-add XLA generation method

* Re-add lines that were deleted by conflicting updates

* Re-add lines that were deleted by conflicting updates

* Re-add lines that were deleted by conflicting updates

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-24 13:19:40 +00:00
Yih-Dar
2a27c80063
Fix BigBirdModelTester (#16310)
* fix

* update the expected value in test_fast_integration

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 13:43:52 +01:00
Nathan Cooper
f5e8c9bdea
Update readme with how to train offline and fix BPE command (#15897)
* Update readme with how to train offline and fix BPE command

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-03-24 11:00:46 +01:00
Yih-Dar
9badcecf69
[Doctests] Make TFRoberta-like meaningfull (#16370)
* update doc examples for TFRoberta

* fix style

* fix style

* use TF ckpt

* apply suggestion

* add the code file to test here

* fix style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 10:26:27 +01:00
Patrick von Platen
77c5a80536
[Doctests] Make roberta-like meaningfull (#16363)
* [Doctests] Make roberta-like meaningfull

* correct

* final correct

* Trigger test

* make style

* apply suggestion from sylvain
2022-03-24 00:17:00 +01:00
Xu Zhao
5f0d07b36b
Make BigBird model compatiable to fp16 dtype. (#16034)
* Make BigBird model compatiable to fp16 dtype.

* Use tree_map instead of map

* Reformat the code

* Fix import order

* Convert masks to the correct dtype

* Fix format issue

* Address comments.
2022-03-24 00:07:34 +01:00
Yih-Dar
1cf28da66d
Update docs/README.md (#16333)
* Update docs/README.md

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 22:46:11 +01:00
Thomas Chaigneau
029b0d95ed
add GPT-J ONNX config to Transformers (#16274)
* add GPT-J ONNX config to Transformers

* remove token-classification features mapping

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* add question-answering features mapping

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* add GPT2 config init to GPT2 config + copie shebang for fix-copies

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-03-23 16:36:11 -04:00
Edward Beeching
aff9bc405a
Decision transformer gym (#15845)
* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Add type hints for Pegasus (#16324)

* Funnel type hints (#16323)

* add pt funnel type hints

* add tf funnel type hints

* Add type hints for ProphetNet PyTorch (#16272)

* [GLPN] Improve docs (#16331)

* Add link to notebook

* Add link

* Fix bug

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Added type hints for Pytorch Marian calls (#16200)

* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus

Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* done (#16340)

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type annotations for Rembert/Splinter and copies (#16338)

* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>

* [Bug template] Shift responsibilities for long-range (#16344)

* Fix code repetition in serialization guide (#16346)

* Adopt framework-specific blocks for content (#16342)

*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback

* Updates the default branch from master to main (#16326)

* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Updated copies, config auto, and readme files.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-03-23 16:18:43 -04:00
Sylvain Gugger
c595b6e6a9
Make Transformers use cache files when hf.co is down (#16362)
* Make Transformers use cache files when hf.co is down

* Fix tests

* Was there a random circleCI failure?

* Isolate patches

* Style

* Comment out the failure since it doesn't fail anymore

* Better comment
2022-03-23 15:56:49 -04:00
OllieBroadhurst
8a69e023bf
Swap inequalities (#16368)
* Swap inequalities

* Update src/transformers/trainer_callback.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer_callback.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 14:50:09 -04:00
Joao Gante
9e8c37dc82
TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332)
* revert tf gpt2

* add test for unpack_inputs and fix test case

* add changes to vision encoder decoder
2022-03-23 18:41:18 +00:00
Sylvain Gugger
12428f0ef1 Fix style 2022-03-23 11:44:09 -04:00
João Gustavo A. Amorim
1dfc11e9e0
complete the type annotations for config parameters (#16263) 2022-03-23 15:15:59 +00:00
Rishav Chandra Varma
bb3a1d345a
Adding missing type hints for mBART model (TF) (#16281)
* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-23 15:14:55 +00:00
OllieBroadhurst
935330ddfd
Trainer evaluation delay (#16356)
* Initial commit

* Reversed signs, adjusted log entery.

* Check only when

* Cleanup checks

* Only trigger if we want to eval

* Run

* Move changes to callback
2022-03-23 11:11:34 -04:00
Patrick von Platen
a220f160e0
[FlaxBart] make sure no grads are computed an bias (#16345)
* [FlaxBart] make sure no grads are computed an bias

* correct all other seq2seq models
2022-03-23 15:56:11 +01:00
Sylvain Gugger
4975002df5
Reorganize file utils (#16264)
* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
Patrick von Platen
7135603423
[T5] Add t5 download script (#16328)
* [T5] Add bash download script

* up

* up

* up

* Update src/transformers/models/t5/download_from_gcp.sh
2022-03-23 13:25:30 +01:00
Lysandre Debut
eca77f4719
Updates the default branch from master to main (#16326)
* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
Steven Liu
7732148124
Adopt framework-specific blocks for content (#16342)
*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback
2022-03-22 16:14:58 -05:00
Omar Sanseviero
62cbd8423b
Fix code repetition in serialization guide (#16346) 2022-03-22 16:57:19 -04:00
Patrick von Platen
4f6c938342
[Bug template] Shift responsibilities for long-range (#16344) 2022-03-22 21:55:22 +01:00
Jacob Dineen
ec3aace0ae
Add type annotations for Rembert/Splinter and copies (#16338)
* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-22 20:07:48 +00:00
Francesco Saverio Zuppichini
c30798ec9d
done (#16340) 2022-03-22 18:06:17 +01:00
Clémentine Fourrier
d49f8d3189
Added type hints for Pytorch Marian calls (#16200)
* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus

Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-22 14:45:59 +00:00
NielsRogge
a2379b9257
[GLPN] Improve docs (#16331)
* Add link to notebook

* Add link

* Fix bug

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-22 15:45:29 +01:00
Dan Tegzes
87a9af533c
Add type hints for ProphetNet PyTorch (#16272) 2022-03-22 13:55:58 +00:00
Adam Montgomerie
7b262b9692
Funnel type hints (#16323)
* add pt funnel type hints

* add tf funnel type hints
2022-03-22 13:52:29 +00:00
Dan Tegzes
deb61e5f07
Add type hints for Pegasus (#16324) 2022-03-22 13:17:55 +00:00
Beomseok Lee
7cc2c9c6b0
Fix bugs of s2t fairseq model converting (#15593)
* Fix bugs for argument typo and positional embedding weight loading

* Reflect code review suggestion to cover different missing keys cases
2022-03-22 12:09:51 +01:00
Suraj Patil
7865f4d01f
add xglm conversion script (#16305)
* add xglm conversion script

* style

* update script
2022-03-22 11:45:50 +01:00
NielsRogge
0c55d47cde
Add GLPN (#16199)
* First draft

* Fix logits calculation

* Improve tests

* Add copied from statements

* Fix base_model_prefix

* Improve implementation, upload new models

* Update design

* Fix integration test

* Add model to README and toctree

* Add document image

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add decoder_hidden_size attribute

* Update design of decoder

* Add DepthEstimatorOutput class

* Rename in_index to head_in_index and add feature extractor tests

* Apply suggestions from code review

* Apply suggestions from code review

* Update pretrained model name and add to doc tests

* Remove test.py script

* Update copied from statements and clean up

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-22 08:51:13 +01:00