Commit Graph

9366 Commits

Author SHA1 Message Date
Nathan Cooper
f5e8c9bdea
Update readme with how to train offline and fix BPE command (#15897)
* Update readme with how to train offline and fix BPE command

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-03-24 11:00:46 +01:00
Yih-Dar
9badcecf69
[Doctests] Make TFRoberta-like meaningfull (#16370)
* update doc examples for TFRoberta

* fix style

* fix style

* use TF ckpt

* apply suggestion

* add the code file to test here

* fix style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-24 10:26:27 +01:00
Patrick von Platen
77c5a80536
[Doctests] Make roberta-like meaningfull (#16363)
* [Doctests] Make roberta-like meaningfull

* correct

* final correct

* Trigger test

* make style

* apply suggestion from sylvain
2022-03-24 00:17:00 +01:00
Xu Zhao
5f0d07b36b
Make BigBird model compatiable to fp16 dtype. (#16034)
* Make BigBird model compatiable to fp16 dtype.

* Use tree_map instead of map

* Reformat the code

* Fix import order

* Convert masks to the correct dtype

* Fix format issue

* Address comments.
2022-03-24 00:07:34 +01:00
Yih-Dar
1cf28da66d
Update docs/README.md (#16333)
* Update docs/README.md

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 22:46:11 +01:00
Thomas Chaigneau
029b0d95ed
add GPT-J ONNX config to Transformers (#16274)
* add GPT-J ONNX config to Transformers

* remove token-classification features mapping

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* add question-answering features mapping

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* add GPT2 config init to GPT2 config + copie shebang for fix-copies

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-03-23 16:36:11 -04:00
Edward Beeching
aff9bc405a
Decision transformer gym (#15845)
* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Add type hints for Pegasus (#16324)

* Funnel type hints (#16323)

* add pt funnel type hints

* add tf funnel type hints

* Add type hints for ProphetNet PyTorch (#16272)

* [GLPN] Improve docs (#16331)

* Add link to notebook

* Add link

* Fix bug

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Added type hints for Pytorch Marian calls (#16200)

* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus

Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* done (#16340)

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type annotations for Rembert/Splinter and copies (#16338)

* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>

* [Bug template] Shift responsibilities for long-range (#16344)

* Fix code repetition in serialization guide (#16346)

* Adopt framework-specific blocks for content (#16342)

*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback

* Updates the default branch from master to main (#16326)

* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Updated copies, config auto, and readme files.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-03-23 16:18:43 -04:00
Sylvain Gugger
c595b6e6a9
Make Transformers use cache files when hf.co is down (#16362)
* Make Transformers use cache files when hf.co is down

* Fix tests

* Was there a random circleCI failure?

* Isolate patches

* Style

* Comment out the failure since it doesn't fail anymore

* Better comment
2022-03-23 15:56:49 -04:00
OllieBroadhurst
8a69e023bf
Swap inequalities (#16368)
* Swap inequalities

* Update src/transformers/trainer_callback.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer_callback.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 14:50:09 -04:00
Joao Gante
9e8c37dc82
TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 (#16332)
* revert tf gpt2

* add test for unpack_inputs and fix test case

* add changes to vision encoder decoder
2022-03-23 18:41:18 +00:00
Sylvain Gugger
12428f0ef1 Fix style 2022-03-23 11:44:09 -04:00
João Gustavo A. Amorim
1dfc11e9e0
complete the type annotations for config parameters (#16263) 2022-03-23 15:15:59 +00:00
Rishav Chandra Varma
bb3a1d345a
Adding missing type hints for mBART model (TF) (#16281)
* added type hints for mbart tensorflow tf implementation

* Adding missing type hints for mBART model 

Tensorflow Implementation model added with missing type hints

* Missing Type hints - correction

For TF model

* Code fixup using make quality tests

* Hint types - typo error

* make fix-copies and make fixup

* type hints

* updated files

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-23 15:14:55 +00:00
OllieBroadhurst
935330ddfd
Trainer evaluation delay (#16356)
* Initial commit

* Reversed signs, adjusted log entery.

* Check only when

* Cleanup checks

* Only trigger if we want to eval

* Run

* Move changes to callback
2022-03-23 11:11:34 -04:00
Patrick von Platen
a220f160e0
[FlaxBart] make sure no grads are computed an bias (#16345)
* [FlaxBart] make sure no grads are computed an bias

* correct all other seq2seq models
2022-03-23 15:56:11 +01:00
Sylvain Gugger
4975002df5
Reorganize file utils (#16264)
* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
Patrick von Platen
7135603423
[T5] Add t5 download script (#16328)
* [T5] Add bash download script

* up

* up

* up

* Update src/transformers/models/t5/download_from_gcp.sh
2022-03-23 13:25:30 +01:00
Lysandre Debut
eca77f4719
Updates the default branch from master to main (#16326)
* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
Steven Liu
7732148124
Adopt framework-specific blocks for content (#16342)
*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback
2022-03-22 16:14:58 -05:00
Omar Sanseviero
62cbd8423b
Fix code repetition in serialization guide (#16346) 2022-03-22 16:57:19 -04:00
Patrick von Platen
4f6c938342
[Bug template] Shift responsibilities for long-range (#16344) 2022-03-22 21:55:22 +01:00
Jacob Dineen
ec3aace0ae
Add type annotations for Rembert/Splinter and copies (#16338)
* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-22 20:07:48 +00:00
Francesco Saverio Zuppichini
c30798ec9d
done (#16340) 2022-03-22 18:06:17 +01:00
Clémentine Fourrier
d49f8d3189
Added type hints for Pytorch Marian calls (#16200)
* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus

Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-22 14:45:59 +00:00
NielsRogge
a2379b9257
[GLPN] Improve docs (#16331)
* Add link to notebook

* Add link

* Fix bug

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-22 15:45:29 +01:00
Dan Tegzes
87a9af533c
Add type hints for ProphetNet PyTorch (#16272) 2022-03-22 13:55:58 +00:00
Adam Montgomerie
7b262b9692
Funnel type hints (#16323)
* add pt funnel type hints

* add tf funnel type hints
2022-03-22 13:52:29 +00:00
Dan Tegzes
deb61e5f07
Add type hints for Pegasus (#16324) 2022-03-22 13:17:55 +00:00
Beomseok Lee
7cc2c9c6b0
Fix bugs of s2t fairseq model converting (#15593)
* Fix bugs for argument typo and positional embedding weight loading

* Reflect code review suggestion to cover different missing keys cases
2022-03-22 12:09:51 +01:00
Suraj Patil
7865f4d01f
add xglm conversion script (#16305)
* add xglm conversion script

* style

* update script
2022-03-22 11:45:50 +01:00
NielsRogge
0c55d47cde
Add GLPN (#16199)
* First draft

* Fix logits calculation

* Improve tests

* Add copied from statements

* Fix base_model_prefix

* Improve implementation, upload new models

* Update design

* Fix integration test

* Add model to README and toctree

* Add document image

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add decoder_hidden_size attribute

* Update design of decoder

* Add DepthEstimatorOutput class

* Rename in_index to head_in_index and add feature extractor tests

* Apply suggestions from code review

* Apply suggestions from code review

* Update pretrained model name and add to doc tests

* Remove test.py script

* Update copied from statements and clean up

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-22 08:51:13 +01:00
Johnny Greco
df32b5d89b
TFLongformer: Add missing type hints and unpack inputs decorator (#16228)
* Add type annotations for TF Longformer

* Update docstring data types to include numpy array

* Implement unpack_inputs decorator

* fixup after decorator updates

* Numpy array -> np.ndarray in docstring

Co-authored-by: Johnny Greco <johnny.greco@radpartners.com>
2022-03-21 22:56:17 +00:00
Thomas Chaigneau
0aac9ba2da
Add Flaubert OnnxConfig to Transformers (#16279)
* Add Flaubert to ONNX to make it available for conversion.

* Fixed features for FlauBERT. fixup command remove flaubert to docs list.

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-03-21 21:46:31 +01:00
Joao Gante
9fef668338
TF - update (vision_)encoder_decoder past variable (#16260) 2022-03-21 19:55:41 +00:00
Gunjan Chhablani
f9387c948d
Update Makefile Phonies (#16306) 2022-03-21 15:28:23 -04:00
ivanllt
96cd5bcbb9
added type hints for blenderbot and blenderbot_small (#16307) 2022-03-21 19:13:58 +00:00
Anton Lozhkov
e226a24f84
[xtreme-s] Update Minds14 results (#16241)
* update results

* per-language metrics

* Format the per-language metrics
2022-03-21 19:33:59 +01:00
Gunjan Chhablani
6f1727d83a
Fix Seq2SeqTrainingArguments docs (#16295)
* Indent Seq2Seq Train Args docs

* Add Args keyword to Seq2Seq Train Args docs
2022-03-21 13:48:07 -04:00
Johnny Greco
7643b1caa6
Added type hints to PyTorch Longformer models (#16244) 2022-03-21 17:09:03 +00:00
Suraj Patil
c77092a5ed
[FlaxGPTJ] Fix bug in rotary embeddings (#16298) 2022-03-21 18:07:56 +01:00
Yih-Dar
4b2774832d
fix last element in hidden_states for XGLM (#16301)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-21 17:38:52 +01:00
Steven Liu
5a42bb431e
Update troubleshoot with more content (#16243)
* 📝 first draft

* 🖍 apply feedback
2022-03-21 11:37:18 -05:00
NielsRogge
fbb454307d
[SegFormer] Remove unused attributes (#16285)
* Remove unused attributes

* Add link to blog and add clarification about input size

* Improve readability of the code

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-03-21 17:34:10 +01:00
Suraj Patil
f0c00d8ca9
Fix Marian conversion script (#16300) 2022-03-21 17:23:40 +01:00
Yi Heng Lim
94be424308
Added type hints for PyTorch T5 model (#16257)
* Added type hints for PyTorch T5 model

* removed a type hint

* ran make style
2022-03-21 16:17:52 +00:00
Christopher Akiki
250b478a2c
GPT2 TensorFlow Type Hints (#16261)
* Add typing hints for base model class

* Add typing hints for causal LM model class

* Add typing hints for double heads model class

* Add typing hints for sequence classification model class

* Add typing hints for Main Layer

* Run fixup
2022-03-21 16:11:03 +00:00
Francesco Saverio Zuppichini
9ad77affee
test (#16294) 2022-03-21 16:59:47 +01:00
Robot Jelly
d50f62f2de
added type hints for BART model (#16270)
* added type hints for BART model

* make fixup, adding imports to copied files

* Adding some missing types to cookiecutter

* Adding some missing types to cookiecutter

* Adding some missing types to cookiecutter

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-21 15:18:01 +00:00
Jack McDonald
460f36d352
Add type hints transfoxl (#16267)
* Add type hint for pt transfo_xl model

* Add type hint for tf transfo_xl model
2022-03-21 15:04:13 +00:00
Xia
2afe9cd279
Add argument "cache_dir" for transformers.onnx (#16284)
* Add argument "cache_dir" for transformers.onnx

* Reformate files that can't pass CI.
2022-03-21 15:26:44 +01:00