Sylvain Gugger
269c9638df
Merge branch 'master' of github.com:huggingface/transformers
2021-04-08 21:14:56 -04:00
Sylvain Gugger
d31c7b104e
Skip Megatron tests for now
2021-04-08 21:14:43 -04:00
Stas Bekman
c2e0fd5283
[setup] make fairscale and deepspeed setup extras ( #11151 )
...
* make fairscale and deepspeed setup extras
* fix default
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* no reason not to ask for the good version
* update the CIs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-08 15:46:54 -07:00
Sylvain Gugger
ba8b1f4754
Add support for multiple models for one config in auto classes ( #11150 )
...
* Add support for multiple models for one config in auto classes
* Use get_values everywhere
* Prettier doc
2021-04-08 18:41:36 -04:00
Stas Bekman
97ccf67bb3
[setup] extras[docs] must include 'all' ( #11148 )
...
* extras[doc] must include 'all'
* fix
* better
* regroup
2021-04-08 18:10:44 -04:00
Stas Bekman
66446909b2
[tests] relocate core integration tests ( #11146 )
...
* relocate core integration tests
* add sys.path context manager
* cleanup
* try
* try2
* fix path
* doc
* style
* add dep
* add 2 more deps
2021-04-08 13:13:17 -07:00
Andrea Cappelli
6c40e49712
Run mlm pad to multiple for fp16 ( #11128 )
...
* Add mlm collator pad to multiple option (#10627 )
* Use padding to 8x in run mlm (#10627 )
2021-04-08 16:12:49 -04:00
Sylvain Gugger
dfed4ec263
Don't duplicate logs in TensorBoard and handle --use_env ( #11141 )
2021-04-08 16:12:36 -04:00
Philipp Schmid
9c9b8e707b
Updates SageMaker docs for updating DLCs ( #11140 )
2021-04-08 16:05:53 -04:00
Lysandre Debut
ba2cf5f90d
Add fairscale and deepspeed back to the CI ( #11147 )
...
* Add fairscale and deepspeed back to the CI
* Add deepspeed to single GPU tests
2021-04-08 11:36:45 -07:00
Stas Bekman
1ed24afe91
[trainer] solve "scheduler before optimizer step" warning ( #11144 )
...
* solve "scheduler before optimizer step" warning
* style
* correct the state evaluation test
2021-04-08 11:28:48 -07:00
Julien Demouth
02ec02d6d3
Add nvidia megatron models ( #10911 )
...
* Add support for NVIDIA Megatron models
* Add support for NVIDIA Megatron GPT2 and BERT
Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This
commit includes a script to convert a Megatron-GPT2 checkpoint downloaded
from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details.
Add the megatron_bert model. That model is implemented as a modification of
the existing BERT model in Transformers. This commit includes a script to
convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See
examples/megatron-models/README.md for details.
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Remove model.half in tests + add "# Copied ..."
Remove the model.half() instruction which makes tests fail on the CPU.
Add a comment "# Copied ..." before many classes in the model to enable automatic
tracking in CI between the new Megatron classes and the original Bert ones.
* Fix issues
* Fix Flax/TF tests
* Fix copyright
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/megatron_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/megatron_gpt2.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Resolve most of 'sgugger' comments
* Fix conversion issue + Run make fix-copies/quality/docs
* Apply suggestions from code review
* Causal LM & merge
* Fix init
* Add CausalLM to last auto class
Co-authored-by: Julien Demouth <jdemouth@nvidia.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-04-08 14:09:11 -04:00
Stas Bekman
c6d664849b
[DeepSpeed] ZeRO Stage 3 ( #10753 )
...
* synced gpus
* fix
* fix
* need to use t5-small for quality tests
* notes
* complete merge
* fix a disappearing std stream problem
* start zero3 tests
* wip
* tune params
* sorting out the pre-trained model loading
* reworking generate loop wip
* wip
* style
* fix tests
* split the tests
* refactor tests
* wip
* parameterized
* fix
* workout the resume from non-ds checkpoint pass + test
* cleanup
* remove no longer needed code
* split getter/setter functions
* complete the docs
* suggestions
* gpus and their compute capabilities link
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* style
* remove invalid paramgd
* automatically configure zero3 params that rely on hidden size
* make _get_resized_embeddings zero3-aware
* add test exercising resize_token_embeddings()
* add docstring
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-08 09:53:01 -07:00
Stas Bekman
acc851e1ff
[run_clm] clarify why we get the tokenizer warning on long input ( #11145 )
...
* clarify why we get the warning here
* Update examples/language-modeling/run_clm.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* wording
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-08 09:46:28 -07:00
Yusuke Mori
5bf5d50c8d
Typo fix of the name of BertLMHeadModel in BERT doc ( #11133 )
2021-04-08 08:22:58 -04:00
Jannis Born
f8e90d6fb9
Fix typing error in Trainer class (prediction_step) ( #11138 )
...
* fix: docstrings in prediction_step
* ci: Satisfy line length requirements
* ci: character length requirements
2021-04-08 08:22:25 -04:00
Sylvain Gugger
ffe0761777
Fix and refactor check_repo ( #11127 )
2021-04-07 17:56:21 -04:00
Philipp Schmid
3fd7eee18f
Adds use_auth_token with pipelines ( #11123 )
...
* added model_kwargs to infer_framework_from_model
* added model_kwargs to tokenizer
* added use_auth_token as named parameter
* added dynamic get for use_auth_token
2021-04-07 20:32:59 +02:00
Stas Bekman
1c15128312
[versions] handle version requirement ranges ( #11110 )
...
* handle version requirement ranges
* add mixed requirement test
* cleanup
2021-04-07 09:09:38 -07:00
Vasudev Gupta
7442801df5
fix tests ( #11109 )
2021-04-07 10:07:26 -04:00
Lysandre Debut
c0d97cee13
Adds a note to resize the token embedding matrix when adding special … ( #11120 )
...
* Adds a note to resize the token embedding matrix when adding special tokens
* Remove superfluous space
2021-04-07 10:06:45 -04:00
Sylvain Gugger
02f7c2fe66
Some styling of the training table in Notebooks ( #11118 )
2021-04-07 10:00:33 -04:00
Sylvain Gugger
11505fa139
Dummies multi backend ( #11100 )
...
* Replaces requires_xxx by one generic method
* Quality and update check_dummies
* Fix inits check
* Post-merge cleanup
2021-04-07 09:56:40 -04:00
Stas Bekman
424419f549
[examples] fix white space ( #11099 )
...
these get concatenated without whitespace, so fix it
2021-04-07 09:20:58 -04:00
Stas Bekman
c9035e4537
fix: The 'warn' method is deprecated ( #11105 )
...
* The 'warn' method is deprecated
* fix test
2021-04-07 09:20:06 -04:00
Leo Gao
247bed3857
GPTNeo: handle padded wte ( #11079 )
...
* GPTNeo: handle padded wte
* Switch to config.vocab_size
* apply review suggestion
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-04-07 17:35:20 +05:30
cronoik
083ad7d46c
dead link fixed ( #11103 )
2021-04-07 07:50:47 -04:00
Sylvain Gugger
fd338abdeb
Style
2021-04-06 19:54:13 -04:00
SHYAM SUNDER KUMAR
aef4cf8c52
accelerate question answering examples with no trainer ( #11091 )
...
* accelerate question answering examples with no trainer
* removed train and eval flags also fixed fill np array function
* Update examples/question-answering/run_qa_beam_search_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update examples/question-answering/run_qa_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-06 19:35:21 -04:00
Sylvain Gugger
403d530eec
Auto feature extractor ( #11097 )
...
* AutoFeatureExtractor
* Init and first tests
* Tests
* Damn you gitignore
* Quality
* Defensive test for when not all backends are here
* Use pattern for Speech2Text models
2021-04-06 19:20:08 -04:00
Stas Bekman
520198f56f
[doc] gpt-neo ( #11098 )
...
make the example work
2021-04-06 16:42:06 -04:00
Lysandre
9853c5dd58
Development on v4.6.0dev0
2021-04-06 12:53:25 -04:00
Lysandre
4906a29f7f
Release v4.5.0
2021-04-06 12:37:47 -04:00
Suraj Patil
2a8115f083
[WIP] GPT Neo cleanup ( #10985 )
...
* better names
* add attention mixin
* all slow tests in one class
* make helper methods static so we can test
* add local attention tests
* better names
* doc
* apply review suggestions
2021-04-06 12:24:15 -04:00
Philipp Schmid
76800fb8e6
added new merged Trainer test ( #11090 )
2021-04-06 15:12:21 +02:00
Philipp Schmid
b219d6b5a5
added social thumbnail for docs ( #11083 )
2021-04-06 14:56:18 +02:00
Sylvain Gugger
6c1bee7d89
Link to new blog
2021-04-06 08:55:40 -04:00
Stas Bekman
f7328de46d
HF emoji unicode doesn't work in console ( #11081 )
...
It doesn't look like using 🤗 is a great idea for printing to console. See attachment.
This PR proposes to replace 🤗 with "HuggingFace" for an exception message.
@LysandreJik
2021-04-06 08:03:00 -04:00
Hemil Desai
6ab7d1a429
Add Readme for language modeling scripts with accelerate ( #11073 )
2021-04-05 20:56:12 -04:00
Sylvain Gugger
2199608ca6
Make a base init in FeatureExtractionMixin ( #11074 )
2021-04-05 18:02:28 -04:00
Sylvain Gugger
04ceee7d24
Fix distributed gather for tuples of tensors of varying sizes ( #11071 )
2021-04-05 16:21:49 -04:00
Sylvain Gugger
f05a8a0c5e
Document common config attributes ( #11070 )
2021-04-05 15:29:01 -04:00
Sylvain Gugger
090e3e6896
Add center_crop to ImageFeatureExtractoMixin ( #11066 )
2021-04-05 15:28:51 -04:00
konstin
abb7430003
Replace pkg_resources with importlib_metadata ( #11061 )
...
* Replace pkg_resources with importlib_metadata
Fixes #10964 . The other reason for this change is that pkg_resources has been [deprecated](8fe85c22ce
) in favor of importlib_metadata.
* Reduce to a single importlib_metadata import switch
* Trigger CI
Co-authored-by: Stas Bekman <stas@stason.org>
2021-04-05 12:12:19 -07:00
Hemil Desai
b51b87c41d
Add examples/language_modeling/run_clm_no_trainer.py
( #11026 )
...
* Initial draft for clm no trainer
* Remove unwanted args
* Fix bug
* Update examples/language-modeling/run_clm_no_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 12:27:52 -04:00
Amala Deshmukh
e1c02e018c
Add example for registering callbacks with trainers ( #10928 )
...
* Add example for callback registry
Resolves : #9036
* Update callback registry documentation
* Added comments for other ways to register callback
2021-04-05 12:27:23 -04:00
Lysandre Debut
9f4e0c23d6
Documentation about loading a fast tokenizer within Transformers ( #11029 )
...
* Documentation about loading a fast tokenizer within Transformers
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 10:51:16 -04:00
Sylvain Gugger
6c25f5228e
Refactor AutoModel classes and add Flax Auto classes ( #11027 )
...
* Refactor AutoModel classes and add Flax Auto classes
* Add new objects to the init
* Fix hubconf and sort models
* Fix TF tests
* Missing coma
* Update src/transformers/models/auto/auto_factory.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Fix init
* Fix dummies
* Other init to fix
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-05 10:11:28 -04:00
Lysandre Debut
eb3479e7cf
Some models have no tokenizers ( #11064 )
2021-04-05 09:37:49 -04:00
Lysandre Debut
773e4c7263
Remove unnecessary space ( #11060 )
2021-04-05 09:36:20 -04:00