Michael Benayoun
d4e4efce68
Initial support for symbolic tracing with torch.fx allowing dynamic axes ( #13579 )
...
* Symbolic trace dynamic axes support for BERT like models (albert, bert, distilbert, mobilebert, electra, megatron-bert)
* Sanity checks before tracing that make sure the model to trace is supported
* Adapted to PyTorch 1.9
Co-authored-by: Michael Benayoun <michael@huggingface.co>
2021-10-05 14:19:47 +02:00
Lysandre Debut
c3d9ac7607
Expose get_config() on ModelTesters ( #12812 )
...
* Expose get_config() on ModelTesters
* Typo
2021-07-21 04:13:11 -04:00
Lysandre Debut
8c7bd1b97b
Skip test while the model is not available ( #12739 )
2021-07-15 09:06:47 -04:00
Michael Benayoun
f4a0d6ff86
A cleaner and more scalable implementation of symbolic tracing ( #11763 )
...
Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>
2021-05-20 18:02:29 +02:00
Sylvain Gugger
26212c14e5
Reactivate Megatron tests an use less workers
2021-04-09 18:09:53 -04:00
Sylvain Gugger
269c9638df
Merge branch 'master' of github.com:huggingface/transformers
2021-04-08 21:14:56 -04:00
Sylvain Gugger
d31c7b104e
Skip Megatron tests for now
2021-04-08 21:14:43 -04:00
Sylvain Gugger
ba8b1f4754
Add support for multiple models for one config in auto classes ( #11150 )
...
* Add support for multiple models for one config in auto classes
* Use get_values everywhere
* Prettier doc
2021-04-08 18:41:36 -04:00
Julien Demouth
02ec02d6d3
Add nvidia megatron models ( #10911 )
...
* Add support for NVIDIA Megatron models
* Add support for NVIDIA Megatron GPT2 and BERT
Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This
commit includes a script to convert a Megatron-GPT2 checkpoint downloaded
from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details.
Add the megatron_bert model. That model is implemented as a modification of
the existing BERT model in Transformers. This commit includes a script to
convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See
examples/megatron-models/README.md for details.
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Remove model.half in tests + add "# Copied ..."
Remove the model.half() instruction which makes tests fail on the CPU.
Add a comment "# Copied ..." before many classes in the model to enable automatic
tracking in CI between the new Megatron classes and the original Bert ones.
* Fix issues
* Fix Flax/TF tests
* Fix copyright
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update docs/source/model_doc/megatron_bert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/megatron_gpt2.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Resolve most of 'sgugger' comments
* Fix conversion issue + Run make fix-copies/quality/docs
* Apply suggestions from code review
* Causal LM & merge
* Fix init
* Add CausalLM to last auto class
Co-authored-by: Julien Demouth <jdemouth@nvidia.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-04-08 14:09:11 -04:00