transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 01:30:04 +06:00

History

Julien Demouth 02ec02d6d3 Add nvidia megatron models (#10911 ) * Add support for NVIDIA Megatron models * Add support for NVIDIA Megatron GPT2 and BERT Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This commit includes a script to convert a Megatron-GPT2 checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. Add the megatron_bert model. That model is implemented as a modification of the existing BERT model in Transformers. This commit includes a script to convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Remove model.half in tests + add "# Copied ..." Remove the model.half() instruction which makes tests fail on the CPU. Add a comment "# Copied ..." before many classes in the model to enable automatic tracking in CI between the new Megatron classes and the original Bert ones. * Fix issues * Fix Flax/TF tests * Fix copyright * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/megatron_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/megatron_gpt2.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Resolve most of 'sgugger' comments * Fix conversion issue + Run make fix-copies/quality/docs * Apply suggestions from code review * Causal LM & merge * Fix init * Add CausalLM to last auto class Co-authored-by: Julien Demouth <jdemouth@nvidia.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>		2021-04-08 14:09:11 -04:00
..
albert.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
auto.rst	Auto feature extractor (#11097 )	2021-04-06 19:20:08 -04:00
bart.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
barthez.rst	Fix documentation links always pointing to master. (#9217 )	2021-01-05 06:18:48 -05:00
bert.rst	Typo fix of the name of BertLMHeadModel in BERT doc (#11133 )	2021-04-08 08:22:58 -04:00
bertgeneration.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
bertweet.rst	Improve documentation coverage for Bertweet (#9379 )	2021-01-04 13:12:59 -05:00
bigbird.rst	add blog to docs (#10997 )	2021-03-31 18:36:00 +03:00
blenderbot_small.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
blenderbot.rst	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 )	2021-02-04 11:56:12 +03:00
bort.rst	ADD BORT (#9813 )	2021-01-27 21:25:11 +03:00
camembert.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
convbert.rst	Fix doc for TFConverBertModel	2021-02-04 10:14:46 -05:00
ctrl.rst	Added TF CTRL Sequence Classification (#9151 )	2020-12-17 18:10:57 -05:00
deberta_v2.rst	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 )	2021-02-19 18:34:44 -05:00
deberta.rst	Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018 )	2021-02-19 18:34:44 -05:00
dialogpt.rst	ADD BORT (#9813 )	2021-01-27 21:25:11 +03:00
distilbert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
dpr.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
electra.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
encoderdecoder.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
flaubert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
fsmt.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
funnel.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
gpt_neo.rst	[doc] gpt-neo (#11098 )	2021-04-06 16:42:06 -04:00
gpt.rst	[doc] update code-block rendering (#11053 )	2021-04-05 09:06:07 -04:00
gpt2.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
herbert.rst	Improve documentation coverage for Herbert (#9428 )	2021-01-06 09:13:43 -05:00
ibert.rst	Update ibert.rst (#10445 )	2021-02-28 19:03:49 +03:00
layoutlm.rst	Layout lm tf 2 (#10636 )	2021-03-25 12:32:38 -04:00
led.rst	Upgrade styler to better handle lists (#9423 )	2021-01-06 07:46:17 -05:00
longformer.rst	Add message to documentation that longformer doesn't support token_type_ids (#9152 )	2020-12-16 11:06:14 -05:00
lxmert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
m2m_100.rst	fix M2M100 example (#10745 )	2021-03-16 20:20:00 +05:30
marian.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
mbart.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
megatron_bert.rst	Add nvidia megatron models (#10911 )	2021-04-08 14:09:11 -04:00
megatron_gpt2.rst	Add nvidia megatron models (#10911 )	2021-04-08 14:09:11 -04:00
mobilebert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
mpnet.rst	MPNet copyright files (#9015 )	2020-12-10 09:29:38 -05:00
mt5.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
pegasus.rst	Fix broken link (#10656 )	2021-03-11 14:29:02 -05:00
phobert.rst	Improve documentation coverage for Phobert (#9427 )	2021-01-06 10:04:32 -05:00
prophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
rag.rst	Add TFRag (#9002 )	2021-03-09 00:49:51 +03:00
reformer.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
retribert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
roberta.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
speech_to_text.rst	Fix S2T example (#10741 )	2021-03-16 08:55:07 -04:00
squeezebert.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
t5.rst	Deprecate prepare_seq2seq_batch (#10287 )	2021-02-22 12:36:16 -05:00
tapas.rst	Fix URLs to TAPAS notebooks (#9435 )	2021-01-06 07:20:41 -05:00
transformerxl.rst	Fix script that check objects are documented (#9259 )	2020-12-22 11:12:58 -05:00
vit.rst	Add Vision Transformer and ViTFeatureExtractor (#10950 )	2021-04-01 11:16:05 -04:00
wav2vec2.rst	Add Fine-Tuning for Wav2Vec2 (#10145 )	2021-03-01 12:13:17 +03:00
xlm.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
xlmprophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
xlmroberta.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
xlnet.rst	Enforce all objects in the main init are documented (#9014 )	2020-12-10 11:57:12 -05:00
xlsr_wav2vec2.rst	[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648 )	2021-03-11 17:44:18 +03:00