transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	269c9638df	Merge branch 'master' of github.com:huggingface/transformers	2021-04-08 21:14:56 -04:00
Sylvain Gugger	d31c7b104e	Skip Megatron tests for now	2021-04-08 21:14:43 -04:00
Stas Bekman	c2e0fd5283	[setup] make fairscale and deepspeed setup extras (#11151 ) * make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-08 15:46:54 -07:00
Sylvain Gugger	ba8b1f4754	Add support for multiple models for one config in auto classes (#11150 ) * Add support for multiple models for one config in auto classes * Use get_values everywhere * Prettier doc	2021-04-08 18:41:36 -04:00
Stas Bekman	97ccf67bb3	[setup] extras[docs] must include 'all' (#11148 ) * extras[doc] must include 'all' * fix * better * regroup	2021-04-08 18:10:44 -04:00
Stas Bekman	66446909b2	[tests] relocate core integration tests (#11146 ) * relocate core integration tests * add sys.path context manager * cleanup * try * try2 * fix path * doc * style * add dep * add 2 more deps	2021-04-08 13:13:17 -07:00
Andrea Cappelli	6c40e49712	Run mlm pad to multiple for fp16 (#11128 ) * Add mlm collator pad to multiple option (#10627) * Use padding to 8x in run mlm (#10627)	2021-04-08 16:12:49 -04:00
Sylvain Gugger	dfed4ec263	Don't duplicate logs in TensorBoard and handle --use_env (#11141 )	2021-04-08 16:12:36 -04:00
Philipp Schmid	9c9b8e707b	Updates SageMaker docs for updating DLCs (#11140 )	2021-04-08 16:05:53 -04:00
Lysandre Debut	ba2cf5f90d	Add fairscale and deepspeed back to the CI (#11147 ) * Add fairscale and deepspeed back to the CI * Add deepspeed to single GPU tests	2021-04-08 11:36:45 -07:00
Stas Bekman	1ed24afe91	[trainer] solve "scheduler before optimizer step" warning (#11144 ) * solve "scheduler before optimizer step" warning * style * correct the state evaluation test	2021-04-08 11:28:48 -07:00
Julien Demouth	02ec02d6d3	Add nvidia megatron models (#10911 ) * Add support for NVIDIA Megatron models * Add support for NVIDIA Megatron GPT2 and BERT Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This commit includes a script to convert a Megatron-GPT2 checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. Add the megatron_bert model. That model is implemented as a modification of the existing BERT model in Transformers. This commit includes a script to convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details. * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Remove model.half in tests + add "# Copied ..." Remove the model.half() instruction which makes tests fail on the CPU. Add a comment "# Copied ..." before many classes in the model to enable automatic tracking in CI between the new Megatron classes and the original Bert ones. * Fix issues * Fix Flax/TF tests * Fix copyright * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/configuration_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/megatron_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/megatron_gpt2.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/megatron_bert/modeling_megatron_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Resolve most of 'sgugger' comments * Fix conversion issue + Run make fix-copies/quality/docs * Apply suggestions from code review * Causal LM & merge * Fix init * Add CausalLM to last auto class Co-authored-by: Julien Demouth <jdemouth@nvidia.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-04-08 14:09:11 -04:00
Stas Bekman	c6d664849b	[DeepSpeed] ZeRO Stage 3 (#10753 ) * synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-08 09:53:01 -07:00
Stas Bekman	acc851e1ff	[run_clm] clarify why we get the tokenizer warning on long input (#11145 ) * clarify why we get the warning here * Update examples/language-modeling/run_clm.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * wording * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-08 09:46:28 -07:00
Yusuke Mori	5bf5d50c8d	Typo fix of the name of BertLMHeadModel in BERT doc (#11133 )	2021-04-08 08:22:58 -04:00
Jannis Born	f8e90d6fb9	Fix typing error in Trainer class (prediction_step) (#11138 ) * fix: docstrings in prediction_step * ci: Satisfy line length requirements * ci: character length requirements	2021-04-08 08:22:25 -04:00
Sylvain Gugger	ffe0761777	Fix and refactor check_repo (#11127 )	2021-04-07 17:56:21 -04:00
Philipp Schmid	3fd7eee18f	Adds use_auth_token with pipelines (#11123 ) * added model_kwargs to infer_framework_from_model * added model_kwargs to tokenizer * added use_auth_token as named parameter * added dynamic get for use_auth_token	2021-04-07 20:32:59 +02:00
Stas Bekman	1c15128312	[versions] handle version requirement ranges (#11110 ) * handle version requirement ranges * add mixed requirement test * cleanup	2021-04-07 09:09:38 -07:00
Vasudev Gupta	7442801df5	fix tests (#11109 )	2021-04-07 10:07:26 -04:00
Lysandre Debut	c0d97cee13	Adds a note to resize the token embedding matrix when adding special … (#11120 ) * Adds a note to resize the token embedding matrix when adding special tokens * Remove superfluous space	2021-04-07 10:06:45 -04:00
Sylvain Gugger	02f7c2fe66	Some styling of the training table in Notebooks (#11118 )	2021-04-07 10:00:33 -04:00
Sylvain Gugger	11505fa139	Dummies multi backend (#11100 ) * Replaces requires_xxx by one generic method * Quality and update check_dummies * Fix inits check * Post-merge cleanup	2021-04-07 09:56:40 -04:00
Stas Bekman	424419f549	[examples] fix white space (#11099 ) these get concatenated without whitespace, so fix it	2021-04-07 09:20:58 -04:00
Stas Bekman	c9035e4537	fix: The 'warn' method is deprecated (#11105 ) * The 'warn' method is deprecated * fix test	2021-04-07 09:20:06 -04:00
Leo Gao	247bed3857	GPTNeo: handle padded wte (#11079 ) * GPTNeo: handle padded wte * Switch to config.vocab_size * apply review suggestion Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-04-07 17:35:20 +05:30
cronoik	083ad7d46c	dead link fixed (#11103 )	2021-04-07 07:50:47 -04:00
Sylvain Gugger	fd338abdeb	Style	2021-04-06 19:54:13 -04:00
SHYAM SUNDER KUMAR	aef4cf8c52	accelerate question answering examples with no trainer (#11091 ) * accelerate question answering examples with no trainer * removed train and eval flags also fixed fill np array function * Update examples/question-answering/run_qa_beam_search_no_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_no_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-06 19:35:21 -04:00
Sylvain Gugger	403d530eec	Auto feature extractor (#11097 ) * AutoFeatureExtractor * Init and first tests * Tests * Damn you gitignore * Quality * Defensive test for when not all backends are here * Use pattern for Speech2Text models	2021-04-06 19:20:08 -04:00
Stas Bekman	520198f56f	[doc] gpt-neo (#11098 ) make the example work	2021-04-06 16:42:06 -04:00
Lysandre	9853c5dd58	Development on v4.6.0dev0	2021-04-06 12:53:25 -04:00
Lysandre	4906a29f7f	Release v4.5.0	2021-04-06 12:37:47 -04:00
Suraj Patil	2a8115f083	[WIP] GPT Neo cleanup (#10985 ) * better names * add attention mixin * all slow tests in one class * make helper methods static so we can test * add local attention tests * better names * doc * apply review suggestions	2021-04-06 12:24:15 -04:00
Philipp Schmid	76800fb8e6	added new merged Trainer test (#11090 )	2021-04-06 15:12:21 +02:00
Philipp Schmid	b219d6b5a5	added social thumbnail for docs (#11083 )	2021-04-06 14:56:18 +02:00
Sylvain Gugger	6c1bee7d89	Link to new blog	2021-04-06 08:55:40 -04:00
Stas Bekman	f7328de46d	HF emoji unicode doesn't work in console (#11081 ) It doesn't look like using 🤗 is a great idea for printing to console. See attachment. This PR proposes to replace 🤗 with "HuggingFace" for an exception message. @LysandreJik	2021-04-06 08:03:00 -04:00
Hemil Desai	6ab7d1a429	Add Readme for language modeling scripts with accelerate (#11073 )	2021-04-05 20:56:12 -04:00
Sylvain Gugger	2199608ca6	Make a base init in FeatureExtractionMixin (#11074 )	2021-04-05 18:02:28 -04:00
Sylvain Gugger	04ceee7d24	Fix distributed gather for tuples of tensors of varying sizes (#11071 )	2021-04-05 16:21:49 -04:00
Sylvain Gugger	f05a8a0c5e	Document common config attributes (#11070 )	2021-04-05 15:29:01 -04:00
Sylvain Gugger	090e3e6896	Add center_crop to ImageFeatureExtractoMixin (#11066 )	2021-04-05 15:28:51 -04:00
konstin	abb7430003	Replace pkg_resources with importlib_metadata (#11061 ) * Replace pkg_resources with importlib_metadata Fixes #10964. The other reason for this change is that pkg_resources has been [deprecated](`8fe85c22ce`) in favor of importlib_metadata. * Reduce to a single importlib_metadata import switch * Trigger CI Co-authored-by: Stas Bekman <stas@stason.org>	2021-04-05 12:12:19 -07:00
Hemil Desai	b51b87c41d	Add `examples/language_modeling/run_clm_no_trainer.py` (#11026 ) * Initial draft for clm no trainer * Remove unwanted args * Fix bug * Update examples/language-modeling/run_clm_no_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-05 12:27:52 -04:00
Amala Deshmukh	e1c02e018c	Add example for registering callbacks with trainers (#10928 ) * Add example for callback registry Resolves: #9036 * Update callback registry documentation * Added comments for other ways to register callback	2021-04-05 12:27:23 -04:00
Lysandre Debut	9f4e0c23d6	Documentation about loading a fast tokenizer within Transformers (#11029 ) * Documentation about loading a fast tokenizer within Transformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-05 10:51:16 -04:00
Sylvain Gugger	6c25f5228e	Refactor AutoModel classes and add Flax Auto classes (#11027 ) * Refactor AutoModel classes and add Flax Auto classes * Add new objects to the init * Fix hubconf and sort models * Fix TF tests * Missing coma * Update src/transformers/models/auto/auto_factory.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Fix init * Fix dummies * Other init to fix Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-04-05 10:11:28 -04:00
Lysandre Debut	eb3479e7cf	Some models have no tokenizers (#11064 )	2021-04-05 09:37:49 -04:00
Lysandre Debut	773e4c7263	Remove unnecessary space (#11060 )	2021-04-05 09:36:20 -04:00

1 2 3 4 5 ...

6951 Commits