transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	b45192ec47	Fix big model inference for T5 models in float16 (#22095 ) * Fix big model inference for T5 models in float16 * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Style * Trigger CI with latest release --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-14 09:20:16 -04:00
Nicola Procopio	7f5ad6c35b	Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree	2023-03-14 11:09:36 +00:00
Yih-Dar	ff88703501	Update 2 doctest expected values for torch 2.0.0 (#22148 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 09:13:16 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
Yih-Dar	6c2ad00c46	Move `is_pipeline_test_to_skip` to specific model test classes (#21999 ) * Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 10:03:02 +01:00
Arthur	2beabd24f0	[🛠️] Fix-whisper-breaking-changes (#21965 ) * temp fix * temporary fix * update * fix tests * fixup * update based on reveiew Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * update to fix tests * update docstring --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-03-14 09:23:48 +01:00
MichaelRipa	101a6cd276	docs: New terms and updates to glossary (#21982 ) * Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 19:09:37 -04:00
Yih-Dar	ba9e0191de	Prepare daily CI for torch 2.0.0 (#22135 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-13 22:21:15 +01:00
Patrick von Platen	f780557a34	[Safetensors] Add explicit flag to from pretrained (#22083 ) * [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 21:39:06 +01:00
Sylvain Gugger	3a35937ede	Remove backend check for torch.compile (#22140 ) * Remove backend enforcment for torch.compile * Update error * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Style --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2023-03-13 16:34:00 -04:00
Stas Bekman	618697ef53	[deepspeed docs] Activation Checkpointing (#22099 ) * [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 12:52:42 -07:00
Stas Bekman	5b85add7d5	[trainer] fix bug in grad accum with multiple epochs (#22098 ) * [trainer] fix bug in grad accum * comment out debug * fix one-off * rename counter	2023-03-13 12:51:40 -07:00
Sylvain Gugger	1c801d65eb	Enforce same behavior as PyTorch 2.0 for older versions (#22136 )	2023-03-13 15:50:50 -04:00
Joao Gante	e16cbe88ae	Trainer: let generate pick its inputs (#22108 ) * Let generate pick its inputs * fix squad seq2seq example	2023-03-13 19:00:25 +00:00
Younes Belkada	d979cf6efd	[`Whiper`] add `get_input_embeddings` to `WhisperForAudioClassification` (#22133 ) * add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-13 19:46:01 +01:00
bishmdl76	987972377d	Update configuration_align.py (projected_dim=640) (#22139 ) Update configuration_align.py updated projected_dim=640 from 512 in arguments of AlignConfig	2023-03-13 14:12:12 -04:00
Yih-Dar	54ee56b15b	Add a new script to check model testers' config (#22063 ) * Add script --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-13 19:11:19 +01:00
mollerup23	a096eaca65	Adding Type Hints to TF_Pegasus model (#21941 ) * Adding Type Hints to TF_Pegasus model * Updated some parameters per maintainer comments	2023-03-13 15:58:29 +00:00
Sylvain Gugger	6cb5132a7f	Fix doc link for MGP-STR (#22138 )	2023-03-13 15:26:50 +00:00
Maria Khalusova	8def252de2	Zero-shot image classification task guide (#22132 ) * WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2023-03-13 10:57:17 -04:00
Karim Foda	e61081e725	Fix gradient checkpointing bug in trocr (#22126 ) * Fix gradient checkpointing bug in trocr * Fix format * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-13 15:45:47 +01:00
Karim Foda	ef74e7e783	Fix gradient checkpointing bug in LongT5 (#22130 )	2023-03-13 14:06:17 +00:00
Karim Foda	c1db6a3bab	Fix gradient checkpointing bug in xmod (#22129 )	2023-03-13 15:05:11 +01:00
Younes Belkada	6652e7da0d	[`Blip2`] skip accelerate test (#22124 ) skip accelerate test	2023-03-13 15:03:21 +01:00
Nicola Procopio	dd3a0580a6	Added big_models.mdx italian translation #17600 (#22115 ) * updated toctree * italian translation big_model.mdx * italian translation big_models	2023-03-13 10:02:03 -04:00
Karim Foda	0768c5e274	Fix gradient checkpointing bug in xlm_roberta_xl (#22128 )	2023-03-13 13:52:34 +00:00
Karim Foda	4c14c1f47b	Fix gradient checkpointing bug in Trajectory Transformer (#22125 )	2023-03-13 13:50:02 +00:00
Karim Foda	d0876a095f	Fix gradient checkpointing bug in xglm (#22127 )	2023-03-13 13:49:23 +00:00
Alex Calabrese	0c883766bd	Add pr_checks.mdx Italian translation (#17459 ) (#22116 ) * Add pr_checks.mdx Italian translation (#17459) * Updated pr_checks.mdx Italian translation (#17459)	2023-03-13 09:24:34 -04:00
wangpeng	102b5ff4a8	add new model of MGP-STR (#21418 ) * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * add new model of MGP-STR * fix the check failings * remove torch and numpy from mgp_tokenization * remove unused import from modeling_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str.py * add test_processing_mgp_str * add test_processing_mgp_str * add test_processing_mgp_str * rm test_processing_mgp_str and add softmax outs to model * rewrite the code of mgp-str according to PR suggestions * rewrite the code of mgp-str according to PR suggestions * remove representation_size from MGPSTRConfig * reformat configuration_mgp_str.py * format test_processor_mgp_str.py * add test for tokenizer and complete model/processer test and model file * rm Unnecessary tupple in modeling_mgp_str * reduce hidden_size/layers/label_size in test_model * add integration tests and change MGPSTR to Mgpstr * add test for logit values * reformat test model file --------- Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-13 10:11:31 +00:00
Alara Dirik	32e3466d38	Add AutoModelForZeroShotImageClassification (#22087 ) Adds AutoModelForZeroShotImageClassification to transformers	2023-03-13 12:46:14 +03:00
Sanchit Gandhi	b90fbc7e0b	[Whisper] Remove embed_tokens from encoder docstring (#21996 ) * [Whisper] Remove embed_tokens from encoder docstring * new line to retrigger CI * remove new line	2023-03-11 14:03:36 +01:00
Yih-Dar	2f320661f3	Revert "[GPT2] Propose fix for #21080 " (#22093 ) Revert "[GPT2] Propose fix for #21080 (#21853)" to avoid CI failure This reverts commit `a3fef89b26`.	2023-03-10 22:08:21 +01:00
Sylvain Gugger	499770c088	Fix imports of TF MobileViT (#22065 ) * Fix imports of TF MobileViT * Fix copies	2023-03-10 14:46:34 -05:00
Maria Khalusova	bdec2768bd	GPT-J specific half precision on CPU note (#22086 ) * re: #21989 * update re: #21989 * removed cpu option * make style	2023-03-10 14:03:43 -05:00
Dean Wyatte	2f4cdd97f5	handle numpy inputs in whole word mask data collator (#22032 )	2023-03-10 10:50:29 -05:00
J-shang	a70da86b84	Fix hint in src/transformers/modeling_utils.py (#22074 ) fix hint	2023-03-10 08:56:42 -05:00
Karim Foda	419d979f7f	Fix gradient checkpointing bug in Speecht5 (#22080 ) * Fix gradient checkpointing bug in Speecht5 * Update modeling_speech_to_text.py * Update src/transformers/models/speech_to_text/modeling_speech_to_text.py * Fix change errors --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-03-10 13:36:09 +00:00
Joao Gante	7014fc360d	Generate - Fix broken documentation links (#22078 ) fix broken links	2023-03-10 13:28:30 +00:00
Kevin Jiang	ade26bf991	Fix small typo in flan-ul2.mdx (#22068 ) * Update flan-ul2.mdx * Update flan-ul2.mdx	2023-03-10 07:44:45 -05:00
Arthur	a3fef89b26	[GPT2] Propose fix for #21080 (#21853 ) * Make sure position ids are masked * test that padded input produce the same results * fix failing tests * fixup * fix batch test	2023-03-10 07:15:25 -05:00
Karim Foda	eee195b3aa	Fix gradient checkpointing bug in switch transformer (#22081 )	2023-03-10 11:31:08 +00:00
Karim Foda	b9273353dc	Fix gradient checkpointing bug in Speech2Text (#22079 ) * Fix gradient checkpointing bug in Speech2Text * Update modeling_speech_to_text.py * Update modeling_speech_to_text_2.py	2023-03-10 11:30:42 +00:00
Sylvain Gugger	a9bd5df16a	Add a progress bar for the total download of shards (#22062 ) * Add a progress bar for the total download of shards * Check for no cache at all * Fix check	2023-03-09 16:58:03 -05:00
aws-sangeetha	1a5fc300f4	Fix case when using --gradient_accumulation_steps with DDP disabled. (#22007 ) Co-authored-by: EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>	2023-03-09 14:31:58 -05:00
Yih-Dar	6d9031f285	Update tiny model creation script (#22058 ) Update the script Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-09 19:53:54 +01:00
Sylvain Gugger	7a2b915e92	Add setters by type of args to TrainingArguments (#21570 ) * Add setters by type of args to TrainingArguments * Define more setters	2023-03-09 13:13:23 -05:00
Yih-Dar	ab81d31d20	Skip 3 tests for `WhisperEncoderModelTest` (#22060 ) * skip 3 tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-09 19:09:23 +01:00
Jiali Mei	8434cb878e	Edit the docstring of `image_processing_donut` to match code (#22033 ) * Edit the docstring of `image_processing_donut` to match code * improve style * more style improvement after installing quality	2023-03-09 17:35:43 +00:00
Stas Bekman	ec24132b6c	[deepspeed] offload + non-cpuadam optimizer exception (#22043 ) * [deepspeed] offload + non-cpuadam optimizer exception * flip * revert min version	2023-03-09 08:12:57 -08:00

1 2 3 4 5 ...

12327 Commits