transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 20:18:24 +06:00

Author	SHA1	Message	Date
wangpeng	af1c864cdc	fix code example in mgp-str doc (#22219 ) Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>	2023-03-17 09:40:06 +00:00
Kevin Turner	33d033d694	fix typos in llama.mdx (#22223 )	2023-03-17 08:43:18 +00:00
Yih-Dar	97a3d16a69	Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 23:57:26 +01:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Susnato Dhar	fb366b9a2a	fixes a typo in WhisperFeatureExtractor docs. (#22208 ) * fixes a typo * .	2023-03-16 16:08:05 +00:00
Younes Belkada	da3ba3a167	[`XGLM`] Add `accelerate` support for XGLM (#22207 ) * add `accelerate` support for XGLM * fix order	2023-03-16 16:18:05 +01:00
SatyaJandhyalaAtMS	a88a4dae19	Temporarily fix ONNX model exporting error (#21830 ) * Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a108042118d204da447729f3834affa354fc. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8. * Propagated changes made in SwinV2 to Swin2SR	2023-03-16 10:56:26 -04:00
Yih-Dar	4c5c0af7e5	Update tiny model creation script (#22202 ) * Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 14:21:58 +01:00
Jason Phang	464d420775	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:01:15 -04:00
Jason Phang	0041be5b3d	LLaMA Implementation (#21955 ) * LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>	2023-03-16 09:00:53 -04:00
Baelish03	09922da4a7	Italian Translation of migration.mdx (#22183 ) * Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree	2023-03-16 12:00:07 +00:00
Yih-Dar	52a57f7c7c	Update expected values in `MgpstrModelIntegrationTest` (#22195 ) Update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 11:48:52 +00:00
Alara Dirik	1485bd9c02	Fix typo in Align docs (#22199 ) Fix align docs typo	2023-03-16 13:41:48 +03:00
Yih-Dar	1c4a9acc73	Fix DeepSpeed CI (#22194 ) * Deal with torch-tensorrt --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 05:52:40 +01:00
Prathik Rao	7c4999e495	t5 remove data dependency (#22097 ) * t5 remove data dependency * make style * make fix-copies --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com>	2023-03-15 16:11:15 -04:00
Anahita Bhiwandiwalla	16121bae5c	Update BridgeTowerForContrastiveLearning (#22145 ) * Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by: Tiep Le <tiep.le@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-15 20:54:38 +01:00
Sylvain Gugger	42ad693b7b	Regression pipeline device (#22190 ) * Fix regression in pipeline when device=-1 is passed * Add regression test	2023-03-15 14:13:38 -04:00
amyeroberts	737681477c	Revert 22152 MaskedImageCompletionOutput changes (#22187 ) Revert changes	2023-03-15 18:37:23 +01:00
浮躁的小螃蟹	7b0e2cfdfb	Fix: unfinished_sequences with correct device (#22184 ) Fix: unfinished_sequences with correct device The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.	2023-03-15 16:27:19 +00:00
Sylvain Gugger	f7329751fe	Run all tests by default (#22162 )	2023-03-14 17:30:43 -04:00
Sylvain Gugger	b7036f4912	Load optimizer state on CPU to avoid CUDA OOM (#22159 )	2023-03-14 17:30:32 -04:00
Sylvain Gugger	ebdb185bef	v4.28.0.dev0	2023-03-14 13:49:10 -04:00
Sylvain Gugger	c52c5282ef	Revert "Enforce same behavior as PyTorch 2.0 for older versions" (#22163 ) Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)" This reverts commit `1c801d65eb`.	2023-03-14 13:45:46 -04:00
Stas Bekman	085bf5c1fe	[trainer] add `--optim adamw_torch_fused` for pt-2.0+ (#22144 ) * [trainer] add --optim adamw_torch_fused * change optim default * deal with non-torch * revert default change; prep; add fp16/amp assert * typo * typo	2023-03-14 10:22:03 -07:00
amyeroberts	c6318c3788	to_pil - don't rescale if int and in range 0-255 (#22158 ) * Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py	2023-03-14 15:43:44 +00:00
Alara Dirik	3b22bfbc6a	Create MaskedImageCompletionOutput and fix ViT docs (#22152 ) * create MaskedImageCompletionOutput * fix bugs * fix bugs	2023-03-14 13:55:18 +00:00
Sylvain Gugger	b45192ec47	Fix big model inference for T5 models in float16 (#22095 ) * Fix big model inference for T5 models in float16 * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Style * Trigger CI with latest release --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-14 09:20:16 -04:00
Nicola Procopio	7f5ad6c35b	Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151 ) * added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree	2023-03-14 11:09:36 +00:00
Yih-Dar	ff88703501	Update 2 doctest expected values for torch 2.0.0 (#22148 ) update values Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 09:13:16 +00:00
Alara Dirik	cdddfbffa1	Add ConvNeXT V2 (#21679 ) * Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues	2023-03-14 12:08:14 +03:00
Yih-Dar	6c2ad00c46	Move `is_pipeline_test_to_skip` to specific model test classes (#21999 ) * Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-14 10:03:02 +01:00
Arthur	2beabd24f0	[🛠️] Fix-whisper-breaking-changes (#21965 ) * temp fix * temporary fix * update * fix tests * fixup * update based on reveiew Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * update to fix tests * update docstring --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-03-14 09:23:48 +01:00
MichaelRipa	101a6cd276	docs: New terms and updates to glossary (#21982 ) * Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 19:09:37 -04:00
Yih-Dar	ba9e0191de	Prepare daily CI for torch 2.0.0 (#22135 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-13 22:21:15 +01:00
Patrick von Platen	f780557a34	[Safetensors] Add explicit flag to from pretrained (#22083 ) * [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 21:39:06 +01:00
Sylvain Gugger	3a35937ede	Remove backend check for torch.compile (#22140 ) * Remove backend enforcment for torch.compile * Update error * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Style --------- Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2023-03-13 16:34:00 -04:00
Stas Bekman	618697ef53	[deepspeed docs] Activation Checkpointing (#22099 ) * [deepspeed docs] Activation Checkpointing * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update deepspeed.mdx --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-13 12:52:42 -07:00
Stas Bekman	5b85add7d5	[trainer] fix bug in grad accum with multiple epochs (#22098 ) * [trainer] fix bug in grad accum * comment out debug * fix one-off * rename counter	2023-03-13 12:51:40 -07:00
Sylvain Gugger	1c801d65eb	Enforce same behavior as PyTorch 2.0 for older versions (#22136 )	2023-03-13 15:50:50 -04:00
Joao Gante	e16cbe88ae	Trainer: let generate pick its inputs (#22108 ) * Let generate pick its inputs * fix squad seq2seq example	2023-03-13 19:00:25 +00:00
Younes Belkada	d979cf6efd	[`Whiper`] add `get_input_embeddings` to `WhisperForAudioClassification` (#22133 ) * add `get_input_embeddings` to `WhisperForAudioClassification` * add common tests * fix another common test * Update tests/models/whisper/test_modeling_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-13 19:46:01 +01:00
bishmdl76	987972377d	Update configuration_align.py (projected_dim=640) (#22139 ) Update configuration_align.py updated projected_dim=640 from 512 in arguments of AlignConfig	2023-03-13 14:12:12 -04:00
Yih-Dar	54ee56b15b	Add a new script to check model testers' config (#22063 ) * Add script --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-13 19:11:19 +01:00
mollerup23	a096eaca65	Adding Type Hints to TF_Pegasus model (#21941 ) * Adding Type Hints to TF_Pegasus model * Updated some parameters per maintainer comments	2023-03-13 15:58:29 +00:00
Sylvain Gugger	6cb5132a7f	Fix doc link for MGP-STR (#22138 )	2023-03-13 15:26:50 +00:00
Maria Khalusova	8def252de2	Zero-shot image classification task guide (#22132 ) * WIP * WIP * manual inference example * make style * Apply suggestions from code review Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> --------- Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2023-03-13 10:57:17 -04:00
Karim Foda	e61081e725	Fix gradient checkpointing bug in trocr (#22126 ) * Fix gradient checkpointing bug in trocr * Fix format * Update src/transformers/models/trocr/modeling_trocr.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-03-13 15:45:47 +01:00
Karim Foda	ef74e7e783	Fix gradient checkpointing bug in LongT5 (#22130 )	2023-03-13 14:06:17 +00:00
Karim Foda	c1db6a3bab	Fix gradient checkpointing bug in xmod (#22129 )	2023-03-13 15:05:11 +01:00
Younes Belkada	6652e7da0d	[`Blip2`] skip accelerate test (#22124 ) skip accelerate test	2023-03-13 15:03:21 +01:00

... 53 54 55 56 57 ...

15053 Commits