transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-06 06:10:04 +06:00

Author	SHA1	Message	Date
Adilzhan Ismailov	e2b6df7971	[LLaVa] Add past_key_values to _skip_keys_device_placement to fix multi-GPU dispatch (#28051 ) Add past_key_values to _skip_keys_device_placement for LLaVa	2023-12-15 14:05:20 +00:00
Yoach Lacombe	deb72cb6d9	Skip M4T `test_retain_grad_hidden_states_attentions` (#28060 ) * skip test from SpeechInput * refine description of skip	2023-12-15 13:39:16 +00:00
Younes Belkada	d269c4b2d7	[`Mixtral`] update conversion script to reflect new changes (#28068 ) * Update convert_mixtral_weights_to_hf.py * forward contrib credits from original fix --------- Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>	2023-12-15 14:05:20 +01:00
Cylis	70a127a37a	doc: Correct spelling mistake (#28064 )	2023-12-15 13:01:39 +00:00
Yoach Lacombe	c817c17dbe	Remove SpeechT5 deprecated argument (#28062 )	2023-12-15 12:15:06 +00:00
Sanchit Gandhi	6af3ce7757	[Flax LLaMA] Fix attn dropout (#28059 )	2023-12-15 10:57:36 +00:00
Sanchit Gandhi	7e876dca54	[Flax BERT] Update deprecated 'split' method (#28012 ) * [Flax BERT] Update deprecated 'split' method * fix copies	2023-12-15 10:57:18 +00:00
Younes Belkada	e737446ee6	[`Modeling` / `Mixtral`] Fix GC + PEFT issues with Mixtral (#28061 ) fix for mistral	2023-12-15 11:34:42 +01:00
Younes Belkada	1e20931765	[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` (#28043 ) * fix fa-2 issue * fix test * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * clenaer fix * up * add more robust tests * Update src/transformers/modeling_utils.py Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * fixup * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pop * add test --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-12-15 11:08:27 +01:00
amyeroberts	1a585c1222	Remove warning when Annotion enum is created (#28048 ) Remove warning when enum is created	2023-12-14 19:50:20 +00:00
Matt	3060899be5	Replace build() with build_in_name_scope() for some TF tests (#28046 ) Replace build() with build_in_name_scope() for some tests	2023-12-14 17:42:25 +00:00
Matt	050e0b44f6	Proper build() methods for TF (#27794 ) * Add a convenience method for building in your own name scope * Second attempt at auto layer building * Revert "Second attempt at auto layer building" This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be. * Attempt #3 * Revert "Attempt #3" This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47. * Add missing attributes that we're going to need later * Add some attributes we're going to need later * A fourth attempt! Feel the power flow through you! * Revert "A fourth attempt! Feel the power flow through you!" This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7. * Add more values we'll need later * TF refactor that we'll need later * Revert "TF refactor that we'll need later" This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9. * Revert "Revert "TF refactor that we'll need later"" This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13. * make fixup * Attempt five! * Revert "Attempt five!" This reverts commit 3302207958dfd0374b0447a51c06eea51a506044. * Attempt six - this time don't add empty methods * Revert "Attempt six - this time don't add empty methods" This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb. * Attempt seven - better base model class detection! * Revert "Attempt seven - better base model class detection!" This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9. * Another attribute we'll need later * Try again with the missing attribute! * Revert "Try again with the missing attribute!" This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7. * This is the attempt that will pierce the heavens! * Revert "This is the attempt that will pierce the heavens!" This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6. * Attempt seven - snag list is steadily decreasing * Revert "Attempt seven - snag list is steadily decreasing" This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316. * Attempt eight - will an empty snag list do it? * Revert "Attempt eight - will an empty snag list do it?" This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b. * Fixes to Hubert issues that cause problems later * Trying again with Conv1D/SeparableConv fixes * Revert "Trying again with Conv1D/SeparableConv fixes" This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e. * Apply the build shape fixes to Wav2Vec2 as well * One more attempt! * Revert "One more attempt!" This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c. * Another attempt! * Revert "Another attempt!" This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50. * Let's see how many failures we get without the internal build method * Fix OpenAI * Fix MobileBERT * (Mostly) fix GroupVIT * Fix BLIP * One more BLIP fix * One more BLIP fix! * Fix Regnet * Finally fully fix GroupViT * Fix Data2Vec and add the new AdaptivePool * Fix Segformer * Fix Albert * Fix Deberta/DebertaV2 * Fix XLM * Actually fix XLM * Fix Flaubert * Fix lxmert * Fix Resnet * Fix ConvBERT * Fix ESM * Fix Convnext / ConvnextV2 * Fix SAM * Fix Efficientformer * Fix LayoutLMv3 * Fix speech_to_text * Fix mpnet and mobilevit * Fix Swin * Fix CTRL * Fix CVT * Fix DPR * Fix Wav2Vec2 * Fix T5 * Fix Hubert * Fix GPT2 * Fix Whisper * Fix DeiT * Fix the encoder-decoder / dual-encoder classes * make fix-copies * build in name scope * Fix summarization test * Fix tied weight names for BART + Blenderbot * Fix tied weight name building * Fix to TFESM weight building * Update TF SAM * Expand all the shapes out into Big Boy Shapes	2023-12-14 15:17:30 +00:00
Sanchit Gandhi	52c37882fb	[Seamless] Fix links in docs (#27905 ) * [Seamless] Fix links in docs * apply suggestions from code review	2023-12-14 15:14:13 +00:00
Joao Gante	388fd314d8	Generate: Mistral/Mixtral FA2 cache fix when going beyond the context window (#28037 )	2023-12-14 14:52:45 +00:00
James E. Dobson	0ede762636	Fixed spelling error in T5 tokenizer warning message (s/thouroughly/t… (#28014 ) Fixed spelling error in T5 tokenizer warning message (s/thouroughly/thoroughly)	2023-12-14 14:52:03 +00:00
Yoach Lacombe	bb1d0d0d9e	Fix languages covered by M4Tv2 (#28019 ) * correct language assessment + add tests * Update src/transformers/models/seamless_m4t_v2/modeling_seamless_m4t_v2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make style + simplify and enrich test --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-12-14 14:43:44 +00:00
Joao Gante	e2b16485f3	SeamlessM4T: `test_retain_grad_hidden_states_attentions` is flaky (#28035 )	2023-12-14 13:56:03 +00:00
Joao Gante	9e5c28c573	Generate: assisted decoding now uses `generate` for the assistant (#28030 ) generate refactor	2023-12-14 13:31:13 +00:00
Yih-Dar	dde6c427a1	Fix AMD push CI not triggered (#28029 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-14 12:44:00 +01:00
Younes Belkada	73de5108e1	[`core` / `modeling`] Fix training bug with PEFT + GC (#28031 ) fix trainign bug	2023-12-14 12:19:45 +01:00
Arthur	2788f8d8d5	[`SeamlessM4TTokenizer`] Safe import (#28026 ) safe import	2023-12-14 08:46:10 +01:00
Arthur	131a528be0	well well well (#28011 )	2023-12-14 06:51:04 +01:00
Marc Sun	17506d1256	add `modules_in_block_to_quantize` arg in GPTQconfig (#27956 ) * add inside_layer_modules arg * fix * change to modules_to_quantize_inside_block * fix * remane again * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * better docsting * fix again with less explanation * Update src/transformers/utils/quantization_config.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-12-13 14:13:44 -05:00
Rockerz	fe44b1f1a9	Add model_docs from cpmant.md to derformable_detr.md (#27884 ) * upfaste * Update * Update docs/source/ja/model_doc/deformable_detr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/data2vec.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/cvt.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * add suggestions * Toctree update * remove git references * Update docs/source/ja/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/decision_transformer.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2023-12-13 10:02:29 -08:00
Lysandre	3ed3e3190c	Dev version	2023-12-13 18:29:31 +01:00
Aaron Jimenez	815ea8e8a2	[Doc] Spanish translation of glossary.md (#27958 ) * Add glossary to es/_toctree.yml * Add glossary.md to es/ * A section translated * B and C section translated * Fix typo in en/glossary.md C section * D section translated \| Add a extra line in en/glossary.md * E and F section translated \| Fix typo in en/glossary.md * Fix words preentrenado * H and I section translated \| Fix typo in en/glossary.md * L section translated * M and N section translated * P section translated * R section translated * S section translated * T section translated * U and Z section translated \| Fix TensorParallel link in both files * Fix word	2023-12-13 09:21:59 -08:00
Zach Mueller	93766251cb	Fix bug with rotating checkpoints (#28009 ) * Fix bug * Write test * Keep back old modification for grad accum steps * Whitespace... * Whitespace again * Race condition * Wait for everyone	2023-12-13 12:17:30 -05:00
Arthur	ec43d6870a	[`CI slow`] Fix expected values (#27999 ) * fix expected values * style * test is slow	2023-12-13 13:37:10 +01:00
Arindam Jati	749f94e460	Fix PatchTSMixer slow tests (#27997 ) * fix slow tests * revert formatting --------- Co-authored-by: Arindam Jati <arindam.jati@ibm.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>	2023-12-13 13:34:25 +01:00
Younes Belkada	c7f076a00e	Adds VIP-llava to transformers (#27932 ) * v1 * add-new-model-like * revert * fix forward and conversion script * revert * fix copies * fixup * fix * Update docs/source/en/index.md * Apply suggestions from code review * push * fix * fixes here and there * up * fixup and fix tests * Apply suggestions from code review * add docs * fixup * fixes * docstring * add docstring * fixup * docstring * fixup * nit * docs * more copies * fix copies * nit * update test	2023-12-13 10:42:24 +01:00
Arthur	371fb0b7dc	[`Whisper`] raise better errors (#27971 ) * [`Whisper`] raise better erros fixes #27893 * update torch as well	2023-12-13 09:13:01 +01:00
Arthur	230ac352d8	[`Tokenizer Serialization`] Fix the broken serialisation (#27099 ) * nits * nits * actual fix * style * ze fix * fix fix fix style	2023-12-13 09:11:34 +01:00
Dave Berenbaum	f4db565b69	fix typo in dvclive callback (#27983 )	2023-12-12 16:29:58 -05:00
Stas Bekman	9936143014	[doc] fix typo (#27981 )	2023-12-12 20:32:42 +00:00
fxmarty	78172dcdb7	Fix SDPA correctness following torch==2.1.2 regression (#27973 ) * fix sdpa with non-contiguous inputs for gpt_bigcode * fix other archs * add currently comment * format	2023-12-13 00:33:46 +09:00
Matt	5e4ef0a0f6	Better key error for AutoConfig (#27976 ) * Improve the error printed when loading an unrecognized architecture * Improve the error printed when loading an unrecognized architecture * Raise a ValueError instead because KeyError prints weirdly * make fixup	2023-12-12 14:41:55 +00:00
saswatmeher	a49f4acab3	Fix link in README.md of Image Captioning (#27969 ) Update the link for vision encoder decoder doc used by FlaxVisionEncoderDecoderModel link.	2023-12-12 08:07:15 -05:00
Arthur	680c610f97	Hot-fix-mixstral-loss (#27948 ) * fix loss computation * compute on GPU if possible	2023-12-12 12:20:28 +01:00
Joao Gante	4b759da8be	Generate: `assisted_decoding` now accepts arbitrary candidate generators (#27750 ) Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-12 09:25:57 +00:00
Anthony Susevski	e660424717	fixed typos (issue 27919) (#27920 ) * fixed typos (issue 27919) * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-12-11 18:44:23 -05:00
dancingpipi	e5079b0b2a	Support PeftModel signature inspect (#27865 ) * Support PeftModel signature inspect * Use get_base_model() to get the base model --------- Co-authored-by: shujunhua1 <shujunhua1@jd.com>	2023-12-11 19:30:11 +00:00
Steven Liu	35478182ce	[docs] Fused AWQ modules (#27896 ) streamline	2023-12-11 10:41:33 -08:00
NielsRogge	67b1335cb9	Update bounding box format everywhere (#27944 ) Update formats	2023-12-11 18:03:42 +00:00
Younes Belkada	54d0b1c278	[`Mixtral`] Change mistral op order (#27955 ) up	2023-12-11 19:03:18 +01:00
Adam Louly	4850aaba6f	fix no sequence length models error (#27522 ) * fix no sequence length models error * block size check --------- Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-12-11 18:01:26 +00:00
Ashish Tawari	4b4b864224	Fix for stochastic depth decay rule in the TimeSformer implementation (#27875 ) Update modeling_timesformer.py Fixing typo to correct the stochastic depth decay rule	2023-12-11 16:20:31 +00:00
Chenhao Xu	c0a354d8d7	fix bug in mask2former: cost matrix is infeasible (#27897 ) fix bug: cost matrix is infeasible	2023-12-11 16:19:16 +00:00
rjenc29	7e35f37071	Fix a couple of typos and add an illustrative test (#26941 ) * fix a typo and add an illustrative test * appease black * reduce code duplication and add Annotion type back with a pending deprecation warning * remove unused code * change warning type * black formatting fix * change enum deprecation approach to support 3.8 and earlier * add stacklevel * fix black issue * fix ruff issues * fix ruff issues * move tests to own mixin * include yolos * fix black formatting issue * fix black formatting issue * use logger instead of warnings and include target version for deprecation	2023-12-11 15:51:51 +00:00
Ella Charlaix	39acfe84ba	Add deepspeed test to amd scheduled CI (#27633 ) * add deepspeed scheduled test for amd * fix image * add dockerfile * add comment * enable tests * trigger * remove trigger for this branch * trigger * change runner env to trigger the docker build image test * use new docker image * remove test suffix from docker image tag * replace test docker image with original image * push new image * Trigger * add back amd tests * fix typo * add amd tests back * fix * comment until docker image build scheduled test fix * remove deprecated deepspeed build option * upgrade torch * update docker & make tests pass * Update docker/transformers-pytorch-deepspeed-amd-gpu/Dockerfile * fix * tmp disable test * precompile deepspeed to avoid timeout during tests * fix comment * trigger deepspeed tests with new image * comment tests * trigger * add sklearn dependency to fix slow tests * enable back other tests * final update --------- Co-authored-by: Felix Marty <felix@hf.co> Co-authored-by: Félix Marty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-11 16:33:36 +01:00
Yih-Dar	0f59d2f173	Fix AMD scheduled CI not triggered (#27951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-11 16:22:10 +01:00

... 5 6 7 8 9 ...

15053 Commits