transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 01:32:23 +06:00

Author	SHA1	Message	Date
yuanzhoulvpi	47c9570903	fix error: TypeError: Object of type Tensor is not JSON serializable … (#29568 ) fix error: TypeError: Object of type Tensor is not JSON serializable trainer Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-11 17:15:36 +00:00
Yih-Dar	e5eb55b88b	Don't use a subset in test fetcher if on `main` branch (#28816 ) save ci life Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-03-11 16:58:06 +01:00
Klaus Hipp	dd1c905215	[Docs] Fix FastSpeech2Conformer model doc links (#29574 ) [Docs] Fix FastSpeech2Conformer links	2024-03-11 14:14:03 +00:00
Yitong Huang	873d9bb3cc	Make torch xla available on GPU (#29334 ) * add USE_TORCH_XLA env * rename torch_tpu to torch_xla * better is_torch_xla_available; fix some fsdp and performance issues * fix format * fix bug when pjrt_device is cpu * fix bug * fix the deprecation handling --------- Co-authored-by: anw90 <ang868@gmail.com> Co-authored-by: wangang.wa <wangang.wa@alibaba-inc.com>	2024-03-11 14:07:16 +00:00
Damith Senanayake	9a3f4d4daf	Bark model Flash Attention 2 Enabling to pass on check_device_map parameter to super() (#29357 ) * Fixing error #29332. The _check_and_enable_flash_attn_2() method receives a check_device_map parameter and fails. * style fixup	2024-03-11 12:44:12 +00:00
Tanay Mehta	6d67837f06	Add Fill-in-the-middle training objective example - PyTorch (#27464 ) * add: initial script to train clm fim * fix: if training model from scratch, new tokens will be added and embeddings resized * fix: fixed attention_mask errors when generating FIM data * fix: file formatted using black * add: run_fim_no_trainer.py and fixed some comments in run_fim.py * add: added fim examples to the README.md and ran code fixup * fix: little bug in both fim training scripts * fix: remove comment from notebook and added a note on fim related params * fix: minor typo in README * add: suggested minor changes to README and run_fim.py * add: gradient_accumulation_steps and gradient_checkpointing args * add: improved model embedding resizing * add: pad_to_multiple_of and attn_implementation params * add: requested minor changes * add: deepspeed zero compatibility * add: resize embeddings layer with zero3 support for fim model initialization	2024-03-11 12:14:02 +00:00
j-gc	d80c9a3497	[`Docs`] fixed minor typo (#29555 )	2024-03-11 11:05:16 +00:00
Arthur	4f27ee936a	[`Mamba doc`] Post merge updates (#29472 ) * post merge update * nit * oups	2024-03-11 09:46:24 +01:00
Winston H	0290ec19c9	feat: use `warning_advice` for tensorflow warning (#29540 ) feat: use `warning_advice` instead of tensorflow warning	2024-03-08 17:27:30 +00:00
Zach Mueller	469c13280d	Fix eval thread fork bomb (#29538 ) * Fix eval thread fork bomb * Keep eval dl persistent and prepare after so free_memory doesn't destroy it * Add note * Quality	2024-03-08 11:04:18 -05:00
Fanli Lin	3f6973db06	[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307 ) * fix n_gpu * fix style	2024-03-08 10:52:25 -05:00
Yoach Lacombe	1ba89dc2d2	Fix WhisperNoSpeechDetection when input is full silence (#29065 ) fix total silence input with no_speech_threshold	2024-03-08 14:31:05 +00:00
Yun Dai	697f05bab3	fix typos in FSDP config parsing logic in `TrainingArguments` (#29189 ) fix FSDP config	2024-03-08 08:36:30 -05:00
Jonatan Kłosko	608fa5496c	Make sliding window size inclusive in eager attention (#29519 ) * Make sliding window size inclusive in eager attention * Fix tests	2024-03-08 12:53:17 +00:00
liangjs	f386c51ad9	StableLM: Fix dropout argument type error (#29236 ) * fix stablelm dropout argument type error * fix docs of _flash_attention_forward * fix all docs of _flash_attention_forward * fix docs of _flash_attention_forward in starcoder2 --------- Co-authored-by: oliang <oliang@tencent.com>	2024-03-08 11:58:25 +00:00
Fanli Lin	1ea3ad1aec	[tests] use `torch_device` instead of `auto` for model testing (#29531 ) * use torch_device * skip for XPU * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-08 11:21:43 +00:00
Clémentine Fourrier	14536c339a	Typo fix in error message (#29535 )	2024-03-08 11:20:31 +00:00
Wang, Yi	8ee1d47203	fix image-to-text batch incorrect output issue (#29342 ) * fix image-to-text batch incorrect output issue Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * add ci test Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * update ci test Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-03-08 11:11:10 +00:00
Fanli Lin	8e589c83b6	[tests] add the missing `require_sacremoses` decorator (#29504 ) * add sacremoses check * fix style * for FlaubertTokenizer * HerbertTokenizer fix * add typeHint * Update src/transformers/testing_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make less skipped * make quality * remove import --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-03-08 10:13:54 +00:00
Joao Gante	bc764f4263	Generate: left-padding test, revisited (#29515 ) * left-padding test revisited * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-08 10:06:46 +00:00
Pedro Cuenca	631fa7bf6b	Typo in mlx tensor support (#29509 ) Potential typo in mlx support	2024-03-08 09:47:44 +00:00
Nick DeGroot	b338a6c3b8	Fix `VisionEncoderDecoder` Positional Arg (#29497 ) * 🐛 Fix vision encoder decoder positional arg * ✅ Add test for VisionEncoderDecoder with LayoutLMv3 encoder --------- Co-authored-by: Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>	2024-03-07 20:45:51 +00:00
Alvaro Bartolome	ddf177ee4a	Set `inputs` as kwarg in `TextClassificationPipeline` (#29495 ) * Set `inputs` as kwarg in `TextClassificationPipeline` This change has been done to align the `TextClassificationPipeline` with the rest of the pipelines, and to be able to e.g. `pipeline(*{"inputs": "text"})` which wouldn't be possible since the `args` were being used instead. * Add `noqa: C409` on `tuple([inputs],)` Even though is discouraged by the linter, the cast `tuple(list(...),)` is required here, as otherwise the original list in `inputs` will be transformed into a `tuple` and the elements 1...N will be ignored by the `Pipeline` * Run `ruff format` * Simplify `tuple` conversion with `(inputs,)` Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2024-03-07 20:43:57 +00:00
amyeroberts	4ed9ae623d	test_generation_config_is_loaded_with_model - fall back to pytorch model for now (#29521 ) * Fall back to pytorch model for now * Fix up	2024-03-07 17:30:28 +00:00
Alex Ishida	45c0651090	Add support for metadata format MLX (#29335 ) Add support for loading safetensors files saved with metadata format mlx.	2024-03-07 14:51:59 +01:00
Raushan Turganbay	923733c22b	Flava multimodal add attention mask (#29446 ) * flava multimodal add attn mask * make style * check mask is not None	2024-03-07 12:45:47 +01:00
Ashok Pon Kumar	9288e759ad	fix: Avoid error when fsdp_config is missing xla_fsdp_v2 (#29480 ) Signed-off-by: Ashok Pon Kumar Sree Prakash <ashokponkumar@gmail.com>	2024-03-07 12:44:23 +01:00
Lysandre Debut	f6133d767a	Revert "Automatic safetensors conversion when lacking these files (#2… (#29507 ) Revert "Automatic safetensors conversion when lacking these files (#29390)" This reverts commit `a69cbf4e64`.	2024-03-07 12:12:41 +01:00
Joao Gante	ffe60fdcd6	v4.39 deprecations 🧼 (#29492 )	2024-03-07 10:44:43 +00:00
regisss	979fccc90f	Enable BLIP for auto VQA (#29499 ) * Enable BLIP for auto VQA * Make style * Add VQA to BLIP pipeline tests	2024-03-07 10:28:01 +01:00
Park Jun	d45f47ab7f	Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS device (#29439 ) * Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS devices * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update llama ang gemma rope use cpu in mps device --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-07 00:57:22 +01:00
Glen Taggart	2a939f20ff	Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat [needs tests+sanity check] (#29413 ) * try to fix gemma mem use * fix: handle attention mask dim==2 case * remove logits=logits.float() * clean up + add llama * apply formatting * readability edit: swap order of items being multiplied * revert change unrelated to PR * revert black autoformat * switch to one .to * Accept style edits Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-07 00:56:25 +01:00
Alvaro Bartolome	965cf67769	Fix `TextGenerationPipeline.__call__` docstring (#29491 )	2024-03-06 09:03:55 -08:00
Moshe Berchansky	19fb1e22d2	added the max_matching_ngram_size to GenerationConfig (#29131 ) * added the max_matching_ngram_size parameter into the GenerationConfig, for the PromptLookupCandidateGenerator * switched back to keyword arguments * added PromptLookupCandidateGenerator docstring for its parameters * ruff reformat * Update src/transformers/generation/configuration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-06 15:06:45 +00:00
Joao Gante	ddb4fda3cb	Generate: torch.compile-ready generation config preparation (#29443 )	2024-03-06 14:28:45 +00:00
Zach Mueller	9322576e2f	Fix test failure on DeepSpeed (#29444 ) * Fix test failure * use item	2024-03-06 07:11:53 -05:00
Ofir Zafrir	0a5b0516f8	Avoid dummy token in PLD to optimize performance (#29445 )	2024-03-06 11:19:47 +00:00
Joao Gante	700d48fb2d	Generate: get generation mode from the generation config instance 🧼 (#29441 )	2024-03-06 11:18:35 +00:00
Joao Gante	41f7b7ae4b	Generate: add tests for caches with `pad_to_multiple_of` (#29462 )	2024-03-06 10:57:04 +00:00
Matthew Hoffman	2890116ab7	Fix TrainingArguments regression with torch <2.0.0 for dataloader_prefetch_factor (#29447 ) * Fix TrainingArguments regression with torch <2.0.0 for dataloader_prefetch_factor dataloader_prefetch_factor was added to TrainingArguments in #28498 with the default value None, but versions of torch<2.0.0 do not accept None and will raise an error if num_workers == 0 and prefetch_factor != 2 * Add is_torch_available() check * Use is_torch_greater_or_equal_than_2_0 add back check for dataloader_prefetch_factor	2024-03-06 09:44:08 +00:00
Younes Belkada	b27aa206dd	[`docs`] Add starcoder2 docs (#29454 ) * add accelerate docs * Apply suggestions from code review Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * Update starcoder2.md * add correct generation --------- Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>	2024-03-06 06:58:37 +01:00
Younes Belkada	2a002d073a	[`Docs` / `Awq`] Add docs on exllamav2 + AWQ (#29474 ) * add docs on exllamav2 + AWQ * Update docs/source/en/quantization.md	2024-03-06 06:30:47 +01:00
Fanli Lin	00bf44270f	[FIX] `offload_weight()` takes from 3 to 4 positional arguments but 5 were given (#29457 ) * use require_torch_gpu * enable on XPU * fix	2024-03-06 03:58:42 +01:00
AI4Harmony	7b01579f73	🌐 [i18n-KO] Translated generation_strategies.md to Korean (#29086 ) * Update ko _toctree.yml * Create ko: generation_strategies.md * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2024-03-05 15:47:33 -08:00
Michael	638c423c89	[i18n-zh] Translate add_new_pipeline.md into Chinese (#29432 ) * [i18n-zh] Translate add_new_pipeline.md into Chinese * apply suggestions from Fan-Lin	2024-03-05 09:19:00 -08:00
Lysandre Debut	a69cbf4e64	Automatic safetensors conversion when lacking these files (#29390 ) * Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread	2024-03-05 13:37:55 +01:00
Logan Adams	9c5e560924	Update pytest `import_path` location (#29154 ) * Update to pull function from proper lib * Fix ruff formatting error * Remove accidently added file	2024-03-05 12:23:34 +00:00
AleksanderWWW	8f3f8e6766	Fix bug with passing capture_* args to neptune callback (#29041 ) * Fix bug with passing capture_* args to neptune callback * ruff happy? * instantiate (frozen)set only once * code review * code review 2 * ruff happy? * code review	2024-03-05 11:54:00 +00:00
Arthur	fb1c62e973	[`Add Mamba`] Adds support for the `Mamba` models (#28094 ) * initial-commit * start cleaning * small nits * small nits * current updates * add kernels * small refactoring little step * add comments * styling * nit * nits * Style * Small changes * Push dummy mambda simple slow * nit * Use original names * Use original names and remove norm * Updates for inference params * Style nd updates * nits * Match logits * Add a test * Add expected generated text * nits doc, imports and styling * style * oups * dont install kernels, invite users to install the required kernels * let use use the original packages * styling * nits * fix some copieds * update doc * fix-copies * styling done * nits * fix import check * run but wrong cuda ress * mamba CUDA works :) * fix the fast path * config naming nits * conversion script is not required at this stage * finish fixing the fast path: generation make sense now! * nit * Let's start working on the CIs * style * better style * more nits * test nit * quick fix for now * nits * nit * nit * nit * nits * update test rest * fixup * update test * nit * some fixes * nits * update test values * fix styling * nit * support peft * integrations tests require torchg * also add slow markers * styling * chose forward wisely * nits * update tests * fix gradient checkpointing * fixup * nit * fix doc * check copies * fix the docstring * fix some more tests * style * fix beam search * add init schene * update * nit * fix * fixup the doc * fix the doc * fixup * tentative update but slow is no longer good * nit * should we always use float32? * nits * revert wrong changes * res in float32 * cleanup * skip fmt for now * update generation values * update test values running original model * fixup * update tests + rename inference_params to cache_params + make sure training does not use cache_params * small nits * more nits * fix final CIs * style * nit doc * I hope final doc nits * nit * 🫠 * final touch! * fix torch import * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * Apply suggestions from code review * fix fix and fix * fix base model prefix! * nit * Update src/transformers/models/mamba/__init__.py * Update docs/source/en/model_doc/mamba.md Co-authored-by: Lysandre Debut <hi@lysand.re> * nit --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-03-05 20:01:06 +09:00
Joao Gante	87a0783dde	Generate: inner decoding methods are no longer public (#29437 )	2024-03-05 10:27:36 +00:00

... 5 6 7 8 9 ...

15616 Commits