transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 10:12:23 +06:00

Author	SHA1	Message	Date
Afanti	81aa9b2e07	fix typos in the docs directory (#36639 ) * chore: fix typos in the docs directory * chore: fix typos in the docs directory * chore: fix typos in the docs directory	2025-03-11 09:41:41 -07:00
Marc Sun	cb384dcd7a	Fix gguf docs (#36601 ) * update * doc * update * Update docs/source/en/gguf.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-03-11 15:29:14 +01:00
Matt	1e4286fd59	Remove research projects (#36645 ) * Remove research projects * Add new README to explain where the projects went * Trigger tests * Cleanup all references to research_projects	2025-03-11 13:47:38 +00:00
Steven Liu	ed1807bab3	[docs] Update docs dependency (#36635 ) update	2025-03-11 13:42:49 +00:00
Matt	b80b3ec529	Stop warnings from unnecessary torch.tensor() overuse (#36538 )	2025-03-11 13:41:13 +00:00
Matt	556d2c23c6	Remove remote code warning (#36285 ) * Remove redundant pipeline warning * Remove redundant pipeline warning	2025-03-11 13:29:15 +00:00
ivarflakstad	b1a51ea464	Fix AriaForConditionalGeneration flex attn test (#36604 ) AriaForConditionalGeneration depends on idefics3 vision transformer which does not support flex attn	2025-03-11 11:05:49 +01:00
Arthur	d126f35427	Proper_flex (#36643 ) * proper performant flex attention implementation * wrapper for flex attention to compile only when triggered * wrapper for flex attention to compile only when triggered * attention mask type detection * Update src/transformers/integrations/flex_attention.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * nit * nit * nit * nit * gemma2 support * add citation for torchtune * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update flex_attention.py * nit * nit * nit * reset gemma2 modifications * nit * nit * nit * licencing * apply changes to other models * safe import --------- Co-authored-by: Sung Ching Liu <sunny19981005@outlook.com> Co-authored-by: Sung Ching Liu <22844540+bursteratom@users.noreply.github.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>	2025-03-11 10:24:12 +01:00
Travis Johnson	d8663cb8c5	Fix bugs in mllama image processing (#36156 ) * fix: handle input_channel_dim == channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix: default PIL images to channels_last Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * Apply suggestions from code review Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fixup from review batch Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * test: add 1x1 PIL image to ambiguous channel test Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> * fix(mllama): avoid 0 dimension for image with impractical aspect ratio Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> --------- Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-03-11 10:22:48 +01:00
Arthur	1c4b62b219	Refactor some core stuff (#36539 ) * some config changes * update * current state * update * update * updates and cleanup * something that works * fixup * fixes * nits * nit * nits and fix * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/integrations/tensor_parallel.py Co-authored-by: Lysandre Debut <hi@lysand.re> * cleanup * style * safe import * fix * updates * rename stuff an clean * style * small updates * ups * oups * nit * protect imports * update tp * rodfl * arf * turbo nit on init * fix import error * frumble gumbgle * try to fix the import error * should fix the non model test * update keep in float32 * update * fix * nits * fix subvconfigs * test was weird * nit * fix failing test * fix instruct blip * fixes * style * x.com * fix overwrite * ok last bit of failing test --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2025-03-11 09:26:28 +01:00
Steven Liu	e9756cdbc7	[docs] Serving LLMs (#36522 ) * initial * fix * model-impl	2025-03-10 13:14:19 -07:00
Afanti	af9b2eaa54	chore: fix typos in language models (#36586 ) * chore: fix typos in language models * chore: fix typos in mistral model * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue * chore: fix model copy from issue	2025-03-10 15:54:49 +00:00
Matt	a929c466d0	Fix auto-assign reviewers (#36631 ) * Fix auto-assign reviewers * Clean up endanchor a bit * We don't actually need the end anchor at all	2025-03-10 15:52:13 +00:00
Joao Gante	858545047c	[`HybridCache`] disable automatic compilation (#36620 )	2025-03-10 09:24:26 +00:00
Kevron Rees	94ae1ba5b5	Fix check for XPU. PyTorch >= 2.6 no longer needs ipex. (#36593 )	2025-03-07 14:09:35 +00:00
gautham	a1cf9f3390	Fixed datatype related issues in `DataCollatorForLanguageModeling` (#36457 ) Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`: 1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`. 2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) \| (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.	2025-03-07 14:09:27 +00:00
dependabot[bot]	4fce7a0f0f	Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decision_transformer (#36582 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-07 13:35:59 +00:00
Joao Gante	f2fb41948e	Update "who to tag" / "who can review" (#36394 ) update who to tag	2025-03-07 13:09:31 +00:00
Krishnakumar Kannan	1b9978c360	Update chat_extras.md with content correction (#36599 ) Update chat_extras.md - content Fixed a typo in the content, that may confuse the readers.	2025-03-07 13:09:02 +00:00
Matt	f2e197c30a	Github action for auto-assigning reviewers (#35846 ) * First draft of github action on PR opening for auto-assigning reviewers * fix missing import * Don't reassign reviewers if we already have them * Temporarily comment out the opened line so we can test the script * Correct path for codeowners file * Update workflow permissions * Update workflow permissions * Update debug logs * Strip inline comments * Remove prefix * Request reviews instead of assigning * Request reviews instead of assigning * Add TODO * Use pull-request-target instead * Update the script * Set back to pull_request for testing * Set to pull_request_target, testing works! * Add licence * Tighten up one of the globs * Refactor things to be a bit less convoluted * Only assign reviewers when marked ready for review	2025-03-07 12:18:49 +00:00
Andreas Abdi	8a16edce67	Export base streamer. (#36500 ) * Export base streamer. Previously, the base streamer class was not exported so the set of available streamers was fixed to 3 streamer classes. This change makes it so that customers may extend the default base streamer class. * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Joao Gante <joao@huggingface.co>	2025-03-07 11:16:09 +00:00
Dolen	6f775970c7	avoid errors when the size of `input_ids` passed to `PrefixConstrainedLogitsProcessor` is zero (#36489 ) * avoid errors when the size of `input_ids` passed to PrefixConstrainedLogitsProcessor is zero * use more reasonable process * avoid early return --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-03-07 11:02:49 +00:00
Nouamane Tazi	51ed61e2f0	Mention UltraScale Playbook 🌌 in docs (#36589 )	2025-03-06 14:48:11 -08:00
Aritra Roy Gosthipaty	159445d044	fix: argument (#36558 ) `752ef3fd4e/utils/modular_model_converter.py (L1729)`	2025-03-06 13:11:19 -08:00
Joao Gante	5275ef6f3d	[XGLM] tag tests as slow (#36592 ) these tests should be slow	2025-03-06 17:54:41 +00:00
Joao Gante	c1b24c0b73	[bark] fix loading of generation config (#36587 )	2025-03-06 16:55:19 +00:00
Shaohon Chen	0440dbc0e1	Integrate SwanLab for offline/online experiment tracking and local visualization (#36433 ) * add swanlab integration * feat(integrate): add SwanLab as an optional experiment tracking tool in transformers - Integrated SwanLab into the transformers library as an alternative for experiment tracking. - Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`. - Added necessary dependencies and documentation for SwanLab integration. * Fix the spelling error of SwanLabCallback in callback.md * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Fix typo in comment * Fix typo in comment * Fix typos and update comments * fix annotation * chore: opt some comments --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: AAssets <20010618@qq.com> Co-authored-by: ZeYi Lin <944270057@qq.com> Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>	2025-03-06 17:35:30 +01:00
hlky	bc30dd1efb	Modular Conversion --fix_and_overwrite on Windows (#36583 ) * Modular Conversion --fix_and_overwrite on Windows * -newline on read	2025-03-06 13:12:30 +00:00
湛露先生	9e385109cf	Delete redundancy if case in model_utils (#36559 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-06 11:36:11 +00:00
dependabot[bot]	acc49e390d	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/pplm (#36540 ) Bump transformers in /examples/research_projects/pplm Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-06 11:35:47 +00:00
Afanti	9e84b38135	chore: enhance message descriptions in parameters,comments,logs and docstrings (#36554 ) * chore: enhance message descriptons in parameters,comments,logs and docstrings * chore: enhance message descriptons in parameters,comments,logs and docstrings * Update src/transformers/hf_argparser.py * Update src/transformers/keras_callbacks.py --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-06 11:02:35 +00:00
湛露先生	6966fa1901	Fix typos . (#36551 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-03-05 16:31:43 -08:00
co63oc	996f512d52	Fix typos in tests (#36547 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-05 15:04:06 -08:00
Marc Sun	752ef3fd4e	guard torch version for uint16 (#36520 ) * u16 * style * fix	2025-03-05 11:27:01 +01:00
Afanti	66f29aaaf5	chore: enhance messages in docstrings (#36525 ) chore: enhance the message in docstrings	2025-03-04 16:31:20 +00:00
Mohamed Mekkouri	89d27fa6ff	Fix links in quantization doc (#36528 ) fix quantization doc	2025-03-04 16:43:03 +01:00
ivarflakstad	c0c5acff07	Fix bamba tests amd (#36535 )	2025-03-04 15:24:27 +01:00
co63oc	37508816d6	chore: Fix typos in docs and examples (#36524 ) Fix typos in docs and examples Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-04 13:47:41 +00:00
Arthur	84f0186e89	Add aya (#36521 ) * initial commit * small fix * move stuff to image processing file * remove stuff in validate turn and fix return tensor * remove liquid stuff * in the process of addressing comments * changes to get the right tokenization * new __init__ works * fixing defulat std and mean * works * small testing scipt -- to be deleted before merge * remove redundant code * addressing comments * fix inits, add docs templates * refactor processor, switch to gotocr image processor * remove image proc from init * refactor to working llava-style architecture * Change AyaVisionModel to AyaVisionForConditionalGeneration * add tests * fixups * update doc * Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model * better variable names + remove code paths * Updates to aya_vision.md * address comments * adding copied from * make style and remove unused projector_hidden_act from config * sort init * include usage of fast image proc and proc on cuda in doc * update checkpoint iin test processor * update checkpoint in test processor 2 * remove test_model and update docstring * skip failing tests --------- Co-authored-by: Saurabh Dash <saurabh@cohere.com> Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-03-04 12:24:33 +01:00
Steven Liu	c0f8d055ce	[docs] Redesign (#31757 ) * toctree * not-doctested.txt * collapse sections * feedback * update * rewrite get started sections * fixes * fix * loading models * fix * customize models * share * fix link * contribute part 1 * contribute pt 2 * fix toctree * tokenization pt 1 * Add new model (#32615) * v1 - working version * fix * fix * fix * fix * rename to correct name * fix title * fixup * rename files * fix * add copied from on tests * rename to `FalconMamba` everywhere and fix bugs * fix quantization + accelerate * fix copies * add `torch.compile` support * fix tests * fix tests and add slow tests * copies on config * merge the latest changes * fix tests * add few lines about instruct * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * fix tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * "to be not" -> "not to be" (#32636) * "to be not" -> "not to be" * Update sam.md * Update trainer.py * Update modeling_utils.py * Update test_modeling_utils.py * Update test_modeling_utils.py * fix hfoption tag * tokenization pt. 2 * image processor * fix toctree * backbones * feature extractor * fix file name * processor * update not-doctested * update * make style * fix toctree * revision * make fixup * fix toctree * fix * make style * fix hfoption tag * pipeline * pipeline gradio * pipeline web server * add pipeline * fix toctree * not-doctested * prompting * llm optims * fix toctree * fixes * cache * text generation * fix * chat pipeline * chat stuff * xla * torch.compile * cpu inference * toctree * gpu inference * agents and tools * gguf/tiktoken * finetune * toctree * trainer * trainer pt 2 * optims * optimizers * accelerate * parallelism * fsdp * update * distributed cpu * hardware training * gpu training * gpu training 2 * peft * distrib debug * deepspeed 1 * deepspeed 2 * chat toctree * quant pt 1 * quant pt 2 * fix toctree * fix * fix * quant pt 3 * quant pt 4 * serialization * torchscript * scripts * tpu * review * model addition timeline * modular * more reviews * reviews * fix toctree * reviews reviews * continue reviews * more reviews * modular transformers * more review * zamba2 * fix * all frameworks * pytorch * supported model frameworks * flashattention * rm check_table * not-doctested.txt * rm check_support_list.py * feedback * updates/feedback * review * feedback * fix * update * feedback * updates * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2025-03-03 10:33:46 -08:00
Matt	6aa9888463	Remove unused code (#36459 )	2025-03-03 18:31:10 +00:00
Kashif Rasul	9fe82793ee	[Style] fix E721 warnings (#36474 ) * fix E721 warnings * config.hidden_size is not a tuple * fix copies * fix-copies * not a tuple * undo * undo	2025-03-03 18:03:42 +00:00
Matt	1975be4d97	Fix edge case for continue_final_message (#36404 ) * Fix edge case for continue_final_message * lstrip() correctly * Add regression test * Add a clearer error message when the final message is not present * Add a clearer error message when the final message is not present * Fix massive bug!	2025-03-03 18:03:03 +00:00
Matt	2aff938992	Fix pipeline+peft interaction (#36480 ) * Fix pipeline-peft interaction * once again you have committed a debug breakpoint * Remove extra testing line * Add a test to check adapter loading * Correct adapter path * make fixup * Remove unnecessary check * Make check a little more stringent	2025-03-03 18:01:43 +00:00
Afanti	28159aee63	chore: fix message descriptions in arguments and comments (#36504 ) chore: fix messagedescriptions in arguments and comments	2025-03-03 17:54:57 +00:00
co63oc	acb8586dd9	Fix some typos in docs (#36502 ) Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-03-03 17:53:53 +00:00
Marc Sun	0463901c92	fix torch_dtype, contiguous, and load_state_dict regression (#36512 ) * fix regression * fix param * fix load_state_dict * style * better fix for module * fix tests * quick fix for now * rm print	2025-03-03 18:35:37 +01:00
Marcel	3e83ee75ec	Fix kwargs UserWarning in SamImageProcessor (#36479 ) transformers/image_processing_utils.py:41: UserWarning: The following named arguments are not valid for `SamImageProcessor.preprocess` and were ignored: 'point_pad_value'	2025-03-03 16:23:34 +00:00
Yih-Dar	9e3a1072c2	Check `TRUST_REMOTE_CODE` for `RealmRetriever` for security (#36511 ) * fix * repush --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-03 15:08:12 +01:00
Zach Mueller	4d8259d245	Fix loading zero3 weights (#36455 ) * Check if fixes * Fix zero3 loading * Quality * Fix marc nit * Add fast tests * Migrate to integrations.deepspeed rather than modeling_utils * Style	2025-03-03 15:05:58 +01:00

1 2 3 4 5 ...

18207 Commits