transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 21:00:08 +06:00

Author	SHA1	Message	Date
efsotr	2b4734bd49	Support passing flash_attn_kwargs when gradient_checkpointing is enabled (#37037 ) * support passing flash_attn_kwargs when gradient_checkpointing is enabled * make modeling_deepspeek_v3.py consistent with modular_deepseek_v3.py	2025-03-31 10:53:02 +02:00
cyyever	41a0e58e5b	Set weights_only in torch.load (#36991 )	2025-03-27 14:55:50 +00:00
cyyever	2b550c47b2	Remove deprecated training arguments (#36946 ) * Remove deprecated training arguments * More fixes * More fixes * More fixes	2025-03-26 16:44:48 +00:00
Yih-Dar	121830ab47	update examples after ruff being updated (#36972 ) * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-03-25 18:15:47 +01:00
湛露先生	ebd2029483	Change GPUS to GPUs (#36945 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2025-03-25 17:25:39 +01:00
Arthur Zucker	4542b8fb27	push v4.51.0.dev0	2025-03-21 13:45:25 +01:00
Matt	9be4728af8	Just import torch AdamW instead (#36177 ) * Just import torch AdamW instead * Update docs too * Make AdamW undocumented * make fixup * Add a basic wrapper class * Add it back to the docs * Just remove AdamW entirely * Remove some AdamW references * Drop AdamW from the public init * make fix-copies * Cleanup some references * make fixup * Delete lots of transformers.AdamW references * Remove extra references to adamw_hf	2025-03-19 18:29:40 +00:00
Matt	1e4286fd59	Remove research projects (#36645 ) * Remove research projects * Add new README to explain where the projects went * Trigger tests * Cleanup all references to research_projects	2025-03-11 13:47:38 +00:00
dependabot[bot]	4fce7a0f0f	Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/decision_transformer (#36582 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-07 13:35:59 +00:00
dependabot[bot]	acc49e390d	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/pplm (#36540 ) Bump transformers in /examples/research_projects/pplm Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-06 11:35:47 +00:00
co63oc	37508816d6	chore: Fix typos in docs and examples (#36524 ) Fix typos in docs and examples Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-03-04 13:47:41 +00:00
Kashif Rasul	9fe82793ee	[Style] fix E721 warnings (#36474 ) * fix E721 warnings * config.hidden_size is not a tuple * fix copies * fix-copies * not a tuple * undo * undo	2025-03-03 18:03:42 +00:00
Cyril Vallez	da4ab2a1b6	Fix doc formatting in forward passes & modular (#36243 ) * fix indentation issues + modular without magic keyword * style * Update doc.py * style * Fix all decorators indentation * all models * style * style * Update doc.py * fix * general fix * style	2025-02-25 11:09:01 +01:00
Cyril Vallez	bc65f3fc1c	[modular] Do not track imports in functions (#36279 ) * Add check * just check for function * Update examples	2025-02-25 10:29:47 +01:00
Mohamed Mekkouri	e5cea20743	Add Example for Custom quantization (#36286 ) * add example * rename	2025-02-19 17:09:23 +01:00
Parteek	8eaae6bee9	Added Support for Custom Quantization (#35915 ) * Added Support for Custom Quantization * Update code * code reformatted * Updated Changes * Updated Changes --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-18 16:14:19 +01:00
dependabot[bot]	3e970dbbf1	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/codeparrot/examples (#36237 ) Bump transformers in /examples/research_projects/codeparrot/examples Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-17 16:28:43 +00:00
Arthur Zucker	c877c9fa5b	v4.45.0-dev0	2025-02-17 15:21:20 +01:00
dependabot[bot]	c5506f4f00	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial (#36168 ) Bump transformers in /examples/research_projects/adversarial Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 15:06:16 +00:00
dependabot[bot]	d7c5d1b539	Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu (#36167 ) Bump transformers in /examples/tensorflow/language-modeling-tpu Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-13 14:46:38 +00:00
Thomas Bauwens	8f137b2427	Move `DataCollatorForMultipleChoice` from the docs to the package (#34763 ) * Add implementation for DataCollatorForMultipleChoice based on docs. * Add DataCollatorForMultipleChoice to import structure. * Remove custom DataCollatorForMultipleChoice implementations from example scripts. * Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean. * Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable. * Apply suggested changes and run make fixup. * fix copies, style and fixup * add missing documentation * nits * fix docstring * style * nits * isort --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2025-02-13 12:01:28 +01:00
dependabot[bot]	d52a9d08ce	Bump cryptography from 43.0.1 to 44.0.1 in /examples/research_projects/decision_transformer (#36142 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.1 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.1...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:34:52 +00:00
dependabot[bot]	31e4831b98	Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/vqgan-clip (#36136 ) Bump transformers in /examples/research_projects/vqgan-clip Bumps [transformers](https://github.com/huggingface/transformers) from 4.38.0 to 4.48.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-02-12 13:21:09 +00:00
Fanli Lin	9b69986e8a	[docs] minor doc fix (#36127 ) fix	2025-02-11 10:31:12 -08:00
Liangliang Ma	315a9f494e	Add XPU type for work-around -inf mask causing sdpa NaN issue in modeling files (#35647 ) * add xpu for unmask * change modular for generated matching * add lastest modeling for helium	2025-02-05 13:28:31 +01:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Gar	9d2056f12b	Add mean_resizing for every VLMs' resizing_token_embeddings() (#35717 ) * refine all resize_token_embedding() * ruff format * hotfix	2025-02-03 15:03:49 +01:00
Sugendran Ganess	14a9bb520e	Fix fast image processor warnings in object detection examples (#35892 ) Have the DETR examples default to using the fast image processor	2025-01-27 08:32:44 +00:00
Joao Gante	90b46e983f	Remove old `benchmark` code (#35730 ) * remove traces of the old deprecated benchmarks * also remove old tf benchmark example, which uses deleted code * run doc builder	2025-01-21 17:56:43 +00:00
Louie Tsai	f82b19cb6f	add a new flax example for Bert model inference (#34794 ) * add a new example for flax inference cases * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update examples/flax/language-modeling/README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix for "make fixup" --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-21 14:09:29 +01:00
Cyril Vallez	91be6a5eb2	Modular: support for importing functions from any file (#35692 ) * fix function imports * improve comment * Update modeling_switch_function.py * make checks more robust * improvement * rename * final test update	2025-01-16 16:37:53 +00:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00
Arthur Zucker	f63829c87b	v4.49.0-dev	2025-01-10 12:31:11 +01:00
Cyril Vallez	46276f9a7f	Fix modular edge case + modular sorting order (#35562 ) * look-ahead negation * re add examples by default * Fix the bug in topological sort * Update create_dependency_mapping.py * start adding test * finalize test * more tests * style * style	2025-01-09 17:17:52 +01:00
Kevin R	665a4942e4	Check whether rescale is requested before checking is_scaled_image (#35439 )	2025-01-07 11:39:45 +00:00
dependabot[bot]	86fa3cedad	Bump jinja2 from 3.1.4 to 3.1.5 in /examples/research_projects/decision_transformer (#35408 ) Bump jinja2 in /examples/research_projects/decision_transformer Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/jinja/compare/3.1.4...3.1.5) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-06 16:58:29 +00:00
Arthur	2c47618c1a	🚨All attention refactor🚨 (#35235 ) * refactor LlamaAttention * minimal changes * fix llama * update * modular gemmas * modular nits * modular updates * nits * simplify * gpt2 * more modualr and fixes * granite * modular modular modular * nits * update * qwen2 + starcoder2 * mostly gemma2 * Update image_processing_auto.py * fix * Update modular_starcoder2.py * fix * remove all copied from attentions * remove gcv * make fix-copies * oups * oups2.0 * fix some modulars + all copied from * should be good now * revert unwanted changes * Update modeling_decision_transformer.py * finish cleanup * Update modeling_olmo.py * consistency * re-add gradient checkpointing attribute * fix * style * make config necessary * bis * bis * Update modeling_my_new_model2.py * is_causal attr * fix * remove past kv return from decoder layer * fix * default rope config * correctly fix rope config * fix bias * fix gpt2 attention output * fix test * fix inits * fix default sdpa * fix default sdpa implementation * harmonize classes * fix mistral * fix sliding window models * mixtral * be more explicit * style * fix * several fixes * Update modeling_dbrx.py * fix test * olmo + phi * rotary * syle * phi * phi again * again * kwargs * Update test_modeling_common.py * skip fx tracing tests * Update modeling_utils.py * gemma 2 * again * Update modeling_recurrent_gemma.py * gemma2 * granite * style * starcoder * Update sdpa_attention.py * switch args * Update modeling_mllama.py * fix * cache type tests * gpt2 * Update test_modeling_common.py * fix * consistency * fix shape with encoder * should be the last one * tests non model * most comments * small oupsi * be more explicit in modulars * more explicit modulars * CIs! it works locally * add kwargs to _flash_attention_forward --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2024-12-18 16:53:39 +01:00
Pavel Iakubovskii	5fcf6286bf	Add TimmWrapper (#34564 ) * Add files * Init * Add TimmWrapperModel * Fix up * Some fixes * Fix up * Remove old file * Sort out import orders * Fix some model loading * Compatible with pipeline and trainer * Fix up * Delete test_timm_model_1/config.json * Remove accidentally commited files * Delete src/transformers/models/modeling_timm_wrapper.py * Remove empty imports; fix transformations applied * Tidy up * Add image classifcation model to special cases * Create pretrained model; enable device_map='auto' * Enable most tests; fix init order * Sort imports * [run-slow] timm_wrapper * Pass num_classes into timm.create_model * Remove train transforms from image processor * Update timm creation with pretrained=False * Fix gamma/beta issue for timm models * Fixing gamma and beta renaming for timm models * Simplify config and model creation * Remove attn_implementation diff * Fixup * Docstrings * Fix warning msg text according to test case * Fix device_map auto * Set dtype and device for pixel_values in forward * Enable output hidden states * Enable tests for hidden_states and model parallel * Remove default scriptable arg * Refactor inner model * Update timm version * Fix _find_mismatched_keys function * Change inheritance for Classification model (fix weights loading with device_map) * Minor bugfix * Disable save pretrained for image processor * Rename hook method for loaded keys correction * Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm` * Managing num_labels <-> num_classes attributes * Enable loading checkpoints in Trainer to resume training * Update error message for output_hidden_states * Add output hidden states test * Decouple base and classification models * Add more test cases * Add save-load-to-timm test * Fix test name * Fixup * Add do_pooling * Add test for do_pooling * Fix doc * Add tests for TimmWrapperModel * Add validation for `num_classes=0` in timm config + test for DINO checkpoint * Adjust atol for test * Fix docs * dev-ci * dev-ci * Add tests for image processor * Update docs * Update init to new format * Update docs in configuration * Fix some docs in image processor * Improve docs for modeling * fix for is_timm_checkpoint * Update code examples * Fix header * Fix typehint * Increase tolerance a bit * Fix Path * Fixing model parallel tests * Disable "parallel" tests * Add comment for metadata * Refactor AutoImageProcessor for timm wrapper loading * Remove custom test_model_outputs_equivalence * Add require_timm decorator * Fix comment * Make image processor work with older timm versions and tensor input * Save config instead of whole model in image processor tests * Add docstring for `image_processor_filename` * Sanitize kwargs for timm image processor * Fix doc style * Update check for tensor input * Update normalize * Remove _load_timm_model function --------- Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>	2024-12-11 12:40:30 +00:00
Cyril Vallez	d363e71d0e	🧹 Remove deprecated RotaryEmbedding parts in the Attention layers (#34858 ) * update * style * fix missing args * remove last trace of old rope classes * remove deprecated copied from * fix copies * trigger CIs * post rebase clean-up * reverse mistral * cleanup after dropping commits * Add comment	2024-12-11 11:16:52 +01:00
Lysandre	66ab300aaf	Dev version	2024-12-05 19:12:22 +01:00
Cyril Vallez	1da1e0d7f2	Support for easier multimodal use of modular (#35056 ) * update modular and add examples * style * improve example comments * style * fix small logic issue for imports * fix relative order issue when files do not make sense * Improve comments * trigger CIs	2024-12-04 15:13:11 +01:00
dependabot[bot]	1de3598d30	Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/lxmert (#34917 ) Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst) - [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-25 15:19:29 +00:00
dependabot[bot]	62ab94dea8	Bump tornado from 6.4.1 to 6.4.2 in /examples/research_projects/visual_bert (#34887 ) Bump tornado in /examples/research_projects/visual_bert Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.1 to 6.4.2. - [Changelog](https://github.com/tornadoweb/tornado/blob/v6.4.2/docs/releases.rst) - [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.1...v6.4.2) --- updated-dependencies: - dependency-name: tornado dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-25 12:54:55 +00:00
Yoni Gozlan	3a8eb74668	Fix support for image processors modifications in modular (#34866 ) * add fix and examples * fix camel case naming	2024-11-22 18:14:24 -05:00
dependabot[bot]	15dd625a0f	Bump aiohttp from 3.10.2 to 3.10.11 in /examples/research_projects/decision_transformer (#34792 ) Bump aiohttp in /examples/research_projects/decision_transformer Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](https://github.com/aio-libs/aiohttp/compare/v3.10.2...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-11-19 16:08:07 +00:00
David Zhang	427b62ed1a	Fix post process function called in the instance segmentation example of mask2former (#34588 ) * Fix post process function called in the instance segmentation example of mask2former * fix description and additional notes for post_process_instance_segmentation of maskformers * remove white space in maskformers post_process_instance_segmentation doc * change image.size[::-1] to height and width for clarity in segmentation examples	2024-11-19 16:49:25 +01:00
Cyril Vallez	e3a5889ef0	Modular fix (#34802 ) * Modular fix * style * remove logger warning * Update modular_model_converter.py	2024-11-19 16:08:57 +01:00
ZuoChen_BUPT	c772d4d91e	fix a typo bug where 'id2label' was incorrectly written as 'i2label' when reading config (#34637 ) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config	2024-11-18 14:41:48 +01:00
Eon Kim	45b0c7680c	Remove unused test_dataset (#34516 )	2024-11-05 14:01:25 +00:00

1 2 3 4 5 ...

2579 Commits