transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Anton Vlasjuk	1dc619e59f	[`FlexAttn`] Fix models with unique characteristics (#38433 ) * fix * style * check * check 2 * add deepseek workaround	2025-06-04 13:37:28 +02:00
Yih-Dar	ff3fad61e3	Fix `deepseekv3` (#38562 ) * fix 1 * fix 2 * fix 3 * fix 4 * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 11:40:14 +02:00
Yih-Dar	6085cded38	update `utils/notification_service.py` for AMD vs Nvidia (#38563 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 11:38:25 +02:00
Yih-Dar	3c995c1fdc	Fix `chameleon` tests (#38565 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-04 10:13:35 +02:00
Armaghan Shakir	55736eea99	Add support for MiniMax's MiniMax-Text-01 (#35831 ) * end-to-end architecture * lightning-attn: refactor, clean, optimize * put minimax_text_01 in other files * use latest __init__ standards and auto-generate modular * support attention_mask for lightning-attn * Revert "use latest __init__ standards and auto-generate modular" This reverts commit `d8d3c409d8`. * fix modular conversion * pass both attention masks instead of tuple * formatting * Updated Dynamic Cache * created MiniMaxText01Cache * fix hardcoded slope_rate * update attn_type_list in config * fix lightning when use_cache=False * copy tests from mixtral * (checkpoint) all tests pass for normal attention * fix all unittests * fix import sorting * fix consistency and formatting tests * fix config * update tests, since changes in main * fix seq_len error * create dummy docs * fix checkpoint * add checkpoint in config docstring * run modular_conversion * update docs * fix checkpoint path and update tests * fix ruff * remove repeated expected_slice * update docs * rename "minimax-text-01" to "minimax" * inherit config from mixtral * remove from docs in other languages * undo files that should be untouched * move minimax to end in conversation docs * use MiniMaxForCausalLM as it is * ruff fixes * run modular * fix docstring example in causallm * refactor attention loop and decay factors * refactor config in modular * run modular * refactor cache * rename static_cache to linear_cache * make positional embeddings necessary * remove unnecessary layernorms declarations * fix import in tests * refactor attention in next tokens * remove outdated code * formatting and modular * update tests * rename layernorm alpha/beta factors * register decay factors as buffers * remove unused declarations of decay factors * update config for alpha/beta factors * run modular * remove head_dim in tests * remove minimax from fx.py * remove stuff that is not really needed * update __init__ * update qkv torch.split Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * fix qkv torch.split * quality fixes * remove mistakenly added dummy * purge unused ModelTester code * fix-copies * run fix-copies * fix head_dim * write cache formatting tests * remove postnorm * avoid contiguous in attention current states * update expected_slice * add generation test for integration * fix dtype in generation test * update authors * update with changes in main * update graident checkpointing and minor fixes * fix mutable attn_type_list * rename: attn_type -> layer_type * update for layer_types * update integration tests * update checkpoint * clean overview in docs --------- Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>	2025-06-04 09:38:40 +02:00
Rémi Ouazan	037acf1d10	[janus] Fix failing tests on mi3XX (#38426 ) * Fix multiple devices error on Janus * Fix AttributeError on Janus BOI token * Initialize lm first in Janus to get correct device map * Added expectations for Janus test_model_generate_images * Fixed JanusVisionEncoderLayer being split across devices * Code formatting * Adding modeling file * Reverted changes out of scope for this PR	2025-06-04 09:38:10 +02:00
Steven Liu	78d771c3c2	[docs] Format fix (#38414 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details fix table	2025-06-03 09:53:23 -07:00
Marc Sun	0f41c41a46	Fix hqq issue (#38551 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * bc * style	2025-06-03 17:58:31 +02:00
Driss Guessous	279000bb70	Name change AOPermod -> ModuleFqn (#38456 ) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-06-03 15:43:31 +00:00
Yih-Dar	e8b292e35f	Fix `utils/notification_service.py` (#38556 ) * fix * fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 13:59:31 +00:00
Muqi Li	8cb96787a6	Explicitly setting encoding in tokenization_utils_base.py (#38553 ) Update tokenization_utils_base.py Add encoding explicitly	2025-06-03 12:08:35 +00:00
Matej Sirovatka	caf708da1b	[TP] Change command in tests to `python3` (#38555 ) * Fix: change to `python3` * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 11:03:33 +00:00
Zhen	fdf86fb440	[bugfix] [WIP] fix apply_rotary_emb error on Ascend NPU (#38491 ) [bugfix] fix apply_rotary_emb error on Ascend NPU	2025-06-03 09:31:49 +00:00
Yih-Dar	ca0a682796	Update docker image to use `av` (#38548 ) * Update * Update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-03 11:04:41 +02:00
jiqing-feng	814432423c	update emu3 test (#38543 ) Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-06-03 11:02:01 +02:00
Raushan Turganbay	55ec319de6	Don't use default attn if pre-set in sub-config (#38526 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * don't use default attn if pre-set in sib-config * style * add a test maybe	2025-06-03 07:53:07 +00:00
Raushan Turganbay	bf68dd9e6e	[tests] expand flex-attn test for vision models (#38434 ) * expand the test for VLMs * typo * mark models `supports_flex` + expand test for additional kwargs * flex attn for refactored vision models * fix copies * fix * unskip * style * address comments	2025-06-03 07:40:44 +00:00
Yih-Dar	de4cf5a38e	Fix blip2 tests (#38510 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * fix 1: not sure * fix 2: _supports_flex_attn = False * fix 3: embedding_output = self.layernorm(query_embeds.to(self.layernorm.weight.dtype)) * fix 4: query_embeds = query_embeds.to(self.layernorm.weight.dtype) * fix 5: text_embeds = text_embeds.to(dtype=torch.float16) * fix 5: question_embeds.to(dtype=torch.float16) * fix 6: text_embeds = text_embeds.to(dtype=self.itm_head.weight.dtype) * fix 7: image_embeds and question_embeds * fix 8: fix other 2 fp16 tests * fix 9: fix T5 OOM * fix 10: fix T5 OOM * fix 11: fix T5 * fix 11: fix T5 beam * fix 12: _supports_sdpa=False * fix 12: style and expect * revert * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-02 22:46:35 +02:00
Yih-Dar	ccc859620a	Fix `Gemma2IntegrationTest` (#38492 ) * fix * fix * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * skip-ci * update * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-06-02 22:45:09 +02:00
Yaswanth Gali	1094dd34f7	Remove type annotation in Siglip Attention Module (#38503 ) * Remove type annotation * remove print statement	2025-06-02 17:51:07 +02:00
Lysandre Debut	afb35a10ed	Num parameters in model.safetensors.index.json (#38531 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details Num parameters in index.json	2025-06-02 17:16:31 +02:00
Yiding Jia	cceab972ba	[flax/mistral] support sliding_window: null in config (#37402 ) flax/mistral: Allow sliding_window to be set to none	2025-06-02 16:45:02 +02:00
Marc Sun	1a25fd2f6d	Fix amp deprecation issue (#38100 ) apex amp is deprecated	2025-06-02 16:15:41 +02:00
Ita Zaporozhets	05ad826002	remove unhandled parameter (#38145 )	2025-06-02 15:57:32 +02:00
Tony Wu	c72ba69441	Add ColQwen2 to 🤗 transformers (#35778 ) * feat: add colqwen2 (wip) * tests: fix test_attention_outputs * tests: reduce hidden size to accelerate tests * tests: fix `test_attention_outputs` 🥳 * fix: fix wrong parent class for `ColQwen2ForRetrievalOutput` * fix: minor typing and style changes * chore: run `make style` * feat: remove redundant `max_num_visual_tokens` attribute in `ColQwen2Processor` * tests: tweak comments * style: apply ruff formatter * feat: move default values for `visual_prompt_prefix` and `query_prefix` * docs: update ColQwen2 model card * docs: tweak model cards * docs: add required example config checkpoint * tests: update expected scores in integration test * docs: tweak quickstart snippets * fix: address PR comments * tests: fix colqwen2 tests + tweak comment in colpali test * tests: unskip useful tests * fix: fix bug when `visual_prompt_prefix` or `query_prefix` is an empty string * fix: fix ColPali outputs when `return_dict == False` * fix: fix issue with PaliGemma output not being a dict * docs: set default dtype to bfloat16 in quickstart snippets * fix: fix error when `return_dict=False` in ColPali and ColQwen2 * tests: fix special tokens not being replaced in input_ids * style: fix lint * fix: `ColQwen2Processor`'s `padding_side` is now set from `processor_config.json` * fix: remove unused `padding_side` in ColQwen2 model * docs: update ColQwen2's model doc * fix: fix harcoded vlm backbone class in ColQwen2Config * fix: remove `padding_side` from ColQwen2Processor as should fed from kwargs * docs: fix typo in model docstring * docs: add illuin mention in model docs * fix: let `padding_size` be handled by `tokenizer_config.json` * docs: add colpali reference url in colqwen2's model doc * docs: add Hf mention in model docs * docs: add late interaction mention in model docs * docs: tweak colqwen2 model doc * docs: update reference checkpoint for ColPali to v1.3 * docs: simplify quickstart snippets * docs: remove redundant `.eval()` * refactor: use `can_return_tuple` decorator for ColPali and ColQwen2 * docs: fix copyright date * docs: add missing copyright in tests * fix: raise error when `initializer_range` is not in config * docs: remove redundant `.eval()` in colpali doc * fix: fix `get_text_config` now that Qwen2VL has a proper `text_config` attribute See https://github.com/huggingface/transformers/pull/37268 for details about changes in Qwen2VL's config. * fix: add missing `initializer_range` attribute in `ColQwen2Config` * fix: use `get_text_config` in `resize_token_embeddings` * update colwen2 with auto_docstring * docs: fix wrong copyright year * chore: remove `raise` as `initializer_range` has a default value in `ColQwen2Config` * refactor: merge `inner_forward` into `forward` * Refactor colqwen2 after refactoring of qwen2VL, use modular for modeling code * protect torch import in modular to protect in processing * protect torch import in modular to protect in processing * tests: fix hf model path in ColQwen2 integration test * docs: clarify `attn_implementation` and add comments * docs: add fallback snippet for using offline PIL dummy images * docs: temporarily revert attn_implementation to `None` while sdpa is not fixed * docs: tweaks in colpali/colqwen2 quick start snippets * fix: add missing flags to enable SDPA/Flex Attention in ColQwen2 model * fix: add missing changes in modular file * fix modeling tests --------- Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>	2025-06-02 12:58:01 +00:00
Joao Gante	beaed8ce01	[generate] move `SinkCache` to a `custom_generate` repo (#38399 ) remove sink cache	2025-06-02 12:13:30 +02:00
Joao Gante	fe5bfaa4b5	[generate] add soft deprecations on custom generation methods (#38406 ) soft deprecations	2025-06-02 12:11:46 +02:00
mohammed benyamna	a75b9ffb5c	Update Loss Functions to Accept Tensor num_items_in_batch (#38029 ) * Update Loss Functions to Accept Tensor num_items_in_batch * Fix device mismatch by moving num_items_in_batch to loss device in fixed_cross_entropy * fix the ruff check * delete the unused if stat * fix the type problem	2025-06-02 11:31:44 +02:00
Rémi Ouazan	493cf1554b	[seamless_m4t] Skip some tests when speech is not available (#38430 ) * Added the require_speech decorator * Added require_speecj to some seamless_m4t tests * Changed skip message	2025-06-02 09:17:28 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	64d14ef28d	Fix setting FLASH_ATTENTION_DETERMINISTIC after importing (#37185 ) transformers.enable_full_determinism enables deterministic flash attention using `FLASH_ATTENTION_DETERMINISTIC` `800510c67b/src/transformers/trainer_utils.py (L79)` However, current checks use a global variable `deterministic_g`, which will do the environment variable check as soon as importing, this will cause issues as users can call `transformers.enable_full_determinism` after `transformers.modeling_flash_attention_utils` is imported. This behavior is introduced in https://github.com/huggingface/transformers/pull/33932/files#r1806668579 to fix the graph break. As a result, this PR implement fixes by delaying the environment variable check to the first time when `_flash_attention_forward` is executed, so that we can fix this issue and we won't introduce a graph break. Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-06-02 11:08:20 +02:00
Yuanyuan Chen	fde1120b6c	Remove deprecated use_flash_attention_2 parameter (#37131 ) Signed-off-by: cyy <cyyever@outlook.com>	2025-06-02 11:06:25 +02:00
Fanli Lin	51d732709e	[docs] add xpu environment variable for gpu selection (#38194 ) Some checks failed Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Has been cancelled Details Build documentation / build (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Get all modified files (push) Has been cancelled Details Self-hosted runner (push-caller) / Check if setup was changed (push) Has been cancelled Details Secret Leaks / trufflehog (push) Has been cancelled Details Update Transformers metadata / build_and_package (push) Has been cancelled Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Has been cancelled Details Self-hosted runner (push-caller) / build-docker-containers (push) Has been cancelled Details Self-hosted runner (push-caller) / Trigger Push CI (push) Has been cancelled Details * squash commits * rename gpu * rename accelerator * change _toctree.yml * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: sdp <sdp@a4bf01943ff7.jf.intel.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-05-30 16:05:07 +00:00
Marc Sun	c7f2b79dd8	protect dtensor import (#38496 ) protect	2025-05-30 17:36:00 +02:00
Marc Sun	051a8acc9a	Align TP check (#38328 ) align tp check	2025-05-30 17:15:39 +02:00
M Saqlain	e0545ef0b8	[Tests] Reduced model size for albert-test model (#38480 ) * Reduced model size for albert-test model * Run checks * Removed test_save_load * Removed test skipping functions	2025-05-30 14:22:32 +00:00
dependabot[bot]	f962c862ff	Bump torch from 2.2.0 to 2.6.0 in /examples/flax/vision (#37618 ) Some checks failed Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details New model PR merged notification / Notify new model (push) Has been cancelled Details Bumps [torch](https://github.com/pytorch/pytorch) from 2.2.0 to 2.6.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v2.2.0...v2.6.0) --- updated-dependencies: - dependency-name: torch dependency-version: 2.6.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-05-30 14:04:52 +01:00
islemyakoubi	98568d1e25	Fix incorrect bbox_embed initialization when decoder_bbox_embed_share=False in GroundingDINO (#38238 ) * A shallow copy in groundingdino Fixes #37333 * Supprimer une ligne vide dans la classe GroundingDinoForObjectDetection * Translate comments in the GroundingDinoForObjectDetection class from French to English	2025-05-30 15:02:18 +02:00
Winston Castorp	d0fccbf7ef	Fix convert_internvl_weights_to_hf.py to support local paths (#38264 ) fix(internvl): add local path support to convert_internvl_weights_to_hf.py	2025-05-30 14:56:32 +02:00
Arthur	858ce6879a	make it go brrrr (#38409 ) * make it go brrrr * date time * update * fix * up * uppp * up * no number i * udpate * fix * [paligemma] fix processor with suffix (#38365) fix pg processor * [video utils] group and reorder by number of frames (#38374) fix * Fix convert to original state dict for VLMs (#38385) * fix convert to original state dict * fix * lint * Update modeling_utils.py * update * warn * no verbose * fginal * ouft * style --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-05-30 11:19:42 +02:00
Luc Georges	ab5067e7fd	fix: handle no scheduler passed by user (#38407 )	2025-05-30 11:00:44 +02:00
XING, Zhenghao	42ef218b58	[Qwen2.5-Omni] Fix dtype of cos,sin when used with flash attention (#38453 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * Fix dtype of cos,sin when used with flash attention * Fix dtype of cos,sin when used with flash attention	2025-05-29 18:24:40 +00:00
Yih-Dar	81cff7ad34	Fix `Gemma3IntegrationTest` (#38471 ) * check * check * check * check * check * check * check * test style bot * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-05-29 16:51:12 +02:00
Lukas Geiger	e508965df7	Cleanup `BatchFeature` and `BatchEncoding` (#38459 ) * Use dict comprehension to create dict * Fix type annotation Union[Any] doesn't really make any sense * Remove methods that are already implemented in the `UserDict` parent class	2025-05-29 14:13:43 +00:00
Rahul	8e5cefcb1e	Fix TypeError in save_pretrained error handling (fixes #38422 ) (#38449 )	2025-05-29 13:58:16 +00:00
Raushan Turganbay	ad9dd3d17b	🔴 [VLM] modeling updates (#38317 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * updates * fixup * fix tests * fix test * fix * let it be here for now, till monday * two more fixes * persimmon * fixup * fix * fixup * make sure fuyu runs now that LM has new attn API * fixup + tests * qwen vl uses new mask interface as well * qwen image features format * update * remove image_sizes * address comments * i am dumb...	2025-05-29 11:08:23 +00:00
Yaswanth Gali	a6f7acb603	[Tests] Clean up test cases for few models (#38315 ) * Update tests * revert aria change * too slow hence revert	2025-05-29 08:21:28 +00:00
Luc Georges	8010f3cf61	feat: add cache retention for requests (#38446 ) Some checks are pending Self-hosted runner (benchmark) / Benchmark (aws-g5-4xlarge-cache) (push) Waiting to run Details Build documentation / build (push) Waiting to run Details New model PR merged notification / Notify new model (push) Waiting to run Details Slow tests on important models (on Push - A10) / Get all modified files (push) Waiting to run Details Slow tests on important models (on Push - A10) / Slow & FA2 tests (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Check if setup was changed (push) Waiting to run Details Self-hosted runner (push-caller) / build-docker-containers (push) Blocked by required conditions Details Self-hosted runner (push-caller) / Trigger Push CI (push) Blocked by required conditions Details Secret Leaks / trufflehog (push) Waiting to run Details Update Transformers metadata / build_and_package (push) Waiting to run Details * feat: add cache retention for requests * fix: propagate `manual_eviction` param & refactor `finish_request` `finish_request` now only takes `request_id: str` as an input rather than the full `RequestState`, which was not needed and simplifies calling from `ContinuousBatchingManager::evict_request_from_cache` * refactor: pop req from `active_requests` * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-28 18:15:10 +00:00
Yih-Dar	66da700145	Fix GLM4 checkpoints (#38412 ) * fix * fix * fix * fix * fix * fix * test style bot * Apply style fixes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-05-28 16:40:08 +00:00
Avasam	2872e8bac5	Merge type hints from `microsoft/python-type-stubs` (post dropping support for Python 3.8) (#38335 ) * Merge type hints from microsoft/python-type-stubs (post Python 3.8) * Remove mention of pylance * Resolved conflict * Merge type hints from microsoft/python-type-stubs (post Python 3.8) * Remove mention of pylance * Resolved conflict * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Avasam <samuel.06@hotmail.com> --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>	2025-05-28 16:21:40 +00:00
Yuanzhou Cai	942c60956f	Model card for mobilenet v1 and v2 (#37948 ) * doc: #36979 * doc: update hfoptions * add model checkpoints links * add model checkpoints links * update example output * update style #36979 * add pipeline tags * improve comments * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * apply suggested changes * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-05-28 09:20:19 -07:00

1 2 3 4 5 ...

19186 Commits