Commit Graph

18498 Commits

Author SHA1 Message Date
Nikos Antoniou
f74d7da836
Introduce modular files for speech models (#35902)
* WAV_2_VEC_2 to WAV2VEC2

* added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio

* remove unnessary definitions in modulars

* added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer

* docstring fix for UniSpeechForCTC

* removed unneccessary re-definition of modular classes

* reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput

* top-level import of deepspeed in seamless_m4t, speecht5

* avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations

* convert modular

* tiny modular typing fixes

* some more modular fixes

* make style

---------

Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com>
2025-04-04 11:46:27 +02:00
Ita Zaporozhets
d130cd0e16
update error msg (#37207) 2025-04-04 10:21:30 +02:00
Raushan Turganbay
41b9b92b52
[qwen-vl] fix image processor (#37258)
* fix

* add test
2025-04-03 19:48:56 +02:00
Surya Garikipati
8dd0a2b89c
Update model card for electra (#37063)
* Update ELECTRA model card with new format

* Update ELECTRA model card with new format

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/electra.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* close hfoption block

---------

Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:45:35 -07:00
Parag Ekbote
15ac2b6ac5
Update Model Card for ModernBERT (#37052)
* Modify Model Card for ModernBERT.

* Update as per code review.

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update model card.

* Update model card.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:14:02 -07:00
Abhishek Ranjan
b552708694
chore: Update model doc for code_llama (#37115)
* Update code_llama.md

aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598

sub part of https://github.com/huggingface/transformers/issues/36979

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make changes as per code review

* chore: make the function smaller for attention mask visualizer

* chore[docs]: update code_llama.md with some more suggested changes

* Update docs/source/en/model_doc/code_llama.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* chore[docs] : Update code_llama.md with indentation changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 10:09:41 -07:00
Bimal Gajera
2b84831a93
Update model card for Cohere (#37056)
* Update Cohere model card to follow standard template

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update cohere.md

Update code snippet for AutoModel, quantization, and transformers-cli

* Update cohere.md

* Update docs/source/en/model_doc/cohere.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 09:51:40 -07:00
Matt
2d46a08b63
Purge unused ModelTester code (#37085)
* Purge correctly this time

* Remove more methods from recent PRs

* make fixup
2025-04-03 17:48:35 +01:00
Avigyan Sinha
1b29409d89
feat: updated model card for qwen_2.5_vl (#37099)
* feat: updated model card for qwen_2.5_vl

* applied suggested change 1

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* applied suggested change 2

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* applied suggested change 3

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: made requested changes for quantization and notes

* suggeested model card change 4

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 5

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 6

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* updated model card wiht suggested change 7

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feat: applied requested changes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-03 09:13:26 -07:00
cyyever
8a828a747e
Add Optional to types (#37163)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-03 16:38:01 +01:00
Ryan Mullins
3f6af96732
Adding links to ShieldGemma 2 technical report (#37247) 2025-04-03 16:26:29 +01:00
Joao Gante
9a1c1fe7ed
[CI] green llama tests (#37244)
* green llama tests

* use cleanup instead

* better test comment; cleanup upgrade

* better test comment; cleanup upgrade
2025-04-03 14:15:53 +01:00
Matt
782d7d945d
Allow flexible generation params arg when checking pipeline specs (#37211)
* Allow flexible generation params arg

* Trigger tests

* Add docstring and rename js_generate to hub_generate
2025-04-03 13:29:36 +01:00
Jaime Fraustro
afafb84b59
Add support for fast image processing in image-pretraining example (#37021)
* Add support for fast image processing in image-pretraining example

Fix typo: correct tuple formatting in IMAGE_PROCESSOR_MAPPING_NAMES

Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>

* Use fast image processor by default

Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>

---------

Signed-off-by: jafraustro <jaime.fraustro.valdez@intel.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-04-03 13:26:46 +01:00
Matt
34ccfebf32
Fix AST parsing when looking for remote code imports (#37245)
* Not all Call.func nodes have id because they can be methods

* Trigger tests

* Trigger tests
2025-04-03 13:00:51 +01:00
Yao Matrix
f697b3f824
enable 2 types of case on XPU (#37198)
enable 2 types of case on XPU 1. test_resize_tokens_embeddings_with_deepspeed_multi_gpu 2. test_resize_embeddings_untied_with_deepspeed_multi_gpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-03 11:37:55 +02:00
Joao Gante
2099287a59
[CI] lazy loading external datasets (#37218) 2025-04-03 09:57:45 +01:00
Fanli Lin
a0803a9555
[tests] fix mamba integration simple inference precision issue (#37193)
* fix precision issue

* use float32
2025-04-03 10:38:03 +02:00
Cyril Vallez
6ce238fe7a
Fix test (#37213)
* Update test_modeling_common.py

* style
2025-04-03 10:24:34 +02:00
regisss
12048990a9
Add new dim to num_items_in_batch if necessary (#36967)
* Add new dim to `num_items_in_batch` if necessary

* Unsqueeze only in the DP case

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-03 09:57:03 +02:00
Raushan Turganbay
98601cc818
[Phi4] add multimodal chat template (#36996)
* phi4 chat template

* remove from valid kwargs
2025-04-03 09:52:09 +02:00
Guang Yang
c9302c0983
Fix static cache export (#37229)
Co-authored-by: Guang Yang <guangyang@fb.com>
2025-04-03 07:05:57 +02:00
ARAVINDHAN T
2056287940
Updated model card for Qwen2 (#37192)
* Update qwen2.md

* Update qwen2.md

* Update qwen2.md

* Update qwen2.md

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update qwen2.md

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 18:10:41 -07:00
Ricardo Alanis
3e96a0c32b
Update falcon model card (#37184)
* feat: updated model card for falcon

* fix:rewrite model description

* fix: add link to conversion script

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: Add suggested changes

* fix: typo in link for quantization

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/falcon.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: fix indent and close ticks

* fix: add indent

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 17:30:37 -07:00
Purusharth Malik
199d7adf10
Updated the model card for CLIP (#37040)
* Update clip.md

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Incorporated suggested changes

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/model_doc/clip.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-02 14:57:38 -07:00
Matt
126abe3461
More ReDOS fixes! (#36964)
* More ReDOS fixes!

* Slight regex cleanup

* Cleanup regex replacement

* Drop that regex entirely too

* The regex didn't match config.json, let's make sure we don't either

* Cleanup allowed_value_chars a little

* Cleanup the import search

* Catch multi-condition blocks too

* Trigger tests

* Trigger tests
2025-04-02 18:46:14 +01:00
Matt
3d133cc557
Stop DOSing the Hub in the CI (#37209)
* As the title suggests, stop hammering the same files

* make fixup

* Use shutil instead of pathlib
2025-04-02 17:19:33 +01:00
Joao Gante
e90d55ebcc
[Tests] add min_new_tokens to prevent flaky length checks (#37175) 2025-04-02 15:24:00 +01:00
Matt
cbfa14823b
No more dtype_byte_size() (#37144)
* No more dtype_byte_size()

* Remove function once again

* Fix rebase cruft

* Trigger tests
2025-04-02 14:58:38 +01:00
cyyever
7613cf1a45
Add py.typed (#37022) 2025-04-02 14:17:27 +01:00
cyyever
32c12aaec3
[3/N] Use pyupgrade --py39-plus to improve code (#36936)
Use pyupgrade --py39-plus to improve code

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:16:06 +01:00
cyyever
764ab0d46a
Merge tensor operations with device transfer operations (#37097)
* Merge operations with to

Signed-off-by: cyy <cyyever@outlook.com>

* Use dtype

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: cyy <cyyever@outlook.com>
2025-04-02 14:15:23 +01:00
湛露先生
c94c6ed397
Fix some code annotation typos. (#37102)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-02 14:00:41 +01:00
Dan Saattrup Nielsen
e94d607c8b
fix: Add 'image-text-to-text' to TASK_MAPPING (#37107)
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2025-04-02 14:51:03 +02:00
Yih-Dar
adfc91cd46
Try to avoid/reduce some remaining CI job failures (#37202)
* try

* try

* Update tests/pipelines/test_pipelines_video_classification.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2025-04-02 14:39:57 +02:00
Xavier Dupré
6f5dc9c82e
Fixes DynamicCache export issues due to control flow and inplace modifications (#36652)
* Remove unnecessary masked_fill in deberta models

* Enable some code when exporting but not compiling

* add missing import

* style

* replace if by torch.cond

* style

* use numel

* style

* add unit tests

* style

* change empty value for dynamic cache

* replace != [] by numel()

* fix import issue

* style
2025-04-02 12:04:40 +01:00
Jerry Zhang
a165458901
Add device workaround for int4 weight only quantization after API update (#36980)
* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-02 12:42:22 +02:00
Yih-Dar
ed95493ce0
Skip code 307 in RequestCounter (#36953)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:35:46 +02:00
Raushan Turganbay
211e4dc9a4
[chat-template] fix video loading (#37146)
* fix

* add video

* trigger

* push new iamges

* fix tests

* revert

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-02 11:27:50 +02:00
Bowen Bao
800510c67b
[doc] Fix link for Quark quantization page (#37179)
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-04-01 20:57:38 +02:00
Cyril Vallez
41f5c3216c
Revert #37031 (#37178)
Update modeling_utils.py
2025-04-01 19:48:15 +02:00
Cyril Vallez
bc2dea3f54
Fix meta state dict loading with quantizers (#37136)
Update modeling_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-04-01 18:45:58 +02:00
Yih-Dar
35253076f4
Avoid pipeline test failing related to Hub call (#37170)
* cls

* cls

* cls

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-04-01 18:22:45 +02:00
Yufeng Xu
bf41e54fc8
Fixes the inconsistency of the optionality of attention_mask (#37153)
* debugging issue 36758

* debugging issue 36758

* debugging issue 36758

* updated attn_mask type specification in _flash_attention_forward

* removed pdb

* added a blank line

* removed indentation
2025-04-01 15:31:10 +01:00
Pavel Iakubovskii
3249c5dc15
Refactor attention for SigLIP based models (#36981)
* Update Siglip attention implementation

* Update tests for Siglip

* Remove one level of indentation

* Update test to be more specific

* Fixup

* Idefics2

* Idefics3

* Emu3

* SmolVLM

* Phi4 (just init small update)

* Idefics2 (test fix)

* Update siglip2 tests

* Update eager

* trigger

* Clean up

* Transfer inputs to device in test

* Fixing test

* Fixing test

* Revert contiguous

* Remove unused is_flash_attn_2_available

* Move flaky to specific models
2025-04-01 15:37:25 +02:00
Yao Matrix
24e311f42b
fix XPU UT error case brough by RNG difference btw XPU and CUDA (#37121)
* fix XPU UT error case brough by RNG difference btw XPU and CUDA

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Revert "enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu"

This reverts commit 3ef83a4f02.

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-01 13:52:55 +01:00
Tom Aarsen
897ff9af0e
[ModernBERT] Never save 'reference_compile' config; should be set based on end user (#36305)
* Never save 'reference_compile' config; should be set based on end user

* Reformat (I ran 'make style' from the wrong env)

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Use pop instead of del

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-04-01 14:14:39 +02:00
Tugsbayasgalan Manlaibaatar
c0bd8048a5
Make canine model exportable by removing unncessary complicated logic (#37124) 2025-04-01 12:31:12 +01:00
Ilyas Moutawwakil
60b75d99b6
Only count num items in batch when needed (#36867)
only count num itels when needed
2025-04-01 12:30:39 +02:00
Qizhi Chen
fac70ff3c0
Convert _VALID_DICT_FIELDS to class attribute for shared dict parsing in subclasses (#36736)
* make _VALID_DICT_FIELDS as a class attribute

* fix test case about TrainingArguments
2025-04-01 12:29:12 +02:00