ivarflakstad
7ec35bc3bd
Add missing atol to torch.testing.assert_close where rtol is specified ( #36234 )
2025-02-17 14:57:50 +01:00
Joao Gante
dad513e0c2
[generate] remove cache v4.47 deprecations ( #36212 )
2025-02-17 13:55:03 +00:00
ivarflakstad
936aeb70ab
AMD DeepSpeed image additional HIP dependencies ( #36195 )
...
* Add hipsolver and hipblastlt as dependencies
* Upgrade torch libs with rocm6.2.4 index
2025-02-17 11:50:49 +01:00
Yih-Dar
23d6095e8f
Fix LlavaForConditionalGenerationModelTest::test_config
after #36077 ( #36230 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-17 11:49:07 +01:00
Fanli Lin
fae0f3dde8
[tests] fix EsmModelIntegrationTest::test_inference_bitsandbytes
( #36225 )
...
fix failed test
2025-02-17 11:10:33 +01:00
Yih-Dar
dd16acb8a3
set test_torchscript = False
for Blip2 testing ( #35972 )
...
* just skip
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-14 17:43:32 +01:00
Yih-Dar
0a9923a609
Use args.num_workers
in check_modular_conversion.py
( #36200 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-14 17:31:03 +01:00
Mayank Mishra
a570e2ba87
add shared experts for upcoming Granite 4.0 language models ( #35894 )
...
* Modular GraniteMoE with shared Experts.
Signed-off-by: Shawn Tan <shawntan@ibm.com>
* Modified
* Import order.
* Modified for style
* Fix space.
* Test
* Remove extra granitemoe file.
* New converted file and tests
* Modified __init__ files.
* Formatting.
* Dummy PT objects
* register granitemoe shared model
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix linting of a file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix import in modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update generated modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* add documentation
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update docstrings
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* update generated modeling file
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* fix docstrings in config class
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
* merge main
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
---------
Signed-off-by: Shawn Tan <shawntan@ibm.com>
Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Shawn Tan <shawntan@ibm.com>
Co-authored-by: Shawn Tan <shawn@wtf.sg>
Co-authored-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Co-authored-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>
2025-02-14 16:55:28 +01:00
ivarflakstad
7ae7e87a09
Add @require_bitsandbytes to Aria test_batched_generation ( #36192 )
2025-02-14 15:48:47 +01:00
Kyle Sayers
bcfc9d795e
[Bugfix] Fix reloading of pixtral/llava configs ( #36077 )
...
* add is_composition flag to LlavaConfig
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* WIP: pixtral text config
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix style
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* add test
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* use is_composition for pixtral
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* Revert "use is_composition for pixtral"
This reverts commit a53d5f9fc5
.
* Revert "Revert "use is_composition for pixtral""
This reverts commit 3ab1c99404
.
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-02-14 15:27:05 +01:00
Raushan Turganbay
0c78ef6cd3
🔴 VLM: compile compatibility ( #35724 )
...
* llavas
* add mroe models
* fix `compile_forward` test for all models
* fix copies
* make style
* also doesn't support cache class
* fix some tests
* not copied from
* ci green?
* fix tests
* fix copies
* fix tests
* check with `numel` and remove `item`
* fix copies
* fix copies
* Update src/transformers/models/cohere2/modeling_cohere2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* opt remove cross attn
* gemma2
* fixup
* fixup
* fix newly added test
* maybe fixed?
* green please?
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-14 15:23:49 +01:00
David LaPalomento
b45cf0e90a
Guard against unset resolved_archive_file ( #35628 )
...
* archive_file may not be specified
When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check.
* Remap partial disk offload to cpu for GGUF files
GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError.
* Don't remap auto device_map and raise RuntimeError
If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk.
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-14 14:44:31 +01:00
Arthur
96f01a36ac
Revert qwen2 breaking changes related to attention refactor ( #36162 )
...
* dito
* add a test
* upsate
* test needs fa2
* update test and configuration
* test requires fa2
* style
2025-02-14 13:44:14 +01:00
Mohamed Mekkouri
cb586a3999
Add require_read_token to fp8 tests ( #36189 )
...
fix
2025-02-14 12:27:35 +01:00
Andrei Panferov
5f726f8b8e
New HIGGS quantization interfaces, JIT kernel compilation support. ( #36148 )
...
* new flute
* new higgs working
* small adjustments
* progress and quallity
* small updates
* style
---------
Co-authored-by: Andrey Panferov <panferov.andrey3@wb.ru>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2025-02-14 12:26:45 +01:00
Raushan Turganbay
15ec971b8e
Prepare processors for VideoLLMs ( #36149 )
...
* allow processor to preprocess conversation + video metadata
* allow callable
* add test
* fix test
* nit: fix
* add metadata frames_indices
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* port updates from Orr and add one more test
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* typo
* as dataclass
* style
* docstring + maek sure tests green
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-02-14 11:34:08 +01:00
Isotr0py
33d1d715b0
Add ImageProcessorFast to Qwen2.5-VL processor ( #36164 )
...
* add qwen2 fast image processor to modular file
Signed-off-by: isotr0py <2037008807@qq.com>
* fix modular
Signed-off-by: isotr0py <2037008807@qq.com>
* fix circle import
Signed-off-by: isotr0py <2037008807@qq.com>
* add docs
Signed-off-by: isotr0py <2037008807@qq.com>
* fix typo
Signed-off-by: isotr0py <2037008807@qq.com>
* add modular generated files
Signed-off-by: isotr0py <2037008807@qq.com>
* revert qwen2vl fast image processor
Signed-off-by: isotr0py <2037008807@qq.com>
* remove qwen2.5-vl image processor from modular
Signed-off-by: isotr0py <2037008807@qq.com>
* re-generate qwen2.5-vl files
Signed-off-by: isotr0py <2037008807@qq.com>
* remove unnecessary test
Signed-off-by: isotr0py <2037008807@qq.com>
* fix auto map
Signed-off-by: isotr0py <2037008807@qq.com>
* cleanup
Signed-off-by: isotr0py <2037008807@qq.com>
* fix model_input_names
Signed-off-by: isotr0py <2037008807@qq.com>
* remove import
Signed-off-by: isotr0py <2037008807@qq.com>
* make fix-copies
Signed-off-by: isotr0py <2037008807@qq.com>
---------
Signed-off-by: isotr0py <2037008807@qq.com>
2025-02-14 17:34:55 +08:00
Raushan Turganbay
1931a35140
Chat template docs ( #36163 )
...
* decompose chat template docs
* add docs
* update model docs
* qwen2-5
* pixtral
* remove old chat template
* also video as list frames supported
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* remove audio for now
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-14 10:32:14 +01:00
Raushan Turganbay
3bf02cf440
CI: fix test-save-trainer
( #36191 )
...
* fix
* also the docstring
2025-02-14 10:20:56 +01:00
Amit Garg
0ae93d31ce
Add support for partial rotary embeddings in Phi3 model ( #35947 )
...
* Added support for partial_rotary_factor
* addressed comments
* refactored
2025-02-14 09:37:38 +01:00
Yoni Gozlan
336dc69d63
Uniformize OwlViT and Owlv2 processors ( #35700 )
...
* uniformize owlvit processor
* uniformize owlv2
* nit
* add positional arg test owlvit
* run-slow: owlvit, owlv2
* run-slow: owlvit, owlv2
* remove one letter variable
2025-02-13 17:30:26 -05:00
Yoni Gozlan
e6a7981711
Fix make_batched_videos and add tests ( #36143 )
...
* add support for initial shift in video processing and other fixes
* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
Yih-Dar
8fd4bc7d1d
Fix a mistake in #36175 ( #36179 )
...
fix my bad
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 18:33:02 +01:00
Mohamed Mekkouri
b1a2de075d
Follow up to SpQR integration ( #36176 )
...
fix
2025-02-13 17:40:59 +01:00
Wizyoung
12962fe84b
Fix the key name for _load_rng_state under torch.cuda ( #36138 )
...
fix load key name for _load_rng_state under torch.cuda
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 11:35:08 -05:00
Yih-Dar
bfe46c98b5
Make check_repository_consistency
run faster by MP ( #36175 )
...
* speeddddd
* speeddddd
* speeddddd
* speeddddd
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-13 17:25:17 +01:00
Jiahao Li
5f0fd1185b
Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks ( #35837 )
...
* Optimize Qwen2VL vision model by precomputing cos/sin embeds before ViT blocks
* Make rotary_pos_emb optional & fix type
* Adapt pre-computed cos/sin to Qwen2.5VL
* More concise
2025-02-13 17:10:58 +01:00
மனோஜ்குமார் பழனிச்சாமி
d72642bccc
Use tqdm auto ( #35726 )
...
* Remove traces of the progressbar
* Use tqdm auto
2025-02-13 15:41:30 +00:00
Joao Gante
62c7ea0201
CI: avoid human error, automatically infer generative models ( #33212 )
...
* tmp commit
* move tests to the right class
* remove ALL all_generative_model_classes = ...
* skip tf roberta
* skip InstructBlipForConditionalGenerationDecoderOnlyTest
* videollava
* reduce diff
* reduce diff
* remove on vlms
* fix a few more
* manual rebase bits
* more manual rebase
* remove all manual generative model class test entries
* fix up to ernie
* a few more removals
* handle remaining cases
* recurrent gemma
* it's better here
* make fixup
* tf idefics is broken
* tf bert + generate is broken
* don't touch tf :()
* don't touch tf :(
* make fixup
* better comments for test skips
* revert tf changes
* remove empty line removal
* one more
* missing one
2025-02-13 16:27:11 +01:00
Arthur
06231fdfc7
add disable compile option ( #36161 )
...
* add disable compile code
* fix
2025-02-13 16:24:46 +01:00
Arthur
0ca7259217
fix training issues ( #36158 )
...
* fix training issues
* Update
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 16:24:28 +01:00
Elvir Crnčević
845b0a2616
Efficient Inference Kernel for SpQR ( #34976 )
...
* Resolve vptq conflict
* Rename spqr package to spqr_quant
* Get rid of aqlm mention
* Start working on tests
* Resolve ruff code checks
* Ruff format
* Isort
* Test updates
* Add gpu tag
* Rename to modules_to_not_convert
* Config update
* Docs and config update
* Docs and config update
* Update to update_torch_dtype
* spqr config parameter validation
* Ruff update
* Apply ruff fixes
* Test fixes
* Ruff update
* Mark tests as @slow again; Ruff; Docstring update
* Ruff
* Remove absolute path
* Resolve typo
* Remove redundandt log
* Check accelerate/spqr availability
* Ruff fix
* Check if the config contains proper shapes
* Ruff test
* Documentation update
* overview update
* Ruff checks
* Ruff code quality
* Make style
* Update docs/source/en/quantization/spqr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update spqr.md
* Enable gptqmodel (#35012 )
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* gptqmodel need use checkpoint_format (#1 )
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* Revert quantizer_gptq.py (#2 )
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Fix Transformer compat (#3 )
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update gptqmodel version (#6 )
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5 )
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7 )
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* review: update docs (#10 )
* review: update docs (#12 )
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update document (#9 )
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix : Nemotron Processor in GGUF conversion (#35708 )
* fixing nemotron processor
* make style
* Update docs/source/en/quantization/spqr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add missing TOC to doc
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-02-13 16:22:58 +01:00
dependabot[bot]
c5506f4f00
Bump transformers from 4.38.0 to 4.48.0 in /examples/research_projects/adversarial ( #36168 )
...
Bump transformers in /examples/research_projects/adversarial
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 15:06:16 +00:00
dependabot[bot]
d7c5d1b539
Bump transformers from 4.38.0 to 4.48.0 in /examples/tensorflow/language-modeling-tpu ( #36167 )
...
Bump transformers in /examples/tensorflow/language-modeling-tpu
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.38.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.38.0...v4.48.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-13 14:46:38 +00:00
Joao Gante
636ee57489
[generate] revert change in Aria: the maximum cache length must match max_length
( #36120 )
...
* revert inputs_embeds len
* Update test_utils.py
* make fixup
2025-02-13 14:36:33 +00:00
Mohamed Mekkouri
b41591d847
Fix : fix doc fp8 ( #36173 )
...
* fix
* fix
2025-02-13 15:29:59 +01:00
Arthur
b079dd1fa2
Fix red CI ( #36174 )
...
test was weird
2025-02-13 14:27:55 +01:00
Joao Gante
d114a6f78e
[Modular] skip modular checks based on diff ( #36130 )
...
skip modular checks based on diff
2025-02-13 12:53:21 +00:00
Pavel Iakubovskii
6397916dd2
Remove loading custom kernel for RT-DETRv2 ( #36098 )
...
* Remove loading custom kernels
* Remove config param
* Fixup
2025-02-13 12:01:53 +00:00
Mohamed Mekkouri
efe72fe21f
Adding FP8 Quantization to transformers ( #36026 )
...
* first commit
* adding kernels
* fix create_quantized_param
* fix quantization logic
* end2end
* fix style
* fix imports
* fix consistency
* update
* fix style
* update
* udpate after review
* make style
* update
* update
* fix
* update
* fix docstring
* update
* update after review
* update
* fix scheme
* update
* update
* fix
* update
* fix docstring
* add source
* fix test
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2025-02-13 13:01:19 +01:00
Lysandre Debut
c82319b493
Helium documentation fixes ( #36170 )
...
* Helium documentation fixes
* Update helium.md
* Update helium.md
* Update helium.md
2025-02-13 12:20:53 +01:00
Thomas Bauwens
8f137b2427
Move DataCollatorForMultipleChoice
from the docs to the package ( #34763 )
...
* Add implementation for DataCollatorForMultipleChoice based on docs.
* Add DataCollatorForMultipleChoice to import structure.
* Remove custom DataCollatorForMultipleChoice implementations from example scripts.
* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.
* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.
* Apply suggested changes and run make fixup.
* fix copies, style and fixup
* add missing documentation
* nits
* fix docstring
* style
* nits
* isort
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2025-02-13 12:01:28 +01:00
CL-ModelCloud
35c155052d
Fix PretrainedTokenizerFast check => Fix PretrainedTokenizerFast Save ( #35835 )
...
* Fix the bug in tokenizer.save_pretrained when saving tokenizer_class to tokenizer_config.json
* Update tokenization_utils_base.py
* Update tokenization_utils_base.py
* Update tokenization_utils_base.py
* add tokenizer class type test
* code review
* code opt
* fix bug
* Update test_tokenization_fast.py
* ruff check
* make style
* code opt
* Update test_tokenization_fast.py
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
2025-02-13 12:00:33 +01:00
Marco Edward Gorelli
3c912c9089
docs: fix return type annotation of get_default_model_revision
( #35982 )
2025-02-13 11:59:15 +01:00
gewenbin0992
6a1ab634b6
qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1 ( #36083 )
...
* qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1
* fix
* fix
* fix
* fix
* add tests
* fix test bugs
* fix
* fix failed tests
* fix
2025-02-13 11:35:28 +01:00
Pavel Iakubovskii
d419862889
Fix tests for vision models ( #35654 )
...
* Trigger tests
* [run-slow] beit, detr, dinov2, vit, textnet
* Fix BEiT interpolate_pos_encoding
* Fix DETR test
* Update DINOv2 test
* Fix textnet
* Fix vit
* Fix DPT
* fix data2vec test
* Fix textnet test
* Update interpolation check
* Fix ZoeDepth tests
* Update interpolate embeddings for BEiT
* Apply suggestions from code review
2025-02-13 10:28:37 +00:00
Lucain
e60ae0d078
Replace deprecated update_repo_visibility ( #35970 )
2025-02-13 11:27:55 +01:00
Nerogar
9065cf0d92
Fix Gemma2 dtype issue when storing weights in float16 precision ( #35398 )
...
fix gemma2 dtype issue when storing weights in float16 precision
2025-02-13 11:17:37 +01:00
Ben Schneider
08ab1abff4
Add reminder config to issue template and print DS version in env ( #35156 )
...
* update env command to log deepspeed version
* suppress deepspeed import logging
* Add reminder to include configs to repro description in bug report.
* make fixup
* [WIP] update import utils for deepspeed
* Change to using is_deepspeed_available() from integrations.
* make fixup
2025-02-13 10:55:49 +01:00
Sambhav Dixit
950cfb0b4f
Fix PaliGemma Pad Token Masking During Training #35855 ( #35859 )
...
* change order of unmasking of tokens
* library import
* class setup
* test function
* refactor
* add commit message
* test modified
* explict initiliasation of weights + made model smaller
* removed sepete testing file
* fixup
* fixup core
* test attention mask with token types
* tests fixup
* removed PaliGemmaAttentionMaskTest class
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
2025-02-13 10:11:44 +01:00