* Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model
Summary:
After https://github.com/huggingface/huggingface_hub/pull/2440 we added non-safetensor serialization and deserialization
in huggingface, with this we can now add the support in transformers
Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor
see README for more details
Test Plan:
tested locally
Reviewers:
Subscribers:
Tasks:
Tags:
* formatting
* formatting
* minor fix
* formatting
* address comments
* comments
* minor fix
* update doc
* refactor compressed tensor quantizer
* fix return type
* update to union
* fix gate_logits typing
* fix num_experts type
* fix typing
* run fix-copies
* add doc for top_k
* run fix-copies
* empty commit to trigger CI
* Make audio classification pipeline spec-compliant and add test
* Check that test actually running in CI
* Try a different pipeline for the CI
* Move the test so it gets triggered
* Move it again, this time into task_tests!
* make fixup
* indentation fix
* comment
* Move everything from testing_utils to test_pipeline_mixin
* Add output testing too
* revert small diff with main
* make fixup
* Clarify comment
* Update tests/pipelines/test_pipelines_audio_classification.py
Co-authored-by: Lucain <lucainp@gmail.com>
* Update tests/test_pipeline_mixin.py
Co-authored-by: Lucain <lucainp@gmail.com>
* Rename function and js_args -> hub_args
* Cleanup the spec recursion
* Check keys for all outputs
---------
Co-authored-by: Lucain <lucainp@gmail.com>
* Cleanup return_text and return_full_text options in TextGenerationPipeline
* Cleanup return_text and return_full_text options in TextGenerationPipeline
* Cleanup return_text and return_full_text options in TextGenerationPipeline
* Cleanup return_text and return_full_text options in TextGenerationPipeline
* Revert pipeline code, but update docs instead
* Restore pipeline test
* add bloom arch support for gguf
* apply format
* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming
* optimize bloom GGUF_TENSOR_MAPPING
* implement reverse reshaping for bloom gguf
* add qkv weights test
* add q_8 test for bloom
Update siglip.md
This was already partially fixed relative to the deployed docs. But the partial fix made it inconsistent. Additionally, giving the full text ("This is a photo of...") is likely not the desired output.
* clean_up_tokenization_spaces=False if unset
* deprecate warning
* updating param for old models
* update models
* make fix-copies
* fix-copies and update bert models
* warning msg
* update prophet and clvp
* updating test since space before is arbitrarily removed
* remove warning for 4.45
* Add Idefics 3!
* fixes to make both pipelines identical
* fix for quantized models
* First pass at the review
* remove vocab size from the main config (it's still in the text_config)
* hot fix for merve
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* re-add model_type for text_config
* remove support for old_cache
* remove hidden_size from main config
* rename idefics3 HF repo
* few changes suggested in the PR
* fix to input_data_format computation
* remove overwrite of _autoset_attn_implementation following @zucchini-nlp suggestion
* improve example
* few improvements from amy's review
* big change to enable processing input images as numpy arrays
* Changes to the code to uniformize processor kwargs
* image processing tests
* image processing tests fixes and some bugs they discovered
* addressed review comments from Yoni
* fix modeling tests
* remove special tokens that are not special
* fixes tests
* skip failing tests - they also fail for idefics2
* added paper and readded the tests with multi gpu, who knows
* Update docs/source/en/model_doc/idefics3.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* review amy until image_processing_idefics3
* last comments from Amy
* review amy
* Update src/transformers/models/idefics3/image_processing_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics3/modeling_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/idefics3.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* doc improvement - amy review
* fix runtime error during fine-tuning
* amy's review
* Update src/transformers/models/idefics3/image_processing_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics3/image_processing_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics3/modeling_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* ruff
* amy's comment on the order
* ruff ruff
* fix copies
* square images when they are not splitted
* ruff :(
* Update src/transformers/models/idefics3/image_processing_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/idefics3/test_processing_idefics3.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix small bug introduced in refactor
* amy's image processing changes
* fixes peft tests and ruff
* modify to_pil_image from transformers. and review from emanuele.
* add modified to_pil_image
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add AdEMAMix optimizer
* Fix test
* Update tests/trainer/test_trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Add compressed-tensors HFQuantizer implementation
* flag serializable as False
* run
* revive lines deleted by ruff
* fixes to load+save from sparseml, edit config to quantization_config, and load back
* address satrat comment
* compressed_tensors to compressed-tensors and revert back is_serializable
* rename quant_method from sparseml to compressed-tensors
* tests
* edit tests
* clean up tests
* make style
* cleanup
* cleanup
* add test skip for when compressed tensors is not installed
* remove pydantic import + style
* delay torch import in test
* initial docs
* update main init for compressed tensors config
* make fix-copies
* docstring
* remove fill_docstring
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* review comments
* review comments
* comments - suppress warnings on state dict load, tests, fixes
* bug-fix - remove unnecessary call to apply quant lifecycle
* run_compressed compatability
* revert changes not needed for compression
* no longer need unexpected keys fn
* unexpected keys not needed either
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* add to_diff_dict
* update docs and expand testing
* Update _toctree.yml with compressed-tensors
* Update src/transformers/utils/quantization_config.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update doc
* add note about saving a loaded model
---------
Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"
According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.
Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Fixed docstring for cohere model regarding unavailability of prune_head() methods
The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.
* Update src/transformers/models/cohere/modeling_cohere.py
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
* update exampel
* update
* push the converted diff files for testing and ci
* correct one example
* fix class attributes and docstring
* nits
* oups
* fixed config!
* update
* nitd
* class attributes are not matched against the other, this is missing
* fixed overwriting self.xxx now onto the attributes I think
* partial fix, now order with docstring
* fix docstring order?
* more fixes
* update
* fix missing docstrings!
* examples don't all work yet
* fixup
* nit
* updated
* hick
* update
* delete
* update
* update
* update
* fix
* all default
* no local import
* fix more diff
* some fix related to "safe imports"
* push fixed
* add helper!
* style
* add a check
* all by default
* add the
* update
* FINALLY!
* nit
* fix config dependencies
* man that is it
* fix fix
* update diffs
* fix the last issue
* re-default to all
* alll the fixes
* nice
* fix properties vs setter
* fixup
* updates
* update dependencies
* make sure to install what needs to be installed
* fixup
* quick fix for now
* fix!
* fixup
* update
* update
* updates
* whitespaces
* nit
* fix
* simplify everything, and make it file agnostic (should work for image processors)
* style
* finish fixing all import issues
* fixup
* empty modeling should not be written!
* Add logic to find who depends on what
* update
* cleanup
* update
* update gemma to support positions
* some small nits
* this is the correct docstring for gemma2
* fix merging of docstrings
* update
* fixup
* update
* take doc into account
* styling
* update
* fix hidden activation
* more fixes
* final fixes!
* fixup
* fixup instruct blip video
* update
* fix bugs
* align gemma2 with the rest as well
* updats
* revert
* update
* more reversiom
* grind
* more
* arf
* update
* order will matter
* finish del stuff
* update
* rename to modular
* fixup
* nits
* update makefile
* fixup
* update order of the checks!
* fix
* fix docstring that has a call inside
* fiix conversion check
* style
* add some initial documentation
* update
* update doc
* some fixup
* updates
* yups
* Mostly todo gimme a minut
* update
* fixup
* revert some stuff
* Review docs for the modular transformers (#33472)
Docs
* good update
* fixup
* mmm current updates lead to this code
* okay, this fixes it
* cool
* fixes
* update
* nit
* updates
* nits
* fix doc
* update
* revert bad changes
* update
* updates
* proper update
* update
* update?
* up
* update
* cool
* nits
* nits
* bon bon
* fix
* ?
* minimise changes
* update
* update
* update
* updates?
* fixed gemma2
* kind of a hack
* nits
* update
* remove `diffs` in favor of `modular`
* fix make fix copies
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>