transformers/docs/source/en/main_classes
Benjamin Fineran 574a9e12bb
HFQuantizer implementation for compressed-tensors library (#31704)
* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <george@neuralmagic.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Sara Adkins <sara@neuralmagic.com>
Co-authored-by: Sara Adkins <sara.adkins65@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Dipika Sikka <ds3822@columbia.edu>
Co-authored-by: Dipika <dipikasikka1@gmail.com>
2024-09-25 14:31:38 +02:00
..
agent.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
backbones.md doc: fix broken BEiT and DiNAT model links on Backbone page (#32029) 2024-07-17 20:24:10 +01:00
callback.md Update CometCallback to allow reusing of the running experiment (#31366) 2024-07-05 08:13:46 +02:00
configuration.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
data_collator.md Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629) 2024-07-23 15:56:41 +02:00
deepspeed.md [docs] DeepSpeed (#28542) 2024-01-24 08:31:28 -08:00
executorch.md Make StaticCache configurable at model construct time (#32830) 2024-09-10 16:35:57 +01:00
feature_extractor.md Fixed typos (#26810) 2023-10-16 09:52:29 +02:00
image_processor.md Fast image processor (#28847) 2024-06-11 15:47:38 +01:00
keras_callbacks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
logging.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
model.md Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771) 2024-07-16 09:32:01 -04:00
onnx.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
optimizer_schedules.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
output.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
pipelines.md Allow FP16 or other precision inference for Pipelines (#31342) 2024-07-05 17:21:50 +01:00
processors.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
quantization.md HFQuantizer implementation for compressed-tensors library (#31704) 2024-09-25 14:31:38 +02:00
text_generation.md Add Watermarking LogitsProcessor and WatermarkDetector (#29676) 2024-05-14 13:31:39 +05:00
tokenizer.md [PretrainedTokenizer] add some of the most important functions to the doc (#27313) 2023-11-06 15:11:00 +01:00
trainer.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00