transformers/docs/source/en/main_classes
Andrei Panferov 64c05eecd6
HIGGS Quantization Support (#34997)
* higgs init

* working with crunches

* per-model workspaces

* style

* style 2

* tests and style

* higgs tests passing

* protecting torch import

* removed torch.Tensor type annotations

* torch.nn.Module inheritance fix maybe

* hide inputs inside quantizer calls

* style structure something

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* reworked num_sms

* Update src/transformers/integrations/higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* revamped device checks

* docstring upd

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* edited tests and device map assertions

* minor edits

* updated flute cuda version in docker

* Added p=1 and 2,3bit HIGGS

* flute version check update

* incorporated `modules_to_not_convert`

* less hardcoding

* Fixed comment

* Added docs

* Fixed gemma support

* example in docs

* fixed torch_dtype for HIGGS

* Update docs/source/en/quantization/higgs.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Collection link

* dequantize interface

* newer flute version, torch.compile support

* unittest message fix

* docs update compile

* isort

* ValueError instead of assert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-23 16:54:49 +01:00
..
agent.md Decorator for easier tool building (#33439) 2024-09-18 11:07:51 +02:00
backbones.md doc: fix broken BEiT and DiNAT model links on Backbone page (#32029) 2024-07-17 20:24:10 +01:00
callback.md Update CometCallback to allow reusing of the running experiment (#31366) 2024-07-05 08:13:46 +02:00
configuration.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
data_collator.md Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629) 2024-07-23 15:56:41 +02:00
deepspeed.md [docs] DeepSpeed (#28542) 2024-01-24 08:31:28 -08:00
executorch.md Fix flax failures (#33912) 2024-10-11 14:38:35 +02:00
feature_extractor.md Fixed typos (#26810) 2023-10-16 09:52:29 +02:00
image_processor.md Fall back to slow image processor in ImageProcessingAuto when no fast processor available (#34785) 2024-12-15 14:00:36 -05:00
keras_callbacks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
logging.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
model.md Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771) 2024-07-16 09:32:01 -04:00
onnx.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
optimizer_schedules.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
output.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
pipelines.md Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
processors.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
quantization.md HIGGS Quantization Support (#34997) 2024-12-23 16:54:49 +01:00
text_generation.md Add SynthID (watermerking by Google DeepMind) (#34350) 2024-10-23 21:18:52 +01:00
tokenizer.md VLM: special multimodal Tokenizer (#34461) 2024-11-04 16:37:51 +01:00
trainer.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00