transformers/docs/source
Andrei Panferov 64c05eecd6
HIGGS Quantization Support (#34997)
* higgs init

* working with crunches

* per-model workspaces

* style

* style 2

* tests and style

* higgs tests passing

* protecting torch import

* removed torch.Tensor type annotations

* torch.nn.Module inheritance fix maybe

* hide inputs inside quantizer calls

* style structure something

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* reworked num_sms

* Update src/transformers/integrations/higgs.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* revamped device checks

* docstring upd

* Update src/transformers/quantizers/quantizer_higgs.py

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* edited tests and device map assertions

* minor edits

* updated flute cuda version in docker

* Added p=1 and 2,3bit HIGGS

* flute version check update

* incorporated `modules_to_not_convert`

* less hardcoding

* Fixed comment

* Added docs

* Fixed gemma support

* example in docs

* fixed torch_dtype for HIGGS

* Update docs/source/en/quantization/higgs.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Collection link

* dequantize interface

* newer flute version, torch.compile support

* unittest message fix

* docs update compile

* isort

* ValueError instead of assert

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
2024-12-23 16:54:49 +01:00
..
ar FEAT : Adding VPTQ quantization method to HFQuantizer (#34770) 2024-12-20 09:45:53 +01:00
de Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
en HIGGS Quantization Support (#34997) 2024-12-23 16:54:49 +01:00
es Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
fr Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
hi [i18n-HI] Translated TFLite page to Hindi (#34572) 2024-11-04 09:40:30 -08:00
it Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
ja Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
ko FEAT : Adding VPTQ quantization method to HFQuantizer (#34770) 2024-12-20 09:45:53 +01:00
ms Remove old TF port docs (#30426) 2024-04-23 16:06:20 +01:00
pt Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
te Fix typos in translated quicktour docs (#35302) 2024-12-17 09:32:00 -08:00
tr Translate index.md to Turkish (#27093) 2023-11-08 08:35:20 -05:00
zh Fix image preview in multi-GPU inference docs (#35303) 2024-12-17 09:33:50 -08:00
_config.py [#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) 2024-04-08 14:21:16 +01:00