Poedator
|
d78e78a0e4
|
HfQuantizer class for quantization-related stuff in modeling_utils.py (#26610)
* squashed earlier commits for easier rebase
* rm rebase leftovers
* 4bit save enabled @quantizers
* TMP gptq test use exllama
* fix AwqConfigTest::test_wrong_backend for A100
* quantizers AWQ fixes
* _load_pretrained_model low_cpu_mem_usage branch
* quantizers style
* remove require_low_cpu_mem_usage attr
* rm dtype arg from process_model_before_weight_loading
* rm config_origin from Q-config
* rm inspect from q_config
* fixed docstrings in QuantizationConfigParser
* logger.warning fix
* mv is_loaded_in_4(8)bit to BnbHFQuantizer
* is_accelerate_available error msg fix in quantizer
* split is_model_trainable in bnb quantizer class
* rm llm_int8_skip_modules as separate var in Q
* Q rm todo
* fwd ref to HFQuantizer in type hint
* rm note re optimum.gptq.GPTQQuantizer
* quantization_config in __init__ simplified
* replaced NonImplemented with create_quantized_param
* rm load_in_4/8_bit deprecation warning
* QuantizationConfigParser refactoring
* awq-related minor changes
* awq-related changes
* awq config.modules_to_not_convert
* raise error if no q-method in q-config in args
* minor cleanup
* awq quantizer docstring
* combine common parts in bnb process_model_before_weight_loading
* revert test_gptq
* .process_model_ cleanup
* restore dict config warning
* removed typevars in quantizers.py
* cleanup post-rebase 16 jan
* QuantizationConfigParser classmethod refactor
* rework of handling of unexpected aux elements of bnb weights
* moved q-related stuff from save_pretrained to quantizers
* refactor v1
* more changes
* fix some tests
* remove it from main init
* ooops
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* fix awq issues
* fix
* fix
* fix
* fix
* fix
* fix
* add docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/hf_quantizer.md
* address comments
* fix
* fixup
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address final comment
* update
* Update src/transformers/quantizers/base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/quantizers/auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* add kwargs update
* fixup
* add `optimum_quantizer` attribute
* oops
* rm unneeded file
* fix doctests
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2024-01-30 02:48:25 +01:00 |
|
Younes Belkada
|
9b25c164bd
|
[core / Quantization ] Fix for 8bit serialization tests (#27234)
* fix for 8bit serialization
* added regression tests.
* fixup
|
2023-11-02 12:03:51 +01:00 |
|
Younes Belkada
|
4bb50aa212
|
[Quantization / tests ] Fix bnb MPT test (#27178)
fix bnb mpt test
|
2023-10-31 16:25:53 +01:00 |
|
Younes Belkada
|
6b466771b0
|
[tests / Quantization ] Fix bnb test (#27145)
* fix bnb test
* link to GH issue
|
2023-10-30 15:43:08 +01:00 |
|
Younes Belkada
|
fd6a0ade9b
|
🚨🚨🚨 [Quantization ] Store the original dtype in the config as a private attribute 🚨🚨🚨 (#26761)
* First step
* fix
* add adjustements for gptq
* change to `_pre_quantization_dtype`
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix serialization
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-10-16 19:56:53 +02:00 |
|
Younes Belkada
|
2aef9a9601
|
[PEFT ] Final fixes (#26559)
* fix issues with PEFT
* logger warning futurewarning issues
* fixup
* adapt from suggestions
* oops
* rm test
|
2023-10-03 14:53:09 +02:00 |
|
Younes Belkada
|
6824461f2a
|
[core / auto ] Fix bnb test with code revision + bug with code revision (#26431)
* fix bnb test with code revision
* fix test
* Apply suggestions from code review
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
|
2023-10-02 11:35:07 +02:00 |
|
Younes Belkada
|
4b79697865
|
🚨🚨🚨 [Refactor ] Move third-party related utility files into integrations/ folder 🚨🚨🚨 (#25599)
* move deepspeed to `lib_integrations.deepspeed`
* more refactor
* oops
* fix slow tests
* Fix docs
* fix docs
* addess feedback
* address feedback
* final modifs for PEFT
* fixup
* ok now
* trigger CI
* trigger CI again
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* import from `integrations`
* address feedback
* revert removal of `deepspeed` module
* revert removal of `deepspeed` module
* fix conflicts
* ooops
* oops
* add deprecation warning
* place it on the top
* put `FutureWarning`
* fix conflicts with not_doctested.txt
* add back `bitsandbytes` module with a depr warning
* fix
* fix
* fixup
* oops
* fix doctests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
|
2023-08-25 17:13:34 +02:00 |
|
Marc Sun
|
55db70c63d
|
GPTQ integration (#25062)
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-08-10 16:06:29 -04:00 |
|