Omar Sanseviero
|
a989c6c6eb
|
Don't allow passing load_in_8bit and load_in_4bit at the same time (#28266)
* Update quantization_config.py
* Style
* Protect from setting directly
* add tests
* Update tests/quantization/bnb/test_4bit.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2024-01-30 01:43:40 +01:00 |
|
Poedator
|
4f7806ef7e
|
[bnb] Let's make serialization of 4bit models possible (#26037)
* updated bitsandbytes.py
* rm test_raise_* from test_4bit.py
* add test_4bit_serialization.py
* modeling_utils bulk edits
* bnb_ver 0.41.3 in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* @slow reinstated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* bnb ver 0.41.3 in src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* rm bnb version todo in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* moved 4b serialization tests to test_4bit
* tests upd for opt
* to torch_device
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* ruff fixes to tests
* rm redundant bnb version check in mod_utils
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* restore _hf_peft_config_loaded modeling_utils.py::2188
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* restore _hf_peft_config_loaded test in modeling_utils.py::2199
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* fixed NOT getattr(self, "is_8bit_serializable")
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* setting model.is_4bit_serializable
* rm separate fp16_statistics arg from set_module...
* rm else branch in integrations::bnb::set_module
* bnb 4bit dtype check
* upd comment on 4bit weights
* upd tests for FP4 safe
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-12-21 11:54:44 +01:00 |
|
Younes Belkada
|
fd6a0ade9b
|
🚨🚨🚨 [Quantization ] Store the original dtype in the config as a private attribute 🚨🚨🚨 (#26761)
* First step
* fix
* add adjustements for gptq
* change to `_pre_quantization_dtype`
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix serialization
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-10-16 19:56:53 +02:00 |
|
Younes Belkada
|
7ccac73f74
|
[RWKV ] Final fix RWMV 4bit (#26134)
* Final fix RWMV 4bit
* fixup
* add a test
* add more clarifications
|
2023-09-13 16:30:20 +02:00 |
|
Younes Belkada
|
c8b26096d4
|
[core ] fix 4bit num_parameters (#26132)
* fix 4bit `num_parameters`
* stronger check
|
2023-09-13 14:12:35 +02:00 |
|
Marc Sun
|
55db70c63d
|
GPTQ integration (#25062)
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-08-10 16:06:29 -04:00 |
|