Heinz-Alexander Fuetterer
883ed4b344
chore: fix typos ( #26756 )
2023-10-12 18:00:27 +02:00
Marc Sun
06a1d75bd5
fix gptq nits ( #25500 )
...
* fix nits
* fix docstring
* fix doc
* fix damp_percent
* fix doc
2023-08-14 11:43:38 -04:00
Marc Sun
55db70c63d
GPTQ integration ( #25062 )
...
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-08-10 16:06:29 -04:00
Younes Belkada
972fdcc778
[Docs
/quantization
] Clearer explanation on how things works under the hood. + remove outdated info ( #25216 )
...
* clearer explanation on how things works under the hood.
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `load_in_4bit` in `from_pretrained`
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-08-01 10:56:52 +02:00
Stas Bekman
5220606607
[quantization.md] fix ( #25190 )
...
Update quantization.md
2023-07-31 09:37:29 -07:00
Younes Belkada
ca974aff0f
[Docs
] Clarify 4bit docs ( #24878 )
...
* clarify 4bit docs
* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2023-07-18 13:39:08 +02:00
Marc Sun
35eac0df75
add link to accelerate doc ( #24601 )
2023-07-10 17:49:30 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. ( #24376 )
...
* Rename index.mdx to index.md
* With saved modifs
* Address review comment
* Treat all files
* .mdx -> .md
* Remove special char
* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2023-06-20 18:07:47 -04:00