transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Marc Sun 9ea1eacd11 remove to restriction for 4-bit model (#33122 ) * remove to restiction for 4-bit model * Update src/transformers/modeling_utils.py Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * bitsandbytes: prevent dtype casting while allowing device movement with .to or .cuda * quality fix * Improve warning message for .to() and .cuda() on bnb quantized models --------- Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>		2024-09-02 16:28:50 +02:00
..
aqlm_integration	Cache: use `batch_size` instead of `max_batch_size` (#32657 )	2024-08-16 11:48:45 +01:00
autoawq	Skip tests properly (#31308 )	2024-06-26 21:59:08 +01:00
bnb	remove to restriction for 4-bit model (#33122 )	2024-09-02 16:28:50 +02:00
eetq_integration	[FEAT]: EETQ quantizer support (#30262 )	2024-04-22 20:38:58 +01:00
fbgemm_fp8	Add new quant method (#32047 )	2024-07-22 20:21:59 +02:00
ggml	Support dequantizing GGUF FP16 format (#31783 )	2024-07-24 17:59:59 +02:00
gptq	🚨 Remove dataset with restrictive license (#31452 )	2024-06-17 17:56:51 +01:00
hqq	Quantization / HQQ: Fix HQQ tests on our runner (#30668 )	2024-05-06 11:33:52 +02:00
quanto_integration	Skip tests properly (#31308 )	2024-06-26 21:59:08 +01:00
torchao_integration	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00