transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-12 17:20:03 +06:00

History

Marc Sun 28de2f4de3 [Quantization] Quanto quantizer (#29023 ) * start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com>		2024-03-15 11:51:29 -04:00
..
de	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
en	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
es	[docs] Spanish translate chat_templating.md & yml addition (#29559 )	2024-03-13 09:28:11 -07:00
fr	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hi	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
it	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
ja	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
ko	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
ms	[Docs] Add missing language options and fix broken links (#28852 )	2024-02-06 12:01:01 -08:00
pt	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
te	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tr	Translate index.md to Turkish (#27093 )	2023-11-08 08:35:20 -05:00
zh	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00