transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 03:58:25 +06:00

History

Marc Sun 28de2f4de3 [Quantization] Quanto quantizer (#29023 ) * start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com>		2024-03-15 11:51:29 -04:00
..
internal	v4.39 deprecations 🧼 (#29492 )	2024-03-07 10:44:43 +00:00
main_classes	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
model_doc	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
tasks	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
_redirects.yml	Extended semantic segmentation to image segmentation (#27039 )	2023-11-23 15:58:21 +00:00
_toctree.yml	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Tiny improvement for doc (#29581 )	2024-03-11 17:43:35 +00:00
add_new_pipeline.md	Fix broken link on page (#28451 )	2024-01-11 09:26:13 -08:00
add_tensorflow_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
attention.md	[Docs] Fix broken links and syntax issues (#28918 )	2024-02-08 14:13:35 -08:00
autoclass_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
benchmarks.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
chat_templating.md	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
community.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
custom_models.md	[Docs] Add language identifiers to fenced code blocks (#28955 )	2024-02-12 10:48:31 -08:00
custom_tools.md	Warn about tool use (#29628 )	2024-03-13 14:53:13 +01:00
debugging.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
deepspeed.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fsdp.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00
generation_strategies.md	Generate: inner decoding methods are no longer public (#29437 )	2024-03-05 10:27:36 +00:00
glossary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hf_quantizer.md	[CI] Quantization workflow (#29046 )	2024-02-28 10:09:25 -05:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
installation.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
llm_tutorial_optimization.md	F.scaled_dot_product_attention support (#26572 )	2023-12-09 05:38:14 +09:00
llm_tutorial.md	Generate: All logits processors are documented and have examples (#27796 )	2023-12-07 15:11:35 +00:00
model_memory_anatomy.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_sharing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multilingual.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	[Doc] Spanish translation of pad_truncation.md (#27890 )	2023-12-08 10:32:18 -08:00
peft.md	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
perf_hardware.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_infer_cpu.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
perf_torch_compile.md	Fix rendering for `torch.compile()` docs (#25432 )	2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_cpu.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_one.md	Fix minor typo: softare => software (#29602 )	2024-03-12 10:39:56 +00:00
perf_train_special.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pipeline_webserver.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pr_checks.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
preprocessing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
quantization.md	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
quicktour.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
run_scripts.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
task_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tasks_explained.md	[docs] Spanish translation of tasks_explained.md (#29224 )	2024-02-26 08:18:15 -08:00
testing.md	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
tf_xla.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tflite.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tokenizer_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
torchscript.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
trainer.md	[Docs] Spanish Translation -Torchscript md & Trainer md (#29310 )	2024-03-04 13:57:51 -08:00
training.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
transformers_agents.md	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00