transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-13 17:48:22 +06:00

History

Jerry Zhang 78d78cdf8a Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>		2024-08-14 16:14:24 +02:00
..
agent.md	Add stream messages from agent run for gradio chatbot (#32142 )	2024-07-29 20:12:44 +02:00
backbones.md	doc: fix broken BEiT and DiNAT model links on Backbone page (#32029 )	2024-07-17 20:24:10 +01:00
callback.md	Update CometCallback to allow reusing of the running experiment (#31366 )	2024-07-05 08:13:46 +02:00
configuration.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
data_collator.md	Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629 )	2024-07-23 15:56:41 +02:00
deepspeed.md	[docs] DeepSpeed (#28542 )	2024-01-24 08:31:28 -08:00
feature_extractor.md	Fixed typos (#26810 )	2023-10-16 09:52:29 +02:00
image_processor.md	Fast image processor (#28847 )	2024-06-11 15:47:38 +01:00
keras_callbacks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
logging.md	Warnings controlled by logger level (#26527 )	2023-10-12 10:48:38 +02:00
model.md	Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771 )	2024-07-16 09:32:01 -04:00
onnx.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
optimizer_schedules.md	Add WSD scheduler (#30231 )	2024-04-25 12:07:21 +01:00
output.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pipelines.md	Allow FP16 or other precision inference for Pipelines (#31342 )	2024-07-05 17:21:50 +01:00
processors.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
quantization.md	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
text_generation.md	Add Watermarking LogitsProcessor and WatermarkDetector (#29676 )	2024-05-14 13:31:39 +05:00
tokenizer.md	[`PretrainedTokenizer`] add some of the most important functions to the doc (#27313 )	2023-11-06 15:11:00 +01:00
trainer.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00