transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 02:28:24 +06:00

History

Younes Belkada f6261d7d81 FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>		2024-03-19 11:40:23 +01:00
..
internal	v4.39 deprecations 🧼 (#29492 )	2024-03-07 10:44:43 +00:00
main_classes	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
model_doc	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
tasks	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
_config.py	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
_redirects.yml	Extended semantic segmentation to image segmentation (#27039 )	2023-11-23 15:58:21 +00:00
_toctree.yml	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Tiny improvement for doc (#29581 )	2024-03-11 17:43:35 +00:00
add_new_pipeline.md	Fix broken link on page (#28451 )	2024-01-11 09:26:13 -08:00
add_tensorflow_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
attention.md	[Docs] Fix broken links and syntax issues (#28918 )	2024-02-08 14:13:35 -08:00
autoclass_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
benchmarks.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
chat_templating.md	[docs] Remove broken ChatML format link from chat_templating.md (#29643 )	2024-03-13 13:04:51 -07:00
community.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
custom_models.md	[Docs] Add language identifiers to fenced code blocks (#28955 )	2024-02-12 10:48:31 -08:00
custom_tools.md	Warn about tool use (#29628 )	2024-03-13 14:53:13 +01:00
debugging.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
deepspeed.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fsdp.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00
generation_strategies.md	Generate: inner decoding methods are no longer public (#29437 )	2024-03-05 10:27:36 +00:00
glossary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hf_quantizer.md	[CI] Quantization workflow (#29046 )	2024-02-28 10:09:25 -05:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
installation.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
llm_tutorial_optimization.md	F.scaled_dot_product_attention support (#26572 )	2023-12-09 05:38:14 +09:00
llm_tutorial.md	Generate: All logits processors are documented and have examples (#27796 )	2023-12-07 15:11:35 +00:00
model_memory_anatomy.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_sharing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multilingual.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	[Doc] Spanish translation of pad_truncation.md (#27890 )	2023-12-08 10:32:18 -08:00
peft.md	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
perf_hardware.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_infer_cpu.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md	Cohere Model Release (#29622 )	2024-03-15 14:29:11 +01:00
perf_torch_compile.md	Fix rendering for `torch.compile()` docs (#25432 )	2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_cpu.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_one.md	Fix minor typo: softare => software (#29602 )	2024-03-12 10:39:56 +00:00
perf_train_special.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	Update the pipeline tutorial to include `gradio.Interface.from_pipeline` (#29684 )	2024-03-18 09:17:41 -07:00
pipeline_webserver.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pr_checks.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
preprocessing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
quantization.md	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
quicktour.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
run_scripts.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
task_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tasks_explained.md	[docs] Spanish translation of tasks_explained.md (#29224 )	2024-02-26 08:18:15 -08:00
testing.md	Make torch xla available on GPU (#29334 )	2024-03-11 14:07:16 +00:00
tf_xla.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tflite.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tokenizer_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
torchscript.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
trainer.md	FEAT / Optim: Add GaLore optimizer (#29588 )	2024-03-19 11:40:23 +01:00
training.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
transformers_agents.md	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00