transformers/docs/source/en
Aaron V d5f992f5e6
Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835)
* Get parallel loader working. Include tests.

* Update the tests for parallel loading

* Rename env variables.

* Add docs for parallel model weight loading.

* Touch up parallel model loading docs.

* Touch up parallel model loading docs again.

* Edit comment in test_modeling_utils_parallel_loading.py

* Make sure HF_PARALLEL_LOADING_WORKERS is spelled correctly in modeling_utils.py

* Correct times for parallelized loading, previous times were for a "hot" filesystem

* Update parallel model loading so the spawn method is encapsulated. DRY up the code by leveraging get_submodule.

* Update docs on model loading parallelism so that details on setting the multiprocessing start method are removed, now that the package handles this step internally.

* Fix style on model loading parallelism changes.

* Merge latest version of master's modeling_utils.

* Removed unused variable.

* Fix argument packing for the parallel loader.

* Fix state dict being undefined in the parallel model loader.

* Rename variables used in parallel model loading for clarity. Use get_module_from_name().

* Switch to the use of threads for parallel model loading.

* Update docs for parallel loading.

* Remove the use of json.loads when evaluating HF_ENABLE_PARALLEL_LOADING. Prefer simple casting.

* Move parallelized shard loading into its own function.

* Remove use of is_true(). Favor checking env var true values for HF_ENABLE_PARALLEL_LOADING.

* Update copyright to 2025 in readme for paralell model loading.

* Remove garbage collection line in load_shard_file, implicit garbage collection already occurs.

* Run formatter on modeling_utils.py

* Apply style fixes

* Delete tests/utils/test_modeling_utils_parallel_loading.py

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-05-23 16:39:47 +00:00
..
internal Adds use_repr to model_addition_debugger_context (#37984) 2025-05-23 09:35:13 +00:00
main_classes 🔴 Video processors as a separate class (#35206) 2025-05-12 11:55:51 +02:00
model_doc 🚨Early-error🚨 config will error out if output_attentions=True and the attn implementation is wrong (#38288) 2025-05-23 17:17:38 +02:00
quantization Support AOPerModuleConfig and include_embedding (#37802) 2025-04-30 20:16:29 +02:00
reference Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835) 2025-05-23 16:39:47 +00:00
tasks Enhance documentation to explain chat-based few-shot prompting (#37828) 2025-04-30 11:00:10 -07:00
_config.py Add optimized PixtralImageProcessorFast (#34836) 2024-11-28 16:04:05 +01:00
_redirects.yml Docs / Quantization: Redirect deleted page (#31063) 2024-05-28 18:29:22 +02:00
_toctree.yml Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag (#36835) 2025-05-23 16:39:47 +00:00
accelerate.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
add_new_model.md Transformers cli clean command (#37657) 2025-04-30 12:15:43 +01:00
add_new_pipeline.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
agents.md [agents] remove agents 🧹 (#37368) 2025-04-11 18:42:37 +01:00
attention_interface.md 🚨🚨[core] Completely rewrite the masking logic for all attentions (#37866) 2025-05-22 11:38:26 +02:00
attention.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
auto_docstring.md [AutoDocstring] Based on inspect parsing of the signature (#33771) 2025-05-08 17:46:07 -04:00
backbones.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
cache_explanation.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
chat_extras.md Update chat_extras.md with content correction (#36599) 2025-03-07 13:09:02 +00:00
chat_templating_multimodal.md [chat-template] Unify tests and clean up 🧼 (#37275) 2025-04-10 14:42:32 +02:00
chat_templating_writing.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
chat_templating.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
community.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
conversations.md [chat] generate parameterization powered by GenerationConfig and UX-related changes (#38047) 2025-05-12 14:04:41 +01:00
custom_models.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
debugging.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
deepspeed.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
executorch.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
fast_tokenizers.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
feature_extractors.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
fsdp.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
generation_features.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
generation_strategies.md [custom_generate] don't forward custom_generate and trust_remote_code (#38304) 2025-05-23 14:49:39 +00:00
gguf.md Fix gguf docs (#36601) 2025-03-11 15:29:14 +01:00
glossary.md Fix typos (#31819) 2024-07-08 11:52:47 +01:00
gpu_selection.md Fix typos (#36910) 2025-03-24 14:08:29 +00:00
how_to_hack_models.md [doc] fix bugs in how_to_hack_models.md (#38198) 2025-05-19 10:37:54 -07:00
hpo_train.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
image_processors.md 🔴 Video processors as a separate class (#35206) 2025-05-12 11:55:51 +02:00
index.md Adding Qwen3 and Qwen3MoE (#36878) 2025-03-31 09:50:49 +02:00
installation.md byebye torch 2.0 (#37277) 2025-04-07 15:19:47 +02:00
kv_cache.md fix link in kv_cache.md (#37652) 2025-04-21 09:01:11 -07:00
llm_optims.md [CI] green llama tests (#37244) 2025-04-03 14:15:53 +01:00
llm_tutorial_optimization.md fix typos in the docs directory (#36639) 2025-03-11 09:41:41 -07:00
llm_tutorial.md [custom_generate] don't forward custom_generate and trust_remote_code (#38304) 2025-05-23 14:49:39 +00:00
model_memory_anatomy.md Enable BNB multi-backend support (#31098) 2024-09-24 03:40:56 -06:00
model_sharing.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
model_summary.md model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) 2024-03-23 18:29:39 -07:00
models.md [docs] minor fixes in models.md (#38193) 2025-05-19 13:14:21 +00:00
modular_transformers.md Support custom dosctrings in modular (#36726) 2025-03-18 14:00:54 -04:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
optimizers.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
pad_truncation.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
peft.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_hardware.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
perf_infer_cpu.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_infer_gpu_multi.md Fix: make docs work better with doc builder (#38213) 2025-05-20 08:23:03 +00:00
perf_infer_gpu_one.md Small typo lines 47 and 199 perf_infer_gpu_one.md (#37938) 2025-05-06 14:32:55 +01:00
perf_torch_compile.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_cpu_many.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_cpu.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_gaudi.md Add Intel Gaudi doc (#37855) 2025-04-29 13:28:06 -07:00
perf_train_gpu_many.md docs: fix typo (#37567) 2025-04-17 14:54:44 +01:00
perf_train_gpu_one.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_special.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perf_train_tpu_tf.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
perplexity.md [docs] use device-agnostic API instead of cuda (#34913) 2024-11-26 09:23:34 -08:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_gradio.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
pipeline_tutorial.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
pipeline_webserver.md fix and enhance pipeline_webserver.md (#36992) 2025-04-15 08:35:05 -07:00
pr_checks.md Fixed Majority of the Typos in transformers[en] Documentation (#33350) 2024-09-09 10:47:24 +02:00
processors.md [docs] add Audio import (#38195) 2025-05-19 13:16:35 +00:00
quicktour.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
run_scripts.md Remove research projects (#36645) 2025-03-11 13:47:38 +00:00
serialization.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
serving.md fix docs serving typos. (#37936) 2025-05-06 14:32:44 +01:00
task_summary.md [doctest] Fixes (#35863) 2025-01-26 15:26:38 -08:00
tasks_explained.md fix: Wrong task mentioned in docs (#34757) 2024-11-18 18:42:28 +00:00
testing.md chore: Fix typos in docs and examples (#36524) 2025-03-04 13:47:41 +00:00
tf_xla.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
tflite.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
tokenizer_summary.md [docs] Spanish translation of tokenizer_summary.md (#31154) 2024-06-03 16:52:23 -07:00
tools.md [agents] remove agents 🧹 (#37368) 2025-04-11 18:42:37 +01:00
torchscript.md Fix wording in torchscript.md (#38004) 2025-05-08 16:47:45 +01:00
trainer.md Update trainer.md (#38113) 2025-05-14 12:40:00 +00:00
training.md [docs] Redesign (#31757) 2025-03-03 10:33:46 -08:00
troubleshooting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
video_processors.md 🔴 Video processors as a separate class (#35206) 2025-05-12 11:55:51 +02:00