mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-15 02:28:24 +06:00
![]() * [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
---|---|---|
.. | ||
callback.mdx | ||
configuration.mdx | ||
data_collator.mdx | ||
deepspeed.mdx | ||
feature_extractor.mdx | ||
keras_callbacks.mdx | ||
logging.mdx | ||
model.mdx | ||
onnx.mdx | ||
optimizer_schedules.mdx | ||
output.mdx | ||
pipelines.mdx | ||
processors.mdx | ||
text_generation.mdx | ||
tokenizer.mdx | ||
trainer.mdx |