mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-18 12:08:22 +06:00
![]() * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Fixing issues for PR #23479. * Added fix for fp32 layer norms and bf16 compute in LLaMA. * Reverted variable name change. * Initial draft. Some tests fail. * Fixed dtype bug. * Fixed bug caused by torch_dtype='auto'. * All test green for 8-bit and 4-bit layers. * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. * Added missing tests. * Fixup changes. * Added fixup changes. * Missed some variables to rename. * revert trainer tests * revert test trainer * another revert * fix tests and safety checkers * protect import * simplify a bit * Update src/transformers/trainer.py * few fixes * add warning * replace with `load_in_kbit = load_in_4bit or load_in_8bit` * fix test * fix tests * this time fix tests * safety checker * add docs * revert torch_dtype * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * multiple fixes * update docs * version checks and multiple fixes * replace `is_loaded_in_kbit` * replace `load_in_kbit` * change methods names * better checks * oops * oops * address final comments --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> |
||
---|---|---|
.. | ||
internal | ||
main_classes | ||
model_doc | ||
tasks | ||
_config.py | ||
_toctree.yml | ||
accelerate.mdx | ||
add_new_model.mdx | ||
add_new_pipeline.mdx | ||
add_tensorflow_model.mdx | ||
attention.mdx | ||
autoclass_tutorial.mdx | ||
benchmarks.mdx | ||
bertology.mdx | ||
big_models.mdx | ||
community.mdx | ||
contributing.md | ||
create_a_model.mdx | ||
custom_models.mdx | ||
custom_tools.mdx | ||
debugging.mdx | ||
fast_tokenizers.mdx | ||
generation_strategies.mdx | ||
glossary.mdx | ||
hpo_train.mdx | ||
index.mdx | ||
installation.mdx | ||
model_sharing.mdx | ||
model_summary.mdx | ||
multilingual.mdx | ||
notebooks.md | ||
pad_truncation.mdx | ||
perf_hardware.mdx | ||
perf_infer_cpu.mdx | ||
perf_infer_gpu_many.mdx | ||
perf_infer_gpu_one.mdx | ||
perf_infer_special.mdx | ||
perf_train_cpu_many.mdx | ||
perf_train_cpu.mdx | ||
perf_train_gpu_many.mdx | ||
perf_train_gpu_one.mdx | ||
perf_train_special.mdx | ||
perf_train_tpu_tf.mdx | ||
perf_train_tpu.mdx | ||
performance.mdx | ||
perplexity.mdx | ||
philosophy.mdx | ||
pipeline_tutorial.mdx | ||
pipeline_webserver.mdx | ||
pr_checks.mdx | ||
preprocessing.mdx | ||
quicktour.mdx | ||
run_scripts.mdx | ||
sagemaker.mdx | ||
serialization.mdx | ||
task_summary.mdx | ||
tasks_explained.mdx | ||
testing.mdx | ||
tf_xla.mdx | ||
tokenizer_summary.mdx | ||
torchscript.mdx | ||
training.mdx | ||
transformers_agents.mdx | ||
troubleshooting.mdx |