transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-21 13:38:31 +06:00

Author	SHA1	Message	Date
Howard Liberty	f16caf44bb	Add FSDP config for CPU RAM efficient loading through accelerate (#30002 ) * Add FSDP config for CPU RAM efficient loading * Style fix * Update src/transformers/training_args.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add sync_module_states and cpu_ram_efficient_loading validation logic * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Style --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-22 13:15:28 +01:00
Zach Mueller	60d5f8f9f0	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 ) * Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions	2024-04-18 12:49:43 -04:00
Sourab Mangrulkar	350c5d1566	Add support for FSDP+QLoRA and DeepSpeed ZeRO3+QLoRA (#29587 ) * fsdp+qlora related changes * fixes * Update quantization_config.py * support fsdp+qlora and dsz3+qlora * Update quantization_config.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_utils.py * handle fsdp+qlora and dsz3+qlora correctly while model loading * fix param count * quality * fsdp related changes * fsdp changes only when using LoRA/QLoRA * add accelerate version check * refactor, update min accelerate version and add tests 1. Update minimum accelerate version to 0.26.0 2. Clean the trainer wrt accelerate version checks 3. FSDP refactor and test for fsdp config 4. use `itemsize` instead of `dtype2bytes` dict * fix test * Address comments Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix the conditional flag * fix conditional flag * address comments Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>	2024-03-13 22:03:02 +05:30
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
Sourab Mangrulkar	238d2e3c44	fix resuming from ckpt when using FSDP with FULL_STATE_DICT (#27891 ) * fix resuming from ckpt when suing FSDP with FULL_STATE_DICT * update tests * fix tests	2023-12-16 19:41:43 +05:30
Hz, Ji	82c7e87987	device agnostic fsdp testing (#27120 ) * make fsdp test cases device agnostic * make style	2023-11-01 07:17:06 +01:00
Yih-Dar	3e93dd295b	Skip `TrainerIntegrationFSDP::test_basic_run_with_cpu_offload` if `torch < 2.1` (#26764 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-10-12 18:22:09 +02:00
Sourab Mangrulkar	86ffd5ffa2	fix name error when accelerate is not available (#26278 ) * fix name error when accelerate is not available * fix `is_fsdp_available`	2023-09-20 08:02:55 +02:00
Sourab Mangrulkar	382ba670ed	FSDP tests and checkpointing fixes (#26180 ) * add fsdp tests * Update test_fsdp.py * Update test_fsdp.py * fixes * checks * Update trainer.py * fix * fixes for saving/resuming checkpoints * fixes * add tests and delete debug statements * fixing tests * Update test_fsdp.py * fix tests * fix tests * minor nits * fix code style and quality * refactor and modularize test code * reduce the time of tests * reduce the test time * fix test * reduce test time * reduce test time * fix failing tests * fix * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * resolve comments --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-09-20 10:26:16 +05:30

9 Commits