transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-21 05:28:21 +06:00

Author	SHA1	Message	Date
Sangbum Daniel Choi	cb298978ad	add gather_use_object arguments (#31514 ) * add gather_use_object arguments * fix name and pass the CI test for Seq2SeqTrainer * make style * make it to functools * fix typo * add accelerate version: * adding warning * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * make style * Update src/transformers/training_args.py * check function move to initial part * add test for eval_use_gather_object --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-06-28 13:50:27 +01:00
amyeroberts	1de7dc7403	Skip tests properly (#31308 ) * Skip tests properly * [test_all] * Add 'reason' as kwarg for skipTest * [test_all] Fix up * [test_all]	2024-06-26 21:59:08 +01:00
Bastien Le Chenadec	485fd81471	Support multiple validation datasets when `dataloader_persistent_workers=True` (#30627 ) * Support multiple validation datasets when dataloader_persistent_workers=True * Test support of multiple validation datasets	2024-06-17 16:58:39 +01:00
조준래	60861fe1fd	Implement JSON dump conversion for torch_dtype in TrainingArguments (#31224 ) * Implement JSON dump conversion for torch_dtype in TrainingArguments * Add unit test for converting torch_dtype in TrainingArguments to JSON * move unit test for converting torch_dtype into TrainerIntegrationTest class * reformating using ruff * convert dict_torch_dtype_to_str to private method _dict_torch_dtype_to_str --------- Co-authored-by: jun.4 <jun.4@kakaobrain.com>	2024-06-07 15:43:34 +01:00
Zach Mueller	daf281f44f	Enforce saving at end of training if saving option chosen (#30160 ) * Enforce saving at end of training * Fix test * Rework test * Fixup tests' * Update comment based on sourab feedback * Clean	2024-05-21 07:50:11 -04:00
Mohit Sharma	7a4792e6b3	CI: AMD MI300 tests fix (#30797 ) * add fix * update import * updated dicts and comments * remove prints * Update testing_utils.py	2024-05-21 12:46:07 +01:00
Younes Belkada	8871b26150	FEAT / Trainer: LOMO optimizer support (#30178 ) * add V1 - adalomo not working yet * add todo docs + refactor from comments * adjust LR * add docs * add more elaborated test * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix * push * add accelerate check * fix DDP case * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * init kwargs * safely add attribute * revert to enum logic * Update src/transformers/trainer.py --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-21 10:16:37 +02:00
Zach Mueller	92d1d97c05	Introduce configured_state arg for accelerator_config (#29781 ) * Introduce configured_state * Include note on tuning * Allow for users to have defined a state already * Include tests * Add note on hpam tune * Guard a bit better * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Finish rebase * Finish rebase * Guard carefully * Fixup test * Refactor * Fin refactor * Comment * Update wrt feedback --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-20 09:21:40 -04:00
fxmarty	37bba2a32d	CI: update to ROCm 6.0.2 and test MI300 (#30266 ) * update to ROCm 6.0.2 and test MI300 * add callers for mi300 * update dockerfile * fix trainer tests * remove apex * style * Update tests/trainer/test_trainer_seq2seq.py * Update tests/trainer/test_trainer_seq2seq.py * Update tests/trainer/test_trainer_seq2seq.py * Update tests/trainer/test_trainer_seq2seq.py * update to torch 2.3 * add workflow dispatch target * we may need branches: mi300-ci after all * nit * fix docker build * nit * add check runner * remove docker-gpu * fix issues * fix --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-13 18:14:36 +02:00
Nate Cibik	df475bf8e6	Trainer - add cache clearing and the option for batched eval metrics computation (#28769 ) * Added cache clearing for GPU efficiency. * Added cache clearing for GPU efficiency. * Added batch_eval_metrics capability * Ran make fixup * Fixed bug * Fixed whitespace issue * Fixed outdated condition * Updated docstrings with instructions for batch_eval_metrics. Updated end of dataloader logic * Added first version of batch_eval_metrics Trainer test * Fixed batch_eval_metrics Trainer tests for both eval and predict * Fixed batch_eval_metrics behavior for new Trainer variables * Fixed batch_eval_metrics Trainer tests * Ran fixup	2024-05-06 08:23:40 -04:00
Clara Pohland	e076953079	Trainer._load_from_checkpoint - support loading multiple Peft adapters (#30505 ) * Trainer: load checkpoint model with multiple adapters * Trainer._load_from_checkpoint support multiple active adapters * PeftModel.set_adapter does not support multiple adapters yet * Trainer._load_from_checkpoint test multiple adapters --------- Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>	2024-05-06 08:22:52 -04:00
Marc Sun	b4fd49b6c5	Update unwrap from accelerate (#29933 ) * Use unwrap with the one in accelerate * oups * update unwrap * fix * wording * raise error instead * comment * doc * Update src/transformers/modeling_utils.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * style * put else --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-04-19 18:05:34 +02:00
Zach Mueller	60d5f8f9f0	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 ) * Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions	2024-04-18 12:49:43 -04:00
Zach Mueller	e27d9308be	Raise relevent err when wrong type is passed in as the accelerator_config (#29997 ) * Raise relevent err * Use type instead	2024-04-16 11:21:24 -04:00
Zach Mueller	3b8e2932ce	Rework tests to compare trainer checkpoint args (#29883 ) * Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-30 22:19:17 -04:00
Yu Chin Fabian Lim	4df5b9b4b2	Allow GradientAccumulationPlugin to be configured from AcceleratorConfig (#29589 ) * add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by: Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by: Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 14:01:40 +00:00
Christopher Keibel	aac7099c92	add functions to inspect model and optimizer status to trainer.py (#29838 ) * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 10:37:16 +00:00
Yanyi Liu	ef60995858	Add `cosine_with_min_lr` scheduler in Trainer (#29341 ) * Add cosine_with_min_lr scheduler * Update error message for missing min_lr or min_lr_rate	2024-03-26 13:57:07 +01:00
Jonathan Flynn	b5a6d6eeab	Add warnings if training args differ from checkpoint trainer state (#29255 ) * add warnings if training args differ from checkpoint args stored in trainer_state.json * run formatting and styling * add a test * format and styling --------- Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>	2024-03-26 07:13:13 +01:00
Younes Belkada	f6261d7d81	FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>	2024-03-19 11:40:23 +01:00
Fanli Lin	3f6973db06	[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307 ) * fix n_gpu * fix style	2024-03-08 10:52:25 -05:00
Zach Mueller	1681a6d452	🚨 Fully revert atomic checkpointing 🚨 (#29370 ) Fully revert atomic checkpointing	2024-03-04 06:17:42 -05:00
Zach Mueller	1a7c117df9	Fix deprecated arg issue (#29372 ) * Fix deprecated arg issue * Trainer check too * Check for dict or dataclass * Simplify, make config always AcceleratorConfig * Upstream to Trainer	2024-03-01 12:00:29 -05:00
Younes Belkada	efdd436663	FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055 ) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment	2024-02-20 12:45:08 +01:00
Younes Belkada	f7ef7cec6c	FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082 ) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 02:43:02 +01:00
Zach Mueller	636b03244c	Fix trainer test wrt DeepSpeed + auto_find_bs (#29061 ) * FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 10:04:24 -05:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
Younes Belkada	7a0fccc6eb	FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009 ) * fix trainer tags * add test	2024-02-14 23:56:35 +01:00
Zach Mueller	0507e69d34	Introduce AcceleratorConfig dataclass (#28664 ) * Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-14 10:18:09 -05:00
Huazhong Ji	69ca640dd6	Set the dataset format used by `test_trainer` to float32 (#28920 ) Co-authored-by: unit_test <test@unit.com>	2024-02-14 13:55:12 +00:00
Yih-Dar	d336c56d94	Avoid root logger's level being changed (#28638 ) * avoid root logger's level being changed --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-22 14:45:30 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00
peter-sk	769a9542de	move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844 ) * move code to Trainer.evaluate to enable use of that function with multiple datasets * test * update doc string * and a tip * forgot the type --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-12-20 10:55:56 +01:00
Zach Mueller	44127ec667	Fix test for auto_find_batch_size on multi-GPU (#27947 ) * Fix test for multi-GPU * WIth CPU handle	2023-12-11 09:57:41 -05:00
Zach Mueller	6757ed28ce	Allow `resume_from_checkpoint` to handle `auto_find_batch_size` (#27568 ) * Fuffill request * Add test * Better test * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Better test * Better test * MOre comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-08 11:51:02 -05:00
Jonathon Belotti	4c5ed1d0c9	fix: non-atomic checkpoint save (#27820 )	2023-12-08 14:08:54 +01:00
Charbel Abi Daher	2ca73e5ee3	Fixed passing scheduler-specific kwargs via TrainingArguments lr_scheduler_kwargs (#27595 ) * Fix passing scheduler-specific kwargs through TrainingArguments `lr_scheduler_kwargs` * Added test for lr_scheduler_kwargs	2023-11-28 08:33:45 +01:00
Dave Berenbaum	8eb9e29d8d	dvclive callback: warn instead of fail when logging non-scalars (#27608 ) * dvclive callback: warn instead of fail when logging non-scalars * tests: log lr as scalar	2023-11-21 09:29:51 +01:00
Arthur	651408a077	[`Styling`] stylify using ruff (#27144 ) * try to stylify using ruff * might need to remove these changes? * use ruf format andruff check * use isinstance instead of type comparision * use # fmt: skip * use # fmt: skip * nits * soem styling changes * update ci job * nits isinstance * more files update * nits * more nits * small nits * check and format * revert wrong changes * actually use formatter instead of checker * nits * well docbuilder is overwriting this commit * revert notebook changes * try to nuke docbuilder * style * fix feature exrtaction test * remve `indent-width = 4` * fixup * more nits * update the ruff version that we use * style * nuke docbuilder styling * leve the print for detected changes * nits * Remove file I/O Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com> * style * nits * revert notebook changes * Add # fmt skip when possible * Add # fmt skip when possible * Fix * More ` # fmt: skip` usage * More ` # fmt: skip` usage * More ` # fmt: skip` usage * NIts * more fixes * fix tapas * Another way to skip * Recommended way * Fix two more fiels * Remove asynch Remove asynch --------- Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>	2023-11-16 17:43:19 +01:00
Hz, Ji	1ffc4dee5b	enable memory tracker metrics for npu (#27280 )	2023-11-06 13:44:21 +00:00
Lysandre Debut	113ebf80ac	Safetensors serialization by default (#27064 ) * Safetensors serialization by default * First pass on the tests * Second pass on the tests * Third pass on the tests * Fix TF weight loading from TF-format safetensors * Specific encoder-decoder fixes for weight crossloading * Add VisionEncoderDecoder fixes for TF too * Change filename test for pt-to-tf * One missing fix for TFVisionEncoderDecoder * Fix the other crossload test * Support for flax + updated tests * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Sanchit's comments * Sanchit's comments 2 * Nico's comments * Fix tests * cleanup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 19:16:49 +01:00
Younes Belkada	309a90664f	[FEAT] Add Neftune into transformers Trainer (#27141 ) * add v1 neftune * use `unwrap_model` instead * add test + docs * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * more details * fixup * Update docs/source/en/main_classes/trainer.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor a bit * more elaborated test * fix unwrap issue --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 16:03:59 +01:00
Hz, Ji	5bbf671276	Device agnostic trainer testing (#27131 )	2023-10-30 18:16:40 +00:00
Younes Belkada	5fbed2d7ca	[`Trainer` / `GC`] Add `gradient_checkpointing_kwargs` in trainer and training arguments (#27068 ) * add `gradient_checkpointing_kwargs` in trainer and training arguments * add comment * add test - currently failing * now tests pass	2023-10-30 12:41:48 +01:00
Zach Mueller	34a640642b	Save TB logs as part of push_to_hub (#27022 ) * Support runs/ * Upload runs folder as part of push to hub * Add a test * Add to test deps * Update with proposed solution from Slack * Ensure that repo gets deleted in tests	2023-10-26 12:13:19 -04:00
Wang, Yi	8f609ab9e0	enable optuna multi-objectives feature (#25969 ) * enable optuna multi-objectives feature Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update hpo doc * update docstring Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * extend direction to List[str] type Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-12 18:01:22 +01:00
Zach Mueller	be0e189bd3	Revert frozen training arguments (#25903 ) * Revert frozen training arguments * TODO	2023-09-01 11:24:12 -04:00
Zach Mueller	ca51499248	Make training args fully immutable (#25435 ) * Make training args fully immutable * Working tests, PyTorch * In test_trainer * during testing * Use proper dataclass way * Fix test * Another one * Fix tf * Lingering slow * Exception * Clean	2023-08-15 11:47:47 -04:00
Sylvain Gugger	baf1daa58e	Migrate Trainer from `Repository` to `upload_folder` (#25095 ) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com>	2023-08-07 17:47:22 +02:00

1 2

94 Commits