transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 01:02:25 +06:00

Author	SHA1	Message	Date
Marc Sun	b4fd49b6c5	Update unwrap from accelerate (#29933 ) * Use unwrap with the one in accelerate * oups * update unwrap * fix * wording * raise error instead * comment * doc * Update src/transformers/modeling_utils.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * style * put else --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-04-19 18:05:34 +02:00
Zach Mueller	60d5f8f9f0	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 ) * Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions	2024-04-18 12:49:43 -04:00
Zach Mueller	e27d9308be	Raise relevent err when wrong type is passed in as the accelerator_config (#29997 ) * Raise relevent err * Use type instead	2024-04-16 11:21:24 -04:00
Zach Mueller	3b8e2932ce	Rework tests to compare trainer checkpoint args (#29883 ) * Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-30 22:19:17 -04:00
Yu Chin Fabian Lim	4df5b9b4b2	Allow GradientAccumulationPlugin to be configured from AcceleratorConfig (#29589 ) * add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by: Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by: Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 14:01:40 +00:00
Christopher Keibel	aac7099c92	add functions to inspect model and optimizer status to trainer.py (#29838 ) * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 10:37:16 +00:00
Yanyi Liu	ef60995858	Add `cosine_with_min_lr` scheduler in Trainer (#29341 ) * Add cosine_with_min_lr scheduler * Update error message for missing min_lr or min_lr_rate	2024-03-26 13:57:07 +01:00
Jonathan Flynn	b5a6d6eeab	Add warnings if training args differ from checkpoint trainer state (#29255 ) * add warnings if training args differ from checkpoint args stored in trainer_state.json * run formatting and styling * add a test * format and styling --------- Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>	2024-03-26 07:13:13 +01:00
Younes Belkada	f6261d7d81	FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>	2024-03-19 11:40:23 +01:00
Fanli Lin	3f6973db06	[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307 ) * fix n_gpu * fix style	2024-03-08 10:52:25 -05:00
Zach Mueller	1681a6d452	🚨 Fully revert atomic checkpointing 🚨 (#29370 ) Fully revert atomic checkpointing	2024-03-04 06:17:42 -05:00
Zach Mueller	1a7c117df9	Fix deprecated arg issue (#29372 ) * Fix deprecated arg issue * Trainer check too * Check for dict or dataclass * Simplify, make config always AcceleratorConfig * Upstream to Trainer	2024-03-01 12:00:29 -05:00
Younes Belkada	efdd436663	FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055 ) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment	2024-02-20 12:45:08 +01:00
Younes Belkada	f7ef7cec6c	FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082 ) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 02:43:02 +01:00
Zach Mueller	636b03244c	Fix trainer test wrt DeepSpeed + auto_find_bs (#29061 ) * FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 10:04:24 -05:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
Younes Belkada	7a0fccc6eb	FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009 ) * fix trainer tags * add test	2024-02-14 23:56:35 +01:00
Zach Mueller	0507e69d34	Introduce AcceleratorConfig dataclass (#28664 ) * Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-14 10:18:09 -05:00
Huazhong Ji	69ca640dd6	Set the dataset format used by `test_trainer` to float32 (#28920 ) Co-authored-by: unit_test <test@unit.com>	2024-02-14 13:55:12 +00:00
Yih-Dar	d336c56d94	Avoid root logger's level being changed (#28638 ) * avoid root logger's level being changed --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-22 14:45:30 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00
peter-sk	769a9542de	move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844 ) * move code to Trainer.evaluate to enable use of that function with multiple datasets * test * update doc string * and a tip * forgot the type --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-12-20 10:55:56 +01:00
Zach Mueller	44127ec667	Fix test for auto_find_batch_size on multi-GPU (#27947 ) * Fix test for multi-GPU * WIth CPU handle	2023-12-11 09:57:41 -05:00
Zach Mueller	6757ed28ce	Allow `resume_from_checkpoint` to handle `auto_find_batch_size` (#27568 ) * Fuffill request * Add test * Better test * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Better test * Better test * MOre comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-08 11:51:02 -05:00
Jonathon Belotti	4c5ed1d0c9	fix: non-atomic checkpoint save (#27820 )	2023-12-08 14:08:54 +01:00
Charbel Abi Daher	2ca73e5ee3	Fixed passing scheduler-specific kwargs via TrainingArguments lr_scheduler_kwargs (#27595 ) * Fix passing scheduler-specific kwargs through TrainingArguments `lr_scheduler_kwargs` * Added test for lr_scheduler_kwargs	2023-11-28 08:33:45 +01:00
Dave Berenbaum	8eb9e29d8d	dvclive callback: warn instead of fail when logging non-scalars (#27608 ) * dvclive callback: warn instead of fail when logging non-scalars * tests: log lr as scalar	2023-11-21 09:29:51 +01:00
Arthur	651408a077	[`Styling`] stylify using ruff (#27144 ) * try to stylify using ruff * might need to remove these changes? * use ruf format andruff check * use isinstance instead of type comparision * use # fmt: skip * use # fmt: skip * nits * soem styling changes * update ci job * nits isinstance * more files update * nits * more nits * small nits * check and format * revert wrong changes * actually use formatter instead of checker * nits * well docbuilder is overwriting this commit * revert notebook changes * try to nuke docbuilder * style * fix feature exrtaction test * remve `indent-width = 4` * fixup * more nits * update the ruff version that we use * style * nuke docbuilder styling * leve the print for detected changes * nits * Remove file I/O Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com> * style * nits * revert notebook changes * Add # fmt skip when possible * Add # fmt skip when possible * Fix * More ` # fmt: skip` usage * More ` # fmt: skip` usage * More ` # fmt: skip` usage * NIts * more fixes * fix tapas * Another way to skip * Recommended way * Fix two more fiels * Remove asynch Remove asynch --------- Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>	2023-11-16 17:43:19 +01:00
Hz, Ji	1ffc4dee5b	enable memory tracker metrics for npu (#27280 )	2023-11-06 13:44:21 +00:00
Lysandre Debut	113ebf80ac	Safetensors serialization by default (#27064 ) * Safetensors serialization by default * First pass on the tests * Second pass on the tests * Third pass on the tests * Fix TF weight loading from TF-format safetensors * Specific encoder-decoder fixes for weight crossloading * Add VisionEncoderDecoder fixes for TF too * Change filename test for pt-to-tf * One missing fix for TFVisionEncoderDecoder * Fix the other crossload test * Support for flax + updated tests * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Sanchit's comments * Sanchit's comments 2 * Nico's comments * Fix tests * cleanup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 19:16:49 +01:00
Younes Belkada	309a90664f	[FEAT] Add Neftune into transformers Trainer (#27141 ) * add v1 neftune * use `unwrap_model` instead * add test + docs * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * more details * fixup * Update docs/source/en/main_classes/trainer.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor a bit * more elaborated test * fix unwrap issue --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 16:03:59 +01:00
Hz, Ji	5bbf671276	Device agnostic trainer testing (#27131 )	2023-10-30 18:16:40 +00:00
Younes Belkada	5fbed2d7ca	[`Trainer` / `GC`] Add `gradient_checkpointing_kwargs` in trainer and training arguments (#27068 ) * add `gradient_checkpointing_kwargs` in trainer and training arguments * add comment * add test - currently failing * now tests pass	2023-10-30 12:41:48 +01:00
Zach Mueller	34a640642b	Save TB logs as part of push_to_hub (#27022 ) * Support runs/ * Upload runs folder as part of push to hub * Add a test * Add to test deps * Update with proposed solution from Slack * Ensure that repo gets deleted in tests	2023-10-26 12:13:19 -04:00
Wang, Yi	8f609ab9e0	enable optuna multi-objectives feature (#25969 ) * enable optuna multi-objectives feature Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update hpo doc * update docstring Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * extend direction to List[str] type Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-12 18:01:22 +01:00
Zach Mueller	be0e189bd3	Revert frozen training arguments (#25903 ) * Revert frozen training arguments * TODO	2023-09-01 11:24:12 -04:00
Zach Mueller	ca51499248	Make training args fully immutable (#25435 ) * Make training args fully immutable * Working tests, PyTorch * In test_trainer * during testing * Use proper dataclass way * Fix test * Another one * Fix tf * Lingering slow * Exception * Clean	2023-08-15 11:47:47 -04:00
Sylvain Gugger	baf1daa58e	Migrate Trainer from `Repository` to `upload_folder` (#25095 ) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com>	2023-08-07 17:47:22 +02:00
Zach Mueller	0284285501	Fix pad across processes dim in trainer and not being able to set the timeout (#24775 ) * dim, and rm copy * Don't rm copy for now * Oops * pad index * Should be a working test * Tickle down ddp timeout * Put fix back in now that testing locally is done * Better comment specifying timeout Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-12 10:01:51 -04:00
Alex Hall	b6295b26c5	Refactor hyperparameter search backends (#24384 ) * Refactor hyperparameter search backends * Simpler refactoring without abstract base class * black * review comments: specify name in class use methods instead of callable class attributes name constant better * review comments: safer bool checking, log multiple available backends * test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black. * copyright	2023-06-22 14:28:25 -04:00
Zach Mueller	ebd94b0f6f	🚨🚨🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨🚨🚨 (#24028 ) * Working integration * Fix failing test * Revert label host logic * Bring it back!	2023-06-12 11:23:37 -04:00
Tim Dettmers	796162c512	Paged Optimizer + Lion Optimizer for Trainer (#23217 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-24 12:53:28 +02:00
Maxime Méloux	9b435204b1	Add Trainer support for ReduceLROnPlateau (#23010 ) * Add Trainer support for ReduceLROnPlateau Fixes #16503 * Remove training argument and add default instance --------- Co-authored-by: mmeloux <maxime.meloux@loria.fr>	2023-04-28 09:17:30 -04:00
Viktor Scherbakov	871598be55	Implemented safetensors checkpoints save/load for Trainer (#22498 ) * implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 09:05:04 -04:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
ydshieh	aa3787c8f0	Skip test_log_level for now	2023-02-23 12:11:20 +01:00
Sylvain Gugger	b19d64d852	Respect documentation on passive log level (#21700 ) * Respect documentation on passive log level * Fix test and set log level in examples * Add doc	2023-02-22 09:39:18 +01:00

1 2

83 Commits