transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-05 13:50:13 +06:00

Author	SHA1	Message	Date
Zach Mueller	60d5f8f9f0	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 ) * Alias * Note alias * Tests and src * Rest * Clean * Change typing? * Fix tests * Deprecation versions	2024-04-18 12:49:43 -04:00
Pavel Iakubovskii	c15aad0939	Add strategy to store results in evaluation loop (#30267 ) * Add evaluation loop container for interm. results * Add tests for EvalLoopContainer * Formatting * Fix padding_index in test and typo * Move EvalLoopContainer to pr_utils to avoid additional imports * Fix `eval_do_concat_batches` arg description * Fix EvalLoopContainer import	2024-04-17 12:42:27 +01:00
Zach Mueller	e27d9308be	Raise relevent err when wrong type is passed in as the accelerator_config (#29997 ) * Raise relevent err * Use type instead	2024-04-16 11:21:24 -04:00
Zach Mueller	3b8e2932ce	Rework tests to compare trainer checkpoint args (#29883 ) * Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-30 22:19:17 -04:00
Yu Chin Fabian Lim	4df5b9b4b2	Allow GradientAccumulationPlugin to be configured from AcceleratorConfig (#29589 ) * add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by: Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by: Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 14:01:40 +00:00
Christopher Keibel	aac7099c92	add functions to inspect model and optimizer status to trainer.py (#29838 ) * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2024-03-28 10:37:16 +00:00
Yanyi Liu	ef60995858	Add `cosine_with_min_lr` scheduler in Trainer (#29341 ) * Add cosine_with_min_lr scheduler * Update error message for missing min_lr or min_lr_rate	2024-03-26 13:57:07 +01:00
Jonathan Flynn	b5a6d6eeab	Add warnings if training args differ from checkpoint trainer state (#29255 ) * add warnings if training args differ from checkpoint args stored in trainer_state.json * run formatting and styling * add a test * format and styling --------- Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>	2024-03-26 07:13:13 +01:00
Zach Mueller	c78f57729f	Update test reqs to include sentencepiece (#29756 ) * Update test reqs * Clean	2024-03-20 15:53:42 +00:00
Younes Belkada	f6261d7d81	FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>	2024-03-19 11:40:23 +01:00
Joao Gante	c47fcd0830	Trainer: fail early in the presence of an unsavable `generation_config` (#29675 )	2024-03-15 12:59:10 +00:00
Fanli Lin	3f6973db06	[tests] use the correct `n_gpu` in `TrainerIntegrationTest::test_train_and_eval_dataloaders` for XPU (#29307 ) * fix n_gpu * fix style	2024-03-08 10:52:25 -05:00
Zach Mueller	1681a6d452	🚨 Fully revert atomic checkpointing 🚨 (#29370 ) Fully revert atomic checkpointing	2024-03-04 06:17:42 -05:00
Zach Mueller	1a7c117df9	Fix deprecated arg issue (#29372 ) * Fix deprecated arg issue * Trainer check too * Check for dict or dataclass * Simplify, make config always AcceleratorConfig * Upstream to Trainer	2024-03-01 12:00:29 -05:00
Younes Belkada	efdd436663	FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055 ) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment	2024-02-20 12:45:08 +01:00
Younes Belkada	f7ef7cec6c	FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082 ) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 02:43:02 +01:00
Zach Mueller	636b03244c	Fix trainer test wrt DeepSpeed + auto_find_bs (#29061 ) * FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 10:04:24 -05:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
Younes Belkada	7a0fccc6eb	FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009 ) * fix trainer tags * add test	2024-02-14 23:56:35 +01:00
Zach Mueller	0507e69d34	Introduce AcceleratorConfig dataclass (#28664 ) * Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-14 10:18:09 -05:00
Huazhong Ji	69ca640dd6	Set the dataset format used by `test_trainer` to float32 (#28920 ) Co-authored-by: unit_test <test@unit.com>	2024-02-14 13:55:12 +00:00
Yih-Dar	d336c56d94	Avoid root logger's level being changed (#28638 ) * avoid root logger's level being changed --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-22 14:45:30 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00
peter-sk	769a9542de	move code to Trainer.evaluate to enable use of that function with multiple datasets (#27844 ) * move code to Trainer.evaluate to enable use of that function with multiple datasets * test * update doc string * and a tip * forgot the type --------- Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com>	2023-12-20 10:55:56 +01:00
Zach Mueller	93766251cb	Fix bug with rotating checkpoints (#28009 ) * Fix bug * Write test * Keep back old modification for grad accum steps * Whitespace... * Whitespace again * Race condition * Wait for everyone	2023-12-13 12:17:30 -05:00
Zach Mueller	44127ec667	Fix test for auto_find_batch_size on multi-GPU (#27947 ) * Fix test for multi-GPU * WIth CPU handle	2023-12-11 09:57:41 -05:00
Zach Mueller	6757ed28ce	Allow `resume_from_checkpoint` to handle `auto_find_batch_size` (#27568 ) * Fuffill request * Add test * Better test * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Better test * Better test * MOre comments --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-12-08 11:51:02 -05:00
Jonathon Belotti	4c5ed1d0c9	fix: non-atomic checkpoint save (#27820 )	2023-12-08 14:08:54 +01:00
Charbel Abi Daher	2ca73e5ee3	Fixed passing scheduler-specific kwargs via TrainingArguments lr_scheduler_kwargs (#27595 ) * Fix passing scheduler-specific kwargs through TrainingArguments `lr_scheduler_kwargs` * Added test for lr_scheduler_kwargs	2023-11-28 08:33:45 +01:00
Dave Berenbaum	8eb9e29d8d	dvclive callback: warn instead of fail when logging non-scalars (#27608 ) * dvclive callback: warn instead of fail when logging non-scalars * tests: log lr as scalar	2023-11-21 09:29:51 +01:00
Arthur	651408a077	[`Styling`] stylify using ruff (#27144 ) * try to stylify using ruff * might need to remove these changes? * use ruf format andruff check * use isinstance instead of type comparision * use # fmt: skip * use # fmt: skip * nits * soem styling changes * update ci job * nits isinstance * more files update * nits * more nits * small nits * check and format * revert wrong changes * actually use formatter instead of checker * nits * well docbuilder is overwriting this commit * revert notebook changes * try to nuke docbuilder * style * fix feature exrtaction test * remve `indent-width = 4` * fixup * more nits * update the ruff version that we use * style * nuke docbuilder styling * leve the print for detected changes * nits * Remove file I/O Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com> * style * nits * revert notebook changes * Add # fmt skip when possible * Add # fmt skip when possible * Fix * More ` # fmt: skip` usage * More ` # fmt: skip` usage * More ` # fmt: skip` usage * NIts * more fixes * fix tapas * Another way to skip * Recommended way * Fix two more fiels * Remove asynch Remove asynch --------- Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>	2023-11-16 17:43:19 +01:00
Zach Mueller	067c4a310d	Have seq2seq just use gather (#27025 ) * Have seq2seq just use gather * Change * Reset after * Make slow * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clean * Simplify and just use gather * Update tests/trainer/test_trainer_seq2seq.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * gather always for seq2seq --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-11-14 14:54:44 -05:00
Hz, Ji	1ffc4dee5b	enable memory tracker metrics for npu (#27280 )	2023-11-06 13:44:21 +00:00
Lysandre Debut	113ebf80ac	Safetensors serialization by default (#27064 ) * Safetensors serialization by default * First pass on the tests * Second pass on the tests * Third pass on the tests * Fix TF weight loading from TF-format safetensors * Specific encoder-decoder fixes for weight crossloading * Add VisionEncoderDecoder fixes for TF too * Change filename test for pt-to-tf * One missing fix for TFVisionEncoderDecoder * Fix the other crossload test * Support for flax + updated tests * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Sanchit's comments * Sanchit's comments 2 * Nico's comments * Fix tests * cleanup * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 19:16:49 +01:00
Younes Belkada	309a90664f	[FEAT] Add Neftune into transformers Trainer (#27141 ) * add v1 neftune * use `unwrap_model` instead * add test + docs * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * more details * fixup * Update docs/source/en/main_classes/trainer.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor a bit * more elaborated test * fix unwrap issue --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-31 16:03:59 +01:00
Hz, Ji	5bbf671276	Device agnostic trainer testing (#27131 )	2023-10-30 18:16:40 +00:00
Younes Belkada	5fbed2d7ca	[`Trainer` / `GC`] Add `gradient_checkpointing_kwargs` in trainer and training arguments (#27068 ) * add `gradient_checkpointing_kwargs` in trainer and training arguments * add comment * add test - currently failing * now tests pass	2023-10-30 12:41:48 +01:00
Zach Mueller	34a640642b	Save TB logs as part of push_to_hub (#27022 ) * Support runs/ * Upload runs folder as part of push to hub * Add a test * Add to test deps * Update with proposed solution from Slack * Ensure that repo gets deleted in tests	2023-10-26 12:13:19 -04:00
Wang, Yi	8f609ab9e0	enable optuna multi-objectives feature (#25969 ) * enable optuna multi-objectives feature Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update hpo doc * update docstring Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * extend direction to List[str] type Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-12 18:01:22 +01:00
Abhilash Majumder	70a98024b1	Patch with accelerate xpu (#25714 ) * patch with accelerate xpu * patch with accelerate xpu * formatting * fix tests * revert ruff unrelated fixes * revert ruff unrelated fixes * revert ruff unrelated fixes * fix test * review fixes * review fixes * black fixed * review commits * review commits * style fix * use pytorch_utils * revert markuplm test	2023-09-05 15:41:42 +01:00
Zach Mueller	be0e189bd3	Revert frozen training arguments (#25903 ) * Revert frozen training arguments * TODO	2023-09-01 11:24:12 -04:00
Zach Mueller	ca51499248	Make training args fully immutable (#25435 ) * Make training args fully immutable * Working tests, PyTorch * In test_trainer * during testing * Use proper dataclass way * Fix test * Another one * Fix tf * Lingering slow * Exception * Clean	2023-08-15 11:47:47 -04:00
Sylvain Gugger	baf1daa58e	Migrate Trainer from `Repository` to `upload_folder` (#25095 ) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com>	2023-08-07 17:47:22 +02:00
Zach Mueller	3b734f5042	Add dispatch_batches to training arguments (#25038 ) * Dispatch batches * Copy items	2023-07-24 09:27:19 -04:00
statelesshz	9c875839c0	add ascend npu accelerator support (#24879 ) * Add Ascend NPU accelerator support * fix style warining	2023-07-18 08:20:32 -04:00
Zach Mueller	0284285501	Fix pad across processes dim in trainer and not being able to set the timeout (#24775 ) * dim, and rm copy * Don't rm copy for now * Oops * pad index * Should be a working test * Tickle down ddp timeout * Put fix back in now that testing locally is done * Better comment specifying timeout Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-12 10:01:51 -04:00
Xiaoli Wang	239ace152b	Fix TypeError: Object of type int64 is not JSON serializable (#24340 ) * Fix TypeError: Object of type int64 is not JSON serializable * Convert numpy.float64 and numpy.int64 to float and int for json serialization * Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py * * make style	2023-06-27 12:15:49 +01:00
Alex Hall	b6295b26c5	Refactor hyperparameter search backends (#24384 ) * Refactor hyperparameter search backends * Simpler refactoring without abstract base class * black * review comments: specify name in class use methods instead of callable class attributes name constant better * review comments: safer bool checking, log multiple available backends * test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black. * copyright	2023-06-22 14:28:25 -04:00
Zach Mueller	ebd94b0f6f	🚨🚨🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨🚨🚨 (#24028 ) * Working integration * Fix failing test * Revert label host logic * Bring it back!	2023-06-12 11:23:37 -04:00
Tim Dettmers	796162c512	Paged Optimizer + Lion Optimizer for Trainer (#23217 ) * Added lion and paged optimizers and made original tests pass. * Added tests for paged and lion optimizers. * Added and fixed optimizer tests. * Style and quality checks. --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-05-24 12:53:28 +02:00
Maxime Méloux	9b435204b1	Add Trainer support for ReduceLROnPlateau (#23010 ) * Add Trainer support for ReduceLROnPlateau Fixes #16503 * Remove training argument and add default instance --------- Co-authored-by: mmeloux <maxime.meloux@loria.fr>	2023-04-28 09:17:30 -04:00
Zachary Mueller	03462875cc	Introduce `PartialState` as the device handler in the `Trainer` (#22752 ) * Use accelerate for device management * Add accelerate to setup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-17 15:09:45 -04:00
Stas Bekman	1306b7d3ae	[tests] switch to torchrun (#22712 )	2023-04-12 08:25:45 -07:00
Viktor Scherbakov	871598be55	Implemented safetensors checkpoints save/load for Trainer (#22498 ) * implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-04-04 09:05:04 -04:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Dean Wyatte	2f4cdd97f5	handle numpy inputs in whole word mask data collator (#22032 )	2023-03-10 10:50:29 -05:00
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Sylvain Gugger	b29e2dcaff	Fix flaky test for log level (#21776 ) * Fix flaky test for log level * Fix other flaky test	2023-02-28 16:24:14 -05:00
ydshieh	aa3787c8f0	Skip test_log_level for now	2023-02-23 12:11:20 +01:00
Sylvain Gugger	b19d64d852	Respect documentation on passive log level (#21700 ) * Respect documentation on passive log level * Fix test and set log level in examples * Add doc	2023-02-22 09:39:18 +01:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Sylvain Gugger	cc8407522a	Fix epoch number when resuming training (#21478 )	2023-02-06 19:34:34 -05:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
jeffhataws	c59d71b282	Add AWS Neuron torchrun support (#20806 ) * Add XLA torchrun support * Clarify that currently DDP doesn't work with torch.distributed XLA backend yet * Enable DDP with torchrun and XLA (now available in PT-XLA 1.13) * Add check for AWS Neuron availability and AWS Neuron specific compiler flag * Change the new test's name to TestTrainerDistributedNeuronCore * Remove "assert" and replace raised exception * Remove compiler flag as it is optional. If needed, will be another PR. * Use TORCHELASTIC_RUN_ID to determine whether torchrun is used	2023-01-18 11:21:19 -05:00
Sylvain Gugger	05e72aa0c4	Adapt repository creation to latest hf_hub (#21158 ) * Adapt repository creation to latest hf_hub * Update all examples * Fix other tests, add Flax examples * Address review comments	2023-01-18 11:14:00 -05:00
Yih-Dar	b3a0aad37d	Fix past CI (#20967 ) * Fix for Past CI * make style * clean up * unindent 2 blocks Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-01-12 18:04:21 +01:00
Thomas-MMJ	7ef3f19c3c	fix typo output not ouput in bitsandbytes trainer test (#20839 ) fix typo output not ouput typo was causing an error on pytest collection	2022-12-20 03:16:26 -05:00
Sylvain Gugger	08b4621899	Repurpose torchdynamo training args towards torch._dynamo (#20498 ) * Repurpose torchdynamo training args towards torch._dynamo * Add doc	2022-11-30 11:10:45 -05:00
Stas Bekman	a547d5bda5	[AnyPrecisionAdamW] test fix (#20454 )	2022-11-25 09:02:10 -08:00
atturaioe	84c9cc6d15	Add AnyPrecisionAdamW optimizer (#18961 ) * Add AnyPrecisionAdamW optimizer * Add optim_args argument to TrainingArgs * Add tests for AnyPrecisionOptimizer * Change AnyPrecisionAdam default params to float32 * Move default_anyprecision_kwargs in trainer test * Rename AnyPrecisionAdamW	2022-11-18 09:27:08 -05:00
Alexander Markov	610acc5ae9	Data collator for token classification pads labels column when receives pytorch tensors (#20244 ) * token cls data_collator pads labels column * remove walrus operator for code quality * remove redundat space * remove comment that was fixed * PR comments fix Co-authored-by: Alexander Markov <amarkov.me@gmail.com>	2022-11-16 12:18:46 -05:00
Yih-Dar	16242e1bf0	Run `torchdynamo` tests (#19056 ) * Enable torchdynamo tests * make style Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-09-15 11:10:16 -07:00
Younes Belkada	1ccd2515ed	small change (#18584 )	2022-08-12 20:04:38 +02:00
Wei	7ea6ccc2b3	Enable torchdynamo with torch_tensorrt(fx path) (#17765 ) * enable fx2trt * Update perf_train_gpu_one.mdx * Update perf_train_gpu_one.mdx * add lib check * update * format * update * fix import check * fix isort * improve doc * refactor ctx manager * fix isort * black format * isort fix * fix format * update args * update black * cleanups * Update perf_train_gpu_one.mdx * code refactor * code refactor to init * remove redundancy * isort * replace self.args with args Co-authored-by: Stas Bekman <stas@stason.org>	2022-07-13 12:43:28 -04:00
jianan-gu	b7d8bd378c	Enhance IPEX integration in Trainer (#18072 ) * enhance ipex import * refine codes * refine style * add link * style Co-authored-by: Stas Bekman <stas@stason.org>	2022-07-11 21:34:09 -07:00
neverix	8b332a6a16	Make predict() close progress bars after finishing (#17952 ) (#18078 ) * Make Trainer.predict call on_evaluate (#17952) * Add on_predict * Small fix * Small and different fix * Add tests	2022-07-08 16:44:24 -04:00
Yih-Dar	664688b94f	higher atol to avoid flaky trainer test failure (#17979 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-07-01 17:53:16 +02:00
Yih-Dar	fe14046421	skip some ipex tests until it works with torch 1.12 (#17964 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-30 18:05:29 +02:00
Yih-Dar	f717d47fe0	Fix `test_number_of_steps_in_training_with_ipex` (#17889 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-06-28 08:55:02 +02:00
Lysandre Debut	6a5272b205	Prepare transformers for v0.8.0 huggingface-hub release (#17716 ) * Prepare CI for v0.8.0 * pin hfh (revert before merge) * Revert "pin hfh (revert before merge)" This reverts commit `a0103140e1`. * Test rc3 * Test latest rc * Unpin to the RC Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-06-21 11:51:18 -04:00
Stas Bekman	a2d34b7c04	deprecate is_torch_bf16_available (#17738 ) * deprecate is_torch_bf16_available * address suggestions	2022-06-20 08:40:11 -04:00
jianan-gu	3b29c9fdb7	Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153 ) * add jit mode option and model wrap * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add ut and refine code * code refine * refine code * add inference doc * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add cpu inference performance doc * Update perf_infer_cpu.mdx * Update perf_infer_cpu.mdx * Update performance.mdx * Update _toctree.yml * refine jit func naming * Update _toctree.yml * Delete perf_infer_gpu_one.mdx * Update perf_infer_cpu.mdx * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add none check before jit * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-14 07:56:47 -04:00
jianan-gu	34097b3304	Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138 ) * init PR * fix import ipex * minor fix on bf16 * refine optimizer * refine args notes * refine code * refine ipex optimize args * refine half_precision_backend * black format * isort format * isort format files * flake8 format * doc builder format * refine codes * remove jit and optim bits * black preview format * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * refine notes * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * code refine * add ipex ut * add performance cpu doc * link to the cpu doc from main perf doc * install ipex into CI's docker * Update perf_train_cpu.mdx * Update docs/source/en/perf_train_cpu.mdx Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update perf_train_cpu.mdx * Update perf_train_cpu.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2022-06-08 09:41:57 -04:00
Animesh Jain	897a8dd89f	Support compilation via Torchdynamo, AOT Autograd, NVFuser (#17308 ) * Support compilation via Torchdynamo, AOT Autograd, NVFuser * Address comments * Lint * Stas comments - missing quality test * Lintere * Quality test * Doc lint * Reset CUDA peak mem * Add CustomTrainer * require a single gpu Co-authored-by: Stas Bekman <stas@stason.org>	2022-05-25 11:16:09 -04:00
Stas Bekman	3601aa8fc9	[tests] fix copy-n-paste error (#17312 ) * [tests] fix copy-n-paste error * fix	2022-05-18 16:00:47 -07:00
Yih-Dar	66b3e106a1	Make TrainerHyperParameterSigOptIntegrationTest slow test (#17288 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-05-16 14:18:09 -04:00
Antoni Baum	47412c7d43	Ensure tensors are at least 1d for pad and concat (#17179 ) * Ensure tensors are at least 1d for pad and concat * Compatibility * Fix * Fix * Add test * Retrigger CI * Consistency with master * Retrigger CI	2022-05-11 13:19:08 -04:00
Antoni Baum	edcc66d27c	Remove unnecessary columns for all dataset types in `Trainer` (#17166 ) * Remove unneeded columns for IterableDataset * Add test * Update trainer tests * Edit docstring * Lint * Apply feedback * Apply feedback	2022-05-11 11:11:26 -04:00
Zachary Mueller	2fbb237967	Add the auto_find_batch_size capability from Accelerate into Trainer (#17068 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> - Adds auto_batch_size finder - Moves training loop to an inner training loop	2022-05-09 12:29:18 -04:00
Sylvain Gugger	1c9fcd0e04	Fix RNG reload in resume training from epoch checkpoint (#17055 ) * Fix RNG reload in resume training from epoch checkpoint * Fix test	2022-05-03 10:31:24 -04:00
Sylvain Gugger	a8fa2f91f4	Make Trainer compatible with sharded checkpoints (#17053 ) * Make Trainer compatible with sharded checkpoints * Add doc	2022-05-03 09:55:10 -04:00
Manuel R. Ciosici	3104036e7f	Add support for bitsandbytes (#15622 ) * Add initial BNB integration * fixup! Add initial BNB integration * Add bnb test decorator * Update Adamw8bit option name * Use the full bnb package name * Overide bnb for all embedding layers * Fix package name * Formatting * Remove unnecessary import * Update src/transformers/trainer.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Rename AdamwBNB optimizer option * Add training test checking that bnb memory utilization is lower * fix merge * fix merge; fix + extend new test * cleanup * expand bnb * move all require_* candidates to testing_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-04-19 16:01:29 -04:00
code-review-doctor	a2392415e9	Some tests misusing assertTrue for comparisons fix (#16771 ) * Fix issue avoid-misusing-assert-true found at https://codereview.doctor * fix tests * fix tf Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:44:08 +02:00
Sander Land	d7c8ce57d4	Avoid accessing .dataset of a DataLoader in Trainer (#16451 ) * Avoid accessing .dataset of a dataloader * style * fix * cleaning up, reverting some misunderstandings * black * add train_dataset argument to get_train_dataloader, and fix other instances of length checks * flake8 * address comments * fix bug * cleanup * add test * Update tests/trainer/test_trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * under torch * merge * stylistic suggestion Co-authored-by: Sander Land <sander@chatdesk.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-29 15:00:18 -04:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
David Hall	5b7dcc7342	Seed _get_train_sampler's generator with arg seed to improve reproducibility (#15961 ) * Seed get_train_sampler's generator with arg seed to improve reproducibility and make the world_size<=1 code path more similar to the others * move test file into trainer test explicitly * dumb typo * make style lint happy * per discussion, switch to data_seed * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-08 13:45:41 -05:00
Lysandre Debut	29c10a41d0	[Test refactor 1/5] Per-folder tests reorganization (#15725 ) * Per-folder tests reorganization Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-02-23 15:46:28 -05:00

1 2 3 4

198 Commits