transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-14 10:08:29 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	115d97dd2f	Remove subclass for sortish sampler (#9907 ) * Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling	2021-02-01 08:06:32 -05:00
Stas Bekman	6bab83683b	fix logger format for non-main process (#9911 )	2021-02-01 03:08:12 -05:00
Stas Bekman	6bf94bc0b6	correctly handle mt5 (#9879 )	2021-01-29 08:11:22 -08:00
Sylvain Gugger	b4e559cfa1	Deprecate model_path in Trainer.train (#9854 )	2021-01-28 08:32:46 -05:00
Sylvain Gugger	f2fabedbab	Setup logging with a stdout handler (#9816 )	2021-01-27 03:39:11 -05:00
Magdalena Biesialska	8f6c12d306	Fix fine-tuning translation scripts (#9809 )	2021-01-26 11:30:31 -05:00
Andrea Cappelli	10e5f28212	Improve pytorch examples for fp16 (#9796 ) * Pad to 8x for fp16 multiple choice example (#9752) * Pad to 8x for fp16 squad trainer example (#9752) * Pad to 8x for fp16 ner example (#9752) * Pad to 8x for fp16 swag example (#9752) * Pad to 8x for fp16 qa beam search example (#9752) * Pad to 8x for fp16 qa example (#9752) * Pad to 8x for fp16 seq2seq example (#9752) * Pad to 8x for fp16 glue example (#9752) * Pad to 8x for fp16 new ner example (#9752) * update script template #9752 * Update examples/multiple-choice/run_swag.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_beam_search.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve code quality #9752 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 04:47:07 -05:00
Sylvain Gugger	caf4abf768	Auto-resume training from checkpoint (#9776 ) * Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 12:03:51 -05:00
Sylvain Gugger	411c582109	Fixes to run_seq2seq and instructions (#9734 ) * Fixes to run_seq2seq and instructions * Add more defaults for summarization	2021-01-22 10:03:57 -05:00
Sylvain Gugger	5f80c15ef5	Fix memory regression in Seq2Seq example (#9713 ) * Fix memory regression in Seq2Seq example * Fix test and properly deal with -100 * Easier condition with device safety * Patch for MBartTokenzierFast	2021-01-21 12:05:46 -05:00
Sylvain Gugger	e4c06ed664	New run_seq2seq script (#9605 ) * New run_seq2seq script * Add tests * Mark as slow * Update examples/seq2seq/run_seq2seq.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-01-19 15:22:17 -05:00
Sylvain Gugger	97b787fb4e	Fix old Seq2SeqTrainer (#9675 )	2021-01-19 09:56:25 -05:00
Stas Bekman	c60e0e1ee4	deepspeed + grad acumm (#9622 )	2021-01-15 10:12:26 -08:00
Sylvain Gugger	329fe2746a	Upstream (and rename) sortish sampler (#9574 ) * Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-14 10:38:14 -05:00
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Patrick von Platen	eef66035a2	[PyTorch Bart] Split Bart into different models (#9343 ) * first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc	2021-01-05 22:00:05 +01:00
Sylvain Gugger	a1cb6e9866	Adapt to new name of `label_smoothing_factor` training arg (#9282 )	2020-12-23 11:05:21 -05:00
Sylvain Gugger	e6c1f1cad8	Revert renaming in finetune_trainer (#9262 )	2020-12-22 15:42:34 -05:00
Manuel Romero	37d6fb5d04	Fix link to bertabs/README.md (#9255 )	2020-12-22 11:41:23 -05:00
Sylvain Gugger	490b39e614	Seq2seq trainer (#9241 ) * Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-22 11:33:44 -05:00
Stas Bekman	f38c4ad302	better logging and help (#9203 )	2020-12-20 10:28:28 -08:00
Sylvain Gugger	1198ba8fba	Add timing inside Trainer (#9196 ) * Add timing inside Trainer * Fix tests * Add n_objs for train * Sort logs	2020-12-18 15:10:39 -05:00
Stas Bekman	f06d0fadc9	[trainer] apex fixes and tests (#9180 )	2020-12-17 16:49:11 -08:00
Stas Bekman	63841c559b	add tests for the new sharded ddp fairscale integration (#9177 )	2020-12-17 14:24:03 -08:00
Sylvain Gugger	9a67185344	Experimental support for fairscale ShardedDDP (#9139 ) * Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer	2020-12-16 13:47:48 -05:00
Stas Bekman	14c79c3e31	native amp leak fix landed in 1.7.1 (#9115 ) update README with good news that the leak fix has been applied to pytorch-1.7.1.	2020-12-15 09:10:41 -05:00
Stas Bekman	c19d04623e	[finetune_trainer] enhancements and fixes (#9042 ) * trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-14 17:45:33 -08:00
Sylvain Gugger	783d7d2629	Reorganize examples (#9010 ) * Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-12-11 10:07:02 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Stas Bekman	ddf3c64654	potpurri of small fixes (#8807 )	2020-11-26 14:06:27 -08:00
Patrick von Platen	8f07f5c44b	Revert "finetune.py: specifying generation min_length (#8478 )" (#8805 ) This reverts commit `5aa361f3e5`.	2020-11-26 20:12:01 +01:00
Daniel Khashabi	5aa361f3e5	finetune.py: specifying generation min_length (#8478 )	2020-11-26 12:33:02 +05:30
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Stas Bekman	0ad45e108d	[examples/seq2seq] fix PL deprecation warning (#8577 ) * fix deprecation warning * fix	2020-11-19 21:46:04 +01:00
Sylvain Gugger	4208f496ee	Better filtering of the model outputs in Trainer (#8633 ) * Better filtering of the model outputs in Trainer * Fix examples tests * Add test for Lysandre	2020-11-19 10:43:15 -05:00
Stas Bekman	d86d57faa3	[s2s] distillation apex breaks return_dict obj (#8631 ) * apex breaks return_dict obj * style	2020-11-18 12:51:29 -08:00
Stas Bekman	cdf1b7ae82	fix to adjust for #8530 changes (#8612 )	2020-11-18 10:25:00 -05:00
Stas Bekman	2819da02f7	[s2s] broken test (#8613 )	2020-11-18 10:15:53 -05:00
Sylvain Gugger	dd52804f5f	Remove deprecated (#8604 ) * Remove old deprecated arguments Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-17 15:11:29 -05:00
Stas Bekman	f0435f5a61	these should run fine on multi-gpu (#8582 )	2020-11-17 14:00:41 -05:00
Julien Chaumond	042a6aa777	Tokenizers: ability to load from model subfolder (#8586 ) * <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-11-17 08:58:45 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Sylvain Gugger	1073a2bde5	Switch `return_dict` to `True` by default. (#8530 ) * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests	2020-11-16 11:43:00 -05:00
Thomas Wolf	f4e04cd2c6	[breaking\|pipelines\|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 ) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-15 22:50:59 +01:00
Julien Plu	27b3ff316a	Try to understand and apply Sylvain's comments (#8458 )	2020-11-12 13:43:00 -05:00
Sumithra Bhakthavatsalam	81ebd70671	[s2s] distill t5-large -> t5-small (#8376 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-11 17:58:45 -05:00

1 2 3 4 5

216 Commits