transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-23 22:38:58 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	acc3bd9d2a	Enforce string-formatting with f-strings (#10980 ) * First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-03-31 10:00:27 -04:00
Philipp Schmid	3e09d813aa	[examples/s2s] added py7zr dep (#10971 ) * added py7zr * comment out check_min for sagemaker test * added min version again	2021-03-30 23:17:12 +05:30
Eliza Szczechla	9f8fa4e973	Use DataCollatorForSeq2Seq in run_summarization in all cases (#10856 ) Co-authored-by: Eliza <eliza@habanero.tiger.com.pl>	2021-03-22 15:05:39 -04:00
Sylvain Gugger	946400fb68	Expand a bit the presentation of examples (#10799 ) * Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-03-19 10:06:08 -04:00
Stas Bekman	9352b5151a	[examples/seq2seq/README.md] fix t5 examples (#10734 ) * [examples/seq2seq] fix t5 examples This PR: * fixes T5 examples to include `--source_prefix` - it's not optional. If you give it a try you will see that you get 10x worse bleu scores w/o it. w/ `27.6849`, w/ `2.374` * added a normal translation example w/o the peculiarities of MBart and T5 * reduces the default max samples to 50 so it's much faster to test quickly summarization seems to be broken for t5 score-wise: https://github.com/huggingface/transformers/issues/10733 @sgugger * specify explicitly the t5 models requiring the special handling * one more * update the t5 summarization example to use cnn_dailymail * move maxsamples into the top level README.md better wording * better wording	2021-03-18 09:55:39 -07:00
Lysandre	1b5ce1e63b	Development on v4.5.0dev0	2021-03-16 11:41:15 -04:00
Lysandre	c988db5af2	Release v4.4.0	2021-03-16 11:33:35 -04:00
Sylvain Gugger	4c379daf64	Add minimum version check in examples (#10724 ) * Add minimum version check in examples * Style * No need for new line maybe? * Add helpful comment	2021-03-15 19:29:54 -04:00
Théo Matussière	6f840990a7	split seq2seq script into summarization & translation (#10611 ) * split seq2seq script, update docs * needless diff * fix readme * remove test diff * s/summarization/translation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cr * fix arguments & better mbart/t5 refs * copyright Co-authored-by: Suraj Patil <surajp815@gmail.com> * reword readme Co-authored-by: Suraj Patil <surajp815@gmail.com> * s/summarization/translation * short script names * fix tests * fix isort, include mbart doc * delete old script, update tests * automate source prefix * automate source prefix for translation * s/translation/trans Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * fix script name (short version) * typos Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * exact parameter Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * remove superfluous source_prefix calls in docs * rename scripts & warn for source prefix * black * flake8 Co-authored-by: theo <theo@matussie.re> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-03-15 09:11:42 -04:00
Lysandre Debut	9fbb4cdc80	Specify minimum version for sacrebleu (#10662 )	2021-03-11 13:45:06 -05:00
ArvidYin	27d9e05ce2	Update README.md (#10647 ) correct spell error: 'nether'	2021-03-11 08:58:04 -05:00
Bhadresh Savani	dfd16af832	Added max_sample_ arguments (#10551 ) * reverted changes of logging and saving metrics * added max_sample arguments * fixed code * white space diff * reformetting code * reformatted code	2021-03-08 13:57:10 -05:00
Stas Bekman	e6ce636e02	fix nltk lookup (#10585 )	2021-03-07 22:09:58 -08:00
Stas Bekman	88a951e3cc	offline mode for firewalled envs (#10407 ) * offline mode start * add specific values * fix fallback * add test * better values check and range * test that actually works * document the offline mode * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more strict check * cleaner test * pt-only test * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-03-05 17:27:48 -08:00
Stas Bekman	f52a15897b	[run_seq2seq.py] restore functionality: saving to test_generations.txt (#10428 ) This PR restores the original functionality that for some reason was modified. Fixes: https://github.com/huggingface/transformers/issues/10381 @sgugger	2021-02-27 08:21:50 -08:00
Stas Bekman	ee04b69822	[examples] better model example (#10427 ) * refactors * typo	2021-02-26 17:01:01 -08:00
Akmal	23e87c27be	Fix broken examples/seq2seq/README.md markdown (#10344 )	2021-02-23 10:49:25 -05:00
Stas Bekman	622a8c5995	[trainer] add Trainer methods for metrics logging and saving (#10266 ) * make logging and saving trainer built-in * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-22 13:02:53 -08:00
Stas Bekman	f991daed18	defensive programming + expand/correct README (#10295 )	2021-02-22 10:58:50 -08:00
Stas Bekman	97e688bc22	[Trainer] memory tracker metrics (#10225 ) * memory tracker metrics * go back to eval for somewhat consistency * handle no-gpu case * deal with stackable eval calls * restore callback order * style * simplify the API * add test * docs * consistently use eval_ prefix * improve docs * Update src/transformers/trainer_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename method * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-18 09:27:32 -08:00
Zhang Cheng	df1b0fb54d	set tgt_lang of MBart Tokenizer for summarization (#10205 )	2021-02-16 09:39:37 -05:00
Suraj Patil	1c8c2d9ab3	[WIP][examples/seq2seq] move old s2s scripts to legacy (#10136 ) * move old s2s scripts to legacy * add the tests back * proper rename * restore * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-15 10:48:02 -08:00
Stas Bekman	0b1f552a24	fix run_seq2seq.py; porting trainer tests to it (#10162 ) * fix run_seq2seq.py; porting DeepSpeed tests to it * unrefactor * defensive programming * defensive programming 2 * port the rest of the trainer tests * style * a cleaner scripts dir finder * cleanup	2021-02-15 09:12:17 -08:00
Suraj Patil	f51188cbe7	[examples/run_s2s] remove task_specific_params and update rouge computation (#10133 ) * fix rouge metrics and task specific params * fix typo * round metrics * typo * remove task_specific_params	2021-02-12 17:18:21 +05:30
Suraj Patil	63fddcf69c	[examples/s2s] add test set predictions (#10085 ) * add do_predict, pass eval_beams durig eval * update help * apply suggestions from code review	2021-02-09 20:41:41 +05:30
Stas Bekman	781220acab	transition to new tests dir (#10080 )	2021-02-08 12:41:52 -08:00
Stas Bekman	322037e842	[trainer] deepspeed bug fixes and tests (#10039 ) * deepspeed bug fixes and tests * manual wrap?	2021-02-08 09:44:02 -08:00
Olivier	ece6c51458	[s2s examples] Replace -100 token ids with the tokenizer pad_id for compute_metrics (#10046 ) * replace -100 token ids with the tokenizer pad_id for compute_metrics * fixed typo for label_ids	2021-02-08 10:08:16 -05:00
Stas Bekman	24db8cc329	Can't mix --fp16 and --device cpu (#10041 )	2021-02-07 17:54:20 -08:00
Stas Bekman	769948fad2	json to jsonlines, and doc, and typo (#10043 )	2021-02-07 17:51:34 -08:00
Stas Bekman	8ea412a86f	[examples] make run scripts executable (#10037 ) * make executable * make executable * same for the template * cleanup	2021-02-05 15:51:18 -08:00
Suraj Patil	1cd16512dc	[examples/seq2seq] support label smoothing (#9844 ) * add prepare_decoder_input_ids_from_labels in s2s models * support lbl smoothing and enc/emb freezing * fix freezing * use pad_token_id from config * remove embed freezing and add warning * prepare decoder_input_ids inside DataCollatorForSeq2Seq	2021-02-05 23:21:57 +05:30
Sylvain Gugger	115d97dd2f	Remove subclass for sortish sampler (#9907 ) * Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling	2021-02-01 08:06:32 -05:00
Stas Bekman	6bab83683b	fix logger format for non-main process (#9911 )	2021-02-01 03:08:12 -05:00
Stas Bekman	6bf94bc0b6	correctly handle mt5 (#9879 )	2021-01-29 08:11:22 -08:00
Sylvain Gugger	b4e559cfa1	Deprecate model_path in Trainer.train (#9854 )	2021-01-28 08:32:46 -05:00
Sylvain Gugger	f2fabedbab	Setup logging with a stdout handler (#9816 )	2021-01-27 03:39:11 -05:00
Magdalena Biesialska	8f6c12d306	Fix fine-tuning translation scripts (#9809 )	2021-01-26 11:30:31 -05:00
Andrea Cappelli	10e5f28212	Improve pytorch examples for fp16 (#9796 ) * Pad to 8x for fp16 multiple choice example (#9752) * Pad to 8x for fp16 squad trainer example (#9752) * Pad to 8x for fp16 ner example (#9752) * Pad to 8x for fp16 swag example (#9752) * Pad to 8x for fp16 qa beam search example (#9752) * Pad to 8x for fp16 qa example (#9752) * Pad to 8x for fp16 seq2seq example (#9752) * Pad to 8x for fp16 glue example (#9752) * Pad to 8x for fp16 new ner example (#9752) * update script template #9752 * Update examples/multiple-choice/run_swag.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_beam_search.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve code quality #9752 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 04:47:07 -05:00
Sylvain Gugger	caf4abf768	Auto-resume training from checkpoint (#9776 ) * Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 12:03:51 -05:00
Sylvain Gugger	411c582109	Fixes to run_seq2seq and instructions (#9734 ) * Fixes to run_seq2seq and instructions * Add more defaults for summarization	2021-01-22 10:03:57 -05:00
Sylvain Gugger	5f80c15ef5	Fix memory regression in Seq2Seq example (#9713 ) * Fix memory regression in Seq2Seq example * Fix test and properly deal with -100 * Easier condition with device safety * Patch for MBartTokenzierFast	2021-01-21 12:05:46 -05:00
Sylvain Gugger	e4c06ed664	New run_seq2seq script (#9605 ) * New run_seq2seq script * Add tests * Mark as slow * Update examples/seq2seq/run_seq2seq.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-01-19 15:22:17 -05:00
Sylvain Gugger	97b787fb4e	Fix old Seq2SeqTrainer (#9675 )	2021-01-19 09:56:25 -05:00
Stas Bekman	c60e0e1ee4	deepspeed + grad acumm (#9622 )	2021-01-15 10:12:26 -08:00
Sylvain Gugger	329fe2746a	Upstream (and rename) sortish sampler (#9574 ) * Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-14 10:38:14 -05:00
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Patrick von Platen	eef66035a2	[PyTorch Bart] Split Bart into different models (#9343 ) * first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc	2021-01-05 22:00:05 +01:00
Sylvain Gugger	a1cb6e9866	Adapt to new name of `label_smoothing_factor` training arg (#9282 )	2020-12-23 11:05:21 -05:00
Sylvain Gugger	e6c1f1cad8	Revert renaming in finetune_trainer (#9262 )	2020-12-22 15:42:34 -05:00

1 2 3 4 5

248 Commits