transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-24 23:08:57 +06:00

Author	SHA1	Message	Date
Stas Bekman	769948fad2	json to jsonlines, and doc, and typo (#10043 )	2021-02-07 17:51:34 -08:00
Stas Bekman	8ea412a86f	[examples] make run scripts executable (#10037 ) * make executable * make executable * same for the template * cleanup	2021-02-05 15:51:18 -08:00
Suraj Patil	1cd16512dc	[examples/seq2seq] support label smoothing (#9844 ) * add prepare_decoder_input_ids_from_labels in s2s models * support lbl smoothing and enc/emb freezing * fix freezing * use pad_token_id from config * remove embed freezing and add warning * prepare decoder_input_ids inside DataCollatorForSeq2Seq	2021-02-05 23:21:57 +05:30
Suraj Patil	bca0dd5ee3	[run_clm.py] fix getting extention	2021-02-03 20:14:42 +05:30
Stas Bekman	d55e10beab	[research proj] [lxmert] rm bleach dependency (#9970 ) Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps. https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open	2021-02-03 05:24:40 -05:00
Patrick von Platen	538b3b4607	[Tokenizer Utils Base] Make pad function more flexible (#9928 ) * change tokenizer requirement * split line * Correct typo from list to str * improve style * make other function pretty as well * add comment * correct typo * add new test * pass tests for tok without padding token * Apply suggestions from code review	2021-02-02 10:35:27 +03:00
Sylvain Gugger	115d97dd2f	Remove subclass for sortish sampler (#9907 ) * Remove subclass for sortish sampler * Use old Seq2SeqTrainer in script * Styling	2021-02-01 08:06:32 -05:00
wlhgtc	1682804ebd	Fit chinese wwm to new datasets (#9887 ) * MOD: fit chinese wwm to new datasets * MOD: move wwm to new folder * MOD: formate code * Styling * MOD add param and recover trainer Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-02-01 03:37:59 -05:00
Stas Bekman	6bab83683b	fix logger format for non-main process (#9911 )	2021-02-01 03:08:12 -05:00
Stas Bekman	6bf94bc0b6	correctly handle mt5 (#9879 )	2021-01-29 08:11:22 -08:00
Sylvain Gugger	b4e559cfa1	Deprecate model_path in Trainer.train (#9854 )	2021-01-28 08:32:46 -05:00
Sylvain Gugger	f2fabedbab	Setup logging with a stdout handler (#9816 )	2021-01-27 03:39:11 -05:00
Yusuke Mori	059bb25817	Fix a bug in run_glue.py (#9812 ) (#9815 )	2021-01-26 14:32:19 -05:00
Magdalena Biesialska	8f6c12d306	Fix fine-tuning translation scripts (#9809 )	2021-01-26 11:30:31 -05:00
Andrea Cappelli	10e5f28212	Improve pytorch examples for fp16 (#9796 ) * Pad to 8x for fp16 multiple choice example (#9752) * Pad to 8x for fp16 squad trainer example (#9752) * Pad to 8x for fp16 ner example (#9752) * Pad to 8x for fp16 swag example (#9752) * Pad to 8x for fp16 qa beam search example (#9752) * Pad to 8x for fp16 qa example (#9752) * Pad to 8x for fp16 seq2seq example (#9752) * Pad to 8x for fp16 glue example (#9752) * Pad to 8x for fp16 new ner example (#9752) * update script template #9752 * Update examples/multiple-choice/run_swag.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_beam_search.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve code quality #9752 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 04:47:07 -05:00
Sylvain Gugger	caf4abf768	Auto-resume training from checkpoint (#9776 ) * Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 12:03:51 -05:00
Wilfried L. Bounsi	9152f16023	Fix broken [Open in Colab] links (#9761 )	2021-01-23 15:11:46 +05:30
Sylvain Gugger	411c582109	Fixes to run_seq2seq and instructions (#9734 ) * Fixes to run_seq2seq and instructions * Add more defaults for summarization	2021-01-22 10:03:57 -05:00
Stefan Schweter	08b22722c7	examples: fix XNLI url (#9741 )	2021-01-22 18:13:52 +05:30
Sylvain Gugger	5f80c15ef5	Fix memory regression in Seq2Seq example (#9713 ) * Fix memory regression in Seq2Seq example * Fix test and properly deal with -100 * Easier condition with device safety * Patch for MBartTokenzierFast	2021-01-21 12:05:46 -05:00
Sylvain Gugger	582f516adb	Use datasets squad_v2 metric in run_qa (#9677 )	2021-01-20 04:52:13 -05:00
Sylvain Gugger	a1ad16a446	Restrain tokenizer.model_max_length default (#9681 ) * Restrain tokenizer.model_max_length default * Fix indent	2021-01-20 04:17:39 -05:00
Sylvain Gugger	e4c06ed664	New run_seq2seq script (#9605 ) * New run_seq2seq script * Add tests * Mark as slow * Update examples/seq2seq/run_seq2seq.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-01-19 15:22:17 -05:00
Sylvain Gugger	97b787fb4e	Fix old Seq2SeqTrainer (#9675 )	2021-01-19 09:56:25 -05:00
Stas Bekman	c60e0e1ee4	deepspeed + grad acumm (#9622 )	2021-01-15 10:12:26 -08:00
Sylvain Gugger	329fe2746a	Upstream (and rename) sortish sampler (#9574 ) * Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-14 10:38:14 -05:00
Sylvain Gugger	46ed56cfd1	Switch metrics in run_ner to datasets (#9567 ) * Switch metrics in run_ner to datasets * Add flag to return all metrics * Upstream (and rename) sortish_sampler * Revert "Upstream (and rename) sortish_sampler" This reverts commit `e07d0dcf65`.	2021-01-14 03:37:07 -05:00
Yusuke Mori	eabad8fd9c	Update run_glue for do_predict with local test data (#9442 ) (#9486 ) * Update run_glue for do_predict with local test data (#9442) * Update run_glue (#9442): fix comments ('files' to 'a file') * Update run_glue (#9442): reflect the code review * Update run_glue (#9442): auto format * Update run_glue (#9442): reflect the code review	2021-01-13 07:48:35 -05:00
Pavel Tarashkevich	27d0e01d75	Fix classification script: enable dynamic padding with truncation (#9554 ) Co-authored-by: Pavel Tarashkevich <Pavel.Tarashkievich@orange.com>	2021-01-13 07:46:48 -05:00
Stas Bekman	2df34f4aba	[trainer] deepspeed integration (#9211 ) * deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-12 19:05:18 -08:00
Sylvain Gugger	3ec40299c1	Remove nested lxmert (#9440 )	2021-01-07 04:10:41 -05:00
Sylvain Gugger	453a70d4cb	Allow example to use a revision and work with private models (#9407 ) * Allow example to use a revision and work with private models * Copy to other examples and template * Styling	2021-01-06 06:49:23 -05:00
Patrick von Platen	eef66035a2	[PyTorch Bart] Split Bart into different models (#9343 ) * first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc	2021-01-05 22:00:05 +01:00
Yusuke Mori	57a6626929	[examples/text-classification] Fix a bug for using one's own dataset of a regression task (#9411 )	2021-01-05 08:15:06 -05:00
dependabot[bot]	5dd389d1c7	Bump notebook from 6.1.4 to 6.1.5 in /examples/research_projects/lxmert (#9402 ) Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5. - [Release notes](https://github.com/jupyter/jupyterhub/releases) - [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md) - [Commits](https://github.com/jupyter/jupyterhub/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-01-04 10:02:07 -05:00
Sylvain Gugger	23a71449c0	Put back LXMert example (#9401 )	2021-01-04 09:59:07 -05:00
Sam Shleifer	8eb7f26d5d	simplify marian distillation script (#9394 )	2021-01-04 11:21:24 +05:30
Yoshitomo Matsubara	d944966b19	Fix typos in README and bugs in RAG example code for end-to-end evaluation and finetuning (#9355 ) * fix a bug in eval_batch_retrieval * should return parser as well as other staticmethod * remove duplicate argument * these kwargs are no longer accepted (cause TypeError in self.generator.generate of modeling_rag.py) * fixed file paths in README * moved an arg to add_ray_specific_args	2021-01-03 16:00:30 +01:00
Sylvain Gugger	a1cb6e9866	Adapt to new name of `label_smoothing_factor` training arg (#9282 )	2020-12-23 11:05:21 -05:00
Sylvain Gugger	e6c1f1cad8	Revert renaming in finetune_trainer (#9262 )	2020-12-22 15:42:34 -05:00
Sylvain Gugger	ab17758874	Add speed metrics to all example scripts + template (#9260 )	2020-12-22 14:02:26 -05:00
Manuel Romero	37d6fb5d04	Fix link to bertabs/README.md (#9255 )	2020-12-22 11:41:23 -05:00
Manuel Romero	189c1b91a6	Fix link to old language modeling script (#9254 )	2020-12-22 11:40:47 -05:00
Sylvain Gugger	490b39e614	Seq2seq trainer (#9241 ) * Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-22 11:33:44 -05:00
Sylvain Gugger	ec07da65e2	Update the README of the text classification example (#9237 ) * Update the README of the text classification example * Update examples/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Adapt comment from review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-21 15:23:40 -05:00
Teven	4eef5889ac	Adding performer fine-tuning research exampke (#9239 ) * added run_mlm_performer.py research example * make styke * make styke * Added a README !	2020-12-21 21:19:41 +01:00
Amog Kamsetty	a4b21cdd20	[RAG] Add Ray implementation for distributed retrieval (#9197 ) * wip * wip * wip * wip * wip * wip * wip * wip * uncomment * uncomment * wip * updates * add docstring * updates * fix arg * fixes * add unit tests * update readme * update readme * update finetune script * update test * add test * add ray to test dependencies * separate ray and ray tune * formatting * shutdown ray at end of test * fix tests * formatting * formatting * even more formatting * address comments * formatting * add files * Update examples/research_projects/rag/test_distributed_retriever.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address comments * addressing comments Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-208.us-west-2.compute.internal> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-21 10:39:30 +01:00
Stas Bekman	f38c4ad302	better logging and help (#9203 )	2020-12-20 10:28:28 -08:00
Stas Bekman	6b850b671d	[run_glue] add speed metrics (#9198 ) * add speed metrics * suggestions	2020-12-18 17:09:30 -08:00
Aleksey Tikhonov	291974c65c	GPT-model attention heads pruning example (#9189 ) * Pruning for GPT attn heads * The code formatted according to the transformers requirements * Update run_prune_gpt.py * Update run_prune_gpt.py	2020-12-18 16:32:10 -05:00
Sylvain Gugger	1198ba8fba	Add timing inside Trainer (#9196 ) * Add timing inside Trainer * Fix tests * Add n_objs for train * Sort logs	2020-12-18 15:10:39 -05:00
Sylvain Gugger	9a25c5bd3a	Add new run_swag example (#9175 ) * Add new run_swag example * Add check * Add sample * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Very important change to make Lysandre happy Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-12-18 14:19:24 -05:00
Manuel Romero	077a5dce32	Fix link to old SQUAD fine-tuning script (#9181 )	2020-12-18 09:12:10 -05:00
Wissam Antoun	fd7b6a5274	fixed JSON error in run_qa with fp16 (#9186 )	2020-12-18 07:53:23 -05:00
Manuel Romero	66a14a2f6f	Fix link to old NER fine-tuning script (#9182 )	2020-12-17 19:50:01 -05:00
Stas Bekman	f06d0fadc9	[trainer] apex fixes and tests (#9180 )	2020-12-17 16:49:11 -08:00
Stas Bekman	63841c559b	add tests for the new sharded ddp fairscale integration (#9177 )	2020-12-17 14:24:03 -08:00
Sylvain Gugger	9a67185344	Experimental support for fairscale ShardedDDP (#9139 ) * Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer	2020-12-16 13:47:48 -05:00
Sylvain Gugger	4d48973523	Update notebook table and transformers intro notebook (#9136 )	2020-12-16 10:24:31 -05:00
Patrick von Platen	640e6fe190	[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054 ) * save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu \| gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-12-16 13:03:32 +01:00
Teven	2a7e8e1608	[Examples] Add automatic dataset splitting in language-modeling examples (#9133 ) * replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples	2020-12-15 16:02:43 -05:00
Stas Bekman	14c79c3e31	native amp leak fix landed in 1.7.1 (#9115 ) update README with good news that the leak fix has been applied to pytorch-1.7.1.	2020-12-15 09:10:41 -05:00
Yoshitomo Matsubara	44c340f45f	fix a bug in eval_batch_retrieval (#9089 )	2020-12-15 14:46:55 +01:00
Stas Bekman	c19d04623e	[finetune_trainer] enhancements and fixes (#9042 ) * trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-14 17:45:33 -08:00
Sylvain Gugger	29e4597950	Fix min_null_pred in the run_qa script (#9067 )	2020-12-11 16:26:05 -05:00
dependabot[bot]	24f6cdeab6	Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062 ) Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5. - [Release notes](https://github.com/jupyter/jupyterhub/releases) - [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md) - [Commits](https://github.com/jupyter/jupyterhub/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-12-11 10:32:43 -05:00
Sylvain Gugger	783d7d2629	Reorganize examples (#9010 ) * Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-12-11 10:07:02 -05:00
NatLun137	91ab02af28	Fix typo #9012 (#1 ) (#9038 ) There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)	2020-12-10 16:41:00 -05:00
Funtowicz Morgan	75627148ee	Flax Masked Language Modeling training example (#8728 ) * Remove "Model" suffix from Flax models to look more 🤗 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example 😅 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by: Marc van Zee <marcvanzee@gmail.com> Co-authored-by: Marc van Zee <marcvanzee@gmail.com>	2020-12-09 17:13:56 +01:00
Sylvain Gugger	447808c85f	New squad example (#8992 ) * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add tick * Update README * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-08 14:39:29 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Sylvain Gugger	62d30e0583	Small fix to the run clm script (#8973 )	2020-12-07 17:32:09 -05:00
Sylvain Gugger	7f9ccffc5b	Use word_ids to get labels in run_ner (#8962 ) * Use word_ids to get labels in run_ner * Add sanity check	2020-12-07 14:26:36 -05:00
Ethan Perez	8dfc8c7221	Don't pass in token_type_ids to BART for GLUE (#8929 ) Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.	2020-12-05 09:52:16 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Stefan Schweter	19fa01ce2a	token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828 )	2020-11-30 09:21:56 -05:00
Stas Bekman	ddf3c64654	potpurri of small fixes (#8807 )	2020-11-26 14:06:27 -08:00
chutaklee	52708d2637	Fix PPLM (#8779 ) * Fix pplm * fix style * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-26 22:23:36 +01:00
Patrick von Platen	8f07f5c44b	Revert "finetune.py: specifying generation min_length (#8478 )" (#8805 ) This reverts commit `5aa361f3e5`.	2020-11-26 20:12:01 +01:00
Daniel Khashabi	5aa361f3e5	finetune.py: specifying generation min_length (#8478 )	2020-11-26 12:33:02 +05:30
Stas Bekman	82d443a7fd	[core] implement support for run-time dependency version checking (#8645 ) * implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-24 13:22:25 -05:00
Quentin Lhoest	a7d73cfdd4	fix rag index names in eval_rag.py example (#8730 )	2020-11-24 17:04:47 +01:00
zhiheng-huang	2c83b3c38d	Support various BERT relative position embeddings (2nd) (#8276 ) * Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-24 14:40:53 +01:00
Sylvain Gugger	367f497dec	Fix max length in run_plm script (#8738 )	2020-11-23 16:02:31 -05:00
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Quentin Lhoest	8062fa63c5	Fix rag finetuning + add finetuning test (#8585 ) * replace init_ddp_connection for index init * style * add finetune test * add test data * move generate tensors to device * add test on EM metric * style * allow multi process test * keep gloo process group for retrieval * add multi-gpu test * use custom accelerator * clean test finetune * minor * style * style * typo * use python call instead of imported main fumction * return_dict fix in modeling_rag * use float32 in retrieval * store as float32 as well in the custom knowledge dataset example * style * rename to finetune_rag * style * update readme * rename utils and callbacks to utils_rag and callbacks_rag * fix test * patrick's comments * generate dummy data in the finetue test script * remove dummy data files * style	2020-11-20 19:05:03 +01:00
Stas Bekman	0ad45e108d	[examples/seq2seq] fix PL deprecation warning (#8577 ) * fix deprecation warning * fix	2020-11-19 21:46:04 +01:00
Sylvain Gugger	20b658607e	Fix run_ner script (#8664 ) * Fix run_ner script * Pin datasets	2020-11-19 13:59:30 -05:00
Sylvain Gugger	cb3e5c33f7	Fix a few last paths for the new repo org (#8666 )	2020-11-19 11:56:42 -05:00
Matthias	a79a96ddaa	fix small typo (#8644 ) Fixed a small typo on the XLNet and permutation language modelling section	2020-11-19 11:24:11 -05:00
Sylvain Gugger	4208f496ee	Better filtering of the model outputs in Trainer (#8633 ) * Better filtering of the model outputs in Trainer * Fix examples tests * Add test for Lysandre	2020-11-19 10:43:15 -05:00
Quentin Lhoest	62cd9ce9f8	fix missing return dict (#8653 )	2020-11-19 15:17:18 +01:00
Tim Isbister	28d16e7ac5	Update README.md (#8635 )	2020-11-18 18:35:23 -05:00
Stas Bekman	d86d57faa3	[s2s] distillation apex breaks return_dict obj (#8631 ) * apex breaks return_dict obj * style	2020-11-18 12:51:29 -08:00
Sylvain Gugger	a0c62d2493	Fix training from scratch in new scripts (#8623 )	2020-11-18 12:15:26 -05:00
Stas Bekman	cdf1b7ae82	fix to adjust for #8530 changes (#8612 )	2020-11-18 10:25:00 -05:00
Stas Bekman	2819da02f7	[s2s] broken test (#8613 )	2020-11-18 10:15:53 -05:00
Sylvain Gugger	dd52804f5f	Remove deprecated (#8604 ) * Remove old deprecated arguments Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-17 15:11:29 -05:00
Stas Bekman	f0435f5a61	these should run fine on multi-gpu (#8582 )	2020-11-17 14:00:41 -05:00
Julien Chaumond	042a6aa777	Tokenizers: ability to load from model subfolder (#8586 ) * <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-11-17 08:58:45 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Sylvain Gugger	1073a2bde5	Switch `return_dict` to `True` by default. (#8530 ) * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests	2020-11-16 11:43:00 -05:00
Thomas Wolf	f4e04cd2c6	[breaking\|pipelines\|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 ) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-15 22:50:59 +01:00
Julien Plu	27b3ff316a	Try to understand and apply Sylvain's comments (#8458 )	2020-11-12 13:43:00 -05:00
zeyuyun1	924c624a46	quick fix on concatenating text to support more datasets (#8474 )	2020-11-12 09:47:08 -05:00
Sumithra Bhakthavatsalam	81ebd70671	[s2s] distill t5-large -> t5-small (#8376 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-11 17:58:45 -05:00
sarnoult	a38d1c7c31	Example NER script predicts on tokenized dataset (#8468 ) The new run_ner.py script tries to run prediction on the input test set `datasets["test"]`, but it should be the tokenized set `tokenized_datasets["test"]`	2020-11-11 10:28:23 -05:00
Stas Bekman	02bdfc0251	using multi_gpu consistently (#8446 ) * s\|multiple_gpu\|multi_gpu\|g; s\|multigpu\|multi_gpu\|g' * doc	2020-11-10 13:23:58 -05:00
Stas Bekman	5d4972e608	[examples] better PL version check (#8429 )	2020-11-10 09:33:23 -05:00
Shichao Sun	ae1cb4ec22	[s2s/distill] hparams.tokenizer_name = hparams.teacher (#8382 )	2020-11-10 09:32:01 -05:00
Julien Chaumond	55e8d0cea2	Update links from s3 to huggingface.co	2020-11-10 14:03:29 +01:00
Stas Bekman	190df58560	[github CI] add a multi-gpu job for all example tests (#8341 ) * add a multi-gpu job for all example tests * run only ported tests * rename * explain why env is re-activated on each step * mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me * style * Apply suggestions from code review Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-09 15:47:38 -05:00
Patrick von Platen	9c83b96e62	[Tests] Add Common Test for Training + Fix a couple of bugs (#8415 ) * add training tests * correct longformer * fix docs * fix some tests * fix some more train tests * remove ipdb * fix multiple edge case model training * fix funnel and prophetnet * clean gpt models * undo renaming of albert	2020-11-09 18:24:41 +01:00
Sylvain Gugger	5c766ecb50	Fix typo	2020-11-09 11:50:51 -05:00
Sylvain Gugger	908a28894c	Add new token classification example (#8340 ) * Add new token classification example * Remove txt file * Add test * With actual testing done * Less warmup is better * Update examples/token-classification/run_ner_new.py Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address review comments * Fix test * Make Lysandre happy * Last touches and rename * Rename in tests * Address review comments * More run_ner -> run_ner_old Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-11-09 11:39:55 -05:00
Sam Shleifer	ebde57acac	examples/docs: caveat that PL examples don't work on TPU (#8309 )	2020-11-09 08:55:22 -05:00
Sam Shleifer	e6d9cdaafe	[s2s/distill] remove run_distiller.sh, fix xsum script (#8412 )	2020-11-08 16:57:43 -05:00
Stas Bekman	66582492d3	[s2s test_finetune_trainer] failing multigpu test (#8400 )	2020-11-08 16:45:40 -05:00
Stas Bekman	f62755a600	[s2s examples test] fix data path (#8398 )	2020-11-08 16:44:18 -05:00
Jonathan Chang	5807ba3fa9	Fix typo (#8351 )	2020-11-06 11:19:41 -05:00
Stas Bekman	9edafaebef	[s2s] test_bash_script.py - actually learn something (#8318 ) * use decorator * remove hardcoded paths * make the test use more data and do real quality tests * shave off 10 secs * add --eval_beams 2, reformat * reduce train size, use smaller custom dataset	2020-11-05 23:15:14 -05:00
Leandro von Werra	17450397a7	Docs bart training ref (#8330 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 17:20:57 -05:00
Stas Bekman	d787935a14	[s2s] test_distributed_eval (#8315 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 16:01:15 -05:00
Sam Shleifer	7abc1d96d1	no warn (#8329 )	2020-11-05 11:42:24 -05:00
Bobby Donchev	52f44dd6d2	change TokenClassificationTask class methods to static methods (#7902 ) * change TokenClassificationTask class methods to static methods Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class. Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods. * Trigger Build Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-11-05 09:38:30 -05:00
Guillem García Subies	77c8f6c627	Corrected typo in readme (#8320 )	2020-11-05 07:48:36 -05:00
Sylvain Gugger	9c4aa4ac1a	Clean up data collators and datasets (#8308 ) * Clean up data collators and datasets * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Remove needless clone Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-04 17:24:49 -05:00
Manuel Romero	b1d3e95eb5	Fix path to old run_language_modeling.py script (#8302 )	2020-11-04 13:17:57 -05:00
Sylvain Gugger	cf89724696	Fix validation file loading in scripts (#8298 )	2020-11-04 10:42:18 -05:00
Pengzhi Gao	734afa37f6	Fix typo in language-modeling README.md (#8287 )	2020-11-04 09:38:02 -05:00
Stas Bekman	1bb4bba53c	[CIs] Better reports everywhere (#8275 ) * make it possible to invoke testconf.py in both test suites without crashing on having the same option added * perl -pi -e 's\|--make_reports\|--make-reports\|' to be consistent with other opts * add `pytest --make-reports` to all CIs (and artifacts) * fix	2020-11-03 16:57:12 -05:00
Patrick von Platen	068e6b5edd	make files independent (#8267 )	2020-11-03 21:13:33 +01:00
Stas Bekman	cd360dcb26	[examples] minimal version requirement run-time check in PL (#8133 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-03 13:17:11 -05:00
Lysandre	eb6313e823	Fix Tatoeba skip	2020-11-03 10:35:00 -05:00
Sam Shleifer	b63beb743c	Skip tatoeba tests if Tatoeba-Challenge not cloned (#8260 )	2020-11-03 09:49:29 -05:00
Patrick von Platen	9f1747f999	[Seq2Seq] Correct import in Seq2Seq Trainer (#8254 )	2020-11-03 07:56:41 -05:00
Sylvain Gugger	e1b1b614b1	Add line by line option to mlm/plm scripts (#8240 ) * Make line by line optional in run_mlm * Add option to disable dynamic padding * Add option to plm too and update README * Typos * More typos * Even more typos * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-02 12:27:04 -05:00
Patrick von Platen	9bd30f7cf4	[Seq2SeqTrainer] Move import to init to make file self-contained (#8194 ) * boom boom * reverse order	2020-11-01 23:31:55 +01:00
Sylvain Gugger	9eb3a410cd	Remove deprecated arguments from new run_clm (#8197 )	2020-10-30 15:27:20 -04:00
Sylvain Gugger	cdc48ce92d	Finalize lm examples (#8188 ) * Finish the cleanup of the language-modeling examples * Update main README * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Propagate changes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-10-30 14:20:18 -04:00
wlhgtc	9a21b50614	Fix eval ref miss in Chinese WWM. (#8115 ) * ADD: add whole word mask proxy for both eng and chinese * MOD: adjust format * MOD: reformat code * MOD: update import * MOD: fix bug * MOD: add import * MOD: fix bug * MOD: decouple code and update readme * MOD: reformat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change wwm to whole_word_mask * reformat code * reformat * format * Code quality * ADD: update chinese ref readme * MOD: small changes * MOD: small changes2 * update readme * fix eval ref file miss bug * format file * MOD: move ref code to contrib * MOD: add delimeter check * reformat code * refomat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-29 17:08:39 -04:00
Sylvain Gugger	691176283d	Add a template for examples and apply it for mlm and plm examples (#8153 ) * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Styling	2020-10-29 13:38:11 -04:00
Sam Shleifer	49e4fece5c	[s2s] distillBART docs for paper replication (#8150 )	2020-10-29 12:01:15 -04:00
Sylvain Gugger	acf56408d8	Smarter prediction loop and no- -> no_ in console args (#8151 ) * Smarter prediction loop and no- -> no_ in console args * Fix test	2020-10-29 10:56:25 -04:00
Santiago Castro	969859d5f6	Fix doc errors and typos across the board (#8139 ) * Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes	2020-10-29 10:33:33 -04:00
Stas Bekman	825925dfaa	[s2s test] cleanup (#8131 )	2020-10-28 16:50:36 -04:00
Sean Naren	5e24982e58	Upgrade PyTorch Lightning to 1.0.2 (#7852 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-28 14:59:14 -04:00
Sylvain Gugger	378142afdf	Rename add_start_docstrings_to_callable (#8120 )	2020-10-28 13:42:31 -04:00
Stas Bekman	5423f2a9d4	[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107 ) * move the helper code into testing_utils * port test_trainer_distributed to work with pytest * improve docs * simplify notes * doc * doc * style * doc * further improvements * torch might not be available * real fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-28 11:51:32 -04:00
Sylvain Gugger	47dfa65b0c	New run_clm script (#8105 ) * New run_clm script * Formatting * More comments * Remove unused imports * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address review comments * Change link to the hub Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-10-28 10:38:58 -04:00
Sylvain Gugger	1e01db3579	Remove header	2020-10-27 17:36:13 -04:00
Sylvain Gugger	b715e40ced	Fix typo	2020-10-27 17:34:05 -04:00
Sylvain Gugger	41cc5f3f59	Move installation instructions to the top (#8106 )	2020-10-27 17:32:20 -04:00
Stas Bekman	bfd5e370a7	[CI] generate separate report files as artifacts (#7995 ) * better reports * a whole bunch of reports in their own files * clean up * improvements * github artifacts experiment * style * complete the report generator with multiple improvements/fixes * fix * save all reports under one dir to easy upload * can remove temp failing tests * doc fix * some cleanup	2020-10-27 09:25:07 -04:00
Patrick von Platen	664c7ec453	[Seq2Seq Trainer] Make sure padding is implemented for models without pad_token (#8043 ) * make sure padding is implemented for non-padding tokens models as well * add better error message * add better warning * remove results files * Update examples/seq2seq/seq2seq_trainer.py * remove unnecessary copy line * correct usage of labels * delete test files	2020-10-26 17:28:16 +01:00
mohammadreza-Banaei73	098ddc2244	Update README.md (#8050 ) --wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask	2020-10-26 12:00:18 -04:00
suliuzh	20a0894d1a	update version for scipy (#7998 )	2020-10-26 08:56:56 -04:00
Patrick von Platen	3c682ea15c	[Examples] Allow EncoderDecoderModels to be trained with Seq2Seq (#7809 ) * Make Seq2Seq Trainer more similar to Trainer * fix typo * fix seq2seq trainer * remove from tests * remove lock * remove train files * delete test files * correct typo * check at init * make sure trainer is not slowed down on TPU * correct isort * remove use cache * fix use cache * add last use chache = false	2020-10-23 23:05:51 +02:00
Ethan Perez	d39da5a2ab	Handling longformer model_type (#7990 ) Updating the run_squad training script to handle the "longformer" `model_type`. The longformer is trained in the same was as RoBERTa, so I've added the "longformer" `model_type` (that's the right hugginface name for the LongFormer model, right?) everywhere there was a "roberta" `model_type` reference. The longformer (like RoBERTa) doesn't use `token_type_ids` (as I understand from looking at the [longformer notebook](https://github.com/patil-suraj/Notebooks/blob/master/longformer_qa_training.ipynb), which is what gets updated after this change. This fix might be related to [this issue](https://github.com/huggingface/transformers/issues/7249) with SQuAD training when using run_squad.py	2020-10-23 10:34:06 -04:00
Lalit Pagaria	88b3a91e61	Handle the case when title is None (#7941 )	2020-10-23 15:54:45 +02:00
Stas Bekman	023f0f3708	[s2s trainer] tests to use distributed on multi-gpu machine (#7965 )	2020-10-22 17:26:22 -04:00
Sylvain Gugger	2e5052d4f1	New run glue script (#7917 ) * Start simplification * More progress * Finished script * Address comments and update tests instructions * Wrong test * Accept files as inputs and fix test * Update src/transformers/trainer_utils.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Fix labels and add combined score * Add special labels * Update TPU command * Revert to old label strategy * Use model labels * Fix for STT-B * Styling * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Code styling * Fix review comments Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-10-22 11:42:22 -04:00
wlhgtc	a16e568f22	# Add whole word mask support for lm fine-tune (#7925 ) * ADD: add whole word mask proxy for both eng and chinese * MOD: adjust format * MOD: reformat code * MOD: update import * MOD: fix bug * MOD: add import * MOD: fix bug * MOD: decouple code and update readme * MOD: reformat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change wwm to whole_word_mask * reformat code * reformat * format * Code quality * ADD: update chinese ref readme * MOD: small changes * MOD: small changes2 * update readme Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2020-10-22 09:19:00 -04:00
Stas Bekman	8b38173398	[seq2seq testing] multigpu test run via subprocess (#7281 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-21 17:20:53 -04:00
Stas Bekman	0e24e4c136	[s2s] create doc for pegasus/fsmt replication (#7934 )	2020-10-20 15:07:52 -04:00
Stas Bekman	3e31e7f956	[testing] rename skip targets + docs (#7863 ) * rename skip targets + docs * fix quotes * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small improvements * fix Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-20 04:39:13 -04:00
Quentin Lhoest	033f29c625	Allow Custom Dataset in RAG Retriever (#7763 ) * add CustomHFIndex * typo in config * update tests * add custom dataset example * clean script * update test data * minor in test * docs * docs * style * fix imports * allow to pass the indexed dataset directly * update tests * use multiset DPR * address thom and patrick's comments * style * update dpr tokenizer * add output_dir flag in use_own_knowledge_dataset.py * allow custom datasets in examples/rag/finetune.py * add test for custom dataset in distributed rag retriever	2020-10-19 19:42:45 +02:00
Thomas Wolf	ba8c4d0ac0	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 ) * splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece 🎉 * and removed hard dependency on tokenizers 🎉 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-10-18 20:51:24 +02:00
Stas Bekman	9f7b2b2432	[s2s testing] turn all to unittests, use auto-delete temp dirs (#7859 )	2020-10-17 14:33:21 -04:00
Stas Bekman	1652ddad35	[seq2seq testing] improve readability (#7845 )	2020-10-16 09:05:29 -04:00
Quentin Lhoest	466115b279	Fix missing reference titles in retrieval evaluation of RAG (#7817 )	2020-10-16 10:15:49 +02:00
Stas Bekman	464b53f5e4	[testing] disable FutureWarning in examples tests (#7842 ) * [testing] disable FutureWarning in examples tests same as tests/conftest.py, we can't resolve those warning, so turn the noise off. * fix	2020-10-16 03:35:39 -04:00
Sam Shleifer	96e47d9229	[cleanup] assign todos, faster bart-cnn test (#7835 ) * 2 beam output * unassign/remove TODOs * remove one more	2020-10-16 03:11:18 -04:00
Stas Bekman	2255c2c7a0	[seq2seq] get_git_info fails gracefully (#7843 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-16 00:22:43 -04:00
Lysandre	2485b8b0ac	Set XLA example time to 500s	2020-10-15 12:34:29 +02:00
Sylvain Gugger	bb9559a7f9	Don't use `store_xxx` on optional bools (#7786 ) * Don't use `store_xxx` on optional bools * Refine test * Refine test	2020-10-14 12:05:02 -04:00
Sylvain Gugger	a1d1b332d0	Add predict step accumulation (#7767 ) * Add eval_accumulation_step and clean distributed eval * Add TPU test * Add TPU stuff * Fix arg name * Fix Seq2SeqTrainer * Fix total_size * Update src/transformers/trainer_pt_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Doc and add test to TPU * Add unit test * Adapt name Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-14 11:41:45 -04:00
Sam Shleifer	8feb0cc967	fix examples/rag imports, tests (#7712 )	2020-10-14 11:35:00 -04:00
Tiger	7e73c12805	fixed lots of typos. (#7758 )	2020-10-13 10:00:20 -04:00
Sam Shleifer	9c2b2db2cd	[marian] Automate Tatoeba-Challenge conversion (#7709 )	2020-10-12 12:24:25 -04:00
Julien Plu	d9ffb87efb	Fix tf text class (#7724 ) * Fix test * fix generic text classification * fix test * Fix tests	2020-10-12 08:45:15 -04:00
sgugger	d6175a4268	Fix code quality	2020-10-12 08:22:27 -04:00
Kelvin	f176e70723	The input training data files (multiple files in glob format). (#7717 ) Very often splitting large files to smaller files can prevent tokenizer going out of memory in environment like Colab that does not have swap memory	2020-10-12 07:44:02 -04:00
Sam Shleifer	827c519494	[examples] bump pl=0.9.0 (#7053 )	2020-10-11 16:39:38 -04:00
Julien Plu	9ad830596d	Fix dataset cardinality (#7678 ) * Fix test * Fix cardinality issue * Fix test	2020-10-09 10:38:25 -04:00
Sam Shleifer	297233fa92	[s2s] Switch README urls to cdn (#7670 )	2020-10-08 21:22:22 -04:00
Sam Shleifer	a1ecc90d6b	[pseudo] Switch URLS to CDN (#7661 )	2020-10-08 14:12:39 -04:00
Suraj Patil	06a973fd2a	[s2s] configure lr_scheduler from command line (#7641 )	2020-10-08 13:06:35 -04:00
Sam Shleifer	aba4e22944	[pseudolabels] cleanup markdown table (#7653 )	2020-10-07 23:04:18 -04:00
Sam Shleifer	e2bb9abb6a	[s2s] release pseudolabel links and instructions (#7639 )	2020-10-07 11:20:44 -04:00
Sylvain Gugger	08ba4b4902	Trainer callbacks (#7596 ) * Initial callback proposal * Finish various callbacks * Post-rebase conflicts * Fix tests * Don't use something that's not set * Documentation * Remove unwanted print. * Document all models can work * Add tests + small fixes * Update docs/source/internal/trainer_utils.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Fix TF tests * Real fix this time * This one should work * Fix typo * Really fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-07 10:50:21 -04:00
Sam Shleifer	500be01c5d	[s2s] save first batch to json for debugging purposes (#6810 )	2020-10-06 16:11:56 -04:00
Sam Shleifer	d5d2744aa7	Support T5 Distillation w/hidden state supervision (#7599 )	2020-10-05 21:31:48 -04:00
Suraj Patil	99cb924bfb	[s2s] add config params like Dropout in Seq2SeqTrainingArguments (#7532 )	2020-10-04 12:42:30 -04:00
Sam Shleifer	9bdce3a4f9	[s2s] fix lockfile and peg distillation constants (#7545 )	2020-10-02 15:58:14 -04:00

... 2 3 4 5 6 ...

1573 Commits