transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 17:22:25 +06:00

Author	SHA1	Message	Date
Sylvain Gugger	ec07da65e2	Update the README of the text classification example (#9237 ) * Update the README of the text classification example * Update examples/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Adapt comment from review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-21 15:23:40 -05:00
Teven	4eef5889ac	Adding performer fine-tuning research exampke (#9239 ) * added run_mlm_performer.py research example * make styke * make styke * Added a README !	2020-12-21 21:19:41 +01:00
Amog Kamsetty	a4b21cdd20	[RAG] Add Ray implementation for distributed retrieval (#9197 ) * wip * wip * wip * wip * wip * wip * wip * wip * uncomment * uncomment * wip * updates * add docstring * updates * fix arg * fixes * add unit tests * update readme * update readme * update finetune script * update test * add test * add ray to test dependencies * separate ray and ray tune * formatting * shutdown ray at end of test * fix tests * formatting * formatting * even more formatting * address comments * formatting * add files * Update examples/research_projects/rag/test_distributed_retriever.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address comments * addressing comments Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-208.us-west-2.compute.internal> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-21 10:39:30 +01:00
Stas Bekman	f38c4ad302	better logging and help (#9203 )	2020-12-20 10:28:28 -08:00
Stas Bekman	6b850b671d	[run_glue] add speed metrics (#9198 ) * add speed metrics * suggestions	2020-12-18 17:09:30 -08:00
Aleksey Tikhonov	291974c65c	GPT-model attention heads pruning example (#9189 ) * Pruning for GPT attn heads * The code formatted according to the transformers requirements * Update run_prune_gpt.py * Update run_prune_gpt.py	2020-12-18 16:32:10 -05:00
Sylvain Gugger	1198ba8fba	Add timing inside Trainer (#9196 ) * Add timing inside Trainer * Fix tests * Add n_objs for train * Sort logs	2020-12-18 15:10:39 -05:00
Sylvain Gugger	9a25c5bd3a	Add new run_swag example (#9175 ) * Add new run_swag example * Add check * Add sample * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Very important change to make Lysandre happy Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-12-18 14:19:24 -05:00
Manuel Romero	077a5dce32	Fix link to old SQUAD fine-tuning script (#9181 )	2020-12-18 09:12:10 -05:00
Wissam Antoun	fd7b6a5274	fixed JSON error in run_qa with fp16 (#9186 )	2020-12-18 07:53:23 -05:00
Manuel Romero	66a14a2f6f	Fix link to old NER fine-tuning script (#9182 )	2020-12-17 19:50:01 -05:00
Stas Bekman	f06d0fadc9	[trainer] apex fixes and tests (#9180 )	2020-12-17 16:49:11 -08:00
Stas Bekman	63841c559b	add tests for the new sharded ddp fairscale integration (#9177 )	2020-12-17 14:24:03 -08:00
Sylvain Gugger	9a67185344	Experimental support for fairscale ShardedDDP (#9139 ) * Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer	2020-12-16 13:47:48 -05:00
Sylvain Gugger	4d48973523	Update notebook table and transformers intro notebook (#9136 )	2020-12-16 10:24:31 -05:00
Patrick von Platen	640e6fe190	[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054 ) * save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu \| gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-12-16 13:03:32 +01:00
Teven	2a7e8e1608	[Examples] Add automatic dataset splitting in language-modeling examples (#9133 ) * replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples	2020-12-15 16:02:43 -05:00
Stas Bekman	14c79c3e31	native amp leak fix landed in 1.7.1 (#9115 ) update README with good news that the leak fix has been applied to pytorch-1.7.1.	2020-12-15 09:10:41 -05:00
Yoshitomo Matsubara	44c340f45f	fix a bug in eval_batch_retrieval (#9089 )	2020-12-15 14:46:55 +01:00
Stas Bekman	c19d04623e	[finetune_trainer] enhancements and fixes (#9042 ) * trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-14 17:45:33 -08:00
Sylvain Gugger	29e4597950	Fix min_null_pred in the run_qa script (#9067 )	2020-12-11 16:26:05 -05:00
dependabot[bot]	24f6cdeab6	Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062 ) Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5. - [Release notes](https://github.com/jupyter/jupyterhub/releases) - [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md) - [Commits](https://github.com/jupyter/jupyterhub/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-12-11 10:32:43 -05:00
Sylvain Gugger	783d7d2629	Reorganize examples (#9010 ) * Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-12-11 10:07:02 -05:00
NatLun137	91ab02af28	Fix typo #9012 (#1 ) (#9038 ) There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)	2020-12-10 16:41:00 -05:00
Funtowicz Morgan	75627148ee	Flax Masked Language Modeling training example (#8728 ) * Remove "Model" suffix from Flax models to look more 🤗 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example 😅 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by: Marc van Zee <marcvanzee@gmail.com> Co-authored-by: Marc van Zee <marcvanzee@gmail.com>	2020-12-09 17:13:56 +01:00
Sylvain Gugger	447808c85f	New squad example (#8992 ) * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add tick * Update README * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-08 14:39:29 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Sylvain Gugger	62d30e0583	Small fix to the run clm script (#8973 )	2020-12-07 17:32:09 -05:00
Sylvain Gugger	7f9ccffc5b	Use word_ids to get labels in run_ner (#8962 ) * Use word_ids to get labels in run_ner * Add sanity check	2020-12-07 14:26:36 -05:00
Ethan Perez	8dfc8c7221	Don't pass in token_type_ids to BART for GLUE (#8929 ) Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.	2020-12-05 09:52:16 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Stefan Schweter	19fa01ce2a	token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828 )	2020-11-30 09:21:56 -05:00
Stas Bekman	ddf3c64654	potpurri of small fixes (#8807 )	2020-11-26 14:06:27 -08:00
chutaklee	52708d2637	Fix PPLM (#8779 ) * Fix pplm * fix style * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-26 22:23:36 +01:00
Patrick von Platen	8f07f5c44b	Revert "finetune.py: specifying generation min_length (#8478 )" (#8805 ) This reverts commit `5aa361f3e5`.	2020-11-26 20:12:01 +01:00
Daniel Khashabi	5aa361f3e5	finetune.py: specifying generation min_length (#8478 )	2020-11-26 12:33:02 +05:30
Stas Bekman	82d443a7fd	[core] implement support for run-time dependency version checking (#8645 ) * implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-24 13:22:25 -05:00
Quentin Lhoest	a7d73cfdd4	fix rag index names in eval_rag.py example (#8730 )	2020-11-24 17:04:47 +01:00
zhiheng-huang	2c83b3c38d	Support various BERT relative position embeddings (2nd) (#8276 ) * Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-24 14:40:53 +01:00
Sylvain Gugger	367f497dec	Fix max length in run_plm script (#8738 )	2020-11-23 16:02:31 -05:00
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Quentin Lhoest	8062fa63c5	Fix rag finetuning + add finetuning test (#8585 ) * replace init_ddp_connection for index init * style * add finetune test * add test data * move generate tensors to device * add test on EM metric * style * allow multi process test * keep gloo process group for retrieval * add multi-gpu test * use custom accelerator * clean test finetune * minor * style * style * typo * use python call instead of imported main fumction * return_dict fix in modeling_rag * use float32 in retrieval * store as float32 as well in the custom knowledge dataset example * style * rename to finetune_rag * style * update readme * rename utils and callbacks to utils_rag and callbacks_rag * fix test * patrick's comments * generate dummy data in the finetue test script * remove dummy data files * style	2020-11-20 19:05:03 +01:00
Stas Bekman	0ad45e108d	[examples/seq2seq] fix PL deprecation warning (#8577 ) * fix deprecation warning * fix	2020-11-19 21:46:04 +01:00
Sylvain Gugger	20b658607e	Fix run_ner script (#8664 ) * Fix run_ner script * Pin datasets	2020-11-19 13:59:30 -05:00
Sylvain Gugger	cb3e5c33f7	Fix a few last paths for the new repo org (#8666 )	2020-11-19 11:56:42 -05:00

1 2 3 4 5 ...

1379 Commits