transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-29 01:02:25 +06:00

Author	SHA1	Message	Date
Wissam Antoun	fd7b6a5274	fixed JSON error in run_qa with fp16 (#9186 )	2020-12-18 07:53:23 -05:00
Manuel Romero	66a14a2f6f	Fix link to old NER fine-tuning script (#9182 )	2020-12-17 19:50:01 -05:00
Stas Bekman	f06d0fadc9	[trainer] apex fixes and tests (#9180 )	2020-12-17 16:49:11 -08:00
Stas Bekman	63841c559b	add tests for the new sharded ddp fairscale integration (#9177 )	2020-12-17 14:24:03 -08:00
Sylvain Gugger	9a67185344	Experimental support for fairscale ShardedDDP (#9139 ) * Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer	2020-12-16 13:47:48 -05:00
Sylvain Gugger	4d48973523	Update notebook table and transformers intro notebook (#9136 )	2020-12-16 10:24:31 -05:00
Patrick von Platen	640e6fe190	[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054 ) * save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu \| gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by: TevenLeScao <teven.lescao@gmail.com>	2020-12-16 13:03:32 +01:00
Teven	2a7e8e1608	[Examples] Add automatic dataset splitting in language-modeling examples (#9133 ) * replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples	2020-12-15 16:02:43 -05:00
Stas Bekman	14c79c3e31	native amp leak fix landed in 1.7.1 (#9115 ) update README with good news that the leak fix has been applied to pytorch-1.7.1.	2020-12-15 09:10:41 -05:00
Yoshitomo Matsubara	44c340f45f	fix a bug in eval_batch_retrieval (#9089 )	2020-12-15 14:46:55 +01:00
Stas Bekman	c19d04623e	[finetune_trainer] enhancements and fixes (#9042 ) * trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-14 17:45:33 -08:00
Sylvain Gugger	29e4597950	Fix min_null_pred in the run_qa script (#9067 )	2020-12-11 16:26:05 -05:00
dependabot[bot]	24f6cdeab6	Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062 ) Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5. - [Release notes](https://github.com/jupyter/jupyterhub/releases) - [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md) - [Commits](https://github.com/jupyter/jupyterhub/commits) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-12-11 10:32:43 -05:00
Sylvain Gugger	783d7d2629	Reorganize examples (#9010 ) * Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-12-11 10:07:02 -05:00
NatLun137	91ab02af28	Fix typo #9012 (#1 ) (#9038 ) There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)	2020-12-10 16:41:00 -05:00
Funtowicz Morgan	75627148ee	Flax Masked Language Modeling training example (#8728 ) * Remove "Model" suffix from Flax models to look more 🤗 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example 😅 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by: Marc van Zee <marcvanzee@gmail.com> Co-authored-by: Marc van Zee <marcvanzee@gmail.com>	2020-12-09 17:13:56 +01:00
Sylvain Gugger	447808c85f	New squad example (#8992 ) * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add tick * Update README * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-12-08 14:39:29 -05:00
Sylvain Gugger	00aa9dbca2	Copyright (#8970 ) * Add copyright everywhere missing * Style	2020-12-07 18:36:34 -05:00
Sylvain Gugger	62d30e0583	Small fix to the run clm script (#8973 )	2020-12-07 17:32:09 -05:00
Sylvain Gugger	7f9ccffc5b	Use word_ids to get labels in run_ner (#8962 ) * Use word_ids to get labels in run_ner * Add sanity check	2020-12-07 14:26:36 -05:00
Ethan Perez	8dfc8c7221	Don't pass in token_type_ids to BART for GLUE (#8929 ) Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.	2020-12-05 09:52:16 -05:00
Stas Bekman	df311a5ccf	[seq2seq] document the caveat of leaky native amp (#8930 ) * document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-12-04 15:43:35 -08:00
Stas Bekman	4c3d98dddc	[s2s finetune_trainer] add instructions for distributed training (#8884 )	2020-12-03 16:05:55 -08:00
Stas Bekman	379005c9d2	start using training_args.parallel_mode (#8882 )	2020-12-01 11:40:36 -08:00
Stas Bekman	7f34d75780	[s2s trainer] fix DP mode (#8823 ) * fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup	2020-11-30 12:55:56 -08:00
Sylvain Gugger	5530299096	Remove deprecated `evalutate_during_training` (#8852 ) * Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-30 11:12:15 -05:00
Stefan Schweter	19fa01ce2a	token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828 )	2020-11-30 09:21:56 -05:00
Stas Bekman	ddf3c64654	potpurri of small fixes (#8807 )	2020-11-26 14:06:27 -08:00
chutaklee	52708d2637	Fix PPLM (#8779 ) * Fix pplm * fix style * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-26 22:23:36 +01:00
Patrick von Platen	8f07f5c44b	Revert "finetune.py: specifying generation min_length (#8478 )" (#8805 ) This reverts commit `5aa361f3e5`.	2020-11-26 20:12:01 +01:00
Daniel Khashabi	5aa361f3e5	finetune.py: specifying generation min_length (#8478 )	2020-11-26 12:33:02 +05:30
Stas Bekman	82d443a7fd	[core] implement support for run-time dependency version checking (#8645 ) * implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-11-24 13:22:25 -05:00
Quentin Lhoest	a7d73cfdd4	fix rag index names in eval_rag.py example (#8730 )	2020-11-24 17:04:47 +01:00
zhiheng-huang	2c83b3c38d	Support various BERT relative position embeddings (2nd) (#8276 ) * Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-11-24 14:40:53 +01:00
Sylvain Gugger	367f497dec	Fix max length in run_plm script (#8738 )	2020-11-23 16:02:31 -05:00
Stas Bekman	1e45bef0a7	[trainer] make generate work with multigpu (#8716 ) * make generate work with multigpu * better fix - thanks @sgugger	2020-11-23 10:57:27 -08:00
Santiago Castro	e1f3156b21	Fix many typos (#8708 )	2020-11-21 22:58:10 -05:00
Quentin Lhoest	8062fa63c5	Fix rag finetuning + add finetuning test (#8585 ) * replace init_ddp_connection for index init * style * add finetune test * add test data * move generate tensors to device * add test on EM metric * style * allow multi process test * keep gloo process group for retrieval * add multi-gpu test * use custom accelerator * clean test finetune * minor * style * style * typo * use python call instead of imported main fumction * return_dict fix in modeling_rag * use float32 in retrieval * store as float32 as well in the custom knowledge dataset example * style * rename to finetune_rag * style * update readme * rename utils and callbacks to utils_rag and callbacks_rag * fix test * patrick's comments * generate dummy data in the finetue test script * remove dummy data files * style	2020-11-20 19:05:03 +01:00
Stas Bekman	0ad45e108d	[examples/seq2seq] fix PL deprecation warning (#8577 ) * fix deprecation warning * fix	2020-11-19 21:46:04 +01:00
Sylvain Gugger	20b658607e	Fix run_ner script (#8664 ) * Fix run_ner script * Pin datasets	2020-11-19 13:59:30 -05:00
Sylvain Gugger	cb3e5c33f7	Fix a few last paths for the new repo org (#8666 )	2020-11-19 11:56:42 -05:00
Matthias	a79a96ddaa	fix small typo (#8644 ) Fixed a small typo on the XLNet and permutation language modelling section	2020-11-19 11:24:11 -05:00
Sylvain Gugger	4208f496ee	Better filtering of the model outputs in Trainer (#8633 ) * Better filtering of the model outputs in Trainer * Fix examples tests * Add test for Lysandre	2020-11-19 10:43:15 -05:00
Quentin Lhoest	62cd9ce9f8	fix missing return dict (#8653 )	2020-11-19 15:17:18 +01:00
Tim Isbister	28d16e7ac5	Update README.md (#8635 )	2020-11-18 18:35:23 -05:00
Stas Bekman	d86d57faa3	[s2s] distillation apex breaks return_dict obj (#8631 ) * apex breaks return_dict obj * style	2020-11-18 12:51:29 -08:00
Sylvain Gugger	a0c62d2493	Fix training from scratch in new scripts (#8623 )	2020-11-18 12:15:26 -05:00
Stas Bekman	cdf1b7ae82	fix to adjust for #8530 changes (#8612 )	2020-11-18 10:25:00 -05:00
Stas Bekman	2819da02f7	[s2s] broken test (#8613 )	2020-11-18 10:15:53 -05:00
Sylvain Gugger	dd52804f5f	Remove deprecated (#8604 ) * Remove old deprecated arguments Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-11-17 15:11:29 -05:00
Stas Bekman	f0435f5a61	these should run fine on multi-gpu (#8582 )	2020-11-17 14:00:41 -05:00
Julien Chaumond	042a6aa777	Tokenizers: ability to load from model subfolder (#8586 ) * <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>	2020-11-17 08:58:45 -05:00
Sylvain Gugger	c89bdfbe72	Reorganize repo (#8580 ) * Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command	2020-11-16 21:43:42 -05:00
Sylvain Gugger	1073a2bde5	Switch `return_dict` to `True` by default. (#8530 ) * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests	2020-11-16 11:43:00 -05:00
Thomas Wolf	f4e04cd2c6	[breaking\|pipelines\|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073 ) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-15 22:50:59 +01:00
Julien Plu	27b3ff316a	Try to understand and apply Sylvain's comments (#8458 )	2020-11-12 13:43:00 -05:00
zeyuyun1	924c624a46	quick fix on concatenating text to support more datasets (#8474 )	2020-11-12 09:47:08 -05:00
Sumithra Bhakthavatsalam	81ebd70671	[s2s] distill t5-large -> t5-small (#8376 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-11 17:58:45 -05:00
sarnoult	a38d1c7c31	Example NER script predicts on tokenized dataset (#8468 ) The new run_ner.py script tries to run prediction on the input test set `datasets["test"]`, but it should be the tokenized set `tokenized_datasets["test"]`	2020-11-11 10:28:23 -05:00
Stas Bekman	02bdfc0251	using multi_gpu consistently (#8446 ) * s\|multiple_gpu\|multi_gpu\|g; s\|multigpu\|multi_gpu\|g' * doc	2020-11-10 13:23:58 -05:00
Stas Bekman	5d4972e608	[examples] better PL version check (#8429 )	2020-11-10 09:33:23 -05:00
Shichao Sun	ae1cb4ec22	[s2s/distill] hparams.tokenizer_name = hparams.teacher (#8382 )	2020-11-10 09:32:01 -05:00
Julien Chaumond	55e8d0cea2	Update links from s3 to huggingface.co	2020-11-10 14:03:29 +01:00
Stas Bekman	190df58560	[github CI] add a multi-gpu job for all example tests (#8341 ) * add a multi-gpu job for all example tests * run only ported tests * rename * explain why env is re-activated on each step * mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me * style * Apply suggestions from code review Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-09 15:47:38 -05:00
Patrick von Platen	9c83b96e62	[Tests] Add Common Test for Training + Fix a couple of bugs (#8415 ) * add training tests * correct longformer * fix docs * fix some tests * fix some more train tests * remove ipdb * fix multiple edge case model training * fix funnel and prophetnet * clean gpt models * undo renaming of albert	2020-11-09 18:24:41 +01:00
Sylvain Gugger	5c766ecb50	Fix typo	2020-11-09 11:50:51 -05:00
Sylvain Gugger	908a28894c	Add new token classification example (#8340 ) * Add new token classification example * Remove txt file * Add test * With actual testing done * Less warmup is better * Update examples/token-classification/run_ner_new.py Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address review comments * Fix test * Make Lysandre happy * Last touches and rename * Rename in tests * Address review comments * More run_ner -> run_ner_old Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-11-09 11:39:55 -05:00
Sam Shleifer	ebde57acac	examples/docs: caveat that PL examples don't work on TPU (#8309 )	2020-11-09 08:55:22 -05:00
Sam Shleifer	e6d9cdaafe	[s2s/distill] remove run_distiller.sh, fix xsum script (#8412 )	2020-11-08 16:57:43 -05:00
Stas Bekman	66582492d3	[s2s test_finetune_trainer] failing multigpu test (#8400 )	2020-11-08 16:45:40 -05:00
Stas Bekman	f62755a600	[s2s examples test] fix data path (#8398 )	2020-11-08 16:44:18 -05:00
Jonathan Chang	5807ba3fa9	Fix typo (#8351 )	2020-11-06 11:19:41 -05:00
Stas Bekman	9edafaebef	[s2s] test_bash_script.py - actually learn something (#8318 ) * use decorator * remove hardcoded paths * make the test use more data and do real quality tests * shave off 10 secs * add --eval_beams 2, reformat * reduce train size, use smaller custom dataset	2020-11-05 23:15:14 -05:00
Leandro von Werra	17450397a7	Docs bart training ref (#8330 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 17:20:57 -05:00
Stas Bekman	d787935a14	[s2s] test_distributed_eval (#8315 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-05 16:01:15 -05:00
Sam Shleifer	7abc1d96d1	no warn (#8329 )	2020-11-05 11:42:24 -05:00
Bobby Donchev	52f44dd6d2	change TokenClassificationTask class methods to static methods (#7902 ) * change TokenClassificationTask class methods to static methods Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class. Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods. * Trigger Build Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-11-05 09:38:30 -05:00
Guillem García Subies	77c8f6c627	Corrected typo in readme (#8320 )	2020-11-05 07:48:36 -05:00
Sylvain Gugger	9c4aa4ac1a	Clean up data collators and datasets (#8308 ) * Clean up data collators and datasets * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Remove needless clone Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-04 17:24:49 -05:00
Manuel Romero	b1d3e95eb5	Fix path to old run_language_modeling.py script (#8302 )	2020-11-04 13:17:57 -05:00
Sylvain Gugger	cf89724696	Fix validation file loading in scripts (#8298 )	2020-11-04 10:42:18 -05:00
Pengzhi Gao	734afa37f6	Fix typo in language-modeling README.md (#8287 )	2020-11-04 09:38:02 -05:00
Stas Bekman	1bb4bba53c	[CIs] Better reports everywhere (#8275 ) * make it possible to invoke testconf.py in both test suites without crashing on having the same option added * perl -pi -e 's\|--make_reports\|--make-reports\|' to be consistent with other opts * add `pytest --make-reports` to all CIs (and artifacts) * fix	2020-11-03 16:57:12 -05:00
Patrick von Platen	068e6b5edd	make files independent (#8267 )	2020-11-03 21:13:33 +01:00
Stas Bekman	cd360dcb26	[examples] minimal version requirement run-time check in PL (#8133 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-11-03 13:17:11 -05:00
Lysandre	eb6313e823	Fix Tatoeba skip	2020-11-03 10:35:00 -05:00
Sam Shleifer	b63beb743c	Skip tatoeba tests if Tatoeba-Challenge not cloned (#8260 )	2020-11-03 09:49:29 -05:00
Patrick von Platen	9f1747f999	[Seq2Seq] Correct import in Seq2Seq Trainer (#8254 )	2020-11-03 07:56:41 -05:00
Sylvain Gugger	e1b1b614b1	Add line by line option to mlm/plm scripts (#8240 ) * Make line by line optional in run_mlm * Add option to disable dynamic padding * Add option to plm too and update README * Typos * More typos * Even more typos * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-11-02 12:27:04 -05:00
Patrick von Platen	9bd30f7cf4	[Seq2SeqTrainer] Move import to init to make file self-contained (#8194 ) * boom boom * reverse order	2020-11-01 23:31:55 +01:00
Sylvain Gugger	9eb3a410cd	Remove deprecated arguments from new run_clm (#8197 )	2020-10-30 15:27:20 -04:00
Sylvain Gugger	cdc48ce92d	Finalize lm examples (#8188 ) * Finish the cleanup of the language-modeling examples * Update main README * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Propagate changes Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-10-30 14:20:18 -04:00
wlhgtc	9a21b50614	Fix eval ref miss in Chinese WWM. (#8115 ) * ADD: add whole word mask proxy for both eng and chinese * MOD: adjust format * MOD: reformat code * MOD: update import * MOD: fix bug * MOD: add import * MOD: fix bug * MOD: decouple code and update readme * MOD: reformat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/language-modeling/run_language_modeling.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change wwm to whole_word_mask * reformat code * reformat * format * Code quality * ADD: update chinese ref readme * MOD: small changes * MOD: small changes2 * update readme * fix eval ref file miss bug * format file * MOD: move ref code to contrib * MOD: add delimeter check * reformat code * refomat code * Update examples/language-modeling/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-10-29 17:08:39 -04:00
Sylvain Gugger	691176283d	Add a template for examples and apply it for mlm and plm examples (#8153 ) * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Add a template for example scripts and apply it to mlm * Formatting * Fix test * Add plm script * Styling	2020-10-29 13:38:11 -04:00
Sam Shleifer	49e4fece5c	[s2s] distillBART docs for paper replication (#8150 )	2020-10-29 12:01:15 -04:00
Sylvain Gugger	acf56408d8	Smarter prediction loop and no- -> no_ in console args (#8151 ) * Smarter prediction loop and no- -> no_ in console args * Fix test	2020-10-29 10:56:25 -04:00
Santiago Castro	969859d5f6	Fix doc errors and typos across the board (#8139 ) * Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes	2020-10-29 10:33:33 -04:00
Stas Bekman	825925dfaa	[s2s test] cleanup (#8131 )	2020-10-28 16:50:36 -04:00
Sean Naren	5e24982e58	Upgrade PyTorch Lightning to 1.0.2 (#7852 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-10-28 14:59:14 -04:00
Sylvain Gugger	378142afdf	Rename add_start_docstrings_to_callable (#8120 )	2020-10-28 13:42:31 -04:00

1 2 3 4 5 ...

1420 Commits