transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 11:08:23 +06:00

Author	SHA1	Message	Date
Sam Shleifer	b6b2f2270f	s2s: fix LR logging, remove some dead code. (#6205 )	2020-08-03 10:36:26 -04:00
Stas Bekman	d8dbf3b75d	[s2s] clean up + doc (#6184 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-01 14:51:07 -04:00
Stas Bekman	f250beb8aa	enable easy checkout switch (#5645 ) * enable easy checkout switch allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests. * make isort happy * examples needs one too	2020-07-31 04:34:46 -04:00
Sylvain Gugger	91cb95461e	Switch from return_tuple to return_dict (#6138 ) * Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Plu <plu.julien@gmail.com>	2020-07-30 09:17:00 -04:00
Stas Bekman	3212b8850d	[s2s] add support for overriding config params (#6149 )	2020-07-30 01:09:46 -04:00
Julien Plu	54f9fbeff8	Rework TF trainer (#6038 ) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import	2020-07-29 14:32:01 -04:00
Lysandre Debut	641b873c13	XLNet PLM Readme (#6121 )	2020-07-29 11:38:15 -04:00
Sam Shleifer	92f8ce2ed6	Fix deebert tests (#6102 )	2020-07-28 18:30:16 -04:00
Sam Shleifer	dafa296c95	[s2s] Delete useless method, log tokens_per_batch (#6081 )	2020-07-28 11:24:23 -04:00
Stas Bekman	f0c70085c2	link to README.md (#6068 ) * add a link to README.md * Update README.md	2020-07-28 20:34:58 +08:00
Sam Shleifer	3c7fbf35a6	MBART: support summarization tasks where max_src_len > max_tgt_len (#6003 ) * MBART: support summarization tasks * fix test * Style * add tokenizer test	2020-07-28 08:18:11 -04:00
Sam Shleifer	7a68d40138	[s2s] Don't mention packed data in README (#6079 )	2020-07-27 20:07:21 -04:00
Sam Shleifer	1e00ef681d	[s2s] dont document packing because it hurts performance (#6077 )	2020-07-27 18:26:00 -04:00
Sam Shleifer	11792d7826	CL util to convert models to fp16 before upload (#5953 )	2020-07-27 12:21:25 -04:00
Sam Shleifer	4302ace5bd	[pack_dataset] don't sort before packing, only pack train (#5954 )	2020-07-27 12:14:23 -04:00
Suraj Patil	d1d15d6f2d	[examples (seq2seq)] fix preparing decoder_input_ids for T5 (#5994 )	2020-07-27 10:10:43 -04:00
Sam Shleifer	c69ea5efc4	[CI] Don't test apex (#6021 )	2020-07-24 15:34:16 -04:00
Sam Shleifer	c3206eef44	[test] partial coverage for train_mbart_enro_cc25.sh (#5976 )	2020-07-22 14:34:49 -04:00
Sam Shleifer	feeb956a19	[docs] Add integration test example to copy pasta template (#5961 ) Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-07-22 12:48:38 -04:00
Sam Shleifer	9dab39feea	seq2seq/run_eval.py can take decoder_start_token_id (#5949 )	2020-07-21 16:58:45 -04:00
Sam Shleifer	5b193b39b0	[examples/seq2seq]: add --label_smoothing option (#5919 )	2020-07-21 16:51:39 -04:00
Sam Shleifer	95d1962b9c	[Doc] explaining romanian postprocessing for MBART BLEU hacking (#5943 )	2020-07-21 14:12:48 -04:00
Aditya Soni	ccbf74a685	typos in seq2seq/readme (#5937 )	2020-07-21 09:44:59 -04:00
Qingqing Cao	8e0bcb56ec	DataParallel fix: multi gpu evaluation (#5926 ) The DataParallel training was fixed in https://github.com/huggingface/transformers/pull/5733, this commit also fixes the evaluation. It's more convenient when the user enables both `do_train` and `do_eval`.	2020-07-20 17:54:08 -04:00
Sam Shleifer	f1a4e06f1f	[Fix] seq2seq pack_dataset.py actually packs (#5913 ) Huge MT speedup!	2020-07-20 15:18:26 -04:00
Stas Bekman	35cb101eae	DataParallel fixes (#5733 ) * DataParallel fixes: 1. switched to a more precise check - if self.args.n_gpu > 1: + if isinstance(model, nn.DataParallel): 2. fix tests - require the same fixup under DataParallel as the training module * another fix	2020-07-20 09:29:12 -04:00
Sam Shleifer	09a2f40684	Seq2SeqDataset uses linecache to save memory by @Pradhy729 (#5792 ) Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>	2020-07-18 13:57:33 -04:00
Sam Shleifer	dad5e12e54	[seq2seq] distillation.py accepts trainer arguments (#5865 )	2020-07-18 07:43:57 -04:00
Sam Shleifer	ba2400189b	[seq2seq] MAX_LEN env var for MT commands (#5837 )	2020-07-17 22:51:31 -04:00
Nathan Raw	529850ae7b	Lightning Updates for v0.8.5 (#5798 ) Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-07-17 22:43:06 -04:00
Sam Shleifer	e238e3d55a	[seq2seq] Don't copy self.source in sortishsampler (#5818 )	2020-07-17 01:53:25 -04:00
Sam Shleifer	283500ff9f	[seq2seq] pack_dataset.py rewrites dataset in max_tokens format (#5819 )	2020-07-16 14:06:49 -04:00
Sam Shleifer	1a647abf0b	[fix] check code quality (#5772 )	2020-07-15 14:59:38 -04:00
Sam Shleifer	d0486c8bc2	[cleanup] T5 test, warnings (#5761 )	2020-07-15 08:23:22 -04:00
Boris Dayma	4d5a8d6557	docs(wandb): explain how to use W&B integration (#5607 ) * docs(wandb): explain how to use W&B integration fix #5262 * Also mention TensorBoard Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-07-14 05:12:33 -04:00
Julien Chaumond	201d23f285	Update The Big Table of Tasks Co-Authored-By: Suraj Patil <surajp815@gmail.com> Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-10 18:07:29 +02:00
Lysandre Debut	0533cf4706	Test XLA examples (#5583 ) * Test XLA examples * Style * Using `require_torch_tpu` * Style * No need for pytest	2020-07-09 09:19:19 -04:00
Ji Xin	cfbb982974	Add DeeBERT (entropy-based early exiting for BERT) (#5477 ) Add deebert code * Add readme of deebert * Add test for deebert Update test for Deebert * Update DeeBert (README, class names, function refactoring); remove requirements.txt * Format update * Update test * Update readme and model init methods	2020-07-08 08:17:59 +08:00
Patrick von Platen	fde217c679	readme for benchmark (#5363 )	2020-07-07 23:21:23 +02:00
Sam Shleifer	353b8f1e7a	Add mbart-large-cc25, support translation finetuning (#5129 ) improve unittests for finetuning, especially w.r.t testing frozen parameters fix freeze_embeds for T5 add streamlit setup.cfg	2020-07-07 13:23:01 -04:00
Patrick von Platen	4dc65591b5	[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile (#5395 ) * add first version of clm tf * make style * add more tests for bert * update tf clm loss * fix tests * correct tf ner script * add mlm loss * delete bogus file * clean tf auto model + add tests * finish adding clm loss everywhere * fix training in distilbert * fix flake8 * save intermediate * fix tf t5 naming * remove prints * finish up * up * fix tf gpt2 * fix new test utils import * fix flake8 * keep backward compatibility * Update src/transformers/modeling_tf_albert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_roberta.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_mobilebert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_distilbert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-07 18:15:53 +02:00
Suraj Patil	e49393c361	[examples] Add trainer support for question-answering (#4829 ) * add SquadDataset * add DataCollatorForQuestionAnswering * update __init__ * add run_squad with trainer * add DataCollatorForQuestionAnswering in __init__ * pass data_collator to trainer * doc tweak * Update run_squad_trainer.py * Update __init__.py * Update __init__.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-07-07 08:57:08 -04:00
Shashank Gupta	3dcb748e31	Added data collator for permutation (XLNet) language modeling and related calls (#5522 ) * Added data collator for XLNet language modeling and related calls Added DataCollatorForXLNetLanguageModeling in data/data_collator.py to generate necessary inputs for language modeling training with XLNetLMHeadModel. Also added related arguments, logic and calls in examples/language-modeling/run_language_modeling.py. Resolves: #4739, #2008 (partially) * Changed name to `DataCollatorForPermutationLanguageModeling` Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`. Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use. CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of similar to `mems` for XLNet). Changed calls and imports appropriately. * Added detailed comments, changed variable names Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative. * Added tests for new data collator Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences. * Fixed styling issues	2020-07-07 10:17:37 +02:00
Lysandre Debut	9d9b872b66	The `add_space_before_punct_symbol` is only for TransfoXL (#5549 )	2020-07-06 12:17:05 -04:00
Sylvain Gugger	734a28a767	Clean up diffs in Trainer/TFTrainer (#5417 ) * Cleanup and unify Trainer/TFTrainer * Forgot to adapt TFTrainingArgs * In tf scripts n_gpu -> n_replicas * Update src/transformers/training_args.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Address review comments * Formatting * Fix typo Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-07-01 11:00:20 -04:00
Sam Shleifer	13deb95a40	Move tests/utils.py -> transformers/testing_utils.py (#5350 )	2020-07-01 10:31:17 -04:00
Sylvain Gugger	4ade7491f4	Fix examples titles and optimization doc page (#5408 )	2020-07-01 08:11:25 -04:00
Hong Xu	501040fd30	In the run_ner.py example, give the optional label arg a default value (#5326 ) Otherwise, if label is not specified, the following error occurs: Traceback (most recent call last): File "run_ner.py", line 303, in <module> main() File "run_ner.py", line 101, in main model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1])) File "/home/user/anaconda3/envs/bert/lib/python3.7/site-packages/transformers/hf_argparser.py", line 159, in parse_json_file obj = dtype(**inputs) TypeError: __init__() missing 1 required positional argument: 'labels'	2020-06-30 19:45:35 -04:00
Sam Shleifer	27a7fe7a8d	examples/seq2seq: never override $WANDB_PROJECT (#5407 )	2020-06-30 15:29:13 -04:00
Kevin Canwen Xu	331d8d2936	Upload DistilBART artwork (#5394 )	2020-06-30 18:11:11 +08:00
MichaelJanz	9a473f1e43	Update Bertabs example to work again (#5355 ) * Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore * Update run_summarization.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-06-30 14:05:01 +08:00
Sam Shleifer	a316a6aaa8	[seq2seq docs] Move evaluation down, fix typo (#5365 )	2020-06-29 10:36:04 -04:00
Patrick von Platen	4bcc35cd69	[Docs] Benchmark docs (#5360 ) * first doc version * add benchmark docs * fix typos * improve README * Update docs/source/benchmarks.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix naming and docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-29 16:08:57 +02:00
Sam Shleifer	45e26125de	save_pretrained: mkdir(exist_ok=True) (#5258 ) * all save_pretrained methods mkdir if not os.path.exists	2020-06-28 14:53:47 -04:00
Suraj Patil	12dfbd4f7a	[examples] fix example links (#5344 )	2020-06-28 12:54:54 -04:00
Sam Shleifer	393b8dc09a	examples/seq2seq/run_eval.py fixes and docs (#5322 )	2020-06-26 19:20:43 -04:00
Sam Shleifer	5543b30aa6	[pl_examples] default warmup steps=0 (#5316 )	2020-06-26 15:03:41 -04:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Patrick von Platen	79a82cc06a	[Benchmarks] improve Example Plotter (#5245 ) * improve plotting * better labels * fix time plot	2020-06-26 15:00:14 +02:00
Lysandre Debut	7cc15bdd96	Closes #5218	2020-06-25 18:19:21 -04:00
Sam Shleifer	e008d520bb	[examples/seq2seq] more README improvements (#5274 )	2020-06-25 10:13:01 -04:00
Sam Shleifer	40457bcebb	examples/seq2seq supports translation (#5202 )	2020-06-24 23:58:11 -04:00
Victor SANH	4965aee064	[HANS] Fix label_list for RoBERTa/BART (class flipping) (#5196 ) * fix weirdness in roberta/bart for mnli trained checkpoints * black compliance * isort code check	2020-06-24 14:38:15 -04:00
Patrick von Platen	9fe09cec76	[Benchmark] Extend Benchmark to all model type extensions (#5241 ) * add benchmark for all kinds of models * improved import * delete bogus files * make style	2020-06-24 15:11:42 +02:00
Sylvain Gugger	7c41057d50	Add hugs (#5225 )	2020-06-24 07:56:14 -04:00
Sylvain Gugger	5e85b324ec	Use the script in utils (#5224 )	2020-06-24 07:55:58 -04:00
Kevin Canwen Xu	54e9ce785d	Fix PABEE division by zero error (#5233 ) * Fix PABEE division by zero error * patience=0 by default	2020-06-24 16:10:36 +08:00
Sam Shleifer	76e5af4cfd	[pl_examples] revert deletion of optimizer_step (#5227 )	2020-06-23 16:40:45 -04:00
Sam Shleifer	f5c2a122e3	Upgrade examples to pl=0.8.1(#5146 )	2020-06-22 20:40:10 -04:00
Patrick von Platen	fa0be6d761	Benchmarks (#4912 ) * finish benchmark * fix isort * fix setup cfg * retab * fix time measuring of tf graph mode * fix tf cuda * clean code * better error message	2020-06-22 12:06:56 +02:00
Ilya Boytsov	bc3a0c0607	[examples] fixes arguments for summarization finetune scripts (#5157 ) Authored-by: i.boytsov <i.boytsov@MAC867.local>	2020-06-21 11:51:21 -04:00
Kevin Canwen Xu	c0c577cf8f	Fix PABEE's result table (#5158 )	2020-06-20 22:56:39 +08:00
Kevin Canwen Xu	2fd28d4363	Add BERT Loses Patience (Patience-based Early Exit) (#5078 ) * Add BERT Loses Patience (Patience-based Early Exit) * update model archive * update format * sort import * flake8 * Add results * full results * align the table * refactor to inherit * default per gpu eval = 1 * Formatting * Formatting * isort * modify readme * Add check * Fix format * Fix format * Doc strings * ALBERT & BERT for sequence classification don't inherit from the original anymore * Remove incorrect comments * Remove incorrect comments * Remove incorrect comments * Sync up with new code * Sync up with new code * Add a test * Add a test * Add a test * Add a test * Add a test * Add a test * Finishing up!	2020-06-20 13:41:46 +08:00
Sam Shleifer	2db1e2f415	[cleanup] remove redundant code in SummarizationDataset (#5119 )	2020-06-18 20:34:48 -04:00
Lysandre	efeb75b805	Remove misleading comment closes #4958	2020-06-17 18:24:35 -04:00
Sam Shleifer	f1a3d03741	add pandas to setup.cfg (#5093 )	2020-06-17 16:39:17 -04:00
Pranav Dayanand Pawar	049e14f0e3	very minor spelling correction in script command (#5090 ) actual script name - counts_parameters.py	2020-06-17 16:08:43 -04:00
Sam Shleifer	043f9f51f9	[examples] SummarizationModule improvements (#4951 )	2020-06-17 13:51:34 -04:00
Sylvain Gugger	cd40f6564e	Add header and fix command (#5082 )	2020-06-17 11:45:05 -04:00
flozi00	af497b5672	Typo (#5069 )	2020-06-16 16:46:20 -04:00
Yacine Jernite	49c5202522	Eli5 examples (#4968 ) * add eli5 examples * add dense query script * query_di * merging * merging * add_utils * adds nearest neighbor wikipedia * batch queries * training_retriever * new notebooks * moved retriever traiing script * finished wiki40b * max_len_fix * train_s2s * retriever_batch_checkpointing * cleanup * merge * dim_fix * fix_indexer * fix_wiki40b_snippets * fix_embed_for_r * fp32 index * fix_sparse_q * joint_training * remove obsolete datasets * add_passage_nn_results * add_passage_nn_results * add_batch_nn * add_batch_nn * add_data_scripts * notebook * notebook * notebook * fix_multi_gpu * add_app * full_caching * full_caching * notebook * sparse_done * images * notebook * add_image_gif * with_Gif * add_contr_image * notebook * notebook * notebook * train_functions * notebook * min_retrieval_length * pandas_option * notebook * min_retrieval_length * notebook * notebook * eval_Retriever * notebook * images * notebook * add_example * add_example * notebook * fireworks * notebook * notebook * joe's notebook comments * app_update * notebook * notebook_link * captions * notebook * assing RetriBert model * add RetriBert to Auto * change AutoLMHead to AutoSeq2Seq * notebook downloads from hf models * style_black * style_black * app_update * app_update * fix_app_update * style * style * isort * Delete WikiELI5training.ipynb * Delete evaluate_eli5.py * Delete WikiELI5explore.ipynb * Delete ExploreWikiELI5Support.html * Delete explainlikeimfive.py * Delete wiki_snippets.py * children before parent * children before parent * style_black * style_black_only * isort * isort_new * Update src/transformers/modeling_retribert.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * typo fixes * app_without_asset * cleanup * Delete ELI5animation.gif * Delete ELI5contrastive.svg * Delete ELI5wiki_index.svg * Delete choco_bis.svg * Delete fireworks.gif * Delete huggingface_logo.jpg * Delete huggingface_logo.svg * Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb * Delete eli5_app.py * Delete eli5_utils.py * readme * Update README.md * unused imports * moved_info * default_beam * ftuned model * disclaimer * Update src/transformers/modeling_retribert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * black * add_doc * names * isort_Examples * isort_Examples * Add doc to index Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-16 16:36:58 -04:00
Sam Shleifer	c3e607496c	[cleanup] examples test_run_squad uses tiny model (#5059 )	2020-06-16 14:06:45 -04:00
Sylvain Gugger	d5477baf7d	Convert hans to Trainer (#5025 ) * Convert hans to Trainer * Tick box	2020-06-16 08:06:31 -04:00
Anthony MOI	36434220fc	[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510 ) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-15 17:12:51 -04:00
Sylvain Gugger	1affde2f10	Make DataCollator a callable (#5015 ) * Make DataCollator a callable * Update src/transformers/data/data_collator.py Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-15 11:58:33 -04:00
Stefan Schweter	d812e6d76e	NER: fix construction of input examples for RoBERTa (#4943 ) * utils_ner: do not add extra sep token for RoBERTa model * run_pl_ner: do not add extra sep token for RoBERTa model	2020-06-15 08:30:40 -04:00
Sylvain Gugger	403d309857	Hans data (#4854 ) * Update hans data to be able to use Trainer * Fixes * Deal with tokenizer that don't have token_ids * Clean up things * Simplify data use * Fix the input dict * Formatting + proper path in README	2020-06-13 09:35:13 -04:00
VictorSanh	473808da0d	update `mvmt-pruning/saving_prunebert` (updating torch to 1.5)	2020-06-11 19:42:45 +00:00
Sylvain Gugger	e8db8b845a	Remove unused arguments in Multiple Choice example (#4853 ) * Remove unused arguments * Formatting * Remove second todo comment	2020-06-09 20:05:09 -04:00
songyouwei	29c36e9f36	run_pplm.py bug fix (#4867 ) `is_leaf` may become `False` after `.to(device=device)` function call.	2020-06-09 19:14:27 -04:00
Sam Shleifer	f90bc44d9a	[examples] Cleanup summarization docs (#4876 )	2020-06-09 17:38:28 -04:00
Amil Khare	02e5f79662	[examples] consolidate summarization examples (#4837 )	2020-06-09 11:14:12 -04:00
daniel-shan	b6f365a8ed	Updates args in tf squad example. (#4820 ) Co-authored-by: Daniel Shan <daniel.shan@workday.com>	2020-06-08 05:36:09 -04:00
Mr Ruben	ddf9a3dfc7	Updated path "cd examples/text-generation/pplm" (#4778 ) https://github.com/huggingface/transformers/issues/4776	2020-06-05 21:16:48 -04:00
Sam Shleifer	875288b344	[isort] add matplotlib to known 3rd party dependencies (#4800 )	2020-06-05 17:27:31 -04:00
Julien Chaumond	b9109f2de1	[doc] Make it clearer that `text-generation` does not involve training	2020-06-05 14:59:22 +02:00
Stefan Schweter	2a4b9e09c0	NER: Add new WNUT’17 example (#4681 ) * ner: add preprocessing script for examples that splits longer sentences * ner: example shell scripts use local preprocessing now * ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results * ner: satisfy black and isort	2020-06-04 19:13:17 -04:00
prajjwal1	48a05026de	removed deprecared use of Variable api from pplm example	2020-06-04 18:07:49 -04:00
Jason Phang	492b352ab6	Remove unnecessary model_type arg in example (#4771 )	2020-06-04 13:41:24 -04:00
Jin Young Sohn	b231a413f5	Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621 ) * Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-02 13:40:14 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Victor SANH	bf760c80b5	finish README	2020-06-01 09:23:31 -04:00
Victor SANH	9d7d9b3ae0	weird import	2020-06-01 09:23:31 -04:00
Victor SANH	2a3c88a659	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	4ac462bfb8	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	35fa0bbca0	clarify README	2020-06-01 09:23:31 -04:00
Victor SANH	cc746a5020	flake8 compliance	2020-06-01 09:23:31 -04:00
Victor SANH	b11386e158	less prints in saving prunebert	2020-06-01 09:23:31 -04:00
Victor SANH	8b5d4003ab	complete README	2020-06-01 09:23:31 -04:00
Victor SANH	5c8e5b3709	commplying with isort	2020-06-01 09:23:31 -04:00
Victor SANH	db2a3b2e01	space	2020-06-01 09:23:31 -04:00
Victor SANH	5f8f2d849a	add floppy bert model notebok	2020-06-01 09:23:31 -04:00
Victor SANH	b41948f5cd	add requirements	2020-06-01 09:23:31 -04:00
Victor SANH	fb8f4277b2	add scripts	2020-06-01 09:23:31 -04:00
Victor SANH	d489a6d3d5	add masked_run_*	2020-06-01 09:23:31 -04:00
Victor SANH	e4c07faf0a	add sparsity modules	2020-06-01 09:23:31 -04:00
Patrick von Platen	96f57c9ccb	[Benchmark] Memory benchmark utils (#4198 ) * improve memory benchmarking * correct typo * fix current memory * check torch memory allocated * better pytorch function * add total cached gpu memory * add total gpu required * improve torch gpu usage * update memory usage * finalize memory tracing * save intermediate benchmark class * fix conflict * improve benchmark * improve benchmark * finalize * make style * improve benchmarking * correct typo * make train function more flexible * fix csv save * better repr of bytes * better print * fix __repr__ bug * finish plot script * rename plot file * delete csv and small improvements * fix in plot * fix in plot * correct usage of timeit * remove redundant line * remove redundant line * fix bug * add hf parser tests * add versioning and platform info * make style * add gpu information * ensure backward compatibility * finish adding all tests * Update src/transformers/benchmark/benchmark_args.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/benchmark/benchmark_args_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * delete csv files * fix isort ordering * add out of memory handling * add better train memory handling Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-05-27 23:22:16 +02:00
Lysandre Debut	6a17688021	per_device instead of per_gpu/error thrown when argument unknown (#4618 ) * per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-27 11:36:55 -04:00
Hao Tan	a9aa7456ac	Add back --do_lower_case to uncased models (#4245 ) The option `--do_lower_case` is currently required by the uncased models (i.e., bert-base-uncased, bert-large-uncased). Results: BERT-BASE without --do_lower_case: 'exact': 73.83, 'f1': 82.22 BERT-BASE with --do_lower_case: 'exact': 81.02, 'f1': 88.34	2020-05-26 21:13:07 -04:00
Antonis Maronikolakis	50d1ce411f	add DistilBERT to supported models (#4558 )	2020-05-25 14:50:45 -04:00
Zhangyx	49296533ca	Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463 ) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-21 09:17:44 -04:00
Tobias Lee	271bedb485	[examples] fix no grad in second pruning in run_bertology (#4479 ) * fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style	2020-05-21 09:17:03 -04:00
Patrick von Platen	aa925a52fa	[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468 ) * fix gpu slow tests in pytorch * change model to device syntax	2020-05-19 21:35:04 +02:00
Julien Chaumond	5e7fe8b585	Distributed eval: SequentialDistributedSampler + gather all results (#4243 ) * Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii	2020-05-18 22:02:39 -04:00
Boris Dayma	d9ece8233d	fix(run_language_modeling): use arg overwrite_cache (#4407 )	2020-05-18 11:37:35 -04:00
Julien Chaumond	757baee846	Fix un-prefixed f-string see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693 Hat/tip @girishponkiya	2020-05-18 11:20:46 -04:00
Julien Chaumond	15550ce0d1	[skip ci] remove local rank	2020-05-15 17:08:38 -04:00
Lysandre Debut	edf9ac11d4	Should return overflowing information for the log (#4385 )	2020-05-15 09:49:11 -04:00
Julien Chaumond	af2e6bf87c	[examples] Streamline doc	2020-05-14 20:34:31 -04:00
Julien Chaumond	448c467256	Fix: unpin flake8 and fix cs errors (#4367 ) * Fix: unpin flake8 and fix cs errors * Ok we still need to quote those	2020-05-14 13:14:26 -04:00
Julien Chaumond	c547f15a17	Use Filelock to ensure distributed barriers see context in https://github.com/huggingface/transformers/pull/4223	2020-05-14 11:58:32 -04:00
Julien Plu	ca13618681	Question Answering for TF trainer (#4320 ) * Add QA trainer example for TF * Make data_dir optional * Fix parameter logic * Fix feature convert * Update the READMEs to add the question-answering task * Apply style * Change 'sequence-classification' to 'text-classification' and prefix with 'eval' all the metric names * Apply style * Apply style	2020-05-13 09:22:31 -04:00
Julien Chaumond	241759101e	(v2) Improvements to the wandb integration (#4324 ) * Improvements to the wandb integration * small reorg + no global necessary * feat(trainer): log epoch and final metrics * Simplify logging a bit * Fixup * Fix crash when just running eval Co-authored-by: Chris Van Pelt <vanpelt@gmail.com> Co-authored-by: Boris Dayma <boris.dayma@gmail.com>	2020-05-12 21:52:01 -04:00
Viktor Alm	e4512aab3b	Add MultipleChoice to TFTrainer [WIP] (#4270 ) * catch gpu len 1 set to gpu0 * Add mpc to trainer * Add MPC for TF * fix TF automodel for MPC and add Albert * Apply style * Fix import * Note to self: double check * Make shape None, None for datasetgenerator output shapes * Add from_pt bool which doesnt seem to work * Original checkpoint dir * Fix docstrings for automodel * Update readme and apply style * Colab should probably not be from users * Colabs should probably not be from users * Add colab * Update README.md * Update README.md * Cleanup __intit__ * Cleanup flake8 trailing comma * Update src/transformers/training_args_tf.py * Update src/transformers/modeling_tf_auto.py Co-authored-by: Viktor Alm <viktoralm@pop-os.localdomain> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-12 08:48:48 -04:00
Stefan Schweter	3f42eb979f	Documentation: fix links to NER examples (#4279 ) * docs: fix link to token classification (NER) example * examples: fix links to NER scripts	2020-05-11 12:48:21 -04:00
Julien Chaumond	7b75aa9fa5	[TPU] Doc, fix xla_spawn.py, only preprocess dataset once (#4223 ) * [TPU] Doc, fix xla_spawn.py, only preprocess dataset once * Update examples/README.md * [xla_spawn] Add `_mp_fn` to other Trainer scripts * [TPU] Fix: eval dataloader was None	2020-05-08 14:10:05 -04:00
Julien Chaumond	c99fe0386b	[doc] Fix broken links + remove crazy big notebook	2020-05-07 18:44:18 -04:00
Julien Chaumond	6669915b65	[examples] Add column for pytorch-lightning support	2020-05-07 15:26:58 -04:00
Julien Chaumond	612fa1b10b	Examples readme.md (#4215 ) * README * Update README.md	2020-05-07 15:00:06 -04:00
Julien Chaumond	0ae96ff8a7	BIG Reorganize examples (#4213 ) * Created using Colaboratory * [examples] reorganize files * remove run_tpu_glue.py as superseded by TPU support in Trainer * Bugfix: int, not tuple * move files around	2020-05-07 13:48:44 -04:00
Lysandre Debut	ebf80e2e70	Tpu trainer (#4146 ) * wip * wip * a last wip * Better logging when using TPUs * Correct argument name * Tests * fix * Metrics in evaluation * Update src/transformers/training_args.py * [tpu] Use launcher script instead * [tpu] lots of tweaks * Fix formatting Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-07 10:34:04 -04:00
Julien Plu	aad50151f3	TF version of the trainer (#4017 ) * First commit to add a TF version of the trainer. * Make the TF trainer closer to what looks the PT trainer * Refactoring common code between the PT and TF trainer into an util file. * Some bugfix + better similarity with the PT trainer * Add missing class in transformers init * Bugfix over prediction + use classification report instead of simple metrics * Fix name error * Fix optimization tests + style * Apply style * Several bugfix for multi-gpu training * Apply style * Apply style * Add glue example for the TF trainer * Several bugix + address the reviews * Fix on the TF training args file * Add a debug mode * Bugfix in utils_ner.py when segment_ids is None * Apply style * Apply style * Add TPU strategy * Fix selection strategy	2020-05-06 12:56:52 -04:00
Simone Primarosa	25296b12aa	Fix overwrite_cache behaviour for pytorch lightning examples (#4093 )	2020-05-06 12:24:49 -04:00
William Falcon	4c5bd92183	Update run_pl_glue.py (#4117 )	2020-05-02 10:38:30 -04:00
William Falcon	5282b31df4	Update run_pl_ner.py (#4118 )	2020-05-02 10:38:21 -04:00
Stefan Schweter	1e616c0af3	NER: parse args from .args file or JSON (#4110 ) * ner: parse args from .args file or JSON * examples: mention json-based configuration file support for run_ner script	2020-05-02 10:29:17 -04:00
Julien Chaumond	b8686174be	Merge pull request #3934 from huggingface/examples_args_from_files [qol] example scripts: parse args from .args file or JSON	2020-04-30 22:40:13 -04:00
Julien Chaumond	455c639093	CDN urls (#4030 ) * [file_utils] use_cdn + documentation * Move to cdn. urls for weights * [urls] Hotfix for bert-base-japanese	2020-04-28 20:27:14 -04:00
Sam Shleifer	d714dfeaa8	[isort] add known 3rd party to setup.cfg (#4053 ) * add known 3rd party to setup.cfg * comment * Update CONTRIBUTING.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-04-28 17:12:00 -04:00
Patrick von Platen	180585741c	[Generation] Generation should allow to start with empty prompt (#3993 ) * fix empty prompt * fix length in generation pipeline	2020-04-28 14:33:15 +02:00
Julien Chaumond	c811526004	[examples] For convenience, also save the tokenizer Close #3921	2020-04-24 09:52:42 -04:00
Cola	b0167632ce	Shuffle train subset for summarization example (#3909 ) * Shuffle train subset * Cleaner shuffle	2020-04-24 07:55:34 -04:00
Julien Chaumond	1dc9b3c784	Fixes #3877	2020-04-22 01:15:10 +00:00
Julien Chaumond	dd9d483d03	Trainer (#3800 ) * doc * [tests] Add sample files for a regression task * [HUGE] Trainer * Feedback from @sshleifer * Feedback from @thomwolf + logging tweak * [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes * [glue] Use default max_seq_length of 128 like before * [glue] move DataTrainingArguments around * [ner] Change interface of InputExample, and align run_{tf,pl} * Re-align the pl scripts a little bit * ner * [ner] Add integration test * Fix language_modeling with API tweak * [ci] Tweak loss target * Don't break console output * amp.initialize: model must be on right device before * [multiple-choice] update for Trainer * Re-align to `827d6d6ef0`	2020-04-21 20:11:56 -04:00
Andrey Kulagin	b1ff0b2ae7	Fix bug in examples: double wrap into DataParallel during eval	2020-04-20 19:37:44 -04:00
Jared T Nielsen	c79b550dd0	Add `qas_id` to SquadResult and SquadExample (#3745 ) * Add qas_id * Fix incorrect name in squad.py * Make output files optional for squad eval	2020-04-20 16:08:57 -04:00
Sam Shleifer	a504cb49ec	[examples] fix summarization do_predict (#3866 )	2020-04-20 10:49:56 -04:00
Thomas Wolf	827d6d6ef0	Cleanup fast tokenizers integration (#3706 ) * First pass on utility classes and python tokenizers * finishing cleanup pass * style and quality * Fix tests * Updating following @mfuntowicz comment * style and quality * Fix Roberta * fix batch_size/seq_length inBatchEncoding * add alignement methods + tests * Fix OpenAI and Transfo-XL tokenizers * adding trim_offsets=True default for GPT2 et RoBERTa * style and quality * fix tests * add_prefix_space in roberta * bump up tokenizers to rc7 * style * unfortunately tensorfow does like these - removing shape/seq_len for now * Update src/transformers/tokenization_utils.py Co-Authored-By: Stefan Schweter <stefan@schweter.it> * Adding doc and docstrings * making flake8 happy Co-authored-by: Stefan Schweter <stefan@schweter.it>	2020-04-18 13:43:57 +02:00
Sam Shleifer	f0c96fafd1	[examples] summarization/bart/finetune.py supports t5 (#3824 ) renames `run_bart_sum.py` to `finetune.py`	2020-04-16 15:15:19 -04:00
Patrick von Platen	80a1694514	[Examples, T5] Change newstest2013 to newstest2014 and clean up (#3817 ) * Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion * More pythonic file handling through 'with' context * COSMETIC - ran Black and isort * Fixed reference to number of lines in newstest2014 * Fixed failing test. More pythonic file handling * finish PR from tholiao * remove outcommented lines * make style * make isort happy Co-authored-by: Thomas Liao <tholiao@gmail.com>	2020-04-16 20:00:41 +02:00
Davide Fiocco	b1e2368b32	Typo fix (#3821 )	2020-04-16 11:04:32 -04:00
Sam Shleifer	c59b1e682d	[examples] unit test for run_bart_sum (#3544 ) - adds pytorch-lightning dependency	2020-04-15 18:35:01 -04:00
Patrick von Platen	01c37dcdb5	[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734 ) * remove output_past from pt * make style * add optional input length for gpt2 * add use cache to prepare input * save memory in gpt2 * correct gpt2 test inputs * make past input optional for gpt2 * finish use_cache for all models * make style * delete modeling_gpt2 change in test file * correct docstring * correct is true statements for gpt2	2020-04-14 14:40:28 -04:00
elk-cloner	5ebd898953	fix dataset shuffling for Distributed training (#huggingface#3721) (#3766 )	2020-04-13 10:11:18 -04:00
Jin Young Sohn	700ccf6e35	Fix glue_convert_examples_to_features API breakage (#3742 )	2020-04-10 16:03:27 -04:00
Jin Young Sohn	551b450527	Add `run_glue_tpu.py` that trains models on TPUs (#3702 ) * Initial commit to get BERT + run_glue.py on TPU * Add README section for TPU and address comments. * Cleanup TPU bits from run_glue.py (#3) TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * Cleanup TPU bits from run_glue.py TPU runner is currently implemented in: https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py. We plan to upstream this directly into `huggingface/transformers` (either `master` or `tpu`) branch once it's been more thoroughly tested. * No need to call `xm.mark_step()` explicitly (#4) Since for gradient accumulation we're accumulating on batches from `ParallelLoader` instance which on next() marks the step itself. * Resolve R/W conflicts from multiprocessing (#5) * Add XLNet in list of models for `run_glue_tpu.py` (#6) * Add RoBERTa to list of models in TPU GLUE (#7) * Add RoBERTa and DistilBert to list of models in TPU GLUE (#8) * Use barriers to reduce duplicate work/resources (#9) * Shard eval dataset and aggregate eval metrics (#10) * Shard eval dataset and aggregate eval metrics Also, instead of calling `eval_loss.item()` every time do summation with tensors on device. * Change defaultdict to float * Reduce the pred, label tensors instead of metrics As brought up during review some metrics like f1 cannot be aggregated via averaging. GLUE task metrics depends largely on the dataset, so instead we sync the prediction and label tensors so that the metrics can be computed accurately on those instead. * Only use tb_writer from master (#11) * Apply huggingface black code formatting * Style * Remove `--do_lower_case` as example uses cased * Add option to specify tensorboard logdir This is needed for our testing framework which checks regressions against key metrics writtern by the summary writer. * Using configuration for `xla_device` * Prefix TPU specific comments. * num_cores clarification and namespace eval metrics * Cache features file under `args.cache_dir` Instead of under `args.data_dir`. This is needed as our test infra uses data_dir with a read-only filesystem. * Rename `run_glue_tpu` to `run_tpu_glue` Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>	2020-04-10 12:53:54 -04:00
Julien Chaumond	cbad305ce6	[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738 )	2020-04-10 12:34:04 -04:00
Julien Chaumond	b169ac9c2b	[examples] Generate argparsers from type hints on dataclasses (#3669 ) * [examples] Generate argparsers from type hints on dataclasses * [HfArgumentParser] way simpler API * Restore run_language_modeling.py for easier diff * [HfArgumentParser] final tweaks from code review	2020-04-10 12:21:58 -04:00
Julien Chaumond	f98d0ef2a2	Big cleanup of `glue_convert_examples_to_features` (#3688 ) * Big cleanup of `glue_convert_examples_to_features` * Use batch_encode_plus * Cleaner wrapping of glue_convert_examples_to_features for TF @lysandrejik * Cleanup syntax, thanks to @mfuntowicz * Raise explicit error in case of user error	2020-04-10 10:20:18 -04:00
Sam Shleifer	715aa5b135	[Bart] Replace config.output_past with use_cache kwarg (#3632 )	2020-04-07 19:08:26 -04:00
Sam Shleifer	e344e3d402	[examples] SummarizationDataset cleanup (#3451 )	2020-04-07 19:05:58 -04:00
Patrick von Platen	80fa0f7812	[Examples, Benchmark] Improve benchmark utils (#3674 ) * improve and add features to benchmark utils * update benchmark style * remove output files	2020-04-07 16:25:57 -04:00
Ethan Perez	e52d1258e0	Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py (#3631 ) * Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py `convert_examples_to_fes atures` sets `pad_token=0` by default, which is correct for BERT but incorrect for RoBERTa (`pad_token=1`) and XLNet (`pad_token=5`). I think the other arguments to `convert_examples_to_features` are correct, but it might be helpful if someone checked who is more familiar with this part of the codebase. * Simplifying change to match recent commits	2020-04-06 16:52:22 -04:00
Nicolas	c50aa67bff	Resizing embedding matrix before sending it to the optimizer. (#3532 ) * Resizing embedding matrix after sending it to the optimizer prevents from updating the newly resized matrix. * Remove space for style matter	2020-04-02 15:00:05 -04:00
Mark Kockerbeck	1b10159950	Adding should_continue check for retraining (#3509 )	2020-04-02 14:07:08 -04:00
Patrick von Platen	ab5d06a094	[T5, examples] replace heavy t5 models with tiny random models (#3556 ) * replace heavy t5 models with tiny random models as was done by sshleifer * fix isort	2020-04-02 12:34:05 +02:00
Julien Chaumond	50e15c825c	Tokenizers: Start cleaning examples a little (#3455 ) * Start cleaning examples * Fixup	2020-04-01 07:13:40 -04:00
Patrick von Platen	ae6834e028	[Examples] Clean summarization and translation example testing files for T5 and Bart (#3514 ) * fix conflicts * add model size argument to summarization * correct wrong import * fix isort * correct imports * other isort make style * make style	2020-03-31 17:54:13 +02:00
Ethan Perez	e5c393dceb	[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437 ) * Using loaded checkpoint with --do_predict Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix). * Update checkpoint loading * Fixing model loading	2020-03-30 17:06:08 -04:00
Sam Shleifer	8deff3acf2	[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488 )	2020-03-30 12:28:27 -04:00
Julien Plu	d38bbb225f	Update the NER TF script (#3511 ) * Update the NER TF script to remove the softmax and make the pad token label id to -1 * Reformat the quality and style Co-authored-by: Julien Plu <julien.plu@adevinta.com>	2020-03-30 09:50:12 -04:00
Sam Shleifer	33ef7002e1	[Docs] examples/summarization/bart: Simplify CNN/DM preprocessi… (#3516 )	2020-03-29 13:25:42 -04:00
Patrick von Platen	17dceae7a1	Fix circle ci flaky fail of wmt example (#3485 ) * force bleu * fix wrong file name * rename file * different filenames for each example test * test files should clean up after themselves * test files should clean up after themselves * do not force bleu * correct typo * fix isort	2020-03-27 13:01:28 -04:00
Funtowicz Morgan	b08259a120	run_ner.py / bert-base-multilingual-cased can output empty tokens (#2991 ) * Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * run_ner.py - Do not add a label to the labels_ids if word_tokens is empty. This can happen when using bert-base-multilingual-cased with an input containing an unique space. In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior over the labels_ids tokens adding one more tokens than tokens vector. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-27 10:59:55 -04:00
Patrick von Platen	f4f4946836	Rename `t5-large` to `t5-base` in README.md	2020-03-27 15:57:58 +01:00
Lysandre Debut	ff80b73157	Add option to choose T5 model size. (#3480 ) T5-small in test isort	2020-03-27 15:56:59 +01:00
Patrick von Platen	5ad2ea06af	Add wmt translation example (#3428 ) * add translation example * make style * adapt docstring * add gpu device as input for example * small renaming * better README	2020-03-26 19:07:59 +01:00
Patrick von Platen	e703e923ca	Add t5 summarization example (#3411 ) * rebase to master * change tf to pytorch * change to pytorch * small fix * renaming * add gpu training possibility * renaming * improve README * incoorporate collins feedback * better Readme * better README.md	2020-03-26 18:17:55 +01:00
Lysandre Debut	ffcffebe85	Force the return of token type IDs (#3439 )	2020-03-26 09:41:36 +01:00
Andre Carrera	3d76df3a12	BART for summarization training with CNN/DM using pytorch-lightning	2020-03-24 21:00:24 -04:00
Julien Chaumond	eaabaaf750	[run_language_modeling] Fix: initialize a new model from a config object	2020-03-24 17:56:40 -04:00
Julien Chaumond	f8823bad9a	Expose missing mappings (see #3415 )	2020-03-24 17:46:25 -04:00
Julien Chaumond	a8e3336a85	[examples] Use AutoModels in more examples	2020-03-23 20:11:14 -04:00
Julien Chaumond	f7dcf8fcea	[BertAbs] Move files around for more consistent naming	2020-03-23 13:58:49 -04:00
Julien Chaumond	cf72479bf1	One last reorder of {scheduler,optimizer}.step()	2020-03-20 18:05:50 -04:00
Elijah Rippeth	634bf6cf7e	fixes lr_scheduler warning For more details, see https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate	2020-03-20 18:03:50 -04:00
Patrick von Platen	95e00d0808	Clean special token init in modeling_....py (#3264 ) * make style * fix conflicts	2020-03-20 21:41:04 +01:00
Nitish Shirish Keskar	8becb73293	removing torch.cuda.empty_cache() from TF function (#3267 ) torch.cuda.empty_cache() was being called from a TF function (even when torch is unavailable) not sure any replacement is needed if TF OOMs	2020-03-19 23:25:30 +01:00
Julien Chaumond	656e1386a2	Fix #3305 : run_ner only possible on ModelForTokenClassification models	2020-03-19 16:41:28 -04:00
mataney	c44a17db1b	[FIX] not training when epoch is small (#3006 ) * solving bug where for small epochs and large gradient_accumulation_steps we never train * black formatting * no need to change these files	2020-03-19 11:21:21 -04:00
J.P Lee	2b60a26b46	Update examples/ner/run_ner.py to use AutoModel (#3305 ) * Update examples/ner/run_ner.py to use AutoModel * Fix missing code and apply `make style` command	2020-03-17 12:30:10 -04:00
Nathan Raw	930c9412b4	[WIP] Lightning glue example (#3290 ) * ✨ Alter base pl transformer to use automodels * 🐛 Add batch size env variable to function call * 💄 Apply black code style from Makefile * 🚚 Move lightning base out of ner directory * ✨ Add lightning glue example * 💄 self * move _feature_file to base class * ✨ Move eval logging to custom callback * 💄 Apply black code style * 🐛 Add parent to pythonpath, remove copy command * 🐛 Add missing max_length kwarg	2020-03-17 11:46:42 -04:00
Patrick von Platen	e8f44af5bf	[generate] do_sample default back to False (#3298 ) * change do_samples back * None better default as boolean * adapt do_sample to True in test example * make style	2020-03-17 10:52:37 -04:00
Thomas Wolf	2187c49f5c	CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186 ) * memory benchmark rss * have both forward pass and line-by-line mem tracing * cleaned up tracing * refactored and cleaning up API * no f-strings yet... * add GPU mem logging * fix GPU memory monitoring * style and quality * clean up and doc * update with comments * Switching to python 3.6+ * fix quality	2020-03-17 10:17:11 -04:00
Sam Shleifer	5ea8ba67b4	[BART] Remove unused kwargs (#3279 ) * Remove unused kwargs * dont call forward in tests	2020-03-15 23:00:44 -04:00
Thomas Wolf	3814e167d9	Merge pull request #3225 from patrickvonplaten/finalize_merge_bart_generate_into_default_generate Complete merge Seq-2-Seq generation into default generation	2020-03-14 15:08:59 +01:00
Patrick von Platen	4f75d380a4	make style	2020-03-13 16:35:52 +01:00
Patrick von Platen	c2ee3840ae	update file to new starting token logic	2020-03-13 16:34:44 +01:00
dependabot[bot]	afea70c01c	Bump psutil from 5.6.3 to 5.6.6 in /examples/distillation Bumps [psutil](https://github.com/giampaolo/psutil) from 5.6.3 to 5.6.6. - [Release notes](https://github.com/giampaolo/psutil/releases) - [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst) - [Commits](https://github.com/giampaolo/psutil/compare/release-5.6.3...release-5.6.6) Signed-off-by: dependabot[bot] <support@github.com>	2020-03-12 21:14:56 -04:00
Sam Shleifer	2e81b9d8d7	Bart: update example for #3140 compatibility (#3233 ) * Update bart example docs	2020-03-12 10:36:37 -04:00
Patrick von Platen	5b3000d933	renamed min_len to min_length	2020-03-11 11:06:56 +01:00
Shubham Agarwal	5ca356a464	NER - pl example (#3180 ) * 1. seqeval required by ner pl example. install from examples/requirements. 2. unrecognized arguments: save_steps * pl checkpoint callback filenotfound error: make directory and pass * #3159 pl checkpoint path difference * 1. Updated Readme for pl 2. pl script now also correct displays logs 3. pass gpu ids compared to number of gpus * Updated results in readme * 1. updated readme 2. removing deprecated pl methods 3. finalizing scripts * comment length check * using deprecated validation_end for stable results * style related changes	2020-03-09 20:43:38 -04:00
Sam Shleifer	3aca02efb3	Bart example: model.to(device) (#3194 )	2020-03-09 15:09:35 -04:00
Lysandre	eb3e6cb04f	cased -> uncased in BERT SQuAD example closes #3183	2020-03-09 10:54:18 -04:00
Sam Shleifer	857e0a0d3b	Rename BartForMaskedLM -> BartForConditionalGeneration (#3114 ) * improved documentation	2020-03-05 17:41:18 -05:00
Sam Shleifer	5b396457e5	Summarization Examples: add Bart CNN Evaluation (#3082 ) * Rename and improve example * Add test * slightly faster test * style * This breaks remy prolly * shorter test string * no slow * newdir structure * New tree * Style * shorter * docs * clean * Attempt future import * more import hax	2020-03-03 15:29:59 -05:00
Davide Fiocco	c0c7ec3458	Don't crash if fine-tuned model doesn't end with a number (#3099 ) That's the same fix applied in https://github.com/huggingface/transformers/issues/2258 , but for the GLUE example	2020-03-03 08:59:47 -05:00
Victor SANH	6b1ff25084	fix n_gpu count when no_cuda flag is activated (#3077 ) * fix n_gpu count when no_cuda flag is activated * someone was left behind	2020-03-02 10:20:21 -05:00
Julien Chaumond	298bed16a8	make style	2020-03-01 14:08:01 -05:00
VictorSanh	852e032ca6	include roberta in run_squad_w_distillation - cc @graviraja	2020-03-01 01:56:50 +00:00
VictorSanh	b5509abb36	--do_lower_case will always trick me...	2020-03-01 01:39:24 +00:00
srush	908fa43b54	Changes to NER examples for PLT and TPU (#3053 ) * changes to allow for tpu training * black * tpu * tpu	2020-02-27 16:45:32 -05:00
Lysandre Debut	8bcb37bfb8	NER support for Albert in run_ner.py and NerPipeline (#2983 ) * * Added support for Albert when fine-tuning for NER * Added support for Albert in NER pipeline * Added command-line options to examples/ner/run_ner.py to better control tokenization * Added class AlbertForTokenClassification * Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens * Added , * Now passes style guide enforcement * Changes from reviews. * Code now passes style enforcement * Added test for AlbertForTokenClassification * Added test for AlbertForTokenClassification	2020-02-27 10:22:55 -05:00
Martin Malmsten	d762d4289c	Code now passes style enforcement	2020-02-26 23:50:40 +01:00
Martin Malmsten	9495d38b0d	Changes from reviews.	2020-02-26 23:36:39 +01:00
Andrew Walker	5bc99e7f33	fix several typos in Distil* readme (#3034 )	2020-02-26 12:39:54 -05:00
Jhuo IH	7a7ee28cb9	missing ner link (#2967 )	2020-02-25 14:06:57 -05:00
Patrick von Platen	65d74c4965	Add preprocessing step for transfo-xl tokenization to avoid tokenizing words followed by punction to <unk> (#2987 ) * add preprocessing to add space before punctuation for transfo_xl * improve warning messages * make style * compile regex at instantination of tokenizer object	2020-02-24 15:11:10 -05:00
Martin Malmsten	105dcb4162	Now passes style guide enforcement	2020-02-23 21:47:59 +01:00
Martin Malmsten	33eb8a165d	Added ,	2020-02-23 21:43:31 +01:00
Martin Malmsten	869b66f6b3	* Added support for Albert when fine-tuning for NER * Added support for Albert in NER pipeline * Added command-line options to examples/ner/run_ner.py to better control tokenization * Added class AlbertForTokenClassification * Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens	2020-02-23 21:13:03 +01:00
saippuakauppias	cafc4dfc7c	fix hardcoded path in examples readme	2020-02-22 11:12:38 -05:00
Patrick von Platen	fc38d4c86f	Improve special_token_id logic in run_generation.py and add tests (#2885 ) * improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * changed fast random lm generation testing design to more general one * delete in old testing design in gpt2 * correct old variable name * temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed * adapted all fast random generate tests to new design * better warning description in modeling_utils * better comment * better comment and error message Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-02-21 12:09:59 -05:00
maximeilluin	c749a543fa	Added CamembertForQuestionAnswering (#2746 ) * Added CamembertForQuestionAnswering * fixed camembert tokenizer case	2020-02-21 12:01:02 -05:00
Martin Malmsten	4452b44b90	Labels are now added to model config under id2label and label2id (#2945 )	2020-02-21 08:53:05 -05:00
Sam Shleifer	53ce3854a1	New BartModel (#2745 ) * Results same as fairseq * Wrote a ton of tests * Struggled with api signatures * added some docs	2020-02-20 18:11:13 -05:00
srush	889d3bfdbb	default arg fix (#2937 )	2020-02-20 15:31:17 -05:00
srush	b662f0e625	Support for torch-lightning in NER examples (#2890 ) * initial pytorch lightning commit * tested multigpu * Fix learning rate schedule * black formatting * fix flake8 * isort * isort * . Co-authored-by: Check your git settings! <chris@chris-laptop>	2020-02-20 11:50:05 -05:00
VictorSanh	2ae98336d1	fix vocab size in binarized_data (distil): int16 vs int32	2020-02-18 16:17:35 +00:00
VictorSanh	0dbddba6d2	fix typo in hans example call	2020-02-17 20:19:57 +00:00
Manuel Romero	4e597c8e4d	Fix typo	2020-02-14 09:07:42 -05:00
Julien Chaumond	4d36472b96	[run_ner] Don't crash if fine-tuning local model that doesn't end with digit	2020-02-14 03:25:29 +00:00
Lysandre	f54a5bd37f	Raise error when using an mlm flag for a clm model + correct TextDataset	2020-02-12 13:23:14 -05:00
Lysandre	569897ce2c	Fix a few issues regarding the language modeling script	2020-02-12 13:23:14 -05:00
VictorSanh	ee5a6856ca	distilbert-base-cased weights + Readmes + omissions	2020-02-07 15:28:13 -05:00
Julien Chaumond	42f08e596f	[examples] rename run_lm_finetuning to run_language_modeling	2020-02-07 09:15:28 -05:00
Julien Chaumond	4f7bdb0958	[examples] Fix broken markdown	2020-02-07 09:15:28 -05:00
Peter Izsak	6fc3d34abd	Fix multi-gpu evaluation in run_glue.py	2020-02-06 16:38:55 -05:00

... 3 4 5 6 7 ...

1293 Commits