transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 17:52:35 +06:00

Author	SHA1	Message	Date
abhishek thakur	80e4184fb0	on_log event should occur after the current log is written (#9872 )	2021-01-28 19:11:04 +01:00
Stas Bekman	15e4ce353a	[docs] expand install instructions (#9817 ) * expand install instructions * fix * white space * rewrite as discussed in the PR * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change the wording to encourage issue report Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-28 09:36:46 -08:00
Daniel Stancl	4c3ae89ad3	Remove redundant `test_head_masking = True` flags in test files (#9858 ) * Remove redundant test_head_masking = True flags * Remove all redundant test_head_masking flags in PyTorch test_modeling_* files * Make test_head_masking = True as a default choice in test_modeling_tf_commong.py * Remove all redundant test_head_masking flags in TensorFlow test_modeling_tf_* files * Put back test_head_masking=False fot TFT5 models	2021-01-28 10:09:13 -05:00
Joe Davison	caddf9126b	tutorial typo	2021-01-28 09:21:58 -05:00
Sylvain Gugger	b4e559cfa1	Deprecate model_path in Trainer.train (#9854 )	2021-01-28 08:32:46 -05:00
Funtowicz Morgan	2ee9f9b69e	Fix computation of attention_probs when head_mask is provided. (#9853 ) * Fix computation of attention_probs when head_mask is provided. Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com> * Apply changes to the template Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-01-28 06:11:52 -05:00
Nicolas Patry	b936582f71	Fixing flaky conversational test + flag it as a pipeline test. (#9837 )	2021-01-28 10:19:55 +01:00
Lysandre Debut	58fbef9ebc	Remove submodule (#9868 )	2021-01-28 04:03:53 -05:00
Lysandre Debut	6cb0a6f01a	Partial local tokenizer load (#9807 ) * Allow partial loading of a cached tokenizer * Warning > Info * Update src/transformers/tokenization_utils_base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Raise error if not local_files_only Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-28 03:29:12 -05:00
abhishek thakur	25fcb5c171	Pin memory in Trainer by default (#9857 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-01-28 08:50:46 +01:00
Stefan Schweter	5ed5a54684	ADD BORT (#9813 ) * tests: add integration tests for new Bort model * bort: add conversion script from Gluonnlp to Transformers 🚀 * bort: minor cleanup (BORT -> Bort) * add docs * make fix-copies * clean doc a bit * correct docs * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/bort.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * correct dialogpt doc * correct link * Update docs/source/model_doc/bort.rst * Update docs/source/model_doc/dialogpt.rst Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-27 21:25:11 +03:00
Stas Bekman	7c6d63298f	[traner] fix --lr_scheduler_type choices (#9800 ) * fix --lr_scheduler_type choices * rewrite to fix for all enum-based cl args * cleanup * adjust test * style * Proposal that should work * Remove needless code * Fix test Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2021-01-27 10:12:15 -05:00
Sylvain Gugger	893120facc	Allow --arg Value for booleans in HfArgumentParser (#9823 ) * Allow --arg Value for booleans in HfArgumentParser * Update last test * Better error message	2021-01-27 09:31:42 -05:00
Sylvain Gugger	35d55b7b84	When resuming training from checkpoint, Trainer loads model (#9818 ) * Whenresuming training from checkpoint, Trainer loads model * Finish cleaning tests * Address review comment * Use global_step from state	2021-01-27 09:31:18 -05:00
Lysandre Debut	6b6c2b487f	Test (#9851 )	2021-01-27 09:11:53 -05:00
Lysandre Debut	56c3f07a13	Labeled pull requests (#9849 )	2021-01-27 08:45:54 -05:00
Kiyoung Kim	20932e5520	Add tpu_zone and gcp_project in training_args_tf.py (#9825 ) * add tpu_zone and gcp_project in training_args_tf.py * make style Co-authored-by: kykim <kykim>	2021-01-27 08:45:09 -05:00
Lysandre Debut	763ece2fea	Fix model templates (#9842 )	2021-01-27 08:20:58 -05:00
Julien Plu	bd701ab1a0	Fix template (#9840 )	2021-01-27 07:40:30 -05:00
Sylvain Gugger	c7b7bd9963	Add a flag for find_unused_parameters (#9820 ) * Add a flag for find_unused_parameters * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Remove negation Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-01-27 06:18:06 -05:00
Julien Plu	4adbdce5ee	Clean TF Bert (#9788 ) * Start cleaning BERT * Clean BERT and all those depends of it * Fix attribute name * Apply style * Apply Sylvain's comments * Apply Lysandre's comments * remove unused import	2021-01-27 11:28:11 +01:00
tomohideshibata	f0329ea516	Delete a needless duplicate condition (#9826 ) Co-authored-by: Tomohide Shibata <tomshiba@yahoo-corp.jp>	2021-01-27 13:15:23 +03:00
Julien Plu	a1720694a5	Remove a TF usage warning and rework the documentation (#9756 ) * Rework documentation * Update the template * Trigger CI * Restore the warning but with the TF logger * Update convbert doc	2021-01-27 10:45:42 +01:00
Nicolas Patry	285c6262a8	Adding a test to prevent late failure in the Table question answering (#9808 ) pipeline. - If table is empty then the line that contain `answer[0]` will fail. - This PR add a check to prevent `answer[0]`. - Also adds an early check for presence of `table` and `query` to prevent late failure and give better error message. - Adds a few tests to make sure these errors are correctly raised.	2021-01-27 04:10:53 -05:00
Patrick von Platen	a46050d0f5	fix typo with mt5 init (#9830 )	2021-01-27 04:09:56 -05:00
jncasey	f4bf0dea46	Fix auto-resume training from checkpoint (#9822 ) * Fix auto-resume training from checkpoint * style fixes	2021-01-27 03:48:18 -05:00
Sylvain Gugger	f2fabedbab	Setup logging with a stdout handler (#9816 )	2021-01-27 03:39:11 -05:00
Julien Plu	2c891c156d	Add a test for mixed precision (#9806 )	2021-01-27 03:36:49 -05:00
Patrick von Platen	d5b40d6693	[Setup.py] update jaxlib (#9831 ) * update jaxlib * Update setup.py * update table	2021-01-27 11:34:21 +03:00
abhishek thakur	f617490e71	ConvBERT Model (#9717 ) * finalize convbert * finalize convbert * fix * fix * fix * push * fix * tf image patches * fix torch model * tf tests * conversion * everything aligned * remove print * tf tests * fix tf * make tf tests pass * everything works * fix init * fix * special treatment for sepconv1d * style * 🙏🏽 * add doc and cleanup * add electra test again * fix doc * fix doc again * fix doc again * Update src/transformers/modeling_tf_pytorch_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/conv_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * conv_bert -> convbert * more fixes from review * add conversion script * dont use pretrained embed * unused config * suggestions from julien * some more fixes * p -> param * fix copyright * fix doc * Update src/transformers/models/convbert/configuration_convbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * comments from reviews * fix-copies * fix style * revert shape_list Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-01-27 03:20:09 -05:00
Patrick von Platen	e575e06287	fix led not defined (#9828 )	2021-01-27 10:43:14 +03:00
Yusuke Mori	059bb25817	Fix a bug in run_glue.py (#9812 ) (#9815 )	2021-01-26 14:32:19 -05:00
Tristan Deleu	eba418ac5d	Commit the last step on world_process_zero in WandbCallback (#9805 ) * Commit the last step on world_process_zero in WandbCallback * Use the environment variable WANDB_LOG_MODEL as a default value in WandbCallback	2021-01-26 13:21:26 -05:00
Derrick Blakely	8edc98bb70	Allow RAG to output decoder cross-attentions (#9789 ) * get cross attns * add cross-attns doc strings * fix typo * line length * Apply suggestions from code review Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>	2021-01-26 20:32:46 +03:00
Magdalena Biesialska	8f6c12d306	Fix fine-tuning translation scripts (#9809 )	2021-01-26 11:30:31 -05:00
Michael Glass	c37dcff764	Fixed parameter name for logits_processor (#9790 )	2021-01-26 18:44:02 +03:00
Sylvain Gugger	0d0efd3a0e	Smdistributed trainer (#9798 ) * Add a debug print * Adapt Trainer to use smdistributed if available * Forgotten parenthesis * Real check for sagemaker * Donforget to define device... * Woopsie, local)rank is defined differently * Update since local_rank has the proper value * Remove debug statement * More robust check for smdistributed * Quality * Deal with key not present error	2021-01-26 10:28:21 -05:00
Lysandre	897a24c869	Fix head_mask for model templates	2021-01-26 11:02:48 +01:00
Andrea Cappelli	10e5f28212	Improve pytorch examples for fp16 (#9796 ) * Pad to 8x for fp16 multiple choice example (#9752) * Pad to 8x for fp16 squad trainer example (#9752) * Pad to 8x for fp16 ner example (#9752) * Pad to 8x for fp16 swag example (#9752) * Pad to 8x for fp16 qa beam search example (#9752) * Pad to 8x for fp16 qa example (#9752) * Pad to 8x for fp16 seq2seq example (#9752) * Pad to 8x for fp16 glue example (#9752) * Pad to 8x for fp16 new ner example (#9752) * update script template #9752 * Update examples/multiple-choice/run_swag.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update examples/question-answering/run_qa_beam_search.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve code quality #9752 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 04:47:07 -05:00
Nicolas Patry	781e4b1384	Adding `skip_special_tokens=True` to FillMaskPipeline (#9783 ) * We most likely don't want special tokens in this output. * Adding `skip_special_tokens=True` to FillMaskPipeline - It's backward incompatible. - It makes for sense for pipelines to remove references to special_tokens (all of the other pipelines do that). - Keeping special tokens makes it hard for users to actually remove them because all models have different tokens (<s>, <cls>, [CLS], ....) * Fixing `token_str` in the same vein, and actually fix the tests too !	2021-01-26 10:06:28 +01:00
Daniel Stancl	1867d9a8d7	Add head_mask/decoder_head_mask for TF BART models (#9639 ) * Add head_mask/decoder_head_mask for TF BART models * Add head_mask and decoder_head_mask input arguments for TF BART-based models as a TF counterpart to the PR #9569 * Add test_headmasking functionality to tests/test_modeling_tf_common.py * TODO: Add a test to verify that we can get a gradient back for importance score computation * Remove redundant #TODO note Remove redundant #TODO note from tests/test_modeling_tf_common.py * Fix assertions * Make style * Fix ...Model input args and adjust one new test * Add back head_mask and decoder_head_mask to BART-based ...Model after the last commit * Remove head_mask ande decoder_head_mask from input_dict in TF test_train_pipeline_custom_model as these two have different shape than other input args (Necessary for passing this test) * Revert adding global_rng in test_modeling_tf_common.py	2021-01-26 03:50:00 -05:00
Yusuke Mori	cb73ab5a38	Fix broken links in the converting tf ckpt document (#9791 ) * Fix broken links in the converting tf ckpt document * Update docs/source/converting_tensorflow_models.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Reflect the review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 03:37:57 -05:00
Patrick von Platen	d94cc2f904	[Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794 ) * fix ci * fix ci * renaming * fix dup line	2021-01-26 03:21:44 -05:00
Stas Bekman	0fdbf0850a	[PR/Issue templates] normalize, group, sort + add myself for deepspeed (#9706 ) * normalize, group, sort + add myself for deepspeed * new structure * add ray * typo * more suggestions * more suggestions * white space * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by: Suraj Patil <surajp815@gmail.com> * add bullets * sync * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * sync Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 21:09:01 -08:00
Sylvain Gugger	af41da5097	Fix style	2021-01-25 12:40:58 -05:00
Sylvain Gugger	caf4abf768	Auto-resume training from checkpoint (#9776 ) * Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 12:03:51 -05:00
Lysandre Debut	0f443436fb	Actual fix (#9787 )	2021-01-25 11:12:07 -05:00
Stas Bekman	fac7cfb16a	[fsmt] onnx triu workaround (#9738 ) * onnx triu workaround * style * working this time * add test * more efficient version	2021-01-25 08:57:37 -05:00
Sorami Hisamoto	626116b7d7	Fix a typo in Trainer.hyperparameter_search docstring (#9762 ) `compute_objectie` => `compute_objective`	2021-01-25 06:40:03 -05:00
Kai Fricke	d63ab61525	Use object store to pass trainer object to Ray Tune (#9749 )	2021-01-25 05:01:55 -05:00

1 2 3 4 5 ...

6428 Commits