transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Daniel Stancl	1867d9a8d7	Add head_mask/decoder_head_mask for TF BART models (#9639 ) * Add head_mask/decoder_head_mask for TF BART models * Add head_mask and decoder_head_mask input arguments for TF BART-based models as a TF counterpart to the PR #9569 * Add test_headmasking functionality to tests/test_modeling_tf_common.py * TODO: Add a test to verify that we can get a gradient back for importance score computation * Remove redundant #TODO note Remove redundant #TODO note from tests/test_modeling_tf_common.py * Fix assertions * Make style * Fix ...Model input args and adjust one new test * Add back head_mask and decoder_head_mask to BART-based ...Model after the last commit * Remove head_mask ande decoder_head_mask from input_dict in TF test_train_pipeline_custom_model as these two have different shape than other input args (Necessary for passing this test) * Revert adding global_rng in test_modeling_tf_common.py	2021-01-26 03:50:00 -05:00
Yusuke Mori	cb73ab5a38	Fix broken links in the converting tf ckpt document (#9791 ) * Fix broken links in the converting tf ckpt document * Update docs/source/converting_tensorflow_models.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Reflect the review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-26 03:37:57 -05:00
Patrick von Platen	d94cc2f904	[Flaky Generation Tests] Make sure that no early stopping is happening for beam search (#9794 ) * fix ci * fix ci * renaming * fix dup line	2021-01-26 03:21:44 -05:00
Stas Bekman	0fdbf0850a	[PR/Issue templates] normalize, group, sort + add myself for deepspeed (#9706 ) * normalize, group, sort + add myself for deepspeed * new structure * add ray * typo * more suggestions * more suggestions * white space * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by: Suraj Patil <surajp815@gmail.com> * add bullets * sync * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * sync Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 21:09:01 -08:00
Sylvain Gugger	af41da5097	Fix style	2021-01-25 12:40:58 -05:00
Sylvain Gugger	caf4abf768	Auto-resume training from checkpoint (#9776 ) * Auto-resume training from checkpoint * Update examples/text-classification/run_glue.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Roll out to other examples Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-01-25 12:03:51 -05:00
Lysandre Debut	0f443436fb	Actual fix (#9787 )	2021-01-25 11:12:07 -05:00
Stas Bekman	fac7cfb16a	[fsmt] onnx triu workaround (#9738 ) * onnx triu workaround * style * working this time * add test * more efficient version	2021-01-25 08:57:37 -05:00
Sorami Hisamoto	626116b7d7	Fix a typo in Trainer.hyperparameter_search docstring (#9762 ) `compute_objectie` => `compute_objective`	2021-01-25 06:40:03 -05:00
Kai Fricke	d63ab61525	Use object store to pass trainer object to Ray Tune (#9749 )	2021-01-25 05:01:55 -05:00
Maria Janina Sarol	6312fed47d	Fix TFTrainer prediction output (#9662 ) * Fix TFTrainer prediction output * Update trainer_tf.py * Fix TFTrainer prediction output * Fix evaluation_loss update in TFTrainer * Fix TFTrainer prediction output	2021-01-25 10:27:12 +01:00
Wilfried L. Bounsi	9152f16023	Fix broken [Open in Colab] links (#9761 )	2021-01-23 15:11:46 +05:30
Stas Bekman	b7b7e5d049	token_type_ids isn't used (#9736 )	2021-01-22 20:38:53 -08:00
Julien Plu	a449ffcbd2	Fix test (#9755 )	2021-01-22 17:40:16 +01:00
Sylvain Gugger	82d46febeb	Add `report_to` training arguments to control the reporting integrations used (#9735 )	2021-01-22 10:34:34 -05:00
Sylvain Gugger	411c582109	Fixes to run_seq2seq and instructions (#9734 ) * Fixes to run_seq2seq and instructions * Add more defaults for summarization	2021-01-22 10:03:57 -05:00
Julien Plu	d7c31abf38	Fix some TF slow tests (#9728 ) * Fix saved model tests + fix a graph issue in longformer * Apply style	2021-01-22 14:50:46 +01:00
Stefan Schweter	08b22722c7	examples: fix XNLI url (#9741 )	2021-01-22 18:13:52 +05:30
Sylvain Gugger	5f80c15ef5	Fix memory regression in Seq2Seq example (#9713 ) * Fix memory regression in Seq2Seq example * Fix test and properly deal with -100 * Easier condition with device safety * Patch for MBartTokenzierFast	2021-01-21 12:05:46 -05:00
Julien Plu	a7dabfb3d1	Fix TF s2s models (#9478 ) * Fix Seq2Seq models for serving * Apply style * Fix lonfgormer * Fix mBart/Pegasus/Blenderbot * Apply style * Add a main intermediate layer * Apply style * Remove import * Apply tf.function to Longformer * Fix utils check_copy * Update S2S template * Fix BART + Blenderbot * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix BlenderbotSmall * Fix MBart * Fix Marian * Fix Pegasus + template * Apply style * Fix common attributes test * Forgot to fix the LED test * Apply Patrick's comment on LED Decoder	2021-01-21 17:03:29 +01:00
Nicolas Patry	23e5a36ee6	Changing model default for TableQuestionAnsweringPipeline. (#9729 ) * Changing model default for TableQuestionAnsweringPipeline. - Discussion: https://discuss.huggingface.co/t/table-question-answering-is-not-an-available-task-under-pipeline/3284/6 * Updating slow tests that were out of sync.	2021-01-21 14:31:51 +01:00
Julien Plu	3f290e6c84	Fix mixed precision in TF models (#9163 ) * Fix Gelu precision * Fix gelu_fast * Naming * Fix usage and apply style * add TF gelu approximate version * add TF gelu approximate version * add TF gelu approximate version * Apply style * Fix albert * Remove the usage of the Activation layer	2021-01-21 07:00:11 -05:00
Suraj Patil	248fa1ae72	fix T5 head mask in model_parallel (#9726 ) * fix head mask in model_parallel * pass correct head mask	2021-01-21 12:16:14 +01:00
Patrick von Platen	ca422e3d7d	finish (#9721 )	2021-01-21 05:17:13 -05:00
Patrick von Platen	c8ea582ed6	reduce led memory (#9723 )	2021-01-21 05:16:15 -05:00
guillaume-be	fb36c273a2	Allow text generation for ProphetNetForCausalLM (#9707 ) * Moved ProphetNetForCausalLM's parent initialization after config update * Added unit tests for generation for ProphetNetForCausalLM	2021-01-21 11:13:38 +01:00
Lysandre Debut	910aa89671	Temporarily deactivate TPU tests while we work on fixing them (#9720 )	2021-01-21 04:17:39 -05:00
Muennighoff	6a346f0358	fix typo (#9708 ) * fix typo Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-01-21 13:51:01 +05:30
Stas Bekman	4a20b7c450	[trainer] no --deepspeed and --sharded_ddp together (#9712 ) * no --deepspeed and --sharded_ddp together * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-20 16:50:21 -08:00
Sylvain Gugger	7acfa95afb	Add missing new line	2021-01-20 14:13:16 -05:00
Darigov Research	5a307ece82	Adds flashcards to Glossary & makes small corrections (#8949 ) * fix: Makes small typo corrections & standardises glossary * feat: Adds introduction & links to transformer flashcards * feat: Adds attribution & adjustments requested in #8949 * feat: Adds flashcards to community.md * refactor: Removes flashcards from glossary	2021-01-20 13:28:40 -05:00
Sylvain Gugger	3cd91e8162	Fix WAND_DISABLED test (#9703 ) * Fix WAND_DISABLED test * Remove duplicate import * Make a test that actually works... * Fix style	2021-01-20 12:30:24 -05:00
Sylvain Gugger	2a703773aa	Fix style	2021-01-20 12:17:40 -05:00
Stas Bekman	cd5565bed3	fix the backward for deepspeed (#9705 )	2021-01-20 09:07:07 -08:00
Gunjan Chhablani	538245b0c2	Fix Trainer and Args to mention AdamW, not Adam. (#9685 ) * Fix Trainer and Args to mention AdamW, not Adam. * Update the docs for Training Arguments. * Change arguments adamw_* to adam_* * Fixed links to AdamW in TrainerArguments docs * Fix line length in Training Args docs.	2021-01-20 11:59:31 -05:00
NielsRogge	88583d4958	Add notebook (#9696 )	2021-01-20 10:19:26 -05:00
NielsRogge	d1370d29b1	Add DeBERTa head models (#9691 ) * Add DebertaForMaskedLM, DebertaForTokenClassification, DebertaForQuestionAnswering * Add docs and fix quality * Fix Deberta not having pooler	2021-01-20 10:18:50 -05:00
Sylvain Gugger	a7b62fece5	Fix Funnel Transformer conversion script (#9683 )	2021-01-20 09:50:20 -05:00
acul3	8940c7662d	Add t5 convert to transformers-cli (#9654 ) * Update run_mlm.py * add t5 model to transformers-cli convert * update rum_mlm.py same as master * update converting model docs * update converting model docs * Update convert.py * Trigger notification * update import sorted * fix typo t5	2021-01-20 09:34:27 -05:00
Julien Plu	7251a4736d	Fix template (#9697 )	2021-01-20 09:04:53 -05:00
Julien Plu	14042d560f	New TF embeddings (cleaner and faster) (#9418 ) * Create new embeddings + add to BERT * Add Albert * Add DistilBert * Add Albert + Electra + Funnel * Add Longformer + Lxmert * Add last models * Apply style * Update the template * Remove unused imports * Rename attribute * Import embeddings in their own model file * Replace word_embeddings per weight * fix naming * Fix Albert * Fix Albert * Fix Longformer * Fix Lxmert Mobilebert and MPNet * Fix copy * Fix template * Update the get weights function * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/electra/modeling_tf_electra.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-01-20 12:08:12 +01:00
Julien Plu	12f0d7e8e0	Fix label datatype in TF Trainer (#9616 ) * Fix label datatype * Apply style	2021-01-20 12:08:00 +01:00
Sylvain Gugger	76f36e183a	Add a community page to the docs (#9682 )	2021-01-20 04:54:36 -05:00
Sylvain Gugger	582f516adb	Use datasets squad_v2 metric in run_qa (#9677 )	2021-01-20 04:52:13 -05:00
LSinev	a98173cc45	make RepetitionPenaltyLogitsProcessor faster (#9600 )	2021-01-20 10:23:01 +01:00
Sylvain Gugger	a1ad16a446	Restrain tokenizer.model_max_length default (#9681 ) * Restrain tokenizer.model_max_length default * Fix indent	2021-01-20 04:17:39 -05:00
Sylvain Gugger	7e662e6a3b	Fix model templates and use less than 119 chars (#9684 ) * Fix model templates and use less than 119 chars * Missing new line	2021-01-19 17:11:22 -05:00
Daniel Stancl	2ebbbf558c	Add separated decoder_head_mask for T5 Models (#9634 ) * Add decoder_head_mask for PyTorch T5 model * Add decoder_head_mask args into T5Model and T5ForConditionalGeneration * Slightly change the order of input args to be in accordance with the convention from BART-based models introduced within the PR #9569. * Make style for modeling_t5.py * Add decoder_head_mask for TF T5 models * Separate head_mask and decoder_head_mask args in TF T5 models * Slightly change the order of input args to follow convention of BART-based models updated in PR #9569 * Update test_forward_signature tests/test_modeling_tf_common.py w.r.t. the changed order of input args * Add FutureWarnings for T5 and TFT5 models * Add FutureWarnings for T5 and TFT5 models warning a user that input argument `head_mask` was split into two arguments - `head_mask` and `decoder_head_mask` * Add default behaviour - `decoder_head_mask` is set to copy `head_mask` * Fix T5 modeling and FutureWarning * Make proper usage of head_mask and decoder_head_mask in cross_attention * Fix conditions for raising FutureWarning * Reformat FutureWarning in T5 modeling * Refactor the warning message	2021-01-19 22:50:25 +01:00
Sylvain Gugger	e4c06ed664	New run_seq2seq script (#9605 ) * New run_seq2seq script * Add tests * Mark as slow * Update examples/seq2seq/run_seq2seq.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-01-19 15:22:17 -05:00
Julien Plu	fa876aee2a	Fix TF Flaubert and XLM (#9661 ) * Fix Flaubert and XLM * Fix Flaubert and XLM * Apply style	2021-01-19 18:02:57 +01:00

1 2 3 4 5 ...

6388 Commits