Commit Graph

6600 Commits

Author SHA1 Message Date
acul3
8940c7662d
Add t5 convert to transformers-cli (#9654)
* Update run_mlm.py

* add t5 model to transformers-cli convert

* update rum_mlm.py same as master

* update converting model docs

* update converting model docs

* Update convert.py

* Trigger notification

* update import sorted

* fix typo t5
2021-01-20 09:34:27 -05:00
Julien Plu
7251a4736d
Fix template (#9697) 2021-01-20 09:04:53 -05:00
Julien Plu
14042d560f
New TF embeddings (cleaner and faster) (#9418)
* Create new embeddings + add to BERT

* Add Albert

* Add DistilBert

* Add Albert + Electra + Funnel

* Add Longformer + Lxmert

* Add last models

* Apply style

* Update the template

* Remove unused imports

* Rename attribute

* Import embeddings in their own model file

* Replace word_embeddings per weight

* fix naming

* Fix Albert

* Fix Albert

* Fix Longformer

* Fix Lxmert Mobilebert and MPNet

* Fix copy

* Fix template

* Update the get weights function

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/electra/modeling_tf_electra.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-20 12:08:12 +01:00
Julien Plu
12f0d7e8e0
Fix label datatype in TF Trainer (#9616)
* Fix label datatype

* Apply style
2021-01-20 12:08:00 +01:00
Sylvain Gugger
76f36e183a
Add a community page to the docs (#9682) 2021-01-20 04:54:36 -05:00
Sylvain Gugger
582f516adb
Use datasets squad_v2 metric in run_qa (#9677) 2021-01-20 04:52:13 -05:00
LSinev
a98173cc45
make RepetitionPenaltyLogitsProcessor faster (#9600) 2021-01-20 10:23:01 +01:00
Sylvain Gugger
a1ad16a446
Restrain tokenizer.model_max_length default (#9681)
* Restrain tokenizer.model_max_length default

* Fix indent
2021-01-20 04:17:39 -05:00
Sylvain Gugger
7e662e6a3b
Fix model templates and use less than 119 chars (#9684)
* Fix model templates and use less than 119 chars

* Missing new line
2021-01-19 17:11:22 -05:00
Daniel Stancl
2ebbbf558c
Add separated decoder_head_mask for T5 Models (#9634)
* Add decoder_head_mask for PyTorch T5 model

* Add decoder_head_mask args into T5Model and T5ForConditionalGeneration

* Slightly change the order of input args to be in accordance
with the convention from BART-based models introduced within the PR #9569.

* Make style for modeling_t5.py

* Add decoder_head_mask for TF T5 models

* Separate head_mask and decoder_head_mask args in TF T5 models

* Slightly change the order of input args to follow convention
of BART-based models updated in PR #9569

* Update test_forward_signature tests/test_modeling_tf_common.py
w.r.t. the changed order of input args

* Add FutureWarnings for T5 and TFT5 models

* Add FutureWarnings for T5 and TFT5 models warning a user that
input argument `head_mask` was split into two arguments -
`head_mask` and `decoder_head_mask`

* Add default behaviour - `decoder_head_mask` is set to copy
`head_mask`

* Fix T5 modeling and FutureWarning

* Make proper usage of head_mask and decoder_head_mask
in cross_attention

* Fix conditions for raising FutureWarning

* Reformat FutureWarning in T5 modeling

* Refactor the warning message
2021-01-19 22:50:25 +01:00
Sylvain Gugger
e4c06ed664
New run_seq2seq script (#9605)
* New run_seq2seq script

* Add tests

* Mark as slow

* Update examples/seq2seq/run_seq2seq.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/data/data_collator.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update src/transformers/data/data_collator.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-01-19 15:22:17 -05:00
Julien Plu
fa876aee2a
Fix TF Flaubert and XLM (#9661)
* Fix Flaubert and XLM

* Fix Flaubert and XLM

* Apply style
2021-01-19 18:02:57 +01:00
max yue
11ec74905a
Update integrations.py (#9652)
File "/share/apps/anaconda3/envs/my_env/lib/python3.7/site-packages/transformers/integrations.py", line 419, in __init__
    self._SummaryWriter = SummaryWriter
UnboundLocalError: local variable 'SummaryWriter' referenced before assignment
2021-01-19 11:39:49 -05:00
Yusuke Mori
b020a736c3
Update past_key_values in GPT-2 (#9596)
* Update past_key_values in gpt2 (#9391)

* Update generation_utils, and rename some items

* Update modeling_gpt2 to avoid an error in gradient_checkpointing

* Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2

* Change the location of '_reorder_cache' in modeling files

* Add '_reorder_cache' in modeling_ctrl

* Fix a bug of my last commit in CTRL

* Add '_reorder_cache' to GPT2DoubleHeadsModel

* Manage 'use_cache' in config of test_modeling_gpt2

* Clean up the doc string

* Update src/transformers/models/gpt2/modeling_gpt2.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix the doc string (GPT-2, CTRL)

* improve gradient_checkpointing_behavior

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-19 16:00:15 +01:00
Sylvain Gugger
97b787fb4e
Fix old Seq2SeqTrainer (#9675) 2021-01-19 09:56:25 -05:00
Sylvain Gugger
d302d88b47
Fix GPT conversion script (#9676) 2021-01-19 09:55:37 -05:00
Sylvain Gugger
053efc5d2d
Fix imports in conversion scripts (#9674) 2021-01-19 09:40:15 -05:00
Patrick von Platen
2390c16fd2
add mbart to automodel for masked lm (#9673) 2021-01-19 15:19:11 +01:00
Patrick von Platen
b39bd763e8
Update README.md 2021-01-19 12:25:51 +01:00
Sergey Mkrtchyan
917dbb15e0
Fix DPRReaderTokenizer's attention_mask (#9663)
* Fix the attention_mask in DPRReaderTokenizer

* Add an integration test for DPRReader inference

* Run make style
2021-01-19 05:43:11 -05:00
Patrick von Platen
12c1b5b8f4
fix test (#9669) 2021-01-19 09:06:24 +01:00
Daniel Stancl
357fb1c5d8
Add head_mask/decoder_head_mask for BART (#9569)
* Add head_mask/decoder_head_mask for BART

This branch implement head_mask and decoder_head_mask
for BART-based models. Full list below:
- BART
- MBart
- Blenderbot
- BlenderbotSmall
- Marian
- Pegasus

Everything is accompanied with updated testing.

* Fix test_headmasking for BART models

* Fix text_headmasking for BART-like models
which has only 2 layers in each modules.
The condition
```
self.assertNotEqual(attentions[1][..., 0, :, :].flatten().sum().item(), 0.0)
```
is, therefore, invalid for encoder-decoder models considering
the `head_mask`
```
head_mask = torch.ones(
    self.model_tester.num_hidden_layers,
    self.model_tester.num_attention_heads,
    device=torch_device,
)
head_mask[0, 0] = 0
head_mask[-1, :-1] = 0
```
specified in the `test_headmasking` test/function.

* Adjust test_modeling_common.py to reflect T5 input args

* Update tests/test_modeling_common.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

* make fix-copies

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-18 13:35:22 +01:00
Devrim
65eb5d9ac5
Fix: torch.utils.checkpoint import error. (#9626) 2021-01-18 04:33:39 -05:00
Anthony MOI
72fc9abf17
Remove duplicated extra["retrieval"] (#9621) 2021-01-18 04:24:21 -05:00
Stas Bekman
c60e0e1ee4
deepspeed + grad acumm (#9622) 2021-01-15 10:12:26 -08:00
Lysandre Debut
6d3b688b04
Ignore lm_head decoder bias warning (#9615)
* Ignore lm_head decoder bias warning

* Revert "Ignore lm_head decoder bias warning"

This reverts commit f25177a9da.

* predictions -> lm_head
2021-01-15 09:40:21 -05:00
Julien Plu
8eba1f8ca8
Remove unused token_type_ids in MPNet (#9564)
* Add warning

* Remove unused import

* Fix missing call

* Fix missing call

* Completely remove token_type_ids

* Apply style

* Remove unused import

* Update src/transformers/models/mpnet/modeling_tf_mpnet.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-15 08:06:29 -05:00
Patrick von Platen
90ca8d36e9
[TF Led] Fix wrong decoder attention mask behavior (#9601)
* fix tf led

* remove loop file
2021-01-15 06:40:27 -05:00
Kiyoung Kim
85788bae5c Revert "Gradient accumulation for TFTrainer (#9585)"
This reverts commit 3f40070c88.
2021-01-15 10:47:01 +01:00
Stas Bekman
82498cbc37
[deepspeed doc] install issues + 1-gpu deployment (#9582)
* [doc] install + 1-gpu deployment

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improvements

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-14 11:05:04 -08:00
Sylvain Gugger
329fe2746a
Upstream (and rename) sortish sampler (#9574)
* Upstream (and rename) sortish sampler

* Use proper sampler

* Update src/transformers/trainer_pt_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-14 10:38:14 -05:00
Kiyoung Kim
3f40070c88
Gradient accumulation for TFTrainer (#9585)
* gradient accumulation for tftrainer

* label naming

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* label naming

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-14 10:16:39 -05:00
Lysandre
e43f3b6190 v4.2.1 in docs 2021-01-14 14:25:30 +01:00
Lysandre Debut
280db79ac1
BatchEncoding.to with device with tests (#9584) 2021-01-14 07:57:58 -05:00
Lysandre Debut
8bf27075a2
Fix conda build (#9589)
* conda build -> conda-build

* Syntax error

* conda build -> conda-build + 4.2.0

* Prepare to merge in `master`
2021-01-14 05:51:52 -05:00
Stas Bekman
c99751dd9d
[setup.py] note on how to get to transformers exact dependencies from shell (#9553)
* note on how to get to deps from shell

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix text

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-14 05:04:08 -05:00
Julien Plu
a26536f0c8
Make logs tf compliant (#9565) 2021-01-14 04:56:53 -05:00
Julien Plu
14d677ca4a
Compliancy with tf-nightly (#9570)
* Compliancy with tf-nightly

* Add more version + restore min version check
2021-01-14 04:35:35 -05:00
Sylvain Gugger
46ed56cfd1
Switch metrics in run_ner to datasets (#9567)
* Switch metrics in run_ner to datasets

* Add flag to return all metrics

* Upstream (and rename) sortish_sampler

* Revert "Upstream (and rename) sortish_sampler"

This reverts commit e07d0dcf65.
2021-01-14 03:37:07 -05:00
Sylvain Gugger
5e1bea4f16
Fix Trainer with a parallel model (#9578)
* Fix Trainer with a parallel model

* More clean up
2021-01-14 03:23:41 -05:00
Patrick von Platen
126fd281bc
Update README.md 2021-01-13 16:55:59 +01:00
Lysandre
e63cad7936 v4.3.0.dev0 2021-01-13 16:16:54 +01:00
Lysandre
33a8497db8 v4.2.0 documentation 2021-01-13 16:15:40 +01:00
Lysandre
7d9a9d0c72 Release: v4.2.0 2021-01-13 16:01:51 +01:00
Lysandre Debut
c949516695
Fix slow tests v4.2.0 (#9561)
* Fix conversational pipeline test

* LayoutLM

* ProphetNet

* BART

* Blenderbot & small

* Marian

* mBART

* Pegasus

* Tapas tokenizer

* BERT2BERT test

* Style

* Example requirements

* TF BERT2BERT test
2021-01-13 09:55:48 -05:00
Sylvain Gugger
04dc65e5c6
Fix data parallelism in Trainer (#9566)
* Fix data parallelism in Trainer

* Update src/transformers/training_args.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-13 09:54:41 -05:00
Stas Bekman
b2dfcc567b
use correct deps for torchhub (#9552) 2021-01-13 08:02:53 -05:00
Yusuke Mori
eabad8fd9c
Update run_glue for do_predict with local test data (#9442) (#9486)
* Update run_glue for do_predict with local test data (#9442)

* Update run_glue (#9442): fix comments ('files' to 'a file')

* Update run_glue (#9442): reflect the code review

* Update run_glue (#9442): auto format

* Update run_glue (#9442): reflect the code review
2021-01-13 07:48:35 -05:00
LSinev
0c9f01a8e5
Speed up TopKLogitsWarper and TopPLogitsWarper (pytorch) (#9557)
* make TopKLogitsWarper faster

* make TopPLogitsWarper faster
2021-01-13 07:47:47 -05:00
Pavel Tarashkevich
27d0e01d75
Fix classification script: enable dynamic padding with truncation (#9554)
Co-authored-by: Pavel Tarashkevich <Pavel.Tarashkievich@orange.com>
2021-01-13 07:46:48 -05:00