Commit Graph

6071 Commits

Author SHA1 Message Date
StillKeepTry
df2af6d8b8 Add MP Net 2 (#9004) 2020-12-09 10:32:43 -05:00
cronoik
8729109855
fixes #8968 (#9009) 2020-12-09 16:21:41 +01:00
Simon Brandeis
e977ed2142
Add the code_search_net dataset tag to CodeBERTa model cards (#9005) 2020-12-09 15:43:19 +01:00
Patrick von Platen
da37a21c89
push (#9008) 2020-12-09 15:14:33 +01:00
Sylvain Gugger
61abd50b98
Remove use of deprected method in Trainer HP search (#8996) 2020-12-09 09:13:41 -05:00
Sylvain Gugger
7e1d709e2a
Fix link to stable version in the doc navbar (#9007) 2020-12-09 09:11:39 -05:00
Patrick von Platen
02d0e0355c
Diverse beam search 2 (#9006)
* diverse beam search

* bug fixes

* bug fixes

* bug fix

* separate out diverse_beam_search function

* separate out diverse_beam_search function

* bug fix

* improve code quality

* bug fix

* bug fix

* separate out diverse beam search scorer

* code format

* code format

* code format

* code format

* add test

* code format

* documentation changes

* code quality

* add slow integration tests

* more general name

* refactor into logits processor

* add test

* avoid too much copy paste

* refactor

* add to docs

* fix-copies

* bug fix

* Revert "bug fix"

This reverts commit c99eb5a8dc.

* improve comment

* implement sylvains feedback

Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
2020-12-09 15:00:37 +01:00
Lysandre Debut
67ff1c314a
Templates overhaul 1 (#8993) 2020-12-08 18:00:07 -05:00
Sylvain Gugger
447808c85f
New squad example (#8992)
* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-08 14:39:29 -05:00
guillaume-be
7809eb82ae
Removed unused encoder_hidden_states and encoder_attention_mask (#8972)
* Removed unused `encoder_hidden_states` and `encoder_attention_mask` from MobileBert

* Removed decoder tests for MobileBert

* Removed now unnecessary import
2020-12-08 12:04:34 -05:00
Lysandre Debut
b7cdd00f15
Fix interaction of return_token_type_ids and add_special_tokens (#8854) 2020-12-08 12:04:01 -05:00
Sylvain Gugger
04c446f764
Make ModelOutput pickle-able (#8989) 2020-12-08 11:59:40 -05:00
Julien Chaumond
0d9e6ca9ed
[model_card] remove bogus testing changes 2020-12-08 09:58:45 -05:00
Julien Plu
bf7f79cd57
Optional layers (#8961)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Fix wrong model name

* Fix BART

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Address Patrick's comments

* Address Patrick's comments

* Add boolean processing for the inputs

* Take into account the optional layers

* Add missing/unexpected weights in the other models

* Apply style

* rename parameters

* Apply style

* Remove useless

* Remove useless

* Remove useless

* Update num parameters

* Fix tests

* Address Patrick's comment

* Remove useless attribute
2020-12-08 09:14:09 -05:00
Stas Bekman
9d7d0005b0
[training] SAVE_STATE_WARNING was removed in pytorch (#8979)
* [training] SAVE_STATE_WARNING was removed in pytorch

FYI `SAVE_STATE_WARNING` has been removed 3 days ago: pytorch/pytorch#46813

Fixes: #8232

@sgugger

* style, but add () to prevent autoformatters from botching it

* switch to try/except

* cleanup
2020-12-07 21:59:55 -08:00
Lysandre Debut
2ae7388eee
Check table as independent script (#8976) 2020-12-07 19:55:12 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Navjot
c108d0b5a4
add max_length to showcase the use of truncation (#8975) 2020-12-07 18:35:39 -05:00
Sylvain Gugger
62d30e0583
Small fix to the run clm script (#8973) 2020-12-07 17:32:09 -05:00
Julien Chaumond
28fa014a1f
transformers-cli: LFS multipart uploads (> 5GB) (#8663)
* initial commit

* [cli] lfs commands

* Fix FileSlice

* Tweak to FileSlice

* [hf_api] Backport filetype arg from `datasets`

cc @lhoestq

* Silm down the CI while i'm working

* Ok let's try this in CI

* Update config.yml

* Do not try this at home

* one more try

* Update lfs.py

* Revert "Tweak to FileSlice"

This reverts commit d7e32c4b35.

* Update test_hf_api.py

* Update test_hf_api.py

* Update test_hf_api.py

* CI still green?

* make CI green again?

* Update test_hf_api.py

* make CI red again?

* Update test_hf_api.py

* add CI style back

* Fix CI?

* oh my

* doc + switch back to real staging endpoint

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* Fix docblock + f-strings

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
2020-12-07 16:38:39 -05:00
Wietse de Vries
0bce7c5508
Create README.md (#8964) 2020-12-07 16:04:14 -05:00
Nguyen Van Nha
7ccd973ea1
Update README.txt (#8957) 2020-12-07 16:01:49 -05:00
Stas Bekman
37f4c24f10
> 30 files leads to hanging on --More--
cancel debug printing for now. As it can be seen lead to a failing test here:
https://app.circleci.com/pipelines/github/huggingface/transformers/16894/workflows/cc86f7a9-4020-45af-8ab3-c22f79b427cf/jobs/131924
2020-12-07 12:18:05 -08:00
Sylvain Gugger
7f9ccffc5b
Use word_ids to get labels in run_ner (#8962)
* Use word_ids to get labels in run_ner

* Add sanity check
2020-12-07 14:26:36 -05:00
Clement
de6befd41f
Remove sourcerer (#8965) 2020-12-07 11:15:29 -05:00
sandip
483e13273f
Add TFGPT2ForSequenceClassification based on DialogRPT (#8714)
* Add TFGPT2ForSequenceClassification based on DialogRPT

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* code refactor for latest other TF PR

* code refactor

* code refactor

* Update modeling_tf_gpt2.py
2020-12-07 16:58:37 +01:00
Sylvain Gugger
28c77ddf3b
Fix QA pipeline on Windows (#8947) 2020-12-07 09:50:32 -05:00
Philip Tamimi-Sarnikowski
72d6c9c68b
Add model card (#8948)
* add model card

* lowercase identifier

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-12-06 11:16:32 -05:00
Machel Reid
ef93a25427
Fix typo for modeling_bert import resulting in ImportError (#8931)
Self-explanatory ;) - Hope it helps!
2020-12-05 09:57:37 -05:00
Ethan Perez
8dfc8c7221
Don't pass in token_type_ids to BART for GLUE (#8929)
Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
2020-12-05 09:52:16 -05:00
Stas Bekman
df311a5ccf
[seq2seq] document the caveat of leaky native amp (#8930)
* document the caveat of leaky native amp

* Update examples/seq2seq/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-04 15:43:35 -08:00
Stas Bekman
73c51f7fcd
[ci] skip doc jobs - circleCI is not reliable - disable skip for now (#8926)
* disable skipping, but leave logging for the future
2020-12-04 10:13:42 -08:00
Lysandre Debut
71688a8889
Fix TF T5 only encoder model with booleans (#8925) 2020-12-04 12:28:47 -05:00
Julien Plu
dcd3046f98
Better booleans handling in the TF models (#8777)
* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Bug fix

* Retry to bugfix

* Retry bug fix

* Fix wrong model name

* Try another fix

* Fix BART

* Fix input precessing

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Bug fix

* Address Patrick's comments

* Address Patrick's comments

* Address Sylvain's comments

* Add boolean processing for the inputs

* Apply style

* Missing optional

* Fix missing some input proc

* Update the template

* Fix missing inputs

* Missing input

* Fix args parameter

* Trigger CI

* Trigger CI

* Trigger CI

* Address Patrick's and Sylvain's comments

* Replace warn by warning

* Trigger CI

* Fix XLNET

* Fix detection
2020-12-04 09:08:29 -05:00
Stas Bekman
4c3d98dddc
[s2s finetune_trainer] add instructions for distributed training (#8884) 2020-12-03 16:05:55 -08:00
Lysandre Debut
aa60b230ec
Patch model parallel test (#8920)
* Patch model parallel test

* Remove line

* Remove `ci_*` from scheduled branches
2020-12-03 17:15:47 -05:00
Lysandre Debut
0c5615af66
Put Transformers on Conda (#8918)
* conda

* Guide

* correct tag

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/installation.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sylvain's comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-03 14:28:49 -05:00
Julien Chaumond
9ad6194318
Tweak wording + Add badge w/ number of models on the hub (#8914)
* Add badge w/ number of models on the hub

* try to apease @sgugger 😇

* not sure what this `c` was about [ci skip]

* Fix script and move stuff around

* Fix doc styling error

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-12-03 10:56:55 -05:00
Sylvain Gugger
6ed7e32f7c
Fix move when the two cache folders exist (#8917) 2020-12-03 10:50:13 -05:00
Sylvain Gugger
8453201cfe
Avoid erasing the attention mask when double padding (#8915) 2020-12-03 10:45:07 -05:00
Skye Wanderman-Milne
0deece9c53
Don't warn that models aren't available if Flax is available. (#8841) 2020-12-03 10:33:12 -05:00
Julien Chaumond
2b7fc9a0fd [model_cards] lm-head was deprecated
(and wasn't needed here anyways as it was added automatically)
2020-12-03 15:05:01 +01:00
Patrick von Platen
443f67e887
[PyTorch] Refactor Resize Token Embeddings (#8880)
* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py
2020-12-02 19:19:50 +01:00
Devangi Purkayastha
e52f9c0ade
Update README.md (#8906) 2020-12-02 09:28:44 -08:00
ryota-mo
801b2cb36f
Fix typo in docstring (#8905) 2020-12-02 12:08:31 -05:00
Stas Bekman
7e1cb00c37
[trainer] improve code readability (#8903)
* [trainer] improve code

This PR:
- removes redundant code 
```
self.model = model if model is not None else None
```
and
```
self.model = model
```
are the same.

* separate attribute assignment from code logic - which simplifies things further.

* whitespace
2020-12-02 09:07:42 -08:00
Nicolas Patry
a8c3f9aa76
Warning about too long input for fast tokenizers too (#8799)
* Warning about too long input for fast tokenizers too

If truncation is not set in tokenizers, but the tokenization is too long
for the model (`model_max_length`), we used to trigger a warning that

The input would probably fail (which it most likely will).

This PR re-enables the warning for fast tokenizers too and uses common
code for the trigger to make sure it's consistent across.

* Checking for pair of inputs too.

* Making the function private and adding it's doc.

* Remove formatting ?? in odd place.

* Missed uppercase.
2020-12-02 10:18:28 -05:00
sandip
f6b44e6190
Transfoxl seq classification (#8868)
* Transfoxl sequence classification

* Transfoxl sequence classification
2020-12-02 10:08:32 -05:00
Stas Bekman
24f0c2fe33
[ci] skip doc jobs take #3 (#8885)
* check that we get any match first

* docs only

* 2 docs only

* add code

* restore
2020-12-02 10:06:45 -05:00
Stas Bekman
693ac3594b
disable job skip - need more work
reference: https://github.com/huggingface/transformers/pull/8853#issuecomment-736779863
2020-12-01 12:03:29 -08:00