Commit Graph

5958 Commits

Author SHA1 Message Date
Sam Shleifer
46509d1c19
[docs] remove sshleifer from issue-template :( (#8418) 2020-11-09 12:51:38 -05:00
Patrick von Platen
9c83b96e62
[Tests] Add Common Test for Training + Fix a couple of bugs (#8415)
* add training tests

* correct longformer

* fix docs

* fix some tests

* fix some more train tests

* remove ipdb

* fix multiple edge case model training

* fix funnel and prophetnet

* clean gpt models

* undo renaming of albert
2020-11-09 18:24:41 +01:00
Sylvain Gugger
52040517b8
Deprecate old data/metrics functions (#8420) 2020-11-09 12:10:09 -05:00
Stas Bekman
d4d1fbfc5a
[fsmt convert script] fairseq broke chkpt data - fixing that (#8377)
* fairseq broke chkpt data - fixing that

* style

* support older bpecodes filenames - specifically "code" in iwslt14
2020-11-09 11:57:42 -05:00
Sylvain Gugger
5c766ecb50 Fix typo 2020-11-09 11:50:51 -05:00
Sylvain Gugger
908a28894c
Add new token classification example (#8340)
* Add new token classification example

* Remove txt file

* Add test

* With actual testing done

* Less warmup is better

* Update examples/token-classification/run_ner_new.py

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address review comments

* Fix test

* Make Lysandre happy

* Last touches and rename

* Rename in tests

* Address review comments

* More run_ner -> run_ner_old

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-11-09 11:39:55 -05:00
Sylvain Gugger
c7cb1aa26c
Bump tokenizers (#8419) 2020-11-09 11:32:10 -05:00
Stas Bekman
78d706f3ae
[fsmt tokenizer] support lowercase tokenizer (#8389)
* support lowercase tokenizer

* fix arg pos
2020-11-09 10:41:39 -05:00
Shashank Gupta
1e2acd0dcf
Bug fix for permutation language modelling (#8409) 2020-11-09 10:23:26 -05:00
Philip May
bf8625e70b
add evaluate doc - trainer.evaluate returns 'epoch' from training (#8273)
* add evaluate doc

* fix style with utils/style.doc

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-09 09:00:59 -05:00
Sam Shleifer
ebde57acac
examples/docs: caveat that PL examples don't work on TPU (#8309) 2020-11-09 08:55:22 -05:00
Julien Plu
76e7a44dee
Fix some tooling for windows (#8359)
* Fix some tooling for windows

* Fix conflict

* Trigger CI
2020-11-09 13:50:38 +01:00
dartrevan
507dfb40c3
Update README.md (#8406) 2020-11-09 16:44:43 +08:00
smanjil
7247d0b4ea
updating tag for exbert viz (#8408) 2020-11-09 16:43:55 +08:00
Stas Bekman
4ab5617b0b
comet_ml temporary fix(#8410) 2020-11-09 16:36:06 +08:00
Sam Shleifer
e6d9cdaafe
[s2s/distill] remove run_distiller.sh, fix xsum script (#8412) 2020-11-08 16:57:43 -05:00
Stas Bekman
66582492d3
[s2s test_finetune_trainer] failing multigpu test (#8400) 2020-11-08 16:45:40 -05:00
Stas Bekman
f62755a600
[s2s examples test] fix data path (#8398) 2020-11-08 16:44:18 -05:00
Jonathan Chang
4a53e8e9e4
Fix DataCollatorForWholeWordMask again (#8397) 2020-11-08 09:53:01 -05:00
Manav Rathod
610730998f
fixed default labels for QA model (#8399) 2020-11-08 09:08:14 -05:00
Chengxi Guo
0b02489b2c
Add gpt2-medium-chinese model card (#8402)
* Create README.md

* Update model_cards/mymusise/gpt2-medium-chinese/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-08 05:00:19 -05:00
Stas Bekman
187554366f
fix md table (#8395) 2020-11-08 04:25:14 -05:00
Jonathan Chang
77a257fc21
Fix DataCollatorForWholeWordMask (#8379)
* Fix DataCollatorForWholeWordMask

* Replace all tensorize_batch in data_collator.py
2020-11-07 12:51:56 -05:00
Stas Bekman
517eaf460b
[make] rewrite modified_py_files in python to be cross-platform (#8371)
* rewrite modified_py_files in python to be cross-platform

* try a different way to test for variable not being ""

* improve comment
2020-11-07 18:45:16 +01:00
Patrick von Platen
07708793f2
fix encoder outputs (#8368) 2020-11-06 21:03:25 +01:00
Yossi Synett
bc0d26d1de
[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071)
* Output cross-attention with decoder attention output

* Update src/transformers/modeling_bert.py

* add cross-attention for t5 and bart as well

* fix tests

* correct typo in docs

* add sylvains and sams comments

* correct typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-06 19:34:48 +01:00
hassoudi
30f2507a07
Update README.md (#8360)
Fix websitr address
2020-11-06 11:45:46 -05:00
Jonathan Chang
5807ba3fa9
Fix typo (#8351) 2020-11-06 11:19:41 -05:00
hassoudi
82146496b6
Update README.md (#8338)
fixes
2020-11-06 06:20:58 -05:00
ktrapeznikov
9e5c4d39ab
Create README.md (#8312)
* Create README.md

* Update model_cards/ktrapeznikov/gpt2-medium-topic-news/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 06:19:59 -05:00
hasantanvir79
06ebc37967
Create README.md (#8255)
* Create README.md

Initial commit

* Updated Read me

Updated

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:34:24 -05:00
Karthik Uppuluri
41cd031cf2
Create README.md (#8169) 2020-11-06 03:26:07 -05:00
Karthik Uppuluri
f932ddeff5
Create README.md (#8170) 2020-11-06 03:25:52 -05:00
Karthik Uppuluri
08b92f78fa
Create README.md (#8168)
* Create README.md

* Update README.md
2020-11-06 03:25:33 -05:00
Karthik Uppuluri
77d62e78b0
Create README.md (#8167)
* Create README.md

Telugu BERTU Readme file

* Update model_cards/kuppuluri/telugu_bertu/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:24:31 -05:00
Yifan Peng
dd6bfcaefb
Create README.md (#8327) 2020-11-06 03:22:52 -05:00
smanjil
ddeecf08e6
german medbert model details (#8266)
* model details

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:21:13 -05:00
Jiaxin Pei
96baaafd34
Create README.md (#8258) 2020-11-06 03:19:12 -05:00
Stefan Schweter
185259c261
[model_cards] Update Italian BERT models and introduce new Italian XXL ELECTRA model 🎉 (#8343) 2020-11-06 03:17:03 -05:00
Manuel Romero
34bbf60bf8
Model card: GPT-2 fine-tuned on CommonGen (#8248) 2020-11-06 03:15:11 -05:00
Manuel Romero
973218fd3b
Model card: CodeBERT fine-tuned for Insecure Code Detection (#8247)
* Model card: CodeBERT fine-tuned for Insecure Code Detection

* Update model_cards/mrm8488/codebert-base-finetuned-detect-insecure-code/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-06 03:13:45 -05:00
Manuel Romero
f833ca418b
Model card: T5-base fine-tuned on QuaRel (#8334) 2020-11-06 03:09:55 -05:00
Stas Bekman
9edafaebef
[s2s] test_bash_script.py - actually learn something (#8318)
* use decorator

* remove hardcoded paths

* make the test use more data and do real quality tests

* shave off 10 secs

* add --eval_beams 2, reformat

* reduce train size, use smaller custom dataset
2020-11-05 23:15:14 -05:00
Leandro von Werra
17450397a7
Docs bart training ref (#8330)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 17:20:57 -05:00
Stas Bekman
d787935a14
[s2s] test_distributed_eval (#8315)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 16:01:15 -05:00
Sylvain Gugger
04e442d575
Make Trainer evaluation handle dynamic seq_length (#8336)
* Make Trainer evaluation handle dynamic seq_length

* Document behavior.

* Fix test

* Better fix

* Fixes for realsies this time

* Address review comments

* Without forgetting to save...
2020-11-05 15:13:51 -05:00
Guillaume Filion
27b402cab0
Output global_attentions in Longformer models (#7562)
* Output global_attentions in Longformer models

* make style

* small refactoring

* fix tests

* make fix-copies

* add for tf as well

* remove comments in test

* make fix-copies

* make style

* add docs

* make docstring pretty

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-05 21:10:43 +01:00
Sam Shleifer
7abc1d96d1
no warn (#8329) 2020-11-05 11:42:24 -05:00
Bobby Donchev
52f44dd6d2
change TokenClassificationTask class methods to static methods (#7902)
* change TokenClassificationTask class methods to static methods

Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class.

Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods.

* Trigger Build

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-05 09:38:30 -05:00
Guillem García Subies
77c8f6c627
Corrected typo in readme (#8320) 2020-11-05 07:48:36 -05:00