Commit Graph

10 Commits

Author SHA1 Message Date
Thomas Wolf
ba8c4d0ac0
[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659)
* splitting fast and slow tokenizers [WIP]

* [WIP] splitting sentencepiece and tokenizers dependencies

* update dummy objects

* add name_or_path to models and tokenizers

* prefix added to file names

* prefix

* styling + quality

* spliting all the tokenizer files - sorting sentencepiece based ones

* update tokenizer version up to 0.9.0

* remove hard dependency on sentencepiece 🎉

* and removed hard dependency on tokenizers 🎉

* update conversion script

* update missing models

* fixing tests

* move test_tokenization_fast to main tokenization tests - fix bugs

* bump up tokenizers

* fix bert_generation

* update ad fix several tokenizers

* keep sentencepiece in deps for now

* fix funnel and deberta tests

* fix fsmt

* fix marian tests

* fix layoutlm

* fix squeezebert and gpt2

* fix T5 tokenization

* fix xlnet tests

* style

* fix mbart

* bump up tokenizers to 0.9.2

* fix model tests

* fix tf models

* fix seq2seq examples

* fix tests without sentencepiece

* fix slow => fast  conversion without sentencepiece

* update auto and bert generation tests

* fix mbart tests

* fix auto and common test without tokenizers

* fix tests without tokenizers

* clean up tests lighten up when tokenizers + sentencepiece are both off

* style quality and tests fixing

* add sentencepiece to doc/examples reqs

* leave sentencepiece on for now

* style quality split hebert and fix pegasus

* WIP Herbert fast

* add sample_text_no_unicode and fix hebert tokenization

* skip FSMT example test for now

* fix style

* fix fsmt in example tests

* update following Lysandre and Sylvain's comments

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-18 20:51:24 +02:00
Lysandre
a75c64d80c Black 20 release 2020-08-26 17:20:22 +02:00
Sylvain Gugger
a573777901
Update repo to isort v5 (#6686)
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Patrick von Platen
505f2d749e
[Tests] fix attention masks in Tests (#6621)
* fix distilbert

* fix typo
2020-08-20 13:23:47 -04:00
Stas Bekman
5deed37f9f
cleanup torch unittests (#6196)
* improve unit tests

this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest

* batch 1

* batch 2

* batch 3

* batch 4

* batch 5

* style

* non-tf template

* last deletion of check_loss_output
2020-08-04 02:42:56 -04:00
Sylvain Gugger
d951c14ae4
Model output test (#6155)
* Use return_dict=True in all tests

* Formatting
2020-07-31 09:44:37 -04:00
Sam Shleifer
13deb95a40
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
Sam Shleifer
5144104070
[fix] remove unused import (#5206) 2020-06-22 23:39:04 -04:00
Sam Shleifer
0d158e38c9
[fix] mobilebert had wrong path, causing slow test failure (#5205) 2020-06-22 23:31:36 -04:00
Vasily Shamporov
9a3f91088c
Add MobileBert (#4901)
* Add MobileBert

* Quality + Conversion script

* style

* Update src/transformers/modeling_mobilebert.py

* Links to S3

* Style

* TFMobileBert

Slight fixes to the pytorch MobileBert
Style

* MobileBertForMaskedLM (PT + TF)

* MobileBertForNextSentencePrediction (PT + TF)

* MobileFor{MultipleChoice, TokenClassification} (PT + TF)


ss

* Tests + Auto

* Doc

* Tests

* Addressing @sgugger's comments

* Adressing @patrickvonplaten's comments

* Style

* Style

* Integration test

* style

* Model card

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-19 16:38:36 -04:00