Commit Graph

5899 Commits

Author SHA1 Message Date
Nicola De Cao
2f9d49b389
Adding PrefixConstrainedLogitsProcessor (#8529)
* Adding PrefixConstrainedLogitsProcessor

* fixing RAG and style_doc

* fixing black (v20 instead of v19)

* Improving doc in generation_logits_process.py

* Improving docs and typing in generation_utils.py

* docs improvement

* adding test and fixing doc typo

* fixing doc_len

* isort on test

* fixed test

* improve docstring a bit

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-18 17:06:25 +01:00
Julien Plu
3bc1540070
New TF loading weights (#8490)
* New TF loading weights

* apply style

* Better naming

* Largely comment the loading method

* Apply style

* Address Patrick's comments

* Remove useless line of code

* Update Docstring

* Address Sylvain's and Lysandre's comments

* Simplify the names computation

* Typos
2020-11-18 10:48:31 -05:00
Ratthachat (Jung)
0df91ee4f7
self.self.activation_dropout -> self.activation_dropout (#8611)
(one line typo)
2020-11-18 10:30:29 -05:00
Stas Bekman
cdf1b7ae82
fix to adjust for #8530 changes (#8612) 2020-11-18 10:25:00 -05:00
Stas Bekman
2819da02f7
[s2s] broken test (#8613) 2020-11-18 10:15:53 -05:00
Michał Pogoda
9fa3ed1a7f
Fix missing space in multiline warning (#8593)
Multiline string informing about missing PyTorch/TensorFlow had missing space.
2020-11-18 10:09:26 -05:00
Sylvain Gugger
8fcb6935a1
Fix DataCollatorForLanguageModeling (#8621) 2020-11-18 10:02:50 -05:00
Benjamin Minixhofer
f6fe41c96b
Reset loss to zero on logging in Trainer to avoid bfloat16 issues (#8561)
* make tr_loss regular float

* Revert "make tr_loss regular float"

This reverts commit c9d7ccfaf0.

* reset loss at each logging step

* keep track of total loss with _total_loss_scalar

* add remaining tr_loss at the end
2020-11-18 09:58:08 -05:00
cronoik
b592728eff
Fixed link to the wrong paper. (#8607) 2020-11-17 19:00:44 -05:00
Sylvain Gugger
0512444ee5 Remove old doc 2020-11-17 17:34:25 -05:00
Caitlin Ostroff
5cf9c79665
Add Harry Potter Model Card (#8605)
* Add Harry Potter Model

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

* Update model_cards/ceostroff/harry-potter-gpt2-fanfiction/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-17 16:50:58 -05:00
Sylvain Gugger
dd52804f5f
Remove deprecated (#8604)
* Remove old deprecated arguments

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-11-17 15:11:29 -05:00
Lysandre Debut
3095ee9dab
Tokenizers should be framework agnostic (#8599)
* Tokenizers should be framework agnostic

* Run the slow tests

* Not testing

* Fix documentation

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-17 14:03:03 -05:00
Sylvain Gugger
7f3b41a306
Fix check repo utils (#8600) 2020-11-17 14:01:46 -05:00
Stas Bekman
f0435f5a61
these should run fine on multi-gpu (#8582) 2020-11-17 14:00:41 -05:00
Sylvain Gugger
36a19915ea
Fix model templates (#8595)
* First fixes

* Fix imports and add init

* Fix typo

* Move init to final dest

* Fix tokenization import

* More fixes

* Styling
2020-11-17 10:35:38 -05:00
Julien Chaumond
042a6aa777
Tokenizers: ability to load from model subfolder (#8586)
* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co

Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2020-11-17 08:58:45 -05:00
Sylvain Gugger
48395d6b8e
Fix init for MT5 (#8591) 2020-11-17 08:52:13 -05:00
sgugger
a6cf9ca00b Add __init__ to the models folder 2020-11-17 07:39:37 -05:00
Patrick von Platen
5104223552
[MT5] More docs (#8589)
* add docs

* make style
2020-11-17 12:47:57 +01:00
Patrick von Platen
86822a358b
T5 & mT5 (#8552)
* add mt5 and t5v1_1 model

* fix tests

* correct some imports

* add tf model

* finish tf t5

* improve examples

* fix copies

* clean doc
2020-11-17 12:23:09 +01:00
fajri91
9e01f988dd
model_card for indolem/indobert-base-uncased (#8579) 2020-11-17 03:36:50 -05:00
Sylvain Gugger
c89bdfbe72
Reorganize repo (#8580)
* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Julien Plu
901507335f
Fix mixed precision issue for GPT2 (#8572)
* Fix mixed precision issue for GPT2

* Forgot one cast

* oops

* Forgotten casts
2020-11-16 14:44:19 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
Sylvain Gugger
0d0a0785fd
Update version to v4.0.0-dev (#8568) 2020-11-16 10:21:19 -05:00
LSinev
afb50c663a
Fix GPT2DoubleHeadsModel to work with model.generate() (#6601)
* Fix passing token_type_ids during GPT2DoubleHeadsModel.generate() if used

and for GPT2LMHeadModel too

* Update tests to check token_type_ids usage in GPT2 models
2020-11-16 14:35:44 +01:00
Yusuke Mori
04d8136bde
Adding the prepare_seq2seq_batch function to ProphetNet (#8515)
* Simply insert T5Tokenizer's prepare_seq2seq_batch

* Update/Add some 'import'

* fix RunTimeError caused by '.view'

* Moves .view related error avoidance from seq2seq_trainer to inside prophetnet

* Update test_tokenization_prophetnet.py

* Format the test code with black

* Re-format the test code

* Update test_tokenization_prophetnet.py

* Add importing require_torch in the test code

* Add importing BatchEncoding in the test code

* Re-format the test code on Colab
2020-11-16 14:18:25 +01:00
Stas Bekman
931b10978e
[doc] typo fix (#8535)
* [doc] typo fix

@sgugger

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-16 08:05:30 -05:00
Branden Chan
6db21a06ae
Clearer Model Versioning Example (#8562) 2020-11-16 06:59:10 -05:00
Mehrdad Farahani
daaa68451e
Readme for Wiki Summary [Persian] bert2bert (#8558) 2020-11-16 05:04:46 -05:00
Mehrdad Farahani
06d468d3f0
Readme for News Headline Generation (bert2bert) (#8557) 2020-11-16 05:04:38 -05:00
zhezhaoa
9b7fb8a368
Create README.md for Chinese RoBERTa Miniatures (#8550)
* Create README.md

* Update model_cards/uer/chinese_roberta_L-2_H-128/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-16 05:01:28 -05:00
Thomas Wolf
f4e04cd2c6
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073)
* Fixing roberta for slow-fast tests

* WIP getting equivalence on pipelines

* slow-to-fast equivalence - working on question-answering pipeline

* optional FAISS tests

* Pipeline Q&A

* Move pipeline tests to their own test job again

* update tokenizer to add sequence id methods

* update to tokenizers 0.9.4

* set sentencepiecce as optional

* clean up squad

* clean up pipelines to use sequence_ids

* style/quality

* wording

* Switch to use_fast = True by default

* update tests for use_fast at True by default

* fix rag tokenizer test

* removing protobuf from required dependencies

* fix NER test for use_fast = True by default

* fixing example tests (Q&A examples use slow tokenizers for now)

* protobuf in main deps extras["sentencepiece"] and example deps

* fix protobug install test

* try to fix seq2seq by switching to slow tokenizers for now

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-15 22:50:59 +01:00
Julien Plu
24184e73c4
Rework some TF tests (#8492)
* Update some tests

* Small update

* Apply style

* Use max_position_embeddings

* Create a fake attribute

* Create a fake attribute

* Update wrong name

* Wrong TransfoXL model file

* Keep the common tests agnostic
2020-11-13 17:07:17 -05:00
Patrick von Platen
f6cdafdec7
fix load weights (#8528)
* fix load weights

* delete line
2020-11-13 20:31:40 +01:00
Joe Davison
f6f4da8dd4
Add bart-large-mnli model card (#8527) 2020-11-13 14:07:25 -05:00
Julien Chaumond
725269746b
Model sharing doc: more tweaks (#8520)
* More doc tweaks

* Update model_sharing.rst

* make style

* missing newline

* Add email tip

Co-authored-by: Pierric Cistac <pierric@huggingface.co>
2020-11-13 12:10:26 -05:00
LysandreJik
9d519dabb7 Fix paths in github YAML 2020-11-13 12:04:17 -05:00
Lysandre Debut
826f04576f
Model templates encoder only (#8509)
* Model templates

* TensorFlow

* Remove pooler

* CI

* Tokenizer + Refactoring

* Encoder-Decoder

* Let's go testing

* Encoder-Decoder in TF

* Let's go testing in TF

* Documentation

* README

* Fixes

* Better names

* Style

* Update docs

* Choose to skip either TF or PT

* Code quality fixes

* Add to testing suite

* Update file path

* Cookiecutter path

* Update `transformers` path

* Handle rebasing

* Remove seq2seq from model templates

* Remove s2s config

* Apply Sylvain and Patrick comments

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Last fixes from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-13 11:59:30 -05:00
Patrick von Platen
42e2d02e44
[T5] Bug correction & Refactor (#8518)
* fix bug

* T5 refactor

* refactor tf

* apply sylvains suggestions
2020-11-13 16:57:31 +01:00
Sylvain Gugger
42f63e3871 Merge remote-tracking branch 'origin/master' 2020-11-13 10:30:04 -05:00
Sylvain Gugger
bb03a14edd Update doc for v3.5.1 2020-11-13 10:29:58 -05:00
Branden Chan
4df6b59318
Update deepset/roberta-base-squad2 model card (#8522)
* Update README.md

* Update README.md
2020-11-13 09:58:27 -05:00
Sylvain Gugger
0c9bae0934
Remove typo 2020-11-12 22:39:57 -05:00
Julien Plu
5d80539488
Add pretraining loss computation for TF Bert pretraining (#8470)
* Add pretraining loss computation for TF Bert pretraining

* Fix labels creation

* Fix T5 model

* restore T5 kwargs

* try a generic fix for pretraining models

* Apply style

* Overide the prepare method for the BERT tests
2020-11-12 14:08:26 -05:00
Julien Plu
91a67b7506
Use LF instead of os.linesep (#8491) 2020-11-12 13:52:40 -05:00
Julien Plu
27b3ff316a
Try to understand and apply Sylvain's comments (#8458) 2020-11-12 13:43:00 -05:00
Forrest Iandola
0fa0349883
fix SqueezeBertForMaskedLM (#8479) 2020-11-12 12:19:37 -05:00
Sylvain Gugger
7933054638
Model sharing doc (#8498)
* Model sharing doc

* Style
2020-11-12 11:53:23 -05:00