Commit Graph

70 Commits

Author SHA1 Message Date
Patrick von Platen
fa49b9afea
Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility (#3383)
* change encoder decoder style to bart & t5 style

* make encoder decoder generation dummy work for bert

* make style

* clean init config in encoder decoder

* add tests for encoder decoder models

* refactor and add last tests

* refactor and add last tests

* fix attn masks for bert encoder decoder

* make style

* refactor prepare inputs for Bert

* refactor

* finish encoder decoder

* correct typo

* add docstring to config

* finish

* add tests

* better naming

* make style

* fix flake8

* clean docstring

* make style

* rename
2020-04-28 15:11:09 +02:00
Patrick von Platen
52679fbc2e
add dialogpt training tips (#3996) 2020-04-28 14:32:31 +02:00
Lorenzo Ampil
12bb7fe770
Fix t5 doc typos (#3978)
* Fix tpo in into and add line under

* Add missing blank line under

* Correct types under
2020-04-27 18:27:15 +02:00
Thomas Wolf
827d6d6ef0
Cleanup fast tokenizers integration (#3706)
* First pass on utility classes and python tokenizers

* finishing cleanup pass

* style and quality

* Fix tests

* Updating following @mfuntowicz comment

* style and quality

* Fix Roberta

* fix batch_size/seq_length inBatchEncoding

* add alignement methods + tests

* Fix OpenAI and Transfo-XL tokenizers

* adding trim_offsets=True default for GPT2 et RoBERTa

* style and quality

* fix tests

* add_prefix_space in roberta

* bump up tokenizers to rc7

* style

* unfortunately tensorfow does like these - removing shape/seq_len for now

* Update src/transformers/tokenization_utils.py

Co-Authored-By: Stefan Schweter <stefan@schweter.it>

* Adding doc and docstrings

* making flake8 happy

Co-authored-by: Stefan Schweter <stefan@schweter.it>
2020-04-18 13:43:57 +02:00
Patrick von Platen
d22894dfd4
[Docs] Add DialoGPT (#3755)
* add dialoGPT

* update README.md

* fix conflict

* update readme

* add code links to docs

* Update README.md

* Update dialo_gpt2.rst

* Update pretrained_models.rst

* Update docs/source/model_doc/dialo_gpt2.rst

Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* change filename of dialogpt

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-16 09:04:32 +02:00
Lysandre Debut
d5d7d88612
ELECTRA (#3257)
* Electra wip

* helpers

* Electra wip

* Electra v1

* ELECTRA may be saved/loaded

* Generator & Discriminator

* Embedding size instead of halving the hidden size

* ELECTRA Tokenizer

* Revert BERT helpers

* ELECTRA Conversion script

* Archive maps

* PyTorch tests

* Start fixing tests

* Tests pass

* Same configuration for both models

* Compatible with base + large

* Simplification + weight tying

* Archives

* Auto + Renaming to standard names

* ELECTRA is uncased

* Tests

* Slight API changes

* Update tests

* wip

* ElectraForTokenClassification

* temp

* Simpler arch + tests

Removed ElectraForPreTraining which will be in a script

* Conversion script

* Auto model

* Update links to S3

* Split ElectraForPreTraining and ElectraForTokenClassification

* Actually test PreTraining model

* Remove num_labels from configuration

* wip

* wip

* From discriminator and generator to electra

* Slight API changes

* Better naming

* TensorFlow ELECTRA tests

* Accurate conversion script

* Added to conversion script

* Fast ELECTRA tokenizer

* Style

* Add ELECTRA to README

* Modeling Pytorch Doc + Real style

* TF Docs

* Docs

* Correct links

* Correct model intialized

* random fixes

* style

* Addressing Patrick's and Sam's comments

* Correct links in docs
2020-04-03 14:10:54 -04:00
Patrick von Platen
5b44e0a31b
[T5] Add training documenation (#3507)
* Add clear description of how to train T5

* correct docstring in T5

* correct typo

* correct docstring format

* update t5 model docs

* implement collins feedback

* fix typo and add more explanation for sentinal tokens

* delete unnecessary todos
2020-03-30 13:35:53 +02:00
Patrick von Platen
fa9af2468a
Add T5 to docs (#3461)
* add t5 docs basis

* improve docs

* add t5 docs

* improve t5 docstring

* add t5 tokenizer docstring

* finish docstring

* make style

* add pretrained models

* correct typo

* make examples work

* finalize docs
2020-03-27 10:57:16 -04:00
Sam Shleifer
857e0a0d3b
Rename BartForMaskedLM -> BartForConditionalGeneration (#3114)
* improved documentation
2020-03-05 17:41:18 -05:00
Sam Shleifer
b54ef78d0c
Bart-CNN (#3059)
`generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.
2020-03-02 10:35:53 -05:00
Lysandre Debut
bb7c468520
Documentation (#2989)
* All Tokenizers

BertTokenizer + few fixes
RobertaTokenizer
OpenAIGPTTokenizer + Fixes
GPT2Tokenizer + fixes
TransfoXLTokenizer
Correct rst for TransformerXL
XLMTokenizer + fixes
XLNet Tokenizer + Style
DistilBERT + Fix XLNet RST
CTRLTokenizer
CamemBERT Tokenizer
FlaubertTokenizer
XLMRobertaTokenizer
cleanup

* cleanup
2020-02-25 18:43:36 -05:00
Sam Shleifer
53ce3854a1
New BartModel (#2745)
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
Lysandre
dd28830327 Update RoBERTa tips 2020-02-07 16:42:35 -05:00
Lysandre
db97930122 Update XLM-R tips 2020-02-07 16:42:35 -05:00
Lysandre
73306d028b FlauBERT documentation 2020-01-30 10:04:18 -05:00
Lysandre
c69b082601 Update documentation 2020-01-29 12:06:13 -05:00
Lysandre
44a5b4bbe7 Update documentation 2020-01-29 11:47:49 -05:00
thomwolf
e0849a66ac adding in the doc 2020-01-27 14:27:07 -05:00
Lysandre
983fef469c AutoModels doc 2020-01-24 16:37:30 -05:00
Lysandre
24d5ad1dcc Run the examples in slow 2020-01-23 09:38:45 -05:00
Lysandre
9ddf60b694 Tips + whitespaces 2020-01-23 09:38:45 -05:00
Lysandre
0e9899f451 Fixes 2020-01-23 09:38:45 -05:00
Lysandre
7511f3dd89 PyTorch CTRL + Style 2020-01-23 09:38:45 -05:00
Lysandre
980211a63a XLM-RoBERTa 2020-01-23 09:38:45 -05:00
Lysandre
db1a7f27a1 PyTorch DistilBERT 2020-01-23 09:38:45 -05:00
Lysandre
b28020f590 TF RoBERTa 2020-01-23 09:38:45 -05:00
Lysandre
3e1bc27e1b Pytorch RoBERTa 2020-01-23 09:38:45 -05:00
Lysandre
f44ff574d3 Camembert 2020-01-23 09:38:45 -05:00
Lysandre
ccebcae75f PyTorch XLM 2020-01-23 09:38:45 -05:00
Lysandre
cd656fb21a PyTorch XLNet 2020-01-23 09:38:45 -05:00
Lysandre
98edad418e PyTorch Transformer-XL 2020-01-23 09:38:45 -05:00
Lysandre
850795c487 Pytorch GPT 2020-01-23 09:38:45 -05:00
Lysandre
1487b840d3 TF GPT2 2020-01-23 09:38:45 -05:00
Lysandre
bd0d3fd76e GPT-2 PyTorch models + better tips for BERT 2020-01-23 09:38:45 -05:00
Lysandre
cd77c750c5 BERT PyTorch models 2020-01-23 09:38:45 -05:00
Lysandre
3922a2497e TF ALBERT + TF Utilities + Fix warnings 2020-01-23 09:38:45 -05:00
Lysandre
00df3d4de0 ALBERT Modeling + required changes to utilities 2020-01-23 09:38:45 -05:00
Lysandre
387217bd3e Added example usage 2020-01-14 14:09:09 +01:00
Lysandre
7d1bb7f256 Add missing XLNet and XLM models 2020-01-14 14:09:09 +01:00
Lysandre Debut
632682726f Updated Configurations 2020-01-14 14:09:09 +01:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Lysandre
361620954a Remove TFBertForPreTraining from ALBERT doc 2019-11-27 10:11:37 -05:00
Lysandre
ee4647bd5c CamemBERT & ALBERT doc 2019-11-26 15:10:51 -05:00
Julien Chaumond
93d2fff071 Close #1654 2019-11-01 09:47:38 -04:00
LysandreJik
89f86f9661 CTRL added to the documentation 2019-10-09 12:04:06 -04:00
thomwolf
6c3b131516 typo in readme/doc 2019-09-26 16:23:28 +02:00
LysandreJik
7e957237e4 [Doc] XLM + Torch in documentation 2019-09-26 10:08:56 -04:00
LysandreJik
927904bc91 [doc] pytorch_transformers -> transformers 2019-09-26 08:47:15 -04:00
LysandreJik
8349d75773 Various small doc fixes 2019-09-26 07:45:40 -04:00