Patrick von Platen
fa49b9afea
Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility ( #3383 )
...
* change encoder decoder style to bart & t5 style
* make encoder decoder generation dummy work for bert
* make style
* clean init config in encoder decoder
* add tests for encoder decoder models
* refactor and add last tests
* refactor and add last tests
* fix attn masks for bert encoder decoder
* make style
* refactor prepare inputs for Bert
* refactor
* finish encoder decoder
* correct typo
* add docstring to config
* finish
* add tests
* better naming
* make style
* fix flake8
* clean docstring
* make style
* rename
2020-04-28 15:11:09 +02:00
Patrick von Platen
52679fbc2e
add dialogpt training tips ( #3996 )
2020-04-28 14:32:31 +02:00
Lorenzo Ampil
12bb7fe770
Fix t5 doc typos ( #3978 )
...
* Fix tpo in into and add line under
* Add missing blank line under
* Correct types under
2020-04-27 18:27:15 +02:00
Thomas Wolf
827d6d6ef0
Cleanup fast tokenizers integration ( #3706 )
...
* First pass on utility classes and python tokenizers
* finishing cleanup pass
* style and quality
* Fix tests
* Updating following @mfuntowicz comment
* style and quality
* Fix Roberta
* fix batch_size/seq_length inBatchEncoding
* add alignement methods + tests
* Fix OpenAI and Transfo-XL tokenizers
* adding trim_offsets=True default for GPT2 et RoBERTa
* style and quality
* fix tests
* add_prefix_space in roberta
* bump up tokenizers to rc7
* style
* unfortunately tensorfow does like these - removing shape/seq_len for now
* Update src/transformers/tokenization_utils.py
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
* Adding doc and docstrings
* making flake8 happy
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2020-04-18 13:43:57 +02:00
Patrick von Platen
d22894dfd4
[Docs] Add DialoGPT ( #3755 )
...
* add dialoGPT
* update README.md
* fix conflict
* update readme
* add code links to docs
* Update README.md
* Update dialo_gpt2.rst
* Update pretrained_models.rst
* Update docs/source/model_doc/dialo_gpt2.rst
Co-Authored-By: Julien Chaumond <chaumond@gmail.com>
* change filename of dialogpt
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-16 09:04:32 +02:00
Lysandre Debut
d5d7d88612
ELECTRA ( #3257 )
...
* Electra wip
* helpers
* Electra wip
* Electra v1
* ELECTRA may be saved/loaded
* Generator & Discriminator
* Embedding size instead of halving the hidden size
* ELECTRA Tokenizer
* Revert BERT helpers
* ELECTRA Conversion script
* Archive maps
* PyTorch tests
* Start fixing tests
* Tests pass
* Same configuration for both models
* Compatible with base + large
* Simplification + weight tying
* Archives
* Auto + Renaming to standard names
* ELECTRA is uncased
* Tests
* Slight API changes
* Update tests
* wip
* ElectraForTokenClassification
* temp
* Simpler arch + tests
Removed ElectraForPreTraining which will be in a script
* Conversion script
* Auto model
* Update links to S3
* Split ElectraForPreTraining and ElectraForTokenClassification
* Actually test PreTraining model
* Remove num_labels from configuration
* wip
* wip
* From discriminator and generator to electra
* Slight API changes
* Better naming
* TensorFlow ELECTRA tests
* Accurate conversion script
* Added to conversion script
* Fast ELECTRA tokenizer
* Style
* Add ELECTRA to README
* Modeling Pytorch Doc + Real style
* TF Docs
* Docs
* Correct links
* Correct model intialized
* random fixes
* style
* Addressing Patrick's and Sam's comments
* Correct links in docs
2020-04-03 14:10:54 -04:00
Patrick von Platen
5b44e0a31b
[T5] Add training documenation ( #3507 )
...
* Add clear description of how to train T5
* correct docstring in T5
* correct typo
* correct docstring format
* update t5 model docs
* implement collins feedback
* fix typo and add more explanation for sentinal tokens
* delete unnecessary todos
2020-03-30 13:35:53 +02:00
Patrick von Platen
fa9af2468a
Add T5 to docs ( #3461 )
...
* add t5 docs basis
* improve docs
* add t5 docs
* improve t5 docstring
* add t5 tokenizer docstring
* finish docstring
* make style
* add pretrained models
* correct typo
* make examples work
* finalize docs
2020-03-27 10:57:16 -04:00
Sam Shleifer
857e0a0d3b
Rename BartForMaskedLM -> BartForConditionalGeneration ( #3114 )
...
* improved documentation
2020-03-05 17:41:18 -05:00
Sam Shleifer
b54ef78d0c
Bart-CNN ( #3059 )
...
`generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.
2020-03-02 10:35:53 -05:00
Lysandre Debut
bb7c468520
Documentation ( #2989 )
...
* All Tokenizers
BertTokenizer + few fixes
RobertaTokenizer
OpenAIGPTTokenizer + Fixes
GPT2Tokenizer + fixes
TransfoXLTokenizer
Correct rst for TransformerXL
XLMTokenizer + fixes
XLNet Tokenizer + Style
DistilBERT + Fix XLNet RST
CTRLTokenizer
CamemBERT Tokenizer
FlaubertTokenizer
XLMRobertaTokenizer
cleanup
* cleanup
2020-02-25 18:43:36 -05:00
Sam Shleifer
53ce3854a1
New BartModel ( #2745 )
...
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
Lysandre
dd28830327
Update RoBERTa tips
2020-02-07 16:42:35 -05:00
Lysandre
db97930122
Update XLM-R tips
2020-02-07 16:42:35 -05:00
Lysandre
73306d028b
FlauBERT documentation
2020-01-30 10:04:18 -05:00
Lysandre
c69b082601
Update documentation
2020-01-29 12:06:13 -05:00
Lysandre
44a5b4bbe7
Update documentation
2020-01-29 11:47:49 -05:00
thomwolf
e0849a66ac
adding in the doc
2020-01-27 14:27:07 -05:00
Lysandre
983fef469c
AutoModels doc
2020-01-24 16:37:30 -05:00
Lysandre
24d5ad1dcc
Run the examples in slow
2020-01-23 09:38:45 -05:00
Lysandre
9ddf60b694
Tips + whitespaces
2020-01-23 09:38:45 -05:00
Lysandre
0e9899f451
Fixes
2020-01-23 09:38:45 -05:00
Lysandre
7511f3dd89
PyTorch CTRL + Style
2020-01-23 09:38:45 -05:00
Lysandre
980211a63a
XLM-RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
db1a7f27a1
PyTorch DistilBERT
2020-01-23 09:38:45 -05:00
Lysandre
b28020f590
TF RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
3e1bc27e1b
Pytorch RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
f44ff574d3
Camembert
2020-01-23 09:38:45 -05:00
Lysandre
ccebcae75f
PyTorch XLM
2020-01-23 09:38:45 -05:00
Lysandre
cd656fb21a
PyTorch XLNet
2020-01-23 09:38:45 -05:00
Lysandre
98edad418e
PyTorch Transformer-XL
2020-01-23 09:38:45 -05:00
Lysandre
850795c487
Pytorch GPT
2020-01-23 09:38:45 -05:00
Lysandre
1487b840d3
TF GPT2
2020-01-23 09:38:45 -05:00
Lysandre
bd0d3fd76e
GPT-2 PyTorch models + better tips for BERT
2020-01-23 09:38:45 -05:00
Lysandre
cd77c750c5
BERT PyTorch models
2020-01-23 09:38:45 -05:00
Lysandre
3922a2497e
TF ALBERT + TF Utilities + Fix warnings
2020-01-23 09:38:45 -05:00
Lysandre
00df3d4de0
ALBERT Modeling + required changes to utilities
2020-01-23 09:38:45 -05:00
Lysandre
387217bd3e
Added example usage
2020-01-14 14:09:09 +01:00
Lysandre
7d1bb7f256
Add missing XLNet and XLM models
2020-01-14 14:09:09 +01:00
Lysandre Debut
632682726f
Updated Configurations
2020-01-14 14:09:09 +01:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
Lysandre
361620954a
Remove TFBertForPreTraining from ALBERT doc
2019-11-27 10:11:37 -05:00
Lysandre
ee4647bd5c
CamemBERT & ALBERT doc
2019-11-26 15:10:51 -05:00
Julien Chaumond
93d2fff071
Close #1654
2019-11-01 09:47:38 -04:00
LysandreJik
89f86f9661
CTRL added to the documentation
2019-10-09 12:04:06 -04:00
thomwolf
6c3b131516
typo in readme/doc
2019-09-26 16:23:28 +02:00
LysandreJik
7e957237e4
[Doc] XLM + Torch in documentation
2019-09-26 10:08:56 -04:00
LysandreJik
927904bc91
[doc] pytorch_transformers -> transformers
2019-09-26 08:47:15 -04:00
LysandreJik
8349d75773
Various small doc fixes
2019-09-26 07:45:40 -04:00