Antonio V Mendoza
ea2c6f1afc
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models ( #5793 )
...
* added template files for LXMERT and competed the configuration_lxmert.py
* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]
* added model card for lxmert
* cleaning up lxmert code
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* tested torch lxmert, changed documtention, updated outputs, and other small fixes
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* renaming, other small issues, did not change TF code in this commit
* added lxmert question answering model in pytorch
* added capability to edit number of qa labels for lxmert
* made answer optional for lxmert question answering
* add option to return hidden_states for lxmert
* changed default qa labels for lxmert
* changed config archive path
* squshing 3 commits: merged UI + testing improvments + more UI and testing
* changed some variable names for lxmert
* TF LXMERT
* Various fixes to LXMERT
* Final touches to LXMERT
* AutoTokenizer order
* Add LXMERT to index.rst and README.md
* Merge commit test fixes + Style update
* TensorFlow 2.3.0 sequential model changes variable names
Remove inherited test
* Update src/transformers/modeling_tf_pytorch_utils.py
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* added suggestions
* Fixes
* Final fixes for TF model
* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 04:02:25 -04:00
Patrick von Platen
4d1a3ffde8
[EncoderDecoder] Add xlm-roberta to encoder decoder ( #6878 )
...
* finish xlm-roberta
* finish docs
* expose XLMRobertaForCausalLM
2020-09-01 21:56:39 +02:00
Patrick von Platen
afc4ece462
[Generate] Facilitate PyTorch generate using ModelOutputs
( #6735 )
...
* fix generate for GPT2 Double Head
* fix gpt2 double head model
* fix bart / t5
* also add for no beam search
* fix no beam search
* fix encoder decoder
* simplify t5
* simplify t5
* fix t5 tests
* fix BART
* fix transfo-xl
* fix conflict
* integrating sylvains and sams comments
* fix tf past_decoder_key_values
* fix enc dec test
2020-09-01 12:38:25 +02:00
Sam Shleifer
0ebc9699fa
[fixdoc] Add import to pegasus usage doc ( #6698 )
2020-08-24 15:54:57 -04:00
Stas Bekman
912a21ec78
remove BartForConditionalGeneration.generate ( #6659 )
...
As suggested here: https://github.com/huggingface/transformers/issues/6651#issuecomment-678594233
this removes generic `generate` doc with examples not-relevant to bart.
2020-08-25 00:42:34 +08:00
Suraj Patil
d0e42a7bed
CamembertForCausalLM ( #6577 )
...
* added CamembertForCausalLM
* add in __init__ and auto model
* style
* doc
2020-08-21 13:52:54 +02:00
Suraj Patil
fb6844aff5
[Pegasus Doc] minor typo ( #6579 )
...
Minor typo correction
@sshleifer
2020-08-18 12:47:47 -04:00
Sam Shleifer
12d7624199
[marian] converter supports models from new Tatoeba project ( #6342 )
2020-08-17 23:55:42 -04:00
Suraj Patil
c9564f5343
[Doc] add more MBart and other doc ( #6490 )
...
* add mbart example
* add Pegasus and MBart in readme
* typo
* add MBart in Pretrained models
* add pre-proc doc
* add DPR in readme
* fix indent
* doc fix
2020-08-17 12:30:26 -04:00
Patrick von Platen
36010cb1e2
fix pegasus doc ( #6533 )
2020-08-17 12:24:43 +02:00
Suraj Patil
680f1337c3
MBartForConditionalGeneration ( #6441 )
...
* add MBartForConditionalGeneration
* style
* rebase and fixes
* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
* fix docs
* don't ignore mbart
* doc
* fix mbart fairseq link
* put mbart before bart
* apply doc suggestions
2020-08-14 03:21:16 -04:00
Patrick von Platen
0735def8e1
[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer ( #6411 )
...
* add encoder-decoder for roberta
* fix headmask
* apply Sylvains suggestions
* fix typo
* Apply suggestions from code review
2020-08-12 18:23:30 +02:00
Sylvain Gugger
a8db954cda
Activate check on the CI ( #6427 )
...
* Activate check on the CI
* Fix repo inconsistencies
* Don't document too much
2020-08-12 08:42:14 -04:00
Sylvain Gugger
34fabe1697
Move prediction_loss_only to TrainingArguments ( #6426 )
2020-08-12 08:03:45 -04:00
Sam Shleifer
be1520d3a3
rename prepare_translation_batch -> prepare_seq2seq_batch ( #6103 )
2020-08-11 15:57:07 -04:00
Sam Shleifer
66fa8ceaea
PegasusForConditionalGeneration (torch version) ( #6340 )
...
Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>
2020-08-11 14:31:23 -04:00
Patrick von Platen
00bb0b25ed
TF Longformer ( #5764 )
...
* improve names and tests longformer
* more and better tests for longformer
* add first tf test
* finalize tf basic op functions
* fix merge
* tf shape test passes
* narrow down discrepancies
* make longformer local attn tf work
* correct tf longformer
* add first global attn function
* add more global longformer func
* advance tf longformer
* finish global attn
* upload big model
* finish all tests
* correct false any statement
* fix common tests
* make all tests pass except keras save load
* fix some tests
* fix torch test import
* finish tests
* fix test
* fix torch tf tests
* add docs
* finish docs
* Update src/transformers/modeling_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply Lysandres suggestions
* reverse to assert statement because function will fail otherwise
* applying sylvains recommendations
* Update src/transformers/modeling_longformer.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-08-10 23:25:06 +02:00
Sylvain Gugger
6ba540b747
Add a script to check all models are tested and documented ( #6298 )
...
* Add a script to check all models are tested and documented
* Apply suggestions from code review
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Address comments
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-08-07 09:18:37 -04:00
Sylvain Gugger
c67d1a0259
Tf model outputs ( #6247 )
...
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* Add new models and fix issues
* Quality improvements
* Add T5
* A bit of cleanup
* Fix for slow tests
* Style
2020-08-05 11:34:39 -04:00
Kevin Canwen Xu
3c289fb38c
Remove outdated BERT tips ( #6217 )
...
* Remove out-dated BERT tips
* Update modeling_outputs.py
* Update bert.rst
* Update bert.rst
2020-08-04 01:17:56 +08:00
Faiaz Rahman
a39dfe4fb1
Fixed typo in Longformer ( #6180 )
2020-08-01 18:20:48 +08:00
Oren Amsalem
d24ea708d7
Actually the extra_id are from 0-99 and not from 1-100 ( #5967 )
...
a = tokenizer.encode("we got a <extra_id_99>", return_tensors='pt',add_special_tokens=True)
print(a)
>tensor([[ 62, 530, 3, 9, 32000]])
a = tokenizer.encode("we got a <extra_id_100>", return_tensors='pt',add_special_tokens=True)
print(a)
>tensor([[ 62, 530, 3, 9, 3, 2, 25666, 834, 23, 26,
834, 2915, 3155]])
2020-07-30 06:13:29 -04:00
Sylvain Gugger
a20969170b
Add AlbertForPretraining to doc ( #5914 )
2020-07-20 17:53:21 -04:00
Stas Bekman
45addfe96d
FlaubertForTokenClassification ( #5644 )
...
* implement FlaubertForTokenClassification as a subclass of XLMForTokenClassification
* fix mapping order
* add the doc
* add common tests
2020-07-13 14:59:53 -04:00
Sylvain Gugger
7fad617dc1
Document model outputs ( #5673 )
...
* Document model outputs
* Update docs/source/main_classes/output.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-07-10 17:31:02 -04:00
Sam Shleifer
353b8f1e7a
Add mbart-large-cc25, support translation finetuning ( #5129 )
...
improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg
2020-07-07 13:23:01 -04:00
Quentin Lhoest
fbd8792195
Add DPR model ( #5279 )
...
* beginning of dpr modeling
* wip
* implement forward
* remove biencoder + better init weights
* export dpr model to embed model for nlp lib
* add new api
* remove old code
* make style
* fix dumb typo
* don't load bert weights
* docs
* docs
* style
* move the `k` parameter
* fix init_weights
* add pretrained configs
* minor
* update config names
* style
* better config
* style
* clean code based on PR comments
* change Dpr to DPR
* fix config
* switch encoder config to a dict
* style
* inheritance -> composition
* add messages in assert startements
* add dpr reader tokenizer
* one tokenizer per model
* fix base_model_prefix
* fix imports
* typo
* add convert script
* docs
* change tokenizers conf names
* style
* change tokenizers conf names
* minor
* minor
* fix wrong names
* minor
* remove unused convert functions
* rename convert script
* use return_tensors in tokenizers
* remove n_questions dim
* move generate logic to tokenizer
* style
* add docs
* docs
* quality
* docs
* add tests
* style
* add tokenization tests
* DPR full tests
* Stay true to the attention mask building
* update docs
* missing param in bert input docs
* docs
* style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-07-07 08:56:12 -04:00
Patrick von Platen
d16e36c7e5
[Reformer] Add Masked LM Reformer ( #5426 )
...
* fix conflicts
* fix
* happy rebasing
2020-07-01 22:43:18 +02:00
Patrick von Platen
fe81f7d12c
finish reformer qa head ( #5433 )
2020-07-01 12:27:14 -04:00
Sylvain Gugger
1262495a91
Add TF auto model to the docs + fix sphinx warnings ( #5187 )
2020-06-22 14:43:52 -04:00
Vasily Shamporov
9a3f91088c
Add MobileBert ( #4901 )
...
* Add MobileBert
* Quality + Conversion script
* style
* Update src/transformers/modeling_mobilebert.py
* Links to S3
* Style
* TFMobileBert
Slight fixes to the pytorch MobileBert
Style
* MobileBertForMaskedLM (PT + TF)
* MobileBertForNextSentencePrediction (PT + TF)
* MobileFor{MultipleChoice, TokenClassification} (PT + TF)
ss
* Tests + Auto
* Doc
* Tests
* Addressing @sgugger's comments
* Adressing @patrickvonplaten's comments
* Style
* Style
* Integration test
* style
* Model card
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-19 16:38:36 -04:00
Suraj Patil
18177a1a60
lm_labels => labels ( #5080 )
2020-06-18 09:16:29 +02:00
Sylvain Gugger
204ebc25e6
Update installation page and add contributing to the doc ( #5084 )
...
* Update installation page and add contributing to the doc
* Remove mention of symlinks
2020-06-17 14:01:10 -04:00
Sylvain Gugger
011cc0be51
Fix all sphynx warnings ( #5068 )
2020-06-16 16:50:02 -04:00
Yacine Jernite
49c5202522
Eli5 examples ( #4968 )
...
* add eli5 examples
* add dense query script
* query_di
* merging
* merging
* add_utils
* adds nearest neighbor wikipedia
* batch queries
* training_retriever
* new notebooks
* moved retriever traiing script
* finished wiki40b
* max_len_fix
* train_s2s
* retriever_batch_checkpointing
* cleanup
* merge
* dim_fix
* fix_indexer
* fix_wiki40b_snippets
* fix_embed_for_r
* fp32 index
* fix_sparse_q
* joint_training
* remove obsolete datasets
* add_passage_nn_results
* add_passage_nn_results
* add_batch_nn
* add_batch_nn
* add_data_scripts
* notebook
* notebook
* notebook
* fix_multi_gpu
* add_app
* full_caching
* full_caching
* notebook
* sparse_done
* images
* notebook
* add_image_gif
* with_Gif
* add_contr_image
* notebook
* notebook
* notebook
* train_functions
* notebook
* min_retrieval_length
* pandas_option
* notebook
* min_retrieval_length
* notebook
* notebook
* eval_Retriever
* notebook
* images
* notebook
* add_example
* add_example
* notebook
* fireworks
* notebook
* notebook
* joe's notebook comments
* app_update
* notebook
* notebook_link
* captions
* notebook
* assing RetriBert model
* add RetriBert to Auto
* change AutoLMHead to AutoSeq2Seq
* notebook downloads from hf models
* style_black
* style_black
* app_update
* app_update
* fix_app_update
* style
* style
* isort
* Delete WikiELI5training.ipynb
* Delete evaluate_eli5.py
* Delete WikiELI5explore.ipynb
* Delete ExploreWikiELI5Support.html
* Delete explainlikeimfive.py
* Delete wiki_snippets.py
* children before parent
* children before parent
* style_black
* style_black_only
* isort
* isort_new
* Update src/transformers/modeling_retribert.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* typo fixes
* app_without_asset
* cleanup
* Delete ELI5animation.gif
* Delete ELI5contrastive.svg
* Delete ELI5wiki_index.svg
* Delete choco_bis.svg
* Delete fireworks.gif
* Delete huggingface_logo.jpg
* Delete huggingface_logo.svg
* Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb
* Delete eli5_app.py
* Delete eli5_utils.py
* readme
* Update README.md
* unused imports
* moved_info
* default_beam
* ftuned model
* disclaimer
* Update src/transformers/modeling_retribert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* black
* add_doc
* names
* isort_Examples
* isort_Examples
* Add doc to index
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-06-16 16:36:58 -04:00
Sylvain Gugger
f9f8a5312e
Add DistilBertForMultipleChoice ( #5032 )
...
* Add `DistilBertForMultipleChoice`
2020-06-15 18:31:41 -04:00
Suraj Patil
e93ccb3290
BartForQuestionAnswering ( #4908 )
2020-06-12 15:47:57 -04:00
Sylvain Gugger
538531cde5
Add AlbertForMultipleChoice ( #4959 )
...
* Add AlbertForMultipleChoice
* Make up to date and add all models to common tests
2020-06-12 14:20:19 -04:00
Suraj Patil
ef2dcdccaa
ElectraForQuestionAnswering ( #4913 )
...
* ElectraForQuestionAnswering
* udate __init__
* add test for electra qa model
* add ElectraForQuestionAnswering in auto models
* add ElectraForQuestionAnswering in all_model_classes
* fix outputs, input_ids defaults to None
* add ElectraForQuestionAnswering in docs
* remove commented line
2020-06-10 15:17:52 -04:00
Sylvain Gugger
41a1d27cde
Add XLMRobertaForQuestionAnswering ( #4855 )
...
* Add XLMRobertaForQuestionAnswering
* Formatting
* Make test happy
2020-06-08 21:22:37 -04:00
Sylvain Gugger
37be3786cf
Clean documentation ( #4849 )
...
* Clean documentation
2020-06-08 11:28:19 -04:00
Julien Chaumond
b42586ea56
Fix CI after killing archive maps ( #4724 )
...
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Patrick von Platen
56ee2560be
[Longformer] Better handling of global attention mask vs local attention mask ( #4672 )
...
* better api
* improve automatic setting of global attention mask
* fix longformer bug
* fix global attention mask in test
* fix global attn mask flatten
* fix slow tests
* update docstring
* update docs and make more robust
* improve attention mask
2020-05-29 17:58:42 +02:00
Patrick von Platen
9c17256447
[Longformer] Multiple choice for longformer ( #4645 )
...
* add multiple choice for longformer
* add models to docs
* adapt docstring
* add test to longformer
* add longformer for mc in init and modeling auto
* fix tests
2020-05-29 13:46:08 +02:00
Patrick von Platen
c589eae2b8
[Longformer For Question Answering] Conversion script, doc, small fixes ( #4593 )
...
* add new longformer for question answering model
* add new config as well
* fix links
* fix links part 2
2020-05-26 14:58:47 +02:00
Patrick von Platen
3e3e552125
[Reformer] fix reformer num buckets ( #4564 )
...
* fix reformer num buckets
* fix
* adapt docs
* set num buckets in config
2020-05-25 16:04:45 -04:00
Alexander Measure
95a26fcf2d
link to paper was broken ( #4526 )
...
changed from https://https://arxiv.org/abs/2001.04451.pdf to https://arxiv.org/abs/2001.04451.pdf
2020-05-22 15:17:09 -04:00
Patrick von Platen
48c3a70b4e
[Longformer] Docs and clean API ( #4464 )
...
* add longformer docs
* improve docs
2020-05-19 21:52:36 +02:00
Soham Chatterjee
fa6113f9a0
Fixed spelling of training ( #4416 )
2020-05-18 11:23:29 -04:00
Sam Shleifer
9a687ebb77
[Marian Fixes] prevent predicting pad_token_id before softmax, support language codes, name multilingual models ( #4290 )
2020-05-13 17:29:41 -04:00
Sam Shleifer
3487be75ef
[Marian] documentation and AutoModel support ( #4152 )
...
- MarianSentencepieceTokenizer - > MarianTokenizer
- Start using unk token.
- add docs page
- add better generation params to MarianConfig
- more conversion utilities
2020-05-10 13:54:57 -04:00
Julien Chaumond
c99fe0386b
[doc] Fix broken links + remove crazy big notebook
2020-05-07 18:44:18 -04:00
Patrick von Platen
dca34695d0
Reformer ( #3351 )
...
* first copy & past commit from Bert and morgans LSH code
* add easy way to compare to trax original code
* translate most of function
* make trax lsh self attention deterministic with numpy seed + copy paste code
* add same config
* add same config
* make layer init work
* implemented hash_vectors function for lsh attention
* continue reformer translation
* hf LSHSelfAttentionLayer gives same output as trax layer
* refactor code
* refactor code
* refactor code
* refactor
* refactor + add reformer config
* delete bogus file
* split reformer attention layer into two layers
* save intermediate step
* save intermediate step
* make test work
* add complete reformer block layer
* finish reformer layer
* implement causal and self mask
* clean reformer test and refactor code
* fix merge conflicts
* fix merge conflicts
* update init
* fix device for GPU
* fix chunk length init for tests
* include morgans optimization
* improve memory a bit
* improve comment
* factorize num_buckets
* better testing parameters
* make whole model work
* make lm model work
* add t5 copy paste tokenizer
* add chunking feed forward
* clean config
* add improved assert statements
* make tokenizer work
* improve test
* correct typo
* extend config
* add complexer test
* add new axial position embeddings
* add local block attention layer
* clean tests
* refactor
* better testing
* save intermediate progress
* clean test file
* make shorter input length work for model
* allow variable input length
* refactor
* make forward pass for pretrained model work
* add generation possibility
* finish dropout and init
* make style
* refactor
* add first version of RevNet Layers
* make forward pass work and add convert file
* make uploaded model forward pass work
* make uploaded model forward pass work
* refactor code
* add namedtuples and cache buckets
* correct head masks
* refactor
* made reformer more flexible
* make style
* remove set max length
* add attention masks
* fix up tests
* fix lsh attention mask
* make random seed optional for the moment
* improve memory in reformer
* add tests
* make style
* make sure masks work correctly
* detach gradients
* save intermediate
* correct backprob through gather
* make style
* change back num hashes
* rename to labels
* fix rotation shape
* fix detach
* update
* fix trainer
* fix backward dropout
* make reformer more flexible
* fix conflict
* fix
* fix
* add tests for fixed seed in reformer layer
* fix trainer typo
* fix typo in activations
* add fp16 tests
* add fp16 training
* support fp16
* correct gradient bug in reformer
* add fast gelu
* re-add dropout for embedding dropout
* better naming
* better naming
* renaming
* finalize test branch
* finalize tests
* add more tests
* finish tests
* fix
* fix type trainer
* fix fp16 tests
* fix tests
* fix tests
* fix tests
* fix issue with dropout
* fix dropout seeds
* correct random seed on gpu
* finalize random seed for dropout
* finalize random seed for dropout
* remove duplicate line
* correct half precision bug
* make style
* refactor
* refactor
* docstring
* remove sinusoidal position encodings for reformer
* move chunking to modeling_utils
* make style
* clean config
* make style
* fix tests
* fix auto tests
* pretrained models
* fix docstring
* update conversion file
* Update pretrained_models.rst
* fix rst
* fix rst
* update copyright
* fix test path
* fix test path
* fix small issue in test
* include reformer in generation tests
* add docs for axial position encoding
* finish docs
* Update convert_reformer_trax_checkpoint_to_pytorch.py
* remove isort
* include sams comments
* remove wrong comment in utils
* correct typos
* fix typo
* Update reformer.rst
* applied morgans optimization
* make style
* make gpu compatible
* remove bogus file
* big test refactor
* add example for chunking
* fix typo
* add to README
2020-05-07 10:17:01 +02:00
Patrick von Platen
fa49b9afea
Clean Encoder-Decoder models with Bart/T5-like API and add generate possibility ( #3383 )
...
* change encoder decoder style to bart & t5 style
* make encoder decoder generation dummy work for bert
* make style
* clean init config in encoder decoder
* add tests for encoder decoder models
* refactor and add last tests
* refactor and add last tests
* fix attn masks for bert encoder decoder
* make style
* refactor prepare inputs for Bert
* refactor
* finish encoder decoder
* correct typo
* add docstring to config
* finish
* add tests
* better naming
* make style
* fix flake8
* clean docstring
* make style
* rename
2020-04-28 15:11:09 +02:00
Patrick von Platen
52679fbc2e
add dialogpt training tips ( #3996 )
2020-04-28 14:32:31 +02:00
Lorenzo Ampil
12bb7fe770
Fix t5 doc typos ( #3978 )
...
* Fix tpo in into and add line under
* Add missing blank line under
* Correct types under
2020-04-27 18:27:15 +02:00
Thomas Wolf
827d6d6ef0
Cleanup fast tokenizers integration ( #3706 )
...
* First pass on utility classes and python tokenizers
* finishing cleanup pass
* style and quality
* Fix tests
* Updating following @mfuntowicz comment
* style and quality
* Fix Roberta
* fix batch_size/seq_length inBatchEncoding
* add alignement methods + tests
* Fix OpenAI and Transfo-XL tokenizers
* adding trim_offsets=True default for GPT2 et RoBERTa
* style and quality
* fix tests
* add_prefix_space in roberta
* bump up tokenizers to rc7
* style
* unfortunately tensorfow does like these - removing shape/seq_len for now
* Update src/transformers/tokenization_utils.py
Co-Authored-By: Stefan Schweter <stefan@schweter.it>
* Adding doc and docstrings
* making flake8 happy
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2020-04-18 13:43:57 +02:00
Patrick von Platen
d22894dfd4
[Docs] Add DialoGPT ( #3755 )
...
* add dialoGPT
* update README.md
* fix conflict
* update readme
* add code links to docs
* Update README.md
* Update dialo_gpt2.rst
* Update pretrained_models.rst
* Update docs/source/model_doc/dialo_gpt2.rst
Co-Authored-By: Julien Chaumond <chaumond@gmail.com>
* change filename of dialogpt
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-16 09:04:32 +02:00
Lysandre Debut
d5d7d88612
ELECTRA ( #3257 )
...
* Electra wip
* helpers
* Electra wip
* Electra v1
* ELECTRA may be saved/loaded
* Generator & Discriminator
* Embedding size instead of halving the hidden size
* ELECTRA Tokenizer
* Revert BERT helpers
* ELECTRA Conversion script
* Archive maps
* PyTorch tests
* Start fixing tests
* Tests pass
* Same configuration for both models
* Compatible with base + large
* Simplification + weight tying
* Archives
* Auto + Renaming to standard names
* ELECTRA is uncased
* Tests
* Slight API changes
* Update tests
* wip
* ElectraForTokenClassification
* temp
* Simpler arch + tests
Removed ElectraForPreTraining which will be in a script
* Conversion script
* Auto model
* Update links to S3
* Split ElectraForPreTraining and ElectraForTokenClassification
* Actually test PreTraining model
* Remove num_labels from configuration
* wip
* wip
* From discriminator and generator to electra
* Slight API changes
* Better naming
* TensorFlow ELECTRA tests
* Accurate conversion script
* Added to conversion script
* Fast ELECTRA tokenizer
* Style
* Add ELECTRA to README
* Modeling Pytorch Doc + Real style
* TF Docs
* Docs
* Correct links
* Correct model intialized
* random fixes
* style
* Addressing Patrick's and Sam's comments
* Correct links in docs
2020-04-03 14:10:54 -04:00
Patrick von Platen
5b44e0a31b
[T5] Add training documenation ( #3507 )
...
* Add clear description of how to train T5
* correct docstring in T5
* correct typo
* correct docstring format
* update t5 model docs
* implement collins feedback
* fix typo and add more explanation for sentinal tokens
* delete unnecessary todos
2020-03-30 13:35:53 +02:00
Patrick von Platen
fa9af2468a
Add T5 to docs ( #3461 )
...
* add t5 docs basis
* improve docs
* add t5 docs
* improve t5 docstring
* add t5 tokenizer docstring
* finish docstring
* make style
* add pretrained models
* correct typo
* make examples work
* finalize docs
2020-03-27 10:57:16 -04:00
Sam Shleifer
857e0a0d3b
Rename BartForMaskedLM -> BartForConditionalGeneration ( #3114 )
...
* improved documentation
2020-03-05 17:41:18 -05:00
Sam Shleifer
b54ef78d0c
Bart-CNN ( #3059 )
...
`generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.
2020-03-02 10:35:53 -05:00
Lysandre Debut
bb7c468520
Documentation ( #2989 )
...
* All Tokenizers
BertTokenizer + few fixes
RobertaTokenizer
OpenAIGPTTokenizer + Fixes
GPT2Tokenizer + fixes
TransfoXLTokenizer
Correct rst for TransformerXL
XLMTokenizer + fixes
XLNet Tokenizer + Style
DistilBERT + Fix XLNet RST
CTRLTokenizer
CamemBERT Tokenizer
FlaubertTokenizer
XLMRobertaTokenizer
cleanup
* cleanup
2020-02-25 18:43:36 -05:00
Sam Shleifer
53ce3854a1
New BartModel ( #2745 )
...
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
Lysandre
dd28830327
Update RoBERTa tips
2020-02-07 16:42:35 -05:00
Lysandre
db97930122
Update XLM-R tips
2020-02-07 16:42:35 -05:00
Lysandre
73306d028b
FlauBERT documentation
2020-01-30 10:04:18 -05:00
Lysandre
c69b082601
Update documentation
2020-01-29 12:06:13 -05:00
Lysandre
44a5b4bbe7
Update documentation
2020-01-29 11:47:49 -05:00
thomwolf
e0849a66ac
adding in the doc
2020-01-27 14:27:07 -05:00
Lysandre
983fef469c
AutoModels doc
2020-01-24 16:37:30 -05:00
Lysandre
24d5ad1dcc
Run the examples in slow
2020-01-23 09:38:45 -05:00
Lysandre
9ddf60b694
Tips + whitespaces
2020-01-23 09:38:45 -05:00
Lysandre
0e9899f451
Fixes
2020-01-23 09:38:45 -05:00
Lysandre
7511f3dd89
PyTorch CTRL + Style
2020-01-23 09:38:45 -05:00
Lysandre
980211a63a
XLM-RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
db1a7f27a1
PyTorch DistilBERT
2020-01-23 09:38:45 -05:00
Lysandre
b28020f590
TF RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
3e1bc27e1b
Pytorch RoBERTa
2020-01-23 09:38:45 -05:00
Lysandre
f44ff574d3
Camembert
2020-01-23 09:38:45 -05:00
Lysandre
ccebcae75f
PyTorch XLM
2020-01-23 09:38:45 -05:00
Lysandre
cd656fb21a
PyTorch XLNet
2020-01-23 09:38:45 -05:00
Lysandre
98edad418e
PyTorch Transformer-XL
2020-01-23 09:38:45 -05:00
Lysandre
850795c487
Pytorch GPT
2020-01-23 09:38:45 -05:00
Lysandre
1487b840d3
TF GPT2
2020-01-23 09:38:45 -05:00
Lysandre
bd0d3fd76e
GPT-2 PyTorch models + better tips for BERT
2020-01-23 09:38:45 -05:00
Lysandre
cd77c750c5
BERT PyTorch models
2020-01-23 09:38:45 -05:00
Lysandre
3922a2497e
TF ALBERT + TF Utilities + Fix warnings
2020-01-23 09:38:45 -05:00
Lysandre
00df3d4de0
ALBERT Modeling + required changes to utilities
2020-01-23 09:38:45 -05:00
Lysandre
387217bd3e
Added example usage
2020-01-14 14:09:09 +01:00
Lysandre
7d1bb7f256
Add missing XLNet and XLM models
2020-01-14 14:09:09 +01:00
Lysandre Debut
632682726f
Updated Configurations
2020-01-14 14:09:09 +01:00
alberduris
81d6841b4b
GPU text generation: mMoved the encoded_prompt to correct device
2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b
Moved the encoded_prompts to correct device
2020-01-06 15:11:12 +01:00
Lysandre
361620954a
Remove TFBertForPreTraining from ALBERT doc
2019-11-27 10:11:37 -05:00
Lysandre
ee4647bd5c
CamemBERT & ALBERT doc
2019-11-26 15:10:51 -05:00
Julien Chaumond
93d2fff071
Close #1654
2019-11-01 09:47:38 -04:00
LysandreJik
89f86f9661
CTRL added to the documentation
2019-10-09 12:04:06 -04:00
thomwolf
6c3b131516
typo in readme/doc
2019-09-26 16:23:28 +02:00
LysandreJik
7e957237e4
[Doc] XLM + Torch in documentation
2019-09-26 10:08:56 -04:00
LysandreJik
927904bc91
[doc] pytorch_transformers -> transformers
2019-09-26 08:47:15 -04:00
LysandreJik
8349d75773
Various small doc fixes
2019-09-26 07:45:40 -04:00
LysandreJik
4acd87ff4e
TF models added to documentation
2019-09-26 07:45:40 -04:00
thomwolf
31c23bd5ee
[BIG] pytorch-transformers => transformers
2019-09-26 10:15:53 +02:00
LysandreJik
7f522437bc
Updated documentation for LM finetuning script
2019-09-02 13:40:25 -04:00
LysandreJik
1dc43e56c9
Documentation additions
2019-08-28 09:37:27 -04:00
LysandreJik
572dcfd1db
Doc
2019-08-14 14:56:14 -04:00
thomwolf
0b524b0848
remove derived classes for now
2019-08-05 19:08:19 +02:00
thomwolf
13936a9621
update doc and tests
2019-08-05 18:48:16 +02:00
thomwolf
b90e29d52c
working on automodels
2019-08-05 16:06:34 +02:00
thomwolf
00132b7a7a
updating docs - adding few tests to tokenizers
2019-08-04 22:42:55 +02:00
thomwolf
009273dbdd
big doc update [WIP]
2019-08-04 12:14:57 +02:00
thomwolf
c717d38573
dictionnary => dictionary
2019-07-26 23:30:48 +02:00
Stefan Schweter
5b78400e21
docs: fix link to modeling example source (bert)
2019-07-16 23:41:57 +02:00
thomwolf
2397f958f9
updating examples and doc
2019-07-14 23:20:10 +02:00
LysandreJik
c82b74b996
Fixed Sphinx errors and warnings
2019-07-10 15:30:19 -04:00
LysandreJik
f773faa258
Fixed all links. Removed TPU. Changed CLI to Converting TF models. Many minor formatting adjustments. Added "TODO Lysandre filled" where necessary.
2019-07-10 14:45:56 -04:00
LysandreJik
83fb311ef7
Patched warnings + Refactored XLNet's Docstrings
2019-07-09 16:38:30 -04:00
LysandreJik
8fe2c9d98e
Refactored Docstrings of BERT, GPT2, GPT, TransfoXL, XLM and XLNet.
2019-07-09 15:55:31 -04:00
LysandreJik
ab30651802
Hugging Face theme.
2019-07-08 16:05:26 -04:00
LysandreJik
64fd986376
Tokenizers and Config classes are referenced.
2019-07-05 17:44:59 -04:00
LysandreJik
df759114c9
Single file documentation for each model, accompanied by the Documentation overview.
2019-07-05 17:35:26 -04:00