Commit Graph

695 Commits

Author SHA1 Message Date
yujun
206f06f2dd
Add new model RoFormer (use rotary position embedding ) (#11684)
* add roformer

* Update docs/source/model_doc/roformer.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Update docs/source/model_doc/roformer.rst

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding

* update docs

* make style and make quality

* roback

* unchanged

* rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding

* update Copyright year

* move # Add modeling imports here to the correct position

* max_position_embeddings can be set to 1536

* # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer

* # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer

* update tokenization_roformer

* make style

* add staticmethod apply_rotary_position_embeddings

* add TF staticmethod apply_rotary_position_embeddings

* update torch apply_rotary_position_embeddings

* fix tf apply_rotary_position_embeddings error

* make style

* add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest

* add TF rotary_position_embeddings test

* update test_modeling_rofomer

* Update docs/source/model_doc/roformer.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_roformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/roformer/modeling_tf_roformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refact roformer tokenizer

* add RoFormerTokenizerFast

* add RoFormerTokenizationTest

* add require_jieba

* update Copyright

* update tokenizer & add copy from

* add option rotary_value

* use rust jieba

* use rjieba

* use rust jieba

* fix test_alignement_methods

* slice normalized_string is too slow

* add config.embedding_size when embedding_size!=hidden_size

* fix pickle tokenizer

* Update docs/source/model_doc/roformer.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style and make quality

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-05-20 08:00:34 -04:00
Albert Villanova del Moral
2582e59a57
Add DOI badge to README (#11771) 2021-05-19 09:48:56 -04:00
Patrick von Platen
cebb96f53a
Add more subsections to main doc (#11758)
* add headers to main doc

* Apply suggestions from code review

* update

* upload
2021-05-18 14:38:56 +01:00
Julien Chaumond
0fc56df5fb
Add visual + link to Premium Support webpage (#11740)
* Update README.md

* Update index.rst
2021-05-17 05:28:56 -04:00
Suraj Patil
f063c56d94
Fix clip docs (#11694)
* fix doc url

* fix example
2021-05-12 15:28:30 +05:30
Suraj Patil
8719afa1ad
CLIP (#11445)
* begin second draft

* fix import, style

* add loss

* fix embeds, logits_scale, and projection

* fix imports

* add conversion script

* add feature_extractor and processor

* style

* add tests for tokenizer, extractor and processor

* add vision model tests

* add weight init

* add more tests

* fix save_load  test

* model output, dosstrings, causal mask

* config doc

* add clip model tests

* return dict

* bigin integration test

* add integration tests

* fix-copies

* fix init

* Clip => CLIP

* fix module name

* docs

* fix doc

* output_dim => projection_dim

* fix checkpoint names

* remoe fast tokenizer file

* fix conversion script

* fix tests, quality

* put causal mask on device

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix attribute test

* style

* address sylvains comments

* style

* fix docstrings

* add qucik_gelu in activations, docstrings

* clean-up attention test

* fix act fun

* fix config

* fix torchscript tests

* even batch_size

* remove comment

* fix ouput tu_tuple

* fix save load tests

* fix add tokens test

* add fast tokenizer

* update copyright

* new processor API

* fix docs

* docstrings

* docs

* fix doc

* fix doc

* fix tokenizer

* fix import in doc example

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* check types of config

* valhalla => openai

* load image using url

* fix test

* typo

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-12 13:48:15 +05:30
Matt
b3429ab678
Grammar and style edits for the frontpage README (#11679)
* Grammar and style edits for the frontpage README

* Going all-in on em-dashes because you only live once

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-11 15:49:34 +01:00
Vasudev Gupta
dc3f6758cf
Add BigBirdPegasus (#10991)
* init bigbird pegasus

* add debugging nb ; update config

* init conversion

* update conversion script

* complete conversion script

* init forward()

* complete forward()

* add tokenizer

* add some slow tests

* commit current

* fix copies

* add docs

* add conversion script for bigbird-roberta-summarization

* remove TODO

* small fixups

* correct tokenizer

* add bigbird core for now

* fix config

* fix more

* revert pegasus-tokenizer back

* make style

* everything working for pubmed; yayygit status

* complete tests finally

* remove bigbird pegasus tok

* correct tokenizer

* correct tests

* add tokenizer files

* finish make style

* fix test

* update

* make style

* fix tok utils base file

* make fix-copies

* clean a bit

* small update

* fix some suggestions

* add to readme

* fix a bit, clean tests

* fix more tests

* Update src/transformers/__init__.py

* Update src/transformers/__init__.py

* make fix-copies

* complete attn switching, auto-padding left

* make style

* fix auto-padding test

* make style

* fix batched attention tests

* put tolerance at 1e-1 for stand-alone decoder test

* fix docs

* fix tests

* correct slow tokenizer conversion

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* complete remaining suggestions

* fix test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-05-07 09:27:43 +02:00
NielsRogge
f3cf8ae7b3
Add LUKE (#11223)
* Rebase with master

* Minor bug fix in docs

* Copy files from adding_luke_v2 and improve docs

* change the default value of use_entity_aware_attention to True

* remove word_hidden_states

* fix head models

* fix tests

* fix the conversion script

* add integration tests for the pretrained large model

* improve docstring

* Improve docs, make style

* fix _init_weights for pytorch 1.8

* improve docs

* fix tokenizer to construct entity sequence with [MASK] entity when entities=None

* Make fix-copies

* Make style & quality

* Bug fixes

* Add LukeTokenizer to init

* Address most comments by @patil-suraj and @LysandreJik

* rename _compute_extended_attention_mask to get_extended_attention_mask

* add comments to LukeSelfAttention

* fix the documentation of the tokenizer

* address comments by @patil-suraj, @LysandreJik, and @sgugger

* improve docs

* Make style, quality and fix-copies

* Improve docs

* fix docs

* add "entity_span_classification" task

* update example code for LukeForEntitySpanClassification

* improve docs

* improve docs

* improve the code example in luke.rst

* rename the classification layer in LukeForEntityClassification from typing to classifier

* add bias to the classifier in LukeForEntitySpanClassification

* update docs to use fine-tuned hub models in code examples of the head models

* update the example sentences

* Make style & quality

* Add require_torch to tokenizer tests

* Add require_torch to tokenizer tests

* Address comments by @sgugger and add community notebooks

* Make fix-copies

Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>
2021-05-03 09:07:29 -04:00
Sylvain Gugger
2d27900b5d
Update min versions in README and add Flax (#11472)
* Update min versions in README and add Flax

* Adapt index
2021-04-28 09:10:06 -04:00
NielsRogge
9f1260971f
Add DeiT (PyTorch) (#11056)
* First draft of deit

* More improvements

* Remove DeiTTokenizerFast from init

* Conversion script works

* Add DeiT to ViT conversion script

* Add tests, add head model, add support for deit in vit conversion script

* Update model checkpoint names

* Update image_mean and image_std, set resample to bicubic

* Improve docs

* Docs improvements

* Add DeiTForImageClassificationWithTeacher to init

* Address comments by @sgugger

* Improve feature extractors

* Make fix-copies

* Minor fixes

* Address comments by @patil-suraj

* All models uploaded

* Fix tests

* Remove labels argument from DeiTForImageClassificationWithTeacher

* Fix-copies, style and quality

* Fix tests

* Fix typo

* Multiple docs improvements

* More docs fixes
2021-04-12 18:07:10 -04:00
Kevin Canwen Xu
fb41f9f50c
Add a special tokenizer for CPM model (#11068)
* Add a special tokenizer for CPM model

* make style

* fix

* Add docs

* styles

* cpm doc

* fix ci

* fix the overview

* add test

* make style

* typo

* Custom tokenizer flag

* Add REAMDE.md

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-04-10 02:07:47 +08:00
Julien Demouth
02ec02d6d3
Add nvidia megatron models (#10911)
* Add support for NVIDIA Megatron models

* Add support for NVIDIA Megatron GPT2 and BERT

Add the megatron_gpt2 model. That model reuses the existing GPT2 model. This
commit includes a script to convert a Megatron-GPT2 checkpoint downloaded
from NVIDIA GPU Cloud. See examples/megatron-models/README.md for details.

Add the megatron_bert model. That model is implemented as a modification of
the existing BERT model in Transformers. This commit includes a script to
convert a Megatron-BERT checkpoint downloaded from NVIDIA GPU Cloud. See
examples/megatron-models/README.md for details.

* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Remove model.half in tests + add "# Copied ..."

Remove the model.half() instruction which makes tests fail on the CPU.

Add a comment "# Copied ..." before many classes in the model to enable automatic
tracking in CI between the new Megatron classes and the original Bert ones.

* Fix issues

* Fix Flax/TF tests

* Fix copyright

* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/configuration_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/megatron_bert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/megatron_gpt2.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/megatron_bert/modeling_megatron_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Resolve most of 'sgugger' comments

* Fix conversion issue + Run make fix-copies/quality/docs

* Apply suggestions from code review

* Causal LM & merge

* Fix init

* Add CausalLM to last auto class

Co-authored-by: Julien Demouth <jdemouth@nvidia.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-04-08 14:09:11 -04:00
NielsRogge
30677dc743
Add Vision Transformer and ViTFeatureExtractor (#10950)
* Squash all commits into one

* Update ViTFeatureExtractor to use image_utils instead of torchvision

* Remove torchvision and add Pillow

* Small docs improvement

* Address most comments by @sgugger

* Fix tests

* Clean up conversion script

* Pooler first draft

* Fix quality

* Improve conversion script

* Make style and quality

* Make fix-copies

* Minor docs improvements

* Should use fix-copies instead of manual handling

* Revert "Should use fix-copies instead of manual handling"

This reverts commit fd4e591bce.

* Place ViT in alphabetical order

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-01 11:16:05 -04:00
Suraj Patil
860264379f
GPT Neo (#10848)
* lets begin

* boom boom

* fix out proj in attn

* fix attention

* fix local attention

* add tokenizer

* fix imports

* autotokenizer

* fix checkpoint name

* cleanup

* more clean-up

* more cleanup

* output attentions

* fix attn mask creation

* fix imports

* config doc

* add tests

* add slow tests

* quality

* add conversion script

* copyright

* typo

* another bites the dust

* fix attention tests

* doc

* add embed init in convert function

* fix copies

* remove tokenizer

* enable caching

* address review comments

* improve config and create attn layer list internally

* more consistent naming

* init hf config from mesh-tf config json file

* remove neo tokenizer from doc

* handle attention_mask in local attn layer

* attn_layers => attention_layers

* add tokenizer_class in config

* fix docstring

* raise if len of attention_layers is not same as num_layers

* remove tokenizer_class from config

* more consistent naming

* fix doc

* fix checkpoint names

* fp16 compat

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-30 09:42:30 -04:00
Vasudev Gupta
6dfd027279
BigBird (#10183)
* init bigbird

* model.__init__ working, conversion script ready, config updated

* add conversion script

* BigBirdEmbeddings working :)

* slightly update conversion script

* BigBirdAttention working :) ; some bug in layer.output.dense

* add debugger-notebook

* forward() working for BigBirdModel :) ; replaced gelu with gelu_fast

* tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)

* BigBirdModel working in block-sparse attention mode :)

* add BigBirdForPreTraining

* small fix

* add tokenizer for BigBirdModel

* fix config & hence modeling

* fix base prefix

* init testing

* init tokenizer test

* pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements

* remove position_embedding_type arg

* complete normal tests

* add comments to block sparse attention

* add attn_probs for sliding & global tokens

* create fn for block sparse attn mask creation

* add special tests

* restore pos embed arg

* minor fix

* attn probs update

* make big bird fully gpu friendly

* fix tests

* remove pruning

* correct tokenzier & minor fixes

* update conversion script , remove norm_type

* tokenizer-inference test add

* remove extra comments

* add docs

* save intermediate

* finish trivia_qa conversion

* small update to forward

* correct qa and layer

* better error message

* BigBird QA ready

* fix rebased

* add triva-qa debugger notebook

* qa setup

* fixed till embeddings

* some issue in q/k/v_layer

* fix bug in conversion-script

* fixed till self-attn

* qa fixed except layer norm

* add qa end2end test

* fix gradient ckpting ; other qa test

* speed-up big bird a bit

* hub_id=google

* clean up

* make quality

* speed up einsum with bmm

* finish perf improvements for big bird

* remove wav2vec2 tok

* fix tokenizer

* include docs

* correct docs

* add helper to auto pad block size

* make style

* remove fast tokenizer for now

* fix some

* add pad test

* finish

* fix some bugs

* fix another bug

* fix buffer tokens

* fix comment and merge from master

* add comments

* make style

* commit some suggestions

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix typos

* fix some more suggestions

* add another patch

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* another path

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* update

* update nit suggestions

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-30 08:51:34 +03:00
Lysandre
c988db5af2 Release v4.4.0 2021-03-16 11:33:35 -04:00
Patrick von Platen
602d63f05c
[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648)
* add conversion script

* add wav2vec2 xslr models

* finish

* Update docs/source/model_doc/xlsr_wav2vec2.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-11 17:44:18 +03:00
Suraj Patil
d26b37e744
Speech2TextTransformer (#10175)
* s2t

* fix config

* conversion script

* fix import

* add tokenizer

* fix tok init

* fix tokenizer

* first version working

* fix embeds

* fix lm head

* remove extra heads

* fix convert script

* handle encoder attn mask

* style

* better enc attn mask

* override _prepare_attention_mask_for_generation

* handle attn_maks in encoder and decoder

* input_ids => input_features

* enable use_cache

* remove old code

* expand embeddings if needed

* remove logits bias

* masked_lm_loss => loss

* hack tokenizer to support feature processing

* fix model_input_names

* style

* fix error message

* doc

* remove inputs_embeds

* remove input_embeds

* remove unnecessary docstring

* quality

* SpeechToText => Speech2Text

* style

* remove shared_embeds

* subsample => conv

* remove Speech2TextTransformerDecoderWrapper

* update output_lengths formula

* fix table

* remove max_position_embeddings

* update conversion scripts

* add possibility to do upper case for now

* add FeatureExtractor and Processor

* add tests for extractor

* require_torch_audio => require_torchaudio

* add processor test

* update import

* remove classification head

* attention mask is now 1D

* update docstrings

* attention mask should be of type long

* handle attention mask from generate

* alwyas return attention_mask

* fix test

* style

* doc

* Speech2TextTransformer => Speech2Text

* Speech2TextTransformerConfig => Speech2TextConfig

* remove dummy_inputs

* nit

* style

* multilinguial tok

* fix tokenizer

* add tgt_lang setter

* save lang_codes

* fix tokenizer

* add forced_bos_token_id to tokenizer

* apply review suggestions

* add torchaudio to extra deps

* add speech deps to CI

* fix dep

* add libsndfile to ci

* libsndfile1

* add speech to extras all

* libsndfile1 -> libsndfile1

* libsndfile

* libsndfile1-dev

* apt update

* add sudo to install

* update deps table

* install libsndfile1-dev on CI

* tuple to list

* init conv layer

* add model tests

* quality

* add integration tests

* skip_special_tokens

* add speech_to_text_transformer in toctree

* fix tokenizer

* fix fp16 tests

* add tokenizer tests

* fix copyright

* input_values => input_features

* doc

* add model in readme

* doc

* change checkpoint names

* fix copyright

* fix code example

* add max_model_input_sizes in tokenizer

* fix integration tests

* add do_lower_case to tokenizer

* remove clamp trick

* fix "Add modeling imports here"

* fix copyrights

* fix tests

* SpeechToTextTransformer => SpeechToText

* fix naming

* fix table formatting

* fix typo

* style

* fix typos

* remove speech dep from extras[testing]

* fix copies

* rename doc file,

* put imports under is_torch_available

* run feat extract tests when torch is available

* dummy objects for processor and extractor

* fix imports in tests

* fix import in modeling test

* fxi imports

* fix torch import

* fix imports again

* fix positional embeddings

* fix typo in import

* adapt new extractor refactor

* style

* fix torchscript test

* doc

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docs, copied from, style

* fix docstring

* handle imports

* remove speech from all extra deps

* remove s2t from seq2seq lm mapping

* better names

* skip training tests

* add install instructions

* List => Tuple

* doc

* fix conversion script

* fix urls

* add instruction for libsndfile

* fix fp16 test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Suraj Patil
f6e74a63ca
Add m2m100 (#10236)
* m2m_100

* no layernorm_embedding

* sinusoidal positional embeddings

* update pos embeddings

* add default config values

* tokenizer

* add conversion script

* fix config

* fix pos embed

* remove _float_tensor

* update tokenizer

* update lang codes

* handle lang codes

* fix pos embeds

* fix spm key

* put embedding weights on device

* remove qa and seq classification heads

* fix convert script

* lang codes pn one line

* fix embeds

* fix tokenizer

* fix tokenizer

* add fast tokenizer

* style

* M2M100MT => M2M100

* fix copyright, style

* tokenizer converter

* vocab file

* remove fast tokenizer

* fix embeds

* fix tokenizer

* fix tests

* add tokenizer tests

* add integration test

* quality

* fix model name

* fix test

* doc

* doc

* fix doc

* add copied from statements

* fix tokenizer tests

* apply review suggestions

* fix urls

* fix shift_tokens_right

* apply review suggestions

* fix

* fix doc

* add lang code to id

* remove unused function

* update checkpoint names

* fix copy

* fix tokenizer

* fix checkpoint names

* fix merge issue

* style
2021-03-06 22:14:16 +05:30
Lysandre Debut
0c2325198f
Add I-BERT to README (#10462) 2021-03-01 12:12:31 -05:00
Lysandre Debut
cd8c4c3fc2
DeBERTa-v2 fixes (#10328)
Co-authored-by: Pengcheng He <penhe@microsoft.com>

Co-authored-by: Pengcheng He <penhe@microsoft.com>
2021-02-22 07:45:18 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018)
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;

* DeBERTa-v2

* Fix v2 model loading issue (#10129)

* Doc members

* Update src/transformers/models/deberta/modeling_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address Sylvain's comments

* Address Patrick's comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 (#10154)
* add tokenizer for mBART-50

* update tokenizers

* make src_lang and tgt_lang optional

* update tokenizer test

* add setter

* update docs

* update conversion script

* update docs

* update conversion script

* update tokenizer

* update test

* update docs

* doc

* address Sylvain's suggestions

* fix test

* fix formatting

* nits
2021-02-15 20:58:54 +05:30
Sylvain Gugger
6710d1d5ef Typo fix 2021-02-11 15:12:35 -05:00
yylun
5442a11f5f
fix steps_in_epoch variable in trainer when using max_steps (#9969)
* fix steps_in_epoch variable when using max_steps

* redundant sentence

* Revert "redundant sentence"

This reverts commit ad5c0e9b6e.

* remove redundant sentence

Co-authored-by: wujindou <wujindou@sogou-inc.com>
2021-02-03 09:30:37 -05:00
Patrick von Platen
d6217fb30c
Wav2Vec2 (#9659)
* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00
Stas Bekman
15e4ce353a
[docs] expand install instructions (#9817)
* expand install instructions

* fix

* white space

* rewrite as discussed in the PR

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change the wording to encourage issue report

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-28 09:36:46 -08:00
Stefan Schweter
5ed5a54684
ADD BORT (#9813)
* tests: add integration tests for new Bort model

* bort: add conversion script from Gluonnlp to Transformers 🚀

* bort: minor cleanup (BORT -> Bort)

* add docs

* make fix-copies

* clean doc a bit

* correct docs

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct dialogpt doc

* correct link

* Update docs/source/model_doc/bort.rst

* Update docs/source/model_doc/dialogpt.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-27 21:25:11 +03:00
abhishek thakur
f617490e71
ConvBERT Model (#9717)
* finalize convbert

* finalize convbert

* fix

* fix

* fix

* push

* fix

* tf image patches

* fix torch model

* tf tests

* conversion

* everything aligned

* remove print

* tf tests

* fix tf

* make tf tests pass

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup

* add electra test again

* fix doc

* fix doc again

* fix doc again

* Update src/transformers/modeling_tf_pytorch_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/conv_bert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* conv_bert -> convbert

* more fixes from review

* add conversion script

* dont use pretrained embed

* unused config

* suggestions from julien

* some more fixes

* p -> param

* fix copyright

* fix doc

* Update src/transformers/models/convbert/configuration_convbert.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments from reviews

* fix-copies

* fix style

* revert shape_list

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-27 03:20:09 -05:00
Lysandre
7d9a9d0c72 Release: v4.2.0 2021-01-13 16:01:51 +01:00
Patrick von Platen
9e1ea846bc
[README] Add new models (#9465)
* add new models

* make fix-copies
2021-01-08 05:49:43 -05:00
Clement
4eec5d0cf6
improve readme text to private models/versioning/api (#9424) 2021-01-05 15:02:46 -05:00
Sylvain Gugger
6d2e864db7
Put all models in the constants (#9170)
* Put all models in the constants

* Add Google AI mention in the main README
2020-12-17 11:23:21 -05:00
NielsRogge
1551e2dc6d
[WIP] Tapas v4 (tres) (#9117)
* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-15 17:08:49 -05:00
StillKeepTry
df2af6d8b8 Add MP Net 2 (#9004) 2020-12-09 10:32:43 -05:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Clement
de6befd41f
Remove sourcerer (#8965) 2020-12-07 11:15:29 -05:00
Lysandre Debut
0c5615af66
Put Transformers on Conda (#8918)
* conda

* Guide

* correct tag

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/installation.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sylvain's comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-03 14:28:49 -05:00
Julien Chaumond
9ad6194318
Tweak wording + Add badge w/ number of models on the hub (#8914)
* Add badge w/ number of models on the hub

* try to apease @sgugger 😇

* not sure what this `c` was about [ci skip]

* Fix script and move stuff around

* Fix doc styling error

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-12-03 10:56:55 -05:00
Devangi Purkayastha
e52f9c0ade
Update README.md (#8906) 2020-12-02 09:28:44 -08:00
Sylvain Gugger
75f8100fc7
Add a direct link to the big table (#8850) 2020-11-30 10:29:23 -05:00
Moussa Kamal Eddine
81fe0bf085
Add barthez model (#8393)
* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-27 12:31:42 -05:00
Kevin Canwen Xu
94caaa93c2
Update the bibtex with EMNLP demo (#8678)
* Update the bibtex with EMNLP demo

* Update README.md

* Update README.md
2020-11-20 13:26:33 +08:00
Stas Bekman
06518404cb
revert 2020-11-19 12:12:46 -08:00
Stas Bekman
297a29382f
Please fix your software not to ping master
You may be unaware but you're running some software that meddles with every commit on https://github.com/huggingface/transformers/

Something is wrong with the software you're using. It adds a reference to almost every PR in the master tree. Which is very wrong. Please check your software and please don't do it again.

Example:
see the bottom of this PR and most other PRs:
https://github.com/huggingface/transformers/pull/8639
2020-11-19 12:11:35 -08:00
Patrick von Platen
5104223552
[MT5] More docs (#8589)
* add docs

* make style
2020-11-17 12:47:57 +01:00
Sylvain Gugger
08f534d2da
Doc styling (#8067)
* Important files

* Styling them all

* Revert "Styling them all"

This reverts commit 7d029395fd.

* Syling them for realsies

* Fix syntax error

* Fix benchmark_utils

* More fixes

* Fix modeling auto and script

* Remove new line

* Fixes

* More fixes

* Fix more files

* Style

* Add FSMT

* More fixes

* More fixes

* More fixes

* More fixes

* Fixes

* More fixes

* More fixes

* Last fixes

* Make sphinx happy
2020-10-26 18:26:02 -04:00
Lysandre
eb0e0ce2ad Release: v3.4.0 2020-10-20 16:22:26 +02:00
Weizhen
2422cda01b
ProphetNet (#7157)
* add new model prophetnet

prophetnet modified

modify codes as suggested v1

add prophetnet test files

* still bugs, because of changed output formats of encoder and decoder

* move prophetnet into the latest version

* clean integration tests

* clean tokenizers

* add xlm config to init

* correct typo in init

* further refactoring

* continue refactor

* save parallel

* add decoder_attention_mask

* fix use_cache vs. past_key_values

* fix common tests

* change decoder output logits

* fix xlm tests

* make common tests pass

* change model architecture

* add tokenizer tests

* finalize model structure

* no weight mapping

* correct n-gram stream attention mask as discussed with qweizhen

* remove unused import

* fix index.rst

* fix tests

* delete unnecessary code

* add fast integration test

* rename weights

* final weight remapping

* save intermediate

* Descriptions for Prophetnet Config File

* finish all models

* finish new model outputs

* delete unnecessary files

* refactor encoder layer

* add dummy docs

* code quality

* fix tests

* add model pages to doctree

* further refactor

* more refactor, more tests

* finish code refactor and tests

* remove unnecessary files

* further clean up

* add docstring template

* finish tokenizer doc

* finish prophetnet

* fix copies

* fix typos

* fix tf tests

* fix fp16

* fix tf test 2nd try

* fix code quality

* add test for each model

* merge new tests to branch

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_prophetnet.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update utils/check_repo.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* apply sams and sylvains comments

* make style

* remove unnecessary code

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_prophetnet.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* implement lysandres comments

* correct docs

* fix isort

* fix tokenizers

* fix copies

Co-authored-by: weizhen <weizhen@mail.ustc.edu.cn>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-19 17:36:09 +02:00
Terencio Agozzino
7e6b6fbec9
style: fix typo in the README (#7882) 2020-10-19 08:43:25 -04:00
Sylvain Gugger
a3cea6a8cc
Better links for models in READMED and doc index (#7680) 2020-10-09 11:17:16 -04:00
sgugger
bc00b37a0d Revert "Better model links in the README and index"
This reverts commit 76e05518bb.
2020-10-09 10:56:13 -04:00
sgugger
76e05518bb Better model links in the README and index 2020-10-09 10:54:40 -04:00
Forrest Iandola
02ef825be2
SqueezeBERT architecture (#7083)
* configuration_squeezebert.py

thin wrapper around bert tokenizer

fix typos

wip sb model code

wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working

set up squeezebert to use BertModelOutput when returning results.

squeezebert documentation

formatting

allow head mask that is an array of [None, ..., None]

docs

docs cont'd

path to vocab

docs and pointers to cloud files (WIP)

line length and indentation

squeezebert model cards

formatting of model cards

untrack modeling_squeezebert_scratchpad.py

update aws paths to vocab and config files

get rid of stub of NSP code, and advise users to pretrain with mlm only

fix rebase issues

redo rebase of modeling_auto.py

fix issues with code formatting

more code format auto-fixes

move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert

tests for squeezebert modeling and tokenization

fix typo

move squeezebert before bert in modeling_auto.py to fix inheritance problem

disable test_head_masking, since squeezebert doesn't yet implement head masking

fix issues exposed by the test_modeling_squeezebert.py

fix an issue exposed by test_tokenization_squeezebert.py

fix issue exposed by test_modeling_squeezebert.py

auto generated code style improvement

issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()

update copyright

resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask

docs

add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli

autogenerated formatting tweaks

integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings

* tiny change to order of imports
2020-10-05 04:25:43 -04:00
Akshay Gupta
381443c096
Update README.md (#7498)
Making transformers readme more robust.
2020-10-01 07:42:07 -04:00
Sylvain Gugger
dc7d2daa4c
Alphabetize model lists (#7478) 2020-09-30 10:43:58 -04:00
Pengcheng He
7a0cf0ec93
Add DeBERTa model (#5929)
* Add DeBERTa model

* Remove dependency of deberta

* Address comments

* Patch DeBERTa
Documentation
Style

* Add final tests

* Style

* Enable tests + nitpicks

* position IDs

* BERT -> DeBERTa

* Quality

* Style

* Tokenization

* Last updates.

* @patrickvonplaten's comments

* Not everything can be a copy

* Apply most of @sgugger's review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Last reviews

* DeBERTa -> Deberta

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-30 07:07:30 -04:00
Sylvain Gugger
f1220c5fe2
Add a code of conduct (#7433) 2020-09-29 13:38:47 -04:00
Minghao Li
cd9a0585ea
Add LayoutLM Model (#7064)
* first version

* finish test docs readme model/config/tokenization class

* apply make style and make quality

* fix layoutlm GitHub link

* fix conflict in index.rst and add layoutlm to pretrained_models.rst

* fix bug in test_parents_and_children_in_mappings

* reformat modeling_auto.py and tokenization_auto.py

* fix bug in test_modeling_layoutlm.py

* Update docs/source/model_doc/layoutlm.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/layoutlm.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove inh, add tokenizer fast, and update some doc

* copy and rename necessary class from modeling_bert to modeling_layoutlm

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* add mish to activations.py, import ACT2FN and import logging from utils

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-22 09:28:02 -04:00
Manuel Romero
a4faeceaed
Fix typo in model name (#7268) 2020-09-20 19:12:30 +02:00
Sameer Zahid
5c1d5ea667
Fixed typo in README (#7233) 2020-09-18 04:52:43 -04:00
Sylvain Gugger
108c9aefcc
Update README (#7133)
* Rewrite and update README

* Typo and migration guide

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address Clem's comments

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-09-16 12:12:12 -04:00
Sylvain Gugger
d155b38d6e
Funnel transformer (#6908)
* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Fix copyright

* Forgot some layers can be repeated

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/modeling_funnel.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Update src/transformers/modeling_funnel.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

* Update src/transformers/modeling_funnel.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Slow integration test

* Make small integration test

* Formatting

* Add checkpoint and separate classification head

* Formatting

* Expand list, fix link and add in pretrained models

* Styling

* Add the model in all summaries

* Typo fixes

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-08 08:08:08 -04:00
Antonio V Mendoza
ea2c6f1afc
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793)
* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 04:02:25 -04:00
Julien Chaumond
3242e4d942 [model_cards] Fix tiny typos 2020-08-26 23:16:06 +02:00
Sam Shleifer
f230a64094
new paper bibtex (#6656) 2020-08-23 10:03:41 -04:00
Suraj Patil
c9564f5343
[Doc] add more MBart and other doc (#6490)
* add mbart example

* add Pegasus and MBart in readme

* typo

* add MBart in Pretrained models

* add pre-proc doc

* add DPR in readme

* fix indent

* doc fix
2020-08-17 12:30:26 -04:00
Clement
54f49af4ae
Add inference widget examples (#5825) 2020-07-28 09:14:00 -04:00
Clement
2513fe0d02
added subtitle for recent contributors in readme (#5130) 2020-06-29 09:05:08 -04:00
Thomas Wolf
601d4d699c
[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308)
* remove references to old API in docstring - update data processors

* style

* fix tests - better type checking error messages

* better type checking

* include awesome fix by @LysandreJik for #5310

* updated doc and examples
2020-06-26 19:48:14 +02:00
Sylvain Gugger
24f46ea3f3
Remove links for all docs (#5280) 2020-06-25 11:45:05 -04:00
Sylvain Gugger
c439752482
Switch master/stable doc and add older releases (#5193) 2020-06-22 16:38:53 -04:00
Tim Suchanek
68e19f1c22
Fix typo in root README (#5073) 2020-06-20 23:00:04 +08:00
Sylvain Gugger
e4aaa45805
Update pipeline examples to doctest syntax (#5030) 2020-06-16 18:14:58 -04:00
Lysandre Debut
88762a2f8c
Specify PyTorch versions for examples (#4710) 2020-06-02 04:29:28 -04:00
Lysandre Debut
6a17688021
per_device instead of per_gpu/error thrown when argument unknown (#4618)
* per_device instead of per_gpu/error thrown when argument unknown

* [docs] Restore examples.md symlink

* Correct absolute links so that symlink to the doc works correctly

* Update src/transformers/hf_argparser.py

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Warning + reorder

* Docs

* Style

* not for squad

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-27 11:36:55 -04:00
Iz Beltagy
8f1d047148
Longformer (#4352)
* first commit

* bug fixes

* better examples

* undo padding

* remove wrong VOCAB_FILES_NAMES

* License

* make style

* make isort happy

* unit tests

* integration test

* make `black` happy by undoing `isort` changes!!

* lint

* no need for the padding value

* batch_size not bsz

* remove unused type casting

* seqlen not seq_len

* staticmethod

* `bert` selfattention instead of `n2`

* uint8 instead of bool + lints

* pad inputs_embeds using embeddings not a constant

* black

* unit test with padding

* fix unit tests

* remove redundant unit test

* upload model weights

* resolve todo

* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_

* increase unittest coverage
2020-05-19 16:04:43 +02:00
Sam Shleifer
3487be75ef
[Marian] documentation and AutoModel support (#4152)
- MarianSentencepieceTokenizer - > MarianTokenizer
- Start using unk token.
- add docs page
- add better generation params to MarianConfig
- more conversion utilities
2020-05-10 13:54:57 -04:00
Julien Chaumond
c99fe0386b [doc] Fix broken links + remove crazy big notebook 2020-05-07 18:44:18 -04:00
Julien Chaumond
0ae96ff8a7 BIG Reorganize examples (#4213)
* Created using Colaboratory

* [examples] reorganize files

* remove run_tpu_glue.py as superseded by TPU support in Trainer

* Bugfix: int, not tuple

* move files around
2020-05-07 13:48:44 -04:00
Patrick von Platen
dca34695d0
Reformer (#3351)
* first copy & past commit from Bert and morgans LSH code

* add easy way to compare to trax original code

* translate most of function

* make trax lsh self attention deterministic with numpy seed + copy paste code

* add same config

* add same config

* make layer init work

* implemented hash_vectors function for lsh attention

* continue reformer translation

* hf LSHSelfAttentionLayer gives same output as trax layer

* refactor code

* refactor code

* refactor code

* refactor

* refactor + add reformer config

* delete bogus file

* split reformer attention layer into two layers

* save intermediate step

* save intermediate step

* make test work

* add complete reformer block layer

* finish reformer layer

* implement causal and self mask

* clean reformer test and refactor code

* fix merge conflicts

* fix merge conflicts

* update init

* fix device for GPU

* fix chunk length init for tests

* include morgans optimization

* improve memory a bit

* improve comment

* factorize num_buckets

* better testing parameters

* make whole model work

* make lm model work

* add t5 copy paste tokenizer

* add chunking feed forward

* clean config

* add improved assert statements

* make tokenizer work

* improve test

* correct typo

* extend config

* add complexer test

* add new axial position embeddings

* add local block attention layer

* clean tests

* refactor

* better testing

* save intermediate progress

* clean test file

* make shorter input length work for model

* allow variable input length

* refactor

* make forward pass for pretrained model work

* add generation possibility

* finish dropout and init

* make style

* refactor

* add first version of RevNet Layers

* make forward pass work and add convert file

* make uploaded model forward pass work

* make uploaded model forward pass work

* refactor code

* add namedtuples and cache buckets

* correct head masks

* refactor

* made reformer more flexible

* make style

* remove set max length

* add attention masks

* fix up tests

* fix lsh attention mask

* make random seed optional for the moment

* improve memory in reformer

* add tests

* make style

* make sure masks work correctly

* detach gradients

* save intermediate

* correct backprob through gather

* make style

* change back num hashes

* rename to labels

* fix rotation shape

* fix detach

* update

* fix trainer

* fix backward dropout

* make reformer more flexible

* fix conflict

* fix

* fix

* add tests for fixed seed in reformer layer

* fix trainer typo

* fix typo in activations

* add fp16 tests

* add fp16 training

* support fp16

* correct gradient bug in reformer

* add fast gelu

* re-add dropout for embedding dropout

* better naming

* better naming

* renaming

* finalize test branch

* finalize tests

* add more tests

* finish tests

* fix

* fix type trainer

* fix fp16 tests

* fix tests

* fix tests

* fix tests

* fix issue with dropout

* fix dropout seeds

* correct random seed on gpu

* finalize random seed for dropout

* finalize random seed for dropout

* remove duplicate line

* correct half precision bug

* make style

* refactor

* refactor

* docstring

* remove sinusoidal position encodings for reformer

* move chunking to modeling_utils

* make style

* clean config

* make style

* fix tests

* fix auto tests

* pretrained models

* fix docstring

* update conversion file

* Update pretrained_models.rst

* fix rst

* fix rst

* update copyright

* fix test path

* fix test path

* fix small issue in test

* include reformer in generation tests

* add docs for axial position encoding

* finish docs

* Update convert_reformer_trax_checkpoint_to_pytorch.py

* remove isort

* include sams comments

* remove wrong comment in utils

* correct typos

* fix typo

* Update reformer.rst

* applied morgans optimization

* make style

* make gpu compatible

* remove bogus file

* big test refactor

* add example for chunking

* fix typo

* add to README
2020-05-07 10:17:01 +02:00
Clement
877fc56410
change order pytorch/tf in readme (#4167) 2020-05-06 16:31:07 -04:00
Jared T Nielsen
64070cbb88
Fix TF input docstrings to refer to tf.Tensor rather than torch.FloatTensor. (#4051) 2020-04-30 14:28:56 +02:00
Clement
6ba254ee54
quick fix wording readme for community models (#3900) 2020-04-23 14:19:45 -04:00
Julien Chaumond
dd9d483d03
Trainer (#3800)
* doc

* [tests] Add sample files for a regression task

* [HUGE] Trainer

* Feedback from @sshleifer

* Feedback from @thomwolf + logging tweak

* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes

* [glue] Use default max_seq_length of 128 like before

* [glue] move DataTrainingArguments around

* [ner] Change interface of InputExample, and align run_{tf,pl}

* Re-align the pl scripts a little bit

* ner

* [ner] Add integration test

* Fix language_modeling with API tweak

* [ci] Tweak loss target

* Don't break console output

* amp.initialize: model must be on right device before

* [multiple-choice] update for Trainer

* Re-align to 827d6d6ef0
2020-04-21 20:11:56 -04:00
Patrick von Platen
a21d4fa410
add "by" to ReadMe 2020-04-18 18:07:17 +02:00
Patrick von Platen
d22894dfd4
[Docs] Add DialoGPT (#3755)
* add dialoGPT

* update README.md

* fix conflict

* update readme

* add code links to docs

* Update README.md

* Update dialo_gpt2.rst

* Update pretrained_models.rst

* Update docs/source/model_doc/dialo_gpt2.rst

Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* change filename of dialogpt

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-04-16 09:04:32 +02:00
Julien Chaumond
cbad305ce6
[docs] The use of do_lower_case in scripts is on its way to deprecation (#3738) 2020-04-10 12:34:04 -04:00
Julien Chaumond
83703cd077 Update doc for {Summarization,Translation}Pipeline and other tweaks 2020-04-08 09:45:00 -04:00
Lysandre Debut
d5d7d88612
ELECTRA (#3257)
* Electra wip

* helpers

* Electra wip

* Electra v1

* ELECTRA may be saved/loaded

* Generator & Discriminator

* Embedding size instead of halving the hidden size

* ELECTRA Tokenizer

* Revert BERT helpers

* ELECTRA Conversion script

* Archive maps

* PyTorch tests

* Start fixing tests

* Tests pass

* Same configuration for both models

* Compatible with base + large

* Simplification + weight tying

* Archives

* Auto + Renaming to standard names

* ELECTRA is uncased

* Tests

* Slight API changes

* Update tests

* wip

* ElectraForTokenClassification

* temp

* Simpler arch + tests

Removed ElectraForPreTraining which will be in a script

* Conversion script

* Auto model

* Update links to S3

* Split ElectraForPreTraining and ElectraForTokenClassification

* Actually test PreTraining model

* Remove num_labels from configuration

* wip

* wip

* From discriminator and generator to electra

* Slight API changes

* Better naming

* TensorFlow ELECTRA tests

* Accurate conversion script

* Added to conversion script

* Fast ELECTRA tokenizer

* Style

* Add ELECTRA to README

* Modeling Pytorch Doc + Real style

* TF Docs

* Docs

* Correct links

* Correct model intialized

* random fixes

* style

* Addressing Patrick's and Sam's comments

* Correct links in docs
2020-04-03 14:10:54 -04:00
Thomas Wolf
2187c49f5c
CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186)
* memory benchmark rss

* have both forward pass and line-by-line mem tracing

* cleaned up tracing

* refactored and cleaning up API

* no f-strings yet...

* add GPU mem logging

* fix GPU memory monitoring

* style and quality

* clean up and doc

* update with comments

* Switching to python 3.6+

* fix quality
2020-03-17 10:17:11 -04:00
Sam Shleifer
087465b943
add BART to README (#3255) 2020-03-12 19:38:05 -04:00
Julien Chaumond
d6de6423ba [doc] --organization tweak
Co-Authored-By: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-03-10 16:52:44 -04:00
Julien Chaumond
0e56dc3078 [doc] Document the new --organization flag of CLI 2020-03-10 16:42:01 -04:00
Santiago Castro
976e9afece Add syntax highlighting to the BibTeX in README 2020-02-20 10:06:15 -05:00
Lysandre
59c23ad9c9 README link + better instructions for release 2020-02-19 11:57:17 -05:00
VictorSanh
ee5a6856ca distilbert-base-cased weights + Readmes + omissions 2020-02-07 15:28:13 -05:00
Clement
c069932f5d Add contributors snapshot
powered by https://github.com/sourcerer-io/hall-of-fame
2020-02-06 15:25:47 -05:00
Julien Chaumond
eae8ee0389 [doc] model sharing: mention README.md + tweaks
cc @lysandrejik @thomwolf
2020-02-05 14:20:03 -05:00
Arnaud
3a21d6da6b Typo on markdown link in README.md 2020-01-31 10:58:49 -05:00
Lysandre
0aa40e9569 v2.4.0 documentation 2020-01-31 09:55:34 -05:00
Julien Chaumond
9fa836a73f
fill_mask helper (#2576)
* fill_mask helper

* [poc] FillMaskPipeline

* Revert "[poc] FillMaskPipeline"

This reverts commit 67eeea55b0.

* Revert "fill_mask helper"

This reverts commit cacc17b884.

* README: clarify that Pipelines can also do text-classification

cf. question at the AI&ML meetup last week, @mfuntowicz

* Fix test: test feature-extraction pipeline

* Test tweaks

* Slight refactor of existing pipeline (in preparation of new FillMaskPipeline)

* Extraneous doc

* More robust way of doing this

@mfuntowicz as we don't rely on the model name anymore (see AutoConfig)

* Also add RobertaConfig as a quickfix for wrong token_type_ids

* cs

* [BIG] FillMaskPipeline
2020-01-30 18:15:42 -05:00
Hang Le
f0a4fc6cd6 Add Flaubert 2020-01-30 10:04:18 -05:00
Julien Chaumond
119dc50e2a Doc tweak on model sharing 2020-01-22 22:40:38 -05:00
alberduris
81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris
dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
Julien Chaumond
78528742f1 Fix syntax + link to community page 2020-01-05 12:43:39 -05:00
Clement
12e0aa4368 Proposition to include community models in readme 2020-01-05 12:37:11 -05:00
Julien Chaumond
9b2badf3c9 [cli] Update doc 2019-12-27 22:54:29 -05:00
Aymeric Augustin
3233b58ad4 Quote square brackets in shell commands.
This ensures compatibility with zsh.

Fix #2316.
2019-12-27 08:50:25 +01:00
Aymeric Augustin
a8d34e534e Remove [--editable] in install instructions.
Use -e only in docs targeted at contributors.

If a user copy-pastes  command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin
70373a5f7c Update contribution instructions.
Also provide shortcuts in a Makefile.
2019-12-23 21:05:30 +01:00
Aymeric Augustin
45841eaf7b Remove references to Python 2 in documentation. 2019-12-22 18:38:56 +01:00
Aymeric Augustin
b6ea0f43ae Remove duplicate -v flag. 2019-12-22 17:47:27 +01:00
Aymeric Augustin
ced0a94204 Switch test files to the standard test_*.py scheme. 2019-12-22 14:15:13 +01:00
Aymeric Augustin
067395d5c5 Move tests outside of library. 2019-12-22 13:47:17 +01:00
Aymeric Augustin
698f9e3d7a Remove trailing whitespace in README. 2019-12-22 13:29:58 +01:00
thomwolf
1ab25c49d3 Merge branch 'master' into pr/2115 2019-12-21 14:54:30 +01:00
Thomas Wolf
6e7102cfb3
Merge pull request #2203 from gthb/patch-1
fix: wrong architecture count in README
2019-12-21 14:31:44 +01:00
Lysandre
a436574bfd Release: v2.3.0 2019-12-20 16:22:20 -05:00
thomwolf
71883b6ddc update link in readme 2019-12-20 19:40:23 +01:00
Morgan Funtowicz
b98ff88544 Added pipelines quick tour in README 2019-12-20 15:52:50 +01:00
Stefan Schweter
3e89fca543 readme: add XLM-RoBERTa to model architecture list 2019-12-18 19:44:23 +01:00
Gunnlaugur Thor Briem
d303f84e7b
fix: wrong architecture count in README
Just say “the following” so that this intro doesn't so easily fall out of date :) )
2019-12-17 16:18:00 +00:00
Julien Chaumond
3f5ccb183e [doc] Clarify uploads
cf 855ff0e91d (commitcomment-36452545)
2019-12-16 18:20:29 -05:00
Julien Chaumond
855ff0e91d [doc] Model upload and sharing
ping @lysandrejik @thomwolf

Is this clear enough? Anything we should add?
2019-12-16 12:42:22 -05:00
Thomas Wolf
e92bcb7eb6
Merge pull request #1739 from huggingface/t5
[WIP] Adding Google T5 model
2019-12-14 09:40:43 +01:00
Lysandre
7bd11dda6f Release: v2.2.2 2019-12-13 16:45:30 -05:00
thomwolf
0558c9cb9b Merge branch 'master' into t5 2019-12-10 12:58:48 +01:00
Suvrat Bhooshan
df3961121f Add MMBT Model to Transformers Repo 2019-12-09 18:36:48 -08:00
Pierric Cistac
5c877fe94a
fix albert links 2019-12-09 18:53:00 -05:00
Aymeric Augustin
35401fe50f Remove dependency on pytest for running tests (#2055)
* Switch to plain unittest for skipping slow tests.

Add a RUN_SLOW environment variable for running them.

* Switch to plain unittest for PyTorch dependency.

* Switch to plain unittest for TensorFlow dependency.

* Avoid leaking open files in the test suite.

This prevents spurious warnings when running tests.

* Fix unicode warning on Python 2 when running tests.

The warning was:

    UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

* Support running PyTorch tests on a GPU.

Reverts 27e015bd.

* Tests no longer require pytest.

* Make tests pass on cuda
2019-12-06 13:57:38 -05:00
VictorSanh
552c44a9b1 release distilm-bert 2019-12-05 10:14:58 -05:00
LysandreJik
8101924a68 Patch: v2.2.1 2019-12-03 11:20:26 -05:00
Julien Chaumond
b5d884d25c Uniformize #1952 2019-11-27 11:05:55 -05:00
Lysandre
cf26a0c85e Fix pretrained models table 2019-11-26 15:40:03 -05:00
Lysandre Debut
b632145273
Update master documentation link in README 2019-11-26 14:27:15 -05:00
Lysandre
ae98d45991 Release: v2.2.0 2019-11-26 14:12:44 -05:00
Julien Chaumond
176cd1ce1b [doc] homogenize instructions slightly 2019-11-23 11:18:54 -05:00
Rémi Louf
6f70bb8c69 add instructions to run the examples 2019-11-21 14:41:19 -05:00
Julien Chaumond
3916b334a8
[camembert] Acknowledge the full author list 2019-11-18 09:29:11 -05:00
Sebastian Stabinger
44455eb5b6 Adds CamemBERT to Model architectures list 2019-11-18 09:23:14 -05:00
Thomas Wolf
df99f8c5a1
Merge pull request #1832 from huggingface/memory-leak-schedulers
replace LambdaLR scheduler wrappers by function
2019-11-14 22:10:31 +01:00
Rémi Louf
2276bf69b7 update the examples, docs and template 2019-11-14 20:38:02 +01:00
thomwolf
8aba81a0b6 fix #1789 2019-11-12 08:52:43 +01:00
thomwolf
f03c0c1423 adding models in readme and auto classes 2019-11-08 11:49:46 +01:00
Lysandre
68f7064a3e Add model.train() line to ReadMe training example
Co-Authored-By: Santosh-Gupta <San.Gupta.ML@gmail.com>
2019-11-04 11:52:35 -05:00
Thomas Wolf
7f84fc571a
Merge pull request #1670 from huggingface/templates
Templates and explanation for adding a new model and example script
2019-10-30 17:05:58 +01:00
Thomas Wolf
5c6a19a94a
Merge pull request #1604 from huggingface/deploy_doc
Versioning in documentation
2019-10-30 17:03:14 +01:00
thomwolf
328a86d2af adding links to the templates in readme and contributing 2019-10-30 11:37:55 +01:00
Lysandre
b82bfbd0c3 Updated README to show all available documentation 2019-10-24 15:55:31 +00:00
Julien Chaumond
ef1b8b2ae5 [CTRL] warn if generation prompt does not start with a control code
see also https://github.com/salesforce/ctrl/pull/50
2019-10-22 21:30:32 +00:00
Julián Peller (dataista)
e16d46843a Fix architectures count 2019-10-22 15:13:47 -04:00
thomwolf
4d456542e9 Fix citation 2019-10-21 16:34:14 +02:00
Lysandre Debut
c544194611
Remove special_tokens_mask from inputs in README
Co-authored-by: Thomas Wolf @thomwolf
2019-10-16 11:05:13 -04:00
Emrah Budur
5a8c6e771a Fixed the sample code in the title 'Quick tour'. 2019-10-12 14:17:17 +03:00
thomwolf
4b8f3e8f32 adding citation 2019-10-11 16:18:16 +02:00
thomwolf
d9e60f4f0d Merge branch 'master' into pr/1383 2019-10-09 17:25:08 +02:00
Julien Chaumond
d688af19e5 Update link to swift-coreml-transformers
cc @lysandrejik
2019-10-08 16:37:52 -04:00
seanBE
6dc6c716c5 fix pytorch-transformers migration description in README 2019-10-07 09:59:54 +01:00
Christopher Goh
904158ac4d Rephrase forward method to reduce ambiguity 2019-10-06 23:40:52 -04:00
Christopher Goh
0f65d8cbbe Fix some typos in README 2019-10-06 23:40:52 -04:00
keskarnitish
dbed1c5d94 Adding CTRL (squashed commit)
adding conversion script

adding first draft of modeling & tokenization

adding placeholder for test files

bunch of changes

registering the tokenizer/model/etc

tests

change link; something is very VERY wrong here

weird end-of-word thingy going on

i think the tokenization works now ; wrote the unit tests

overall structure works;load w next

the monster is alive!

works after some cleanup as well

adding emacs autosave to gitignore

currently only supporting the 48 layer one; seems to infer fine on my macbook

cleanup

fixing some documentation

fixing some documentation

tests passing?

now works on CUDA also

adding greedy?

adding greedy sampling

works well
2019-10-03 22:29:03 -07:00
VictorSanh
35071007cb incoming release 🔥 update links to arxiv preprint 2019-10-03 10:27:11 -04:00
DenysNahurnyi
6971556ab8 Fix syntax typo in README.md 2019-10-01 14:59:31 -04:00
Santosh Gupta
5c3b32d44d Update README.md
Lines 183 - 200, fixed indentation. Line 198, replaced `tokenizer_class` with `BertTokenizer`, since `tokenizer_class` is not defined in the loop it belongs to.
2019-09-30 18:48:01 +00:00
wangfei
60f791631b Fix link in readme 2019-09-28 16:20:17 +08:00
BramVanroy
15749bfc10 Add small note about the output of hidden states 2019-09-27 10:01:36 +02:00
thomwolf
6c3b131516 typo in readme/doc 2019-09-26 16:23:28 +02:00
thomwolf
4e63c90720 update installation instructions in readme 2019-09-26 16:14:21 +02:00
Lysandre Debut
0f92f76ca3
CircleCI reference in README 2019-09-26 08:59:52 -04:00
thomwolf
9676d1a2a8 update readme and setup.py 2019-09-26 13:47:58 +02:00
thomwolf
4dde31cb76 update readme 2019-09-26 12:18:26 +02:00
thomwolf
4ddc31ff40 update readme with migration change 2019-09-26 12:00:38 +02:00
thomwolf
f47f7f4611 add logo 2019-09-26 11:28:44 +02:00
thomwolf
9fabc0b6a9 wip readme 2019-09-26 11:21:34 +02:00
thomwolf
31c23bd5ee [BIG] pytorch-transformers => transformers 2019-09-26 10:15:53 +02:00
Julien Chaumond
62760baf46 tiny fixes 2019-09-17 18:29:15 -04:00
Julien Chaumond
f9453d15e5
Fix broken link 2019-09-05 12:35:22 -04:00
Julien Chaumond
f7ee2e5d20 [README] link to Write With Transformer 2019-09-05 12:33:46 -04:00
Thomas Wolf
50e615f43d
Merge branch 'master' into improved_testing 2019-08-30 13:40:35 +02:00
thomwolf
306af132d7 update readme to mention add_special_tokens more clearly in example 2019-08-30 11:30:51 +02:00
LysandreJik
75bc2a03cc Updated article link 2019-08-28 10:05:15 -04:00
thomwolf
912a377e90 dilbert -> distilbert 2019-08-28 13:59:42 +02:00
thomwolf
4ce5f36f78 update readmes 2019-08-28 12:14:31 +02:00
VictorSanh
497f73c964 add DilBERT to master REAME 2019-08-28 07:16:30 +00:00
thomwolf
e00b4ff1de fix #1017 2019-08-21 22:22:17 +02:00
Nikolay Korolev
ad6e62cd82
Fix typo. configuratoin -> configuration 2019-08-20 15:43:06 +03:00
Christophe Bourguignat
189ff9b664 Update README after RoBERTa addition 2019-08-17 13:18:37 -04:00
LysandreJik
9d0029e215 Added RoBERTa example to README 2019-08-15 17:17:35 -04:00
Lysandre Debut
88efc65bac
Merge pull request #964 from huggingface/RoBERTa
RoBERTa: model conversion, inference, tests 🔥
2019-08-15 11:11:10 -04:00
Julien Chaumond
c4ef103447 [RoBERTa] First 4 authors
cf. https://github.com/huggingface/pytorch-transformers/pull/964#discussion_r313574354

Co-Authored-By: Myle Ott <myleott@fb.com>
2019-08-14 12:31:09 -04:00
carefree0910
a7b4cfe919 Update README.md
I assume that it should test the `re-load` functionality after testing the `save` functionality, however I'm also surprised that nobody points this out after such a long time, so maybe I've misunderstood the purpose. This PR is just in case :)
2019-08-12 09:53:05 -04:00
LysandreJik
d2cc6b101e Merge branch 'master' into RoBERTa 2019-08-08 09:42:05 -04:00
Christopher Goh
a6f412da01 Fixed typo in migration guide 2019-08-07 02:19:14 +08:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models 2019-08-05 19:17:35 +02:00
thomwolf
7223886dc9 fix #944 2019-08-05 17:16:56 +02:00
thomwolf
58830807d1 inidicate we only support pytorch 1.0.0+ now 2019-08-05 14:38:59 +02:00
thomwolf
328afb7097 cleaning up tokenizer tests structure (at last) - last remaining ppb refs 2019-08-05 14:08:56 +02:00
Julien Chaumond
05c083520a [RoBERTa] model conversion, inference, tests 🔥 2019-08-04 21:39:21 -04:00
thomwolf
009273dbdd big doc update [WIP] 2019-08-04 12:14:57 +02:00
Julien Chaumond
44dd941efb link to swift-coreml-transformers 2019-08-01 09:50:30 -04:00
Anthony MOI
f2a3eb987e
Fix small typos 2019-07-31 11:05:06 -04:00
Pierric Cistac
97091acb8c
Small spelling fix 2019-07-31 10:37:56 -04:00
Grégory Châtel
769bb643ce Fixing a broken link. 2019-07-31 10:22:41 -04:00
Thomas Wolf
fec76a481d
Update readme 2019-07-23 16:05:29 +02:00
thomwolf
ba52fe69d5 update breaking change section regarding from_pretrained keyword arguments 2019-07-23 15:10:02 +02:00
rish-16
2f869dc665 Fixed typo 2019-07-21 11:05:36 -04:00
Thomas Wolf
dbecfcf321
Merge pull request #815 from praateekmahajan/update-readme-link
Update Readme link for Fine Tune/Usage section
2019-07-18 18:30:32 +02:00
Peiqin Lin
acc48a0cc9 typos 2019-07-18 09:54:04 -04:00
Praateek Mahajan
0d46b17553
Update Readme
Incorrect link for `Quick tour: Fine-tuning/usage scripts`
2019-07-17 22:50:10 -07:00
thomwolf
c5b3d86a91 Merge branch 'master' of https://github.com/huggingface/pytorch-pretrained-BERT 2019-07-16 21:21:05 +02:00
thomwolf
6b70760204 typos 2019-07-16 21:21:03 +02:00
Thomas Wolf
b33a385091
update readme 2019-07-16 16:18:37 +02:00
thomwolf
6a72d9aa52 updated examples in readme 2019-07-16 16:09:29 +02:00
thomwolf
b59043bf8f update readme 2019-07-16 16:03:48 +02:00
thomwolf
edc79acb3b simpler quick tour 2019-07-16 16:02:32 +02:00
thomwolf
5c82d3488f indicate default evaluation in breaking changes 2019-07-16 15:45:58 +02:00
thomwolf
4acaa65068 model in evaluation mode by default after from_pretrained 2019-07-16 15:41:57 +02:00
thomwolf
1849aa7d39 update readme and pretrained model weight files 2019-07-16 15:11:29 +02:00
thomwolf
43e0e8fa04 updates to readme and doc 2019-07-16 13:56:47 +02:00
thomwolf
352e3ff998 added migration guide to readme 2019-07-16 09:03:49 +02:00
thomwolf
8ad7e5b4f2 indeed 2019-07-16 00:29:15 +02:00
thomwolf
064d0a0b76 update readme 2019-07-16 00:21:33 +02:00
thomwolf
3b8b0e01bb update readme 2019-07-16 00:12:55 +02:00
thomwolf
2397f958f9 updating examples and doc 2019-07-14 23:20:10 +02:00
thomwolf
6135de2fa3 readme update 2019-07-11 15:39:49 +02:00
thomwolf
e468192e2f Merge branch 'pytorch-transformers' into xlnet 2019-07-09 17:05:37 +02:00
LysandreJik
ab30651802 Hugging Face theme. 2019-07-08 16:05:26 -04:00
thomwolf
eb91f6437e update readme and setup 2019-07-05 12:30:15 +02:00
thomwolf
0231ba291e circle-ci 2019-07-05 11:59:04 +02:00
thomwolf
0bab55d5d5 [BIG] name change 2019-07-05 11:55:36 +02:00
thomwolf
93e9971c54 fix tests 2019-06-26 10:02:45 +02:00
thomwolf
e55d4c4ede various updates to conversion, models and examples 2019-06-26 00:57:53 +02:00
thomwolf
603c513b35 update main conversion script and readme 2019-06-25 10:45:07 +02:00
thomwolf
62d78aa37e updating GLUE utils for compatibility with XLNet 2019-06-24 14:36:11 +02:00
thomwolf
c304593d8f BERTology details in readme 2019-06-20 10:05:06 +02:00
thomwolf
34d706a0e1 pruning in bertology 2019-06-19 15:25:49 +02:00
thomwolf
dc8e0019b7 updating examples 2019-06-19 13:23:20 +02:00
thomwolf
68ab9599ce small fix and updates to readme 2019-06-19 09:38:38 +02:00
thomwolf
4d8c4337ae test barrier in distrib training 2019-06-18 22:41:28 +02:00
thomwolf
15ebd67d4e cache in run_classifier + various fixes to the examples 2019-06-18 15:58:22 +02:00
thomwolf
d82e5deeb1 set find_unused_parameters=True in DDP 2019-06-18 12:13:14 +02:00
thomwolf
f964753090 explanation on the current location of the caching folder 2019-06-18 11:36:28 +02:00
thomwolf
382e2d1e50 spliting config and weight files for bert also 2019-06-18 10:37:16 +02:00
thomwolf
4447f270b2 updating hub 2019-06-17 16:21:28 +02:00
thomwolf
33d3db5c43 updating head masking, readme and docstrings 2019-06-17 15:51:28 +02:00
thomwolf
34858ae1d9 adding bert whole words, bertgerman and gpt-2 medium models, head masking 2019-06-17 11:02:39 +02:00
timoeller
16af9ff7b0 Add German Bert model to code, update readme 2019-06-14 17:42:46 +02:00
Colanim
1eba8b9d96
Fix link in README 2019-05-30 14:01:46 +09:00
lukovnikov
331a46ff04 - replaced OpenAIGPTAdam with OpenAIAdam in docs 2019-04-25 16:04:37 +02:00
lukovnikov
704037ad51 - updated docs for new LR API
- added some images for illustration
- updated comments in optimization
2019-04-25 15:59:39 +02:00
thomwolf
18a8a15f78 improving GPT2 tokenization and adding tests 2019-04-16 17:00:55 +02:00
thomwolf
1135f2384a clean up logger in examples for distributed case 2019-04-15 15:22:40 +02:00
thomwolf
cc43307023 update readme 2019-04-15 15:06:10 +02:00
thomwolf
60ea6c59d2 added best practices for serialization in README and examples 2019-04-15 15:00:33 +02:00
thomwolf
20577d8a7c add configuration serialization to readme 2019-04-15 14:21:41 +02:00
thomwolf
b17963d82f update readme 2019-04-15 13:44:30 +02:00
Weixin Wang
f26ce6992e
Fix links in README 2019-04-02 17:20:32 +08:00
Sepehr Sameni
b588ff362a
fix lm_finetuning's link 2019-03-29 12:39:24 +04:30
Thomas Wolf
694e2117f3
Merge pull request #388 from ananyahjha93/master
Added remaining GLUE tasks to 'run_classifier.py'
2019-03-28 09:06:53 +01:00
Thomas Wolf
bbff03fbfc
Merge pull request #394 from desireevl/master
Minor change in README
2019-03-27 12:03:00 +01:00
thomwolf
34561e61a5 update main readme also 2019-03-27 12:00:04 +01:00
Ananya Harsh Jha
f471979167 added GLUE dev set results and details on how to run GLUE tasks 2019-03-21 15:38:30 -04:00
Desiree Vogt-Lee
d52f914e24
weigths to weights 2019-03-21 15:02:59 +10:00
Junjie Qian
d648a02203 Correct line number in README for classes 2019-03-08 16:28:03 -08:00
thomwolf
7cc35c3104 fix openai gpt example and updating readme 2019-03-06 11:43:21 +01:00
thomwolf
906b638efa updating readme 2019-03-06 10:24:19 +01:00
John Hewitt
e14c6b52e3 add BertTokenizer flag to skip basic tokenization 2019-02-26 20:11:24 -08:00
Joel Grus
8722e9eb3b finish updating docstrings 2019-02-23 06:31:59 -08:00
Stanislas Polu
ff22b3acc0 Few small nits in GPT-2's code examples 2019-02-21 09:15:27 +00:00
Tong Guo
09efcece75
Update README.md 2019-02-21 11:25:33 +08:00
Tony Lin
5b0e0b61f0
fix typo in readme 2019-02-19 20:34:18 +08:00
Davide Fiocco
0ae8eece55
MInor README typos corrected 2019-02-18 21:28:28 +01:00
sam-qordoba
1cb9c76ec5
Fix typo in GPT2Model code sample
Typo prevented code from running
2019-02-18 09:27:26 -08:00
Thomas Wolf
a25d056b7a
update readme 2019-02-18 15:30:11 +01:00
Thomas Wolf
517d7c8624
update readme 2019-02-18 14:39:55 +01:00
Thomas Wolf
ada22a1c9e
more details in GPT-2 usage example 2019-02-18 14:37:41 +01:00
Thomas Wolf
522733f6cb
readme typo fixes 2019-02-18 14:32:10 +01:00
thomwolf
d44db1145c update readme 2019-02-18 11:12:09 +01:00
Thomas Wolf
0e774e57a6
Update readme
Adding details on how to extract a full list of hidden states for the Transformer-XL
2019-02-14 08:39:58 +01:00
Thomas Wolf
4e56da38d9
Merge pull request #268 from wangxiaodiu/master
fixed a minor bug in README.md
2019-02-13 10:19:25 +01:00
thomwolf
67376c02e2 update readme for tokenizers 2019-02-13 10:11:11 +01:00
Liang Niu
e1b3cfb504 fixed a minor bug in README.md 2019-02-12 15:54:23 +04:00
Thomas Wolf
3c33499f87
fix typo in readme 2019-02-12 10:22:54 +01:00
thomwolf
1e71f11dec Release: 0.5.0 2019-02-11 14:16:27 +01:00
thomwolf
eebc8abbe2 clarify and unify model saving logic in examples 2019-02-11 14:04:19 +01:00
thomwolf
81c7e3ec9f fix typo in readme 2019-02-11 13:37:12 +01:00
thomwolf
884ca81d87 transposing the inputs of Transformer-XL to have a unified interface 2019-02-11 13:19:59 +01:00
thomwolf
32fea876bb add distant debugging to run_transfo_xl 2019-02-11 12:53:32 +01:00
thomwolf
b31ba23913 cuda on in the examples by default 2019-02-11 12:15:43 +01:00
thomwolf
2071a9b86e fix python 2.7 imports 2019-02-11 10:35:36 +01:00
thomwolf
b514a60c36 added tests for OpenAI GPT and Transformer-XL tokenizers 2019-02-11 10:17:16 +01:00
thomwolf
9f9909ea2f update readme 2019-02-09 16:59:21 +01:00
thomwolf
0c1a6f9b1d update readme 2019-02-08 22:32:25 +01:00
thomwolf
009b581316 updated readme 2019-02-07 23:15:05 +01:00
thomwolf
f99f2fb661 docstrings 2019-02-07 17:07:22 +01:00
Thomas Wolf
848aae49e1
Merge branch 'master' into python_2 2019-02-06 00:13:20 +01:00
thomwolf
ba37ddc5ce fix run_lm_modeling example command line 2019-02-06 00:07:08 +01:00
Girishkumar
0dd2b750ca
Minor update in README
Update links to classes in `modeling.py`
2019-01-30 23:49:15 +05:30
thomwolf
3a848111e6 update config, docstrings and readme to switch to seperated tokens and position embeddings 2019-01-29 11:00:11 +01:00
Davide Fiocco
35115eaf93
(very) minor update to README 2019-01-16 21:05:24 +01:00
nhatchan
8edc898f63 Fix documentation (missing backslashes)
This PR adds missing backslashes in LM Fine-tuning subsection in README.md.
2019-01-13 21:23:19 +09:00
thomwolf
e5c78c6684 update readme and few typos 2019-01-10 01:40:00 +01:00
thomwolf
fa5222c296 update readme 2019-01-10 01:25:28 +01:00
Thomas Wolf
c18bdb4433
Merge pull request #124 from deepset-ai/master
Add example for fine tuning BERT language model
2019-01-07 12:03:51 +01:00
Julien Chaumond
8da280ebbe Setup CI 2018-12-20 16:33:39 -05:00
tholor
e5fc98c542 add exemplary training data. update to nvidia apex. refactor 'item -> line in doc' mapping. add warning for unknown word. 2018-12-20 18:30:52 +01:00
tholor
67f4dd56a3 update readme for run_lm_finetuning 2018-12-19 09:22:37 +01:00
Julien Chaumond
d57763f582 Fix typos 2018-12-18 19:23:22 -05:00
Thomas Wolf
786cc41299
Typos in readme 2018-12-17 09:22:18 +01:00
Daniel Khashabi
8b1b93947f
Minor fix. 2018-12-14 14:10:36 -05:00
Thomas Wolf
8809eb6c93
update readme with information on NVIDIA's apex 2018-12-14 16:59:39 +01:00
thomwolf
d821358884 update readme 2018-12-14 15:15:17 +01:00
thomwolf
087798b7fa fix reloading model for evaluation in examples 2018-12-13 14:48:12 +01:00
thomwolf
0f544625f4 fix swag example for work with apex 2018-12-13 13:35:59 +01:00
thomwolf
4946c2c500 run_swag example in readme 2018-12-13 13:02:07 +01:00
Thomas Wolf
91aab2a6d3
Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex
Change to use apex for better fp16 and multi-gpu support
2018-12-13 12:32:37 +01:00
Thomas Wolf
ffe9075f48
Merge pull request #96 from rodgzilla/multiple-choice-code
BertForMultipleChoice and Swag dataset example.
2018-12-13 12:05:11 +01:00
Grégory Châtel
dcb50eaa4b Swag example readme section update with gradient accumulation run. 2018-12-12 18:17:46 +01:00
Deyu Fu
c8ea286048 change to apex for better fp16 and multi-gpu support 2018-12-11 17:13:58 -08:00
Thomas Wolf
a3a3180c86
Bump up requirements to Python 3.6 2018-12-11 11:29:45 +01:00
Grégory Châtel
0876b77f7f Change to the README file to add SWAG results. 2018-12-10 15:34:19 +01:00
Davide Fiocco
c9f67e037c
Adding --do_lower_case for all uncased BERTs
I had missed those, it should make sense to use them
2018-12-07 20:40:56 +01:00
Grégory Châtel
150f3cd9fa Few typos in README.md 2018-12-06 19:22:07 +01:00
Grégory Châtel
4fa7892d64 Wrong line number link to modeling file. 2018-12-06 19:18:29 +01:00
Grégory Châtel
6a26e19ea3 Updating README.md with SWAG example informations. 2018-12-06 19:15:08 +01:00
Grégory Châtel
0a7c8bdcac Fixing badly formatted links. 2018-12-04 13:43:56 +01:00
Grégory Châtel
3113e967db Adding links to examples files. 2018-12-04 13:40:38 +01:00
Davide Fiocco
8a8aa59d8c
Update finetuning example adding --do_lower_case
Should be consistent with the fact that an uncased model is used
2018-12-01 01:00:05 +01:00
thomwolf
f9f3bdd60b update readme 2018-11-30 23:05:18 +01:00
thomwolf
52ff0590ff tup => tpu 2018-11-30 23:01:10 +01:00
thomwolf
296f006132 added BertForTokenClassification model 2018-11-30 13:56:53 +01:00
thomwolf
298107fed7 Added new bert models 2018-11-30 13:56:02 +01:00
Davide Fiocco
ec2c339b53
Updated quick-start example with BertForMaskedLM
As `convert_ids_to_tokens` returns a list, the code in the README currently throws an `AssertionError`, so I propose I quick fix.
2018-11-28 14:53:46 +01:00
thomwolf
05053d163c update cache_dir in readme and examples 2018-11-26 10:45:13 +01:00
thomwolf
029bdc0d50 fixing readme examples 2018-11-26 09:56:41 +01:00
Thomas Wolf
60e01ac427
fix link in readme 2018-11-21 12:08:30 +01:00
Thomas Wolf
fd32ebed81
Merge pull request #42 from weiyumou/master
Fixed UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2
2018-11-20 10:09:50 +01:00
thomwolf
eed255a58d fixing CLI typo in readme 2018-11-20 10:02:57 +01:00
weiyumou
9ff2b7d86d Fixed README typo 2018-11-19 23:13:10 -05:00
Thomas Wolf
da73925f6a
fix typos 2018-11-19 20:58:48 +01:00
Joel Grus
dd56cfd89a
update pip package name 2018-11-19 09:50:34 -08:00
Thomas Wolf
956c917344
fix typos in readme 2018-11-17 23:25:23 +01:00
Thomas Wolf
7c91e51c26
update links in readme 2018-11-17 22:54:15 +01:00
Thomas Wolf
e113101702
fix typos in readme 2018-11-17 12:36:35 +01:00
thomwolf
47a7d4ec14 update examples from master 2018-11-17 12:21:35 +01:00
thomwolf
c8cba67742 clean up readme and examples 2018-11-17 12:19:16 +01:00
thomwolf
757750d6f6 fix tests 2018-11-17 11:58:14 +01:00