Commit Graph

719 Commits

Author SHA1 Message Date
Lysandre
7d9a9d0c72 Release: v4.2.0 2021-01-13 16:01:51 +01:00
Julien Chaumond
247a7b2029
Doc: Update pretrained_models wording (#9545)
* Update pretrained_models.rst

To clarify things cf. this tweet for instance https://twitter.com/RTomMcCoy/status/1349094111505211395

* format
2021-01-13 05:58:05 -05:00
Stas Bekman
2df34f4aba
[trainer] deepspeed integration (#9211)
* deepspeed integration

* style

* add test

* ds wants to do its own backward

* fp16 assert

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* for clarity extract what args are being passed to deepspeed

* introduce the concept of self.wrapped_model

* s/self.wrapped_model/self.model_wrapped/

* complete transition to self.wrapped_model / self.model

* fix

* doc

* give ds its own init

* add custom overrides, handle bs correctly

* fix test

* clean up model_init logic, fix small bug

* complete fix

* collapse --deepspeed_config into --deepspeed

* style

* start adding doc notes

* style

* implement hf2ds optimizer and scheduler configuration remapping

* oops

* call get_num_training_steps absolutely when needed

* workaround broken auto-formatter

* deepspeed_config arg is no longer needed - fixed in deepspeed master

* use hf's fp16 args in config

* clean

* start on the docs

* rebase cleanup

* finish up --fp16

* clarify the supported stages

* big refactor thanks to discovering deepspeed.init_distributed

* cleanup

* revert fp16 part

* add checkpoint-support

* more init ds into integrations

* extend docs

* cleanup

* unfix docs

* clean up old code

* imports

* move docs

* fix logic

* make it clear which file it's referring to

* document nodes/gpus

* style

* wrong format

* style

* deepspeed handles gradient clipping

* easier to read

* major doc rewrite

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docs

* switch to AdamW optimizer

* style

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* clarify doc

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 19:05:18 -08:00
NielsRogge
e45eba3b1c
Improve LayoutLM (#9476)
* Add LayoutLMForSequenceClassification and integration tests

Improve docs

Add LayoutLM notebook to list of community notebooks

* Make style & quality

* Address comments by @sgugger, @patrickvonplaten and @LysandreJik

* Fix rebase with master

* Reformat in one line

* Improve code examples as requested by @patrickvonplaten

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-01-12 09:26:32 -05:00
Patrick von Platen
7f28613213
[TFBart] Split TF-Bart (#9497)
* make templates ready

* make add_new_model_command_ready

* finish tf bart

* prepare tf mbart

* finish tf bart

* add tf mbart

* add marian

* prep pegasus

* add tf pegasus

* push blenderbot tf

* add blenderbot

* add blenderbot small

* clean-up

* make fix copy

* define blend bot tok

* fix

* up

* make style

* add to docs

* add copy statements

* overwrite changes

* improve

* fix docs

* finish

* fix last slow test

* fix missing git conflict line

* fix blenderbot

* up

* fix blenderbot small

* load changes

* finish copied from

* upload fix
2021-01-12 02:06:32 +01:00
Sylvain Gugger
8d25df2c7a
Make doc styler detect lists on rst (#9488) 2021-01-11 08:53:41 -05:00
Patrick von Platen
9e1ea846bc
[README] Add new models (#9465)
* add new models

* make fix-copies
2021-01-08 05:49:43 -05:00
Patrick von Platen
ae5a32bb0d
up (#9454) 2021-01-07 11:51:02 +01:00
Simon Brandeis
c89f1bc92e
Add flags to return scores, hidden states and / or attention weights in GenerationMixin (#9150)
* Define new output dataclasses for greedy generation

* Add output_[...] flags in greedy generation methods

Added output_attentions, output_hidden_states, output_scores flags in
generate and greedy_search methods in GenerationMixin.

* [WIP] Implement logic and tests for output flags in generation

* Update GreedySearchOutput classes & docstring

* Implement greedy search output accumulation logic

Update greedy_search unittests

Fix generate method return value docstring

Properly init flags with the default config

* Update configuration to add output_scores flag

* Fix test_generation_utils

Sort imports and fix isinstance tests for GreedySearchOutputs

* Fix typo in generation_utils

* Add return_dict_in_generate for backwards compatibility

* Add return_dict_in_generate flag in config

* Fix tyPo in configuration

* Fix handling of attentions and hidden_states flags

* Make style & quality

* first attempt attentions

* some corrections

* improve tests

* special models requires special test

* disable xlm test for now

* clean tests

* fix for tf

* isort

* Add output dataclasses for other generation methods

* Add logic to return dict in sample generation

* Complete test for sample generation

- Pass output_attentions and output_hidden_states flags to encoder in
encoder-decoder models
- Fix import satements order in test_generation_utils file

* Add logic to return dict in sample generation

- Refactor tests to avoid using self.assertTrue, which provides
scarce information when the test fails
- Add tests for the three beam_search methods: vanilla, sample and
grouped

* Style doc

* Fix copy-paste error in generation tests

* Rename logits to scores and refactor

* Refactor group_beam_search for consistency

* make style

* add sequences_scores

* fix all tests

* add docs

* fix beam search finalize test

* correct docstring

* clean some files

* Made suggested changes to the documentation

* Style doc ?

* Style doc using the Python util

* Update src/transformers/generation_utils.py

* fix empty lines

* fix all test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-06 17:11:42 +01:00
Qbiwan
ecfcac223c
Improve documentation coverage for Phobert (#9427)
* first commit

* change phobert to phoBERT as per author in overview

* v3 and v4 both runs on same code hence there is no need to differentiate them

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-06 10:04:32 -05:00
Qbiwan
be898998bb
Improve documentation coverage for Herbert (#9428)
* first commit

* changed XLMTokenizer to HerbertTokenizer in code example
2021-01-06 09:13:43 -05:00
Patrick von Platen
b972c1bfb0
finalize (#9431) 2021-01-06 14:36:55 +01:00
Sylvain Gugger
bcb55d33ce
Upgrade styler to better handle lists (#9423)
* Add missing lines before a new list.

* Update doc styler and restyle some files.

* Fix docstrings of LED and Longformer
2021-01-06 07:46:17 -05:00
NielsRogge
b7e548976f
Fix URLs to TAPAS notebooks (#9435) 2021-01-06 07:20:41 -05:00
Stas Bekman
d64372fdfc
[docs] outline sharded ddp doc (#9208)
* outline sharded dpp doc

* fix link

* add example

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* narrow the command and remove non-essentials

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-05 17:34:15 -08:00
Patrick von Platen
eef66035a2
[PyTorch Bart] Split Bart into different models (#9343)
* first try

* remove old template

* finish bart

* finish mbart

* delete unnecessary line

* init pegasus

* save intermediate

* correct pegasus

* finish pegasus

* remove cookie cutter leftover

* add marian

* finish blenderbot

* replace in file

* correctly split blenderbot

* delete "old" folder

* correct "add statement"

* adapt config for tf comp

* correct configs for tf

* remove ipdb

* fix more stuff

* fix mbart

* push pegasus fix

* fix mbart

* more fixes

* fix research projects code

* finish docs for bart, mbart, and marian

* delete unnecessary file

* correct attn typo

* correct configs

* remove pegasus for seq class

* correct peg docs

* correct peg docs

* finish configs

* further improve docs

* add copied from statements to mbart

* fix copied from in mbart

* add copy statements to marian

* add copied from to marian

* add pegasus copied from

* finish pegasus

* finish copied from

* Apply suggestions from code review

* make style

* backward comp blenderbot

* apply lysandres and sylvains suggestions

* apply suggestions

* push last fixes

* fix docs

* fix tok tests

* fix imports code style

* fix doc
2021-01-05 22:00:05 +01:00
Patrick von Platen
189387e9b2
LED (#9278)
* create model

* add integration

* save current state

* make integration tests pass

* add one more test

* add explanation to tests

* remove from bart

* add padding

* remove unnecessary test

* make all tests pass

* re-add cookie cutter tests

* finish PyTorch

* fix attention test

* Update tests/test_modeling_common.py

* revert change

* remove unused file

* add string to doc

* save intermediate

* make tf integration tests pass

* finish tf

* fix doc

* fix docs again

* add led to doctree

* add to auto tokenizer

* added tips for led

* make style

* apply jplus statements

* correct tf longformer

* apply lysandres suggestions

* apply sylvains suggestions

* Apply suggestions from code review
2021-01-05 13:14:30 +01:00
Sugeeth
314cca2842
Fix documentation links always pointing to master. (#9217)
* Use extlinks to point hyperlink with the version of code

* Point to version on release and master until then

* Apply style

* Correct links

* Add missing backtick

* Simple missing backtick after all.

Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2021-01-05 06:18:48 -05:00
Qbiwan
086718ac6e
Improve documentation coverage for Bertweet (#9379)
* bertweet docs coverage

* style doc max len 119

* maxlen style rst

* run main() from style_doc

* changed according to  comments
2021-01-04 13:12:59 -05:00
Patrick von Platen
75ff530551
correct docs (#9378) 2021-01-04 17:27:29 +01:00
Patrick von Platen
52b3a05e83
[Bart doc] Fix outdated statement (#9299)
* fix bart doc

* fix docs
2020-12-24 14:47:53 +01:00
Suraj Patil
88ef8893cd
Add caching mechanism to BERT, RoBERTa (#9183)
* add past_key_values

* add use_cache option

* make mask before cutting ids

* adjust position_ids according to past_key_values

* flatten past_key_values

* fix positional embeds

* fix _reorder_cache

* set use_cache to false when not decoder, fix attention mask init

* add test for caching

* add past_key_values for Roberta

* fix position embeds

* add caching test for roberta

* add doc

* make style

* doc, fix attention mask, test

* small fixes

* adress patrick's comments

* input_ids shouldn't start with pad token

* use_cache only when decoder

* make consistent with bert

* make copies consistent

* add use_cache to encoder

* add past_key_values to tapas attention

* apply suggestions from code review

* make coppies consistent

* add attn mask in tests

* remove copied from longformer

* apply suggestions from code review

* fix bart test

* nit

* simplify model outputs

* fix doc

* fix output ordering
2020-12-23 23:01:32 +05:30
Connor Brinton
bcc87c639f
Minor documentation revisions from copyediting (#9266)
* typo: Revise "checkout" to "check out"

* typo: Change "seemlessly" to "seamlessly"

* typo: Close parentheses in "Using the tokenizer"

* typo: Add closing parenthesis to supported models aside

* docs: Treat ``position_ids`` as plural

Alternatively, the word "argument" could be added to make the subject singular.

* docs: Remove comma, making subordinate clause

* docs: Remove comma separating verb and direct object

* docs: Fix typo ("next" -> "text")

* docs: Reverse phrase order to simplify sentence

* docs: "quicktour" -> "quick tour"

* docs: "to throw" -> "from throwing"

* docs: Remove disruptive newline in padding/truncation section

* docs: "show exemplary" -> "show examples of"

* docs: "much harder as" -> "much harder than"

* docs: Fix typo "seach" -> "search"

* docs: Fix subject-verb disagreement in WordPiece description

* docs: Fix style in preprocessing.rst
2020-12-23 10:15:49 -05:00
Sylvain Gugger
490b39e614
Seq2seq trainer (#9241)
* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-22 11:33:44 -05:00
Sylvain Gugger
1fc7119181
Fix script that check objects are documented (#9259) 2020-12-22 11:12:58 -05:00
Suraj Patil
f4432b7e01
add base model classes to bart subclassed models (#9230)
* add base model classes to  bart subclassed models

* add doc
2020-12-21 19:56:46 +05:30
Stas Bekman
3ff5e8955a
[t5 doc] typos (#9199)
* [t5 doc] typos

a few run away backticks

@sgugger

* style
2020-12-18 16:03:26 -08:00
Sylvain Gugger
3e56e2ce04 Fix typo 2020-12-18 10:11:07 -05:00
sandip
467e9158b4
Added TF CTRL Sequence Classification (#9151)
* Added TF CTRL Sequence Classification

* code refactor
2020-12-17 18:10:57 -05:00
Lysandre
bd40345d3e v4.1.1 docs 2020-12-17 11:28:38 -05:00
Lysandre
bfa4ccf77d Release: v4.1.1 2020-12-17 11:25:49 -05:00
Lysandre
e0790cca78 Fix TAPAS doc 2020-12-17 11:25:05 -05:00
Sylvain Gugger
6d2e864db7
Put all models in the constants (#9170)
* Put all models in the constants

* Add Google AI mention in the main README
2020-12-17 11:23:21 -05:00
Lysandre
f83d9c8da7 v4.1.0 docs 2020-12-17 10:16:07 -05:00
Lysandre
f5438ab8a2 Release: v4.1.0 2020-12-17 10:04:55 -05:00
Lysandre
ac2c7e398f Remove erroneous character 2020-12-17 09:47:19 -05:00
Lysandre Debut
1aca3d6afa
Add disclaimer to TAPAS rst file (#9167)
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-17 09:34:06 -05:00
Lysandre Debut
1c1a2ffbff
TableQuestionAnsweringPipeline (#9145)
* AutoModelForTableQuestionAnswering

* TableQuestionAnsweringPipeline

* Apply suggestions from Patrick's code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Sylvain and Patrick comments

* Better PyTorch/TF error message

* Add integration tests

* Argument Handler naming

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

* Fix docs to appease the documentation gods

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-16 12:31:50 -05:00
Lysandre Debut
07384baf7a
AutoModelForTableQuestionAnswering (#9154)
* AutoModelForTableQuestionAnswering

* Update src/transformers/models/auto/modeling_auto.py

* Style
2020-12-16 12:14:33 -05:00
Hayden Housen
34334662df
Add message to documentation that longformer doesn't support token_type_ids (#9152)
* Add message to documentation that longformer doesn't support token_type_ids

* Format changes
2020-12-16 11:06:14 -05:00
Patrick von Platen
640e6fe190
[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054)
* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements

Co-authored-by: TevenLeScao <teven.lescao@gmail.com>
2020-12-16 13:03:32 +01:00
NielsRogge
1551e2dc6d
[WIP] Tapas v4 (tres) (#9117)
* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2020-12-15 17:08:49 -05:00
sandip
389aba34bf
Added TF OpenAi GPT1 Sequence Classification (#9105)
* TF OpenAI GPT Sequence Classification

* Update src/transformers/models/openai/modeling_tf_openai.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-12-15 11:27:08 -05:00
Julien Plu
df3f4d2aef
Fix T5 and BART for TF (#9063)
* Fix T5 for graphe compilation+execution

* Fix BART

* Fix import

* Fix naming

* fix attribute name

* Oops

* fix import

* fix tests

* fix tests

* Update test

* Add mising import

* Address Patrick's comments

* Style

* Address Patrick's comment
2020-12-14 18:47:00 +01:00
Ahmed Elnaggar
a9c8bff724
Add parallelization support for T5EncoderModel (#9082)
* add model parallelism to T5EncoderModel

add model parallelism to T5EncoderModel

* remove decoder from T5EncoderModel parallelize

* uodate T5EncoderModel docs

* Extend T5ModelTest for T5EncoderModel

* fix T5Stask using range for get_device_map

* fix style

Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>
2020-12-14 12:00:45 -05:00
Stas Bekman
b00eb4fb02
Testing Experimental CI Features (#9070) 2020-12-14 10:34:59 -05:00
Simon Brandeis
74daf1f954
Fixed a broken link in documentation (#9101) 2020-12-14 09:12:27 -05:00
Julien Chaumond
3552d0e0d8
[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards

* Update the .rst

@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler

* Add a rootlevel README.md with simple instructions/context

* Update docs/source/model_sharing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* rm all model cards

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-12-11 18:24:42 -05:00
Sylvain Gugger
1310e1a758
Enforce all objects in the main init are documented (#9014) 2020-12-10 11:57:12 -05:00
Sylvain Gugger
51e81e5895
MPNet copyright files (#9015) 2020-12-10 09:29:38 -05:00
Patrick von Platen
06971ac4f9
[Bart] Refactor - fix issues, consistency with the library, naming (#8900)
* remove make on the fly linear embedding

* start refactor

* big first refactor

* save intermediate

* save intermediat

* correct mask issue

* save tests

* refactor padding masks

* make all tests pass

* further refactor

* make pegasus test pass

* fix bool if

* fix leftover tests

* continue

* bart renaming

* delete torchscript test hack

* fix imports in tests

* correct shift

* fix docs and repo cons

* re-add fix for FSTM

* typo in test

* fix typo

* fix another typo

* continue

* hot fix 2 for tf

* small fixes

* refactor types linting

* continue

* finish refactor

* fix import in tests

* better bart names

* further refactor and add test

* delete hack

* apply sylvains and lysandres commens

* small perf improv

* further perf improv

* improv perf

* fix typo

* make style

* small perf improv
2020-12-09 20:55:24 +01:00
StillKeepTry
df2af6d8b8 Add MP Net 2 (#9004) 2020-12-09 10:32:43 -05:00
Patrick von Platen
da37a21c89
push (#9008) 2020-12-09 15:14:33 +01:00
Sylvain Gugger
7e1d709e2a
Fix link to stable version in the doc navbar (#9007) 2020-12-09 09:11:39 -05:00
Patrick von Platen
02d0e0355c
Diverse beam search 2 (#9006)
* diverse beam search

* bug fixes

* bug fixes

* bug fix

* separate out diverse_beam_search function

* separate out diverse_beam_search function

* bug fix

* improve code quality

* bug fix

* bug fix

* separate out diverse beam search scorer

* code format

* code format

* code format

* code format

* add test

* code format

* documentation changes

* code quality

* add slow integration tests

* more general name

* refactor into logits processor

* add test

* avoid too much copy paste

* refactor

* add to docs

* fix-copies

* bug fix

* Revert "bug fix"

This reverts commit c99eb5a8dc.

* improve comment

* implement sylvains feedback

Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
2020-12-09 15:00:37 +01:00
Sylvain Gugger
00aa9dbca2
Copyright (#8970)
* Add copyright everywhere missing

* Style
2020-12-07 18:36:34 -05:00
Navjot
c108d0b5a4
add max_length to showcase the use of truncation (#8975) 2020-12-07 18:35:39 -05:00
sandip
483e13273f
Add TFGPT2ForSequenceClassification based on DialogRPT (#8714)
* Add TFGPT2ForSequenceClassification based on DialogRPT

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* code refactor for latest other TF PR

* code refactor

* code refactor

* Update modeling_tf_gpt2.py
2020-12-07 16:58:37 +01:00
Lysandre Debut
0c5615af66
Put Transformers on Conda (#8918)
* conda

* Guide

* correct tag

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/installation.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sylvain's comments

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-12-03 14:28:49 -05:00
Julien Chaumond
9ad6194318
Tweak wording + Add badge w/ number of models on the hub (#8914)
* Add badge w/ number of models on the hub

* try to apease @sgugger 😇

* not sure what this `c` was about [ci skip]

* Fix script and move stuff around

* Fix doc styling error

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-12-03 10:56:55 -05:00
sandip
f6b44e6190
Transfoxl seq classification (#8868)
* Transfoxl sequence classification

* Transfoxl sequence classification
2020-12-02 10:08:32 -05:00
elk-cloner
4a9e502a36
Ctrl for sequence classification (#8812)
* add CTRLForSequenceClassification

* pass local test

* merge with master

* fix modeling test for sequence classification

* fix deco

* fix assert
2020-12-01 09:49:27 +01:00
LysandreJik
9995a341c9 Update docs 2020-11-30 12:07:52 -05:00
LysandreJik
22b0ff757a Release: v4.0.0 2020-11-30 12:07:43 -05:00
Sylvain Gugger
75f8100fc7
Add a direct link to the big table (#8850) 2020-11-30 10:29:23 -05:00
Ahmed Elnaggar
40ecaf0c2b
Add T5 Encoder for Feature Extraction (#8717)
* Add T5 Encoder class for feature extraction

* fix T5 encoder add_start_docstrings indent

* update init with T5 encoder

* update init with TFT5ModelEncoder

* remove TFT5ModelEncoder

* change T5ModelEncoder order in init

* add T5ModelEncoder to transformers init

* clean T5ModelEncoder

* update init with TFT5ModelEncoder

* add TFModelEncoder for Tensorflow

* update init with TFT5ModelEncoder

* Update src/transformers/models/t5/modeling_t5.py

change output from Seq2SeqModelOutput to BaseModelOutput

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove encoder_outputs

1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.

* Authorize missing decoder keys

* remove unnecessary input parameters

remove pask_key_values and use_cache

* remove use_cache

remove use_cache from the forward method

* add doctoring for T5 encoder

add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING

* change return_dict to dot access

* add T5_ENCODER_INPUTS_DOCSTRING for TF T5

* change TFT5Encoder output type to BaseModelOutput

* remove unnecessary parameters for TFT5Encoder

* remove unnecessary if statement

* add import BaseModelOutput

* fix BaseModelOutput typo to TFBaseModelOutput

* update T5 doc with T5ModelEncoder

* add T5ModelEncoder to tests

* finish pytorch

* finish docs and mt5

* add mtf to init

* fix init

* remove n_positions

* finish PR

* Update src/transformers/models/mt5/modeling_mt5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/mt5/modeling_tf_mt5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-30 08:34:40 +01:00
Lysandre Debut
610cb106a2
Migration guide from v3.x to v4.x (#8763)
* Migration guide from v3.x to v4.x

* Better wording

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sylvain's comments

* Better wording.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-29 20:13:07 -05:00
Guy Rosin
3a08cc1ce7
Minor docs typo fixes (#8797)
* Fix minor typos

* Additional typos

* Style fix

Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>
2020-11-29 11:27:00 -05:00
Stas Bekman
00ea45659f
suggest a numerical limit of 50MB for determining @slow (#8824) 2020-11-27 16:04:54 -05:00
Moussa Kamal Eddine
81fe0bf085
Add barthez model (#8393)
* Add init barthez

* Add barthez model, tokenizer and docs

BARThez is a pre-trained french seq2seq model that uses BART objective.

* Apply suggestions from code review docs typos

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add license

* Change URLs scheme

* Remove barthez model keep tokenizer

* Fix style

* Fix quality

* Update tokenizer

* Add fast tokenizer

* Add fast tokenizer test

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-27 12:31:42 -05:00
Patrick von Platen
2a6fbe6a40
[XLNet] Fix mems behavior (#8567)
* fix mems in xlnet

* fix use_mems

* fix use_mem_len

* fix use mems

* clean docs

* fix tf typo

* make xlnet tf for generation work

* fix tf test

* refactor use cache

* add use cache for missing models

* correct use_cache in generate

* correct use cache in tf generate

* fix tf

* correct getattr typo

* make sylvain happy

* change in docs as well

* do not apply to cookie cutter statements

* fix tf test

* make pytorch model fully backward compatible
2020-11-25 16:54:59 -05:00
Sylvain Gugger
4821ea5aeb
Big model table (#8774)
* First draft

* Styling

* With all changes staged

* Update docs/source/index.rst

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Styling

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-11-25 12:02:15 -05:00
Lysandre Debut
02f48b9bfc
Model parallel documentation (#8741)
* Add parallelize methods to the .rst files

* Correct format
2020-11-23 20:14:48 -05:00
Colin Brochtrup
8ffc01a76a
Add early stopping callback to pytorch trainer (#8581)
* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer

* Add early stopping test

* Set patience counter to 0 if best metric not defined yet

* Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on.

* Run make style

* make funciton name sensible

* Improve new argument docstring wording and hope that flakey CI test passes.

* Use on_evaluation callback instead of custom. Remove some debug printing

* Move early stopping arguments and state into early stopping callback

* Run make style

* Remove old code

* Fix docs formatting. make style went rogue on me.

* Remove copied attributes and fix variable

* Add assertions on training arguments instead of mutating them. Move comment out of public docs.

* Make separate test for early stopping callback. Add test of invalid arguments.

* Run make style... I remembered before CI this time!

* appease flake8

* Add EarlyStoppingCallback to callback docs

* Make docstring EarlyStoppingCallabck match other callbacks.

* Fix typo in docs
2020-11-23 17:25:35 -05:00
Sylvain Gugger
900024273b
Change default cache path (#8734)
* Change default cache path

* Document changes

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-23 13:56:45 -05:00
Santiago Castro
e1f3156b21
Fix many typos (#8708) 2020-11-21 22:58:10 -05:00
Sylvain Gugger
cb3e5c33f7
Fix a few last paths for the new repo org (#8666) 2020-11-19 11:56:42 -05:00
elk-cloner
5362bb8a6b
Tf longformer for sequence classification (#8231)
* working on LongformerForSequenceClassification

* add TFLongformerForMultipleChoice

* add TFLongformerForTokenClassification

* use add_start_docstrings_to_model_forward

* test TFLongformerForSequenceClassification

* test TFLongformerForMultipleChoice

* test TFLongformerForTokenClassification

* remove test from repo

* add test and doc for TFLongformerForSequenceClassification, TFLongformerForTokenClassification, TFLongformerForMultipleChoice

* add requested classes to modeling_tf_auto.py
update dummy_tf_objects
fix tests
fix bugs in requested classes

* pass all tests except test_inputs_embeds

* sync with master

* pass all tests except test_inputs_embeds

* pass all tests

* pass all tests

* work on test_inputs_embeds

* fix style and quality

* make multi choice work

* fix TFLongformerForTokenClassification signature

* fix TFLongformerForMultipleChoice, TFLongformerForSequenceClassification signature

* fix mult choice

* fix mc hint

* fix input embeds

* fix input embeds

* refactor input embeds

* fix copy issue

* apply sylvains changes and clean more

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-19 10:37:27 -05:00
cronoik
dcc9c64299
Updated the Extractive Question Answering code snippets (#8636)
* Updated the Extractive Question Answering code snippets

The Extractive Question Answering code snippets do not work anymore since the models return task-specific output objects. This commit fixes the pytorch and tensorflow examples but adding `.values()` to the model call.

* Update task_summary.rst
2020-11-18 18:56:47 -05:00
Patrick von Platen
cdfa56afe0
[Tokenizer Doc] Improve tokenizer summary (#8622)
* improve summary

* small fixes

* cleaned line length

* correct "" formatting

* apply sylvains suggestions
2020-11-18 17:14:15 +01:00
Lysandre Debut
3095ee9dab
Tokenizers should be framework agnostic (#8599)
* Tokenizers should be framework agnostic

* Run the slow tests

* Not testing

* Fix documentation

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-17 14:03:03 -05:00
Julien Chaumond
042a6aa777
Tokenizers: ability to load from model subfolder (#8586)
* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co

Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2020-11-17 08:58:45 -05:00
Patrick von Platen
5104223552
[MT5] More docs (#8589)
* add docs

* make style
2020-11-17 12:47:57 +01:00
Patrick von Platen
86822a358b
T5 & mT5 (#8552)
* add mt5 and t5v1_1 model

* fix tests

* correct some imports

* add tf model

* finish tf t5

* improve examples

* fix copies

* clean doc
2020-11-17 12:23:09 +01:00
Sylvain Gugger
c89bdfbe72
Reorganize repo (#8580)
* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command
2020-11-16 21:43:42 -05:00
Sylvain Gugger
1073a2bde5
Switch return_dict to True by default. (#8530)
* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
2020-11-16 11:43:00 -05:00
Julien Chaumond
725269746b
Model sharing doc: more tweaks (#8520)
* More doc tweaks

* Update model_sharing.rst

* make style

* missing newline

* Add email tip

Co-authored-by: Pierric Cistac <pierric@huggingface.co>
2020-11-13 12:10:26 -05:00
Sylvain Gugger
bb03a14edd Update doc for v3.5.1 2020-11-13 10:29:58 -05:00
Sylvain Gugger
7933054638
Model sharing doc (#8498)
* Model sharing doc

* Style
2020-11-12 11:53:23 -05:00
Chengxi Guo
d65e0bfea3
Fix doc bug (#8500)
* fix doc bug

Signed-off-by: mymusise <mymusise1@gmail.com>

* fix example bug

Signed-off-by: mymusise <mymusise1@gmail.com>
2020-11-12 11:47:23 -05:00
Funtowicz Morgan
a5b682329c
Flax/Jax documentation (#8331)
* First addition of Flax/Jax documentation

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* make style

* Ensure input order match between Bert & Roberta

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Install dependencies "all" when building doc

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* wraps build_doc deps with ""

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing @sgugger comments.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use list to highlight JAX features.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Make style.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Let's not look to much into the future for now.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-11-11 14:53:36 -05:00
Ratthachat (Jung)
026a2ff225
Add TFDPR (#8203)
* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing

* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
2020-11-11 12:28:09 -05:00
Santiago Castro
8fe6629bb4
Add missing tasks to pipeline docstring (#8428) 2020-11-10 13:44:25 -05:00
Stas Bekman
02bdfc0251
using multi_gpu consistently (#8446)
* s|multiple_gpu|multi_gpu|g; s|multigpu|multi_gpu|g'

* doc
2020-11-10 13:23:58 -05:00
Stas Bekman
e21340da7a
[testing utils] get_auto_remove_tmp_dir more intuitive behavior (#8401)
* [testing utils] get_auto_remove_tmp_dir default change

Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal.

99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug.

* simplify things

* update docs

* fix doc layout

* style

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* better 3-state doc

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* s/tmp/temporary/ + style

* correct the statement

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-11-10 11:57:21 -05:00
Sam Shleifer
c314b1fd3b
[docs] improve bart/marian/mBART/pegasus docs (#8421) 2020-11-10 10:18:34 -05:00
Lysandre
aec51e5696 v3.5.0 documentation 2020-11-10 08:58:47 -05:00
Lysandre
818878dc88 Release: v3.5.0 2020-11-10 08:50:43 -05:00
Lysandre Debut
9cebee38ad
Model sharing rst (#8439)
* Update RST

* Finer details

* Re-organize

* Style
2020-11-10 08:35:11 -05:00
Stas Bekman
ef032ddd1e
[docs] [testing] gpu decorators table (#8422)
* gpu decorators table

* whitespace

* Update docs/source/testing.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* whitespace

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-11-09 14:27:42 -05:00
Sam Shleifer
46509d1c19
[docs] remove sshleifer from issue-template :( (#8418) 2020-11-09 12:51:38 -05:00
Patrick von Platen
9c83b96e62
[Tests] Add Common Test for Training + Fix a couple of bugs (#8415)
* add training tests

* correct longformer

* fix docs

* fix some tests

* fix some more train tests

* remove ipdb

* fix multiple edge case model training

* fix funnel and prophetnet

* clean gpt models

* undo renaming of albert
2020-11-09 18:24:41 +01:00
Yossi Synett
bc0d26d1de
[All Seq2Seq model + CLM models that can be used with EncoderDecoder] Add cross-attention weights to outputs (#8071)
* Output cross-attention with decoder attention output

* Update src/transformers/modeling_bert.py

* add cross-attention for t5 and bart as well

* fix tests

* correct typo in docs

* add sylvains and sams comments

* correct typo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-11-06 19:34:48 +01:00
Leandro von Werra
17450397a7
Docs bart training ref (#8330)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 17:20:57 -05:00
Stas Bekman
d787935a14
[s2s] test_distributed_eval (#8315)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-11-05 16:01:15 -05:00
Guillaume Filion
27b402cab0
Output global_attentions in Longformer models (#7562)
* Output global_attentions in Longformer models

* make style

* small refactoring

* fix tests

* make fix-copies

* add for tf as well

* remove comments in test

* make fix-copies

* make style

* add docs

* make docstring pretty

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-11-05 21:10:43 +01:00
Patrick von Platen
a1bbcf3f6c
Refactoring the generate() function (#6949)
* first draft

* show design proposition for new generate method

* up

* make better readable

* make first version

* gpt2 tests pass

* make beam search for gpt2 work

* add first encoder-decoder code

* delete typo

* make t5 work

* save indermediate

* make bart work with beam search

* finish beam search bart / t5

* add default kwargs

* make more tests pass

* fix no bad words sampler

* some fixes and tests for all distribution processors

* fix test

* fix rag slow tests

* merge to master

* add nograd to generate

* make all slow tests pass

* speed up generate

* fix edge case bug

* small fix

* correct typo

* add type hints and docstrings

* fix typos in tests

* add beam search tests

* add tests for beam scorer

* fix test rag

* finish beam search tests

* move generation tests in seperate file

* fix generation tests

* more tests

* add aggressive generation tests

* fix tests

* add gpt2 sample test

* add more docstring

* add more docs

* finish doc strings

* apply some more of sylvains and sams comments

* fix some typos

* make fix copies

* apply lysandres and sylvains comments

* final corrections on examples

* small fix for reformer
2020-11-03 16:04:22 +01:00
Sam Shleifer
566b083eb1
TFMarian, TFMbart, TFPegasus, TFBlenderbot (#7987)
* Start plumbing

* Marian close

* Small stubs for all children

* Fixed bart

* marian working

* pegasus test is good, but failing

* Checkin tests

* More model files

* Subtle marian, pegasus integration test failures

* Works well

* rm print

* boom boom

* Still failing model2doc

* merge master

* Equivalence test failing, all others fixed

* cleanup

* Fix embed_scale

* Cleanup marian pipeline test

* Undo extra changes

* Smaller delta

* Cleanup model testers

* undo delta

* fix tests import structure

* cross test decorator

* Cleaner set_weights

* Respect authorized_unexpected_keys

* No warnings

* No warnings

* style

* Nest tf import

* black

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* functional dropout

* fixup

* Fixup

* style_doc

* embs

* shape list

* delete slow force_token_id_to_be_generated func

* fixup

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-30 11:23:16 -04:00
Santiago Castro
969859d5f6
Fix doc errors and typos across the board (#8139)
* Fix doc errors and typos across the board

* Fix a typo

* Fix the CI

* Fix more typos

* Fix CI

* More fixes

* Fix CI

* More fixes

* More fixes
2020-10-29 10:33:33 -04:00
Sylvain Gugger
6241c873cd
Document the various LM Auto models (#8118) 2020-10-28 13:41:56 -04:00
Stas Bekman
5423f2a9d4
[testing] port test_trainer_distributed to distributed pytest + TestCasePlus enhancements (#8107)
* move the helper code into testing_utils

* port test_trainer_distributed to work with pytest

* improve docs

* simplify notes

* doc

* doc

* style

* doc

* further improvements

* torch might not be available

* real fix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-28 11:51:32 -04:00
Davide Fiocco
995006eabb
Add AzureML in integrations via dedicated callback (#8062)
* first attempt to add AzureML callbacks

* func arg fix

* var name fix, but still won't fix error...

* fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2

* Avoid lint check of azureml import

* black compliance

* Make isort happy

* Fix point typo in docs

* Add AzureML to Callbacks docs

* Attempt to make sphinx happy

* Format callback docs

* Make documentation style happy

* Make docs compliant to style

Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>
2020-10-27 14:21:54 -04:00
Sylvain Gugger
08f534d2da
Doc styling (#8067)
* Important files

* Styling them all

* Revert "Styling them all"

This reverts commit 7d029395fd.

* Syling them for realsies

* Fix syntax error

* Fix benchmark_utils

* More fixes

* Fix modeling auto and script

* Remove new line

* Fixes

* More fixes

* Fix more files

* Style

* Add FSMT

* More fixes

* More fixes

* More fixes

* More fixes

* Fixes

* More fixes

* More fixes

* Last fixes

* Make sphinx happy
2020-10-26 18:26:02 -04:00
Sylvain Gugger
04a17f8550
Doc fixes in preparation for the docstyle PR (#8061)
* Fixes in preparation for doc styling

* More fixes

* Better syntax

* Fixes

* Style

* More fixes

* More fixes
2020-10-26 15:01:09 -04:00
Yusuke Mori
a9ac1db276
Minor error fix of 'bart-large-cnn' details in the pretrained_models doc (#8053) 2020-10-26 11:05:16 -04:00
Samuel
fc2d6eac3c
Minor typo fixes to the preprocessing tutorial in the docs (#8046)
* Fix minor typos

Fix minor typos in the docs.

* Update docs/source/preprocessing.rst

Clearer data structure description.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-26 10:22:29 -04:00
noise-field
c48b16b8da
Mlflow integration callback (#8016)
* Add MLflow integration class

Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.

* Add MLflowCallback import

Add import of MLflowCallback in trainer.py

* Handle model argument

Allow the callback to handle model argument and store model config items as hyperparameters.

* Log parameters to MLflow in batches

MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.

* Fix style

* Add docs on MLflow callback

* Fix issue with unfinished runs

The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.

* Add MLflow integration class

Add integration code for MLflow in integrations.py along with the code
that checks that MLflow is installed.

* Add MLflowCallback import

Add import of MLflowCallback in trainer.py

* Handle model argument

Allow the callback to handle model argument and store model config items as hyperparameters.

* Log parameters to MLflow in batches

MLflow cannot log more than a hundred parameters at once.
Code added to split the parameters into batches of 100 items and log the batches one by one.

* Fix style

* Add docs on MLflow callback

* Fix issue with unfinished runs

The "fluent" api used in MLflow integration allows only one run to be active at any given moment. If the Trainer is disposed off and a new one is created, but the training is not finished, it will refuse to log the results when the next trainer is created.
2020-10-26 09:41:58 -04:00
Stas Bekman
101186bc1f
[docs] [testing] distributed training (#7993)
* distributed training

* fix

* fix formatting

* wording
2020-10-26 08:15:05 -04:00
Samuel
9aa2826687
Minor typo fixes to the tokenizer summary (#8045)
Minor typo fixes to the tokenizer summary
2020-10-26 08:08:33 -04:00
Stas Bekman
8348105692
[testing] slow tests should be marked as slow (#7895)
* slow tests should be slow

* exception note

* style

* integrate LysandreJik's notes with some expansions

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* another slow test

* fix link, and prose

* clarify.

* note from Sam

* typo

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-22 06:34:05 -04:00
Sam Shleifer
829842159e
Add TFBartForConditionalGeneration (#5411)
* half done

* doc improvement

* Cp test file

* brokedn

* broken test

* undo some mess

* ckpt

* borked

* Halfway

* 6 passing

* boom boom

* Much progress but still 6

* boom boom

* merged master

* 10 passing

* boom boom

* Style

* no t5 changes

* 13 passing

* Integration test failing, but not gibberish

* Frustrated

* Merged master

* 4 fail

* 4 fail

* fix return_dict

* boom boom

* Still only 4

* prepare method

* prepare method

* before delete classif

* Skip tests to avoid adding boilerplate

* boom boom

* fast tests passing

* style

* boom boom

* Switch to supporting many input types

* remove FIXMENORM

* working

* Fixed past_key_values/decoder_cached_states confusion

* new broken test

* Fix attention mask kwarg name

* undo accidental

* Style and reviewers

* style

* Docs and common tests

* Cleaner assert messages

* copy docs

* style issues

* Sphinx fix

* Simplify caching logic

* test does not require torch

* copy _NoLayerEmbedTokens

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update tests/test_modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_bart.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Line length and dont document None

* Add pipeline test coverage

* assert msg

* At parity

* Assert messages

* mark slow

* Update compile test

* back in init

* Merge master

* Fix tests

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-21 13:10:16 +02:00
Joe Davison
13842e413c
PPL guide minor code snippet fix (#7938) 2020-10-20 16:17:39 -06:00
Lysandre Debut
96f4828ace
Respect the 119 line chars (#7928) 2020-10-20 11:02:47 -04:00
Lysandre
ef0ac063c9 Docs for v3.4.0 2020-10-20 16:29:00 +02:00
Lysandre
eb0e0ce2ad Release: v3.4.0 2020-10-20 16:22:26 +02:00
Patrick von Platen
ffd675b42c
add summary (#7927) 2020-10-20 10:11:02 -04:00
Lysandre Debut
5547b40b13
labels and decoder_input_ids to Glossary (#7906)
* labels and decoder_input_ids to Glossary

* Formatting fixes

* Update docs/source/glossary.rst

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* sam's comments

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-10-20 09:50:47 -04:00
Stas Bekman
3e31e7f956
[testing] rename skip targets + docs (#7863)
* rename skip targets + docs

* fix quotes

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* small improvements

* fix

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-20 04:39:13 -04:00
Patrick von Platen
e3d2bee8d0
fix t5 training docstring (#7911) 2020-10-19 21:49:47 +02:00
Quentin Lhoest
033f29c625
Allow Custom Dataset in RAG Retriever (#7763)
* add CustomHFIndex

* typo in config

* update tests

* add custom dataset example

* clean script

* update test data

* minor in test

* docs

* docs

* style

* fix imports

* allow to pass the indexed dataset directly

* update tests

* use multiset DPR

* address thom and patrick's comments

* style

* update dpr tokenizer

* add output_dir flag in use_own_knowledge_dataset.py

* allow custom datasets in examples/rag/finetune.py

* add test for custom dataset in distributed rag retriever
2020-10-19 19:42:45 +02:00
Weizhen
2422cda01b
ProphetNet (#7157)
* add new model prophetnet

prophetnet modified

modify codes as suggested v1

add prophetnet test files

* still bugs, because of changed output formats of encoder and decoder

* move prophetnet into the latest version

* clean integration tests

* clean tokenizers

* add xlm config to init

* correct typo in init

* further refactoring

* continue refactor

* save parallel

* add decoder_attention_mask

* fix use_cache vs. past_key_values

* fix common tests

* change decoder output logits

* fix xlm tests

* make common tests pass

* change model architecture

* add tokenizer tests

* finalize model structure

* no weight mapping

* correct n-gram stream attention mask as discussed with qweizhen

* remove unused import

* fix index.rst

* fix tests

* delete unnecessary code

* add fast integration test

* rename weights

* final weight remapping

* save intermediate

* Descriptions for Prophetnet Config File

* finish all models

* finish new model outputs

* delete unnecessary files

* refactor encoder layer

* add dummy docs

* code quality

* fix tests

* add model pages to doctree

* further refactor

* more refactor, more tests

* finish code refactor and tests

* remove unnecessary files

* further clean up

* add docstring template

* finish tokenizer doc

* finish prophetnet

* fix copies

* fix typos

* fix tf tests

* fix fp16

* fix tf test 2nd try

* fix code quality

* add test for each model

* merge new tests to branch

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_prophetnet.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update utils/check_repo.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* apply sams and sylvains comments

* make style

* remove unnecessary code

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/configuration_prophetnet.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* implement lysandres comments

* correct docs

* fix isort

* fix tokenizers

* fix copies

Co-authored-by: weizhen <weizhen@mail.ustc.edu.cn>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-19 17:36:09 +02:00
Stas Bekman
4eb61f8e88
remove USE_CUDA (#7861) 2020-10-19 07:08:34 -04:00
Thomas Wolf
ba8c4d0ac0
[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659)
* splitting fast and slow tokenizers [WIP]

* [WIP] splitting sentencepiece and tokenizers dependencies

* update dummy objects

* add name_or_path to models and tokenizers

* prefix added to file names

* prefix

* styling + quality

* spliting all the tokenizer files - sorting sentencepiece based ones

* update tokenizer version up to 0.9.0

* remove hard dependency on sentencepiece 🎉

* and removed hard dependency on tokenizers 🎉

* update conversion script

* update missing models

* fixing tests

* move test_tokenization_fast to main tokenization tests - fix bugs

* bump up tokenizers

* fix bert_generation

* update ad fix several tokenizers

* keep sentencepiece in deps for now

* fix funnel and deberta tests

* fix fsmt

* fix marian tests

* fix layoutlm

* fix squeezebert and gpt2

* fix T5 tokenization

* fix xlnet tests

* style

* fix mbart

* bump up tokenizers to 0.9.2

* fix model tests

* fix tf models

* fix seq2seq examples

* fix tests without sentencepiece

* fix slow => fast  conversion without sentencepiece

* update auto and bert generation tests

* fix mbart tests

* fix auto and common test without tokenizers

* fix tests without tokenizers

* clean up tests lighten up when tokenizers + sentencepiece are both off

* style quality and tests fixing

* add sentencepiece to doc/examples reqs

* leave sentencepiece on for now

* style quality split hebert and fix pegasus

* WIP Herbert fast

* add sample_text_no_unicode and fix hebert tokenization

* skip FSMT example test for now

* fix style

* fix fsmt in example tests

* update following Lysandre and Sylvain's comments

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/testing_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-18 20:51:24 +02:00
Katarina Slama
dfa4c26bc0
Typo and fix the input of labels to cross_entropy (#7841)
The current version caused some errors. The changes fixed it for me. Hope this is helpful!
2020-10-15 19:36:31 -04:00
Sylvain Gugger
a1d1b332d0
Add predict step accumulation (#7767)
* Add eval_accumulation_step and clean distributed eval

* Add TPU test

* Add TPU stuff

* Fix arg name

* Fix Seq2SeqTrainer

* Fix total_size

* Update src/transformers/trainer_pt_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Doc and add test to TPU

* Add unit test

* Adapt name

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-14 11:41:45 -04:00
Tiger
7e73c12805
fixed lots of typos. (#7758) 2020-10-13 10:00:20 -04:00
Felipe Curti
dcba9ee03b
Gpt1 for sequence classification (#7683)
* Add Documentation for GPT-1 Classification

* Add GPT-1 with Classification head

* Add tests for GPT-1 Classification

* Add GPT-1 For Classification to auto models

* Remove authorized missing keys, change checkpoint to openai-gpt
2020-10-13 05:06:15 -04:00
Sylvain Gugger
2c9e83f7b8
Fix title level in Blenderbot doc (#7687) 2020-10-09 19:24:10 -04:00
Sylvain Gugger
a3cea6a8cc
Better links for models in READMED and doc index (#7680) 2020-10-09 11:17:16 -04:00
sgugger
bc00b37a0d Revert "Better model links in the README and index"
This reverts commit 76e05518bb.
2020-10-09 10:56:13 -04:00
sgugger
76e05518bb Better model links in the README and index 2020-10-09 10:54:40 -04:00
Noah Trenaman
5668fdb09e
Update XLM-RoBERTa details (#7669) 2020-10-09 05:16:58 -04:00
Thomas Wolf
9aeacb58ba
Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141)
* [WIP] SP tokenizers

* fixing tests for T5

* WIP tokenizers

* serialization

* update T5

* WIP T5 tokenization

* slow to fast conversion script

* Refactoring to move tokenzier implementations inside transformers

* Adding gpt - refactoring - quality

* WIP adding several tokenizers to the fast world

* WIP Roberta - moving implementations

* update to dev4 switch file loading to in-memory loading

* Updating and fixing

* advancing on the tokenizers - updating do_lower_case

* style and quality

* moving forward with tokenizers conversion and tests

* MBart, T5

* dumping the fast version of transformer XL

* Adding to autotokenizers + style/quality

* update init and space_between_special_tokens

* style and quality

* bump up tokenizers version

* add protobuf

* fix pickle Bert JP with Mecab

* fix newly added tokenizers

* style and quality

* fix bert japanese

* fix funnel

* limite tokenizer warning to one occurence

* clean up file

* fix new tokenizers

* fast tokenizers deep tests

* WIP adding all the special fast tests on the new fast tokenizers

* quick fix

* adding more fast tokenizers in the fast tests

* all tokenizers in fast version tested

* Adding BertGenerationFast

* bump up setup.py for CI

* remove BertGenerationFast (too early)

* bump up tokenizers version

* Clean old docstrings

* Typo

* Update following Lysandre comments

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2020-10-08 11:32:16 +02:00
Sam Shleifer
960faaaf28
Blenderbot (#7418)
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-10-07 19:09:23 -04:00
Sylvain Gugger
08ba4b4902
Trainer callbacks (#7596)
* Initial callback proposal

* Finish various callbacks

* Post-rebase conflicts

* Fix tests

* Don't use something that's not set

* Documentation

* Remove unwanted print.

* Document all models can work

* Add tests + small fixes

* Update docs/source/internal/trainer_utils.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Fix TF tests

* Real fix this time

* This one should work

* Fix typo

* Really fix typo

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-10-07 10:50:21 -04:00
Lysandre Debut
5982431814
Add GPT2ForSequenceClassification based on DialogRPT (#7501)
* Add GPT2ForSequenceClassification based on DialogRPT

* Better documentation

* Code quality
2020-10-06 17:31:21 -04:00
Lysandre Debut
0257992e4a
Fix squeezebert docs (#7587)
* Configuration

* Modeling

* Tokenization

* Obliterate the trailing spaces

* From underlines to long underlines
2020-10-06 06:22:04 -04:00
Lysandre Debut
818c294fdd
The toggle actually sticks (#7586) 2020-10-05 11:23:57 -04:00
Sylvain Gugger
b2b7fc7814
Check and update model list in index.rst automatically (#7527)
* Check and update model list in index.rst automatically

* Check and update model list in index.rst automatically

* Adapt template
2020-10-05 09:40:45 -04:00
Amine Abdaoui
0d79de7322
docs(pretrained_models): fix num parameters (#7575)
* docs(pretrained_models): fix num parameters

* fix(pretrained_models): correct typo

Co-authored-by: Amin <amin.geotrend@gmail.com>
2020-10-05 07:50:56 -04:00
Forrest Iandola
02ef825be2
SqueezeBERT architecture (#7083)
* configuration_squeezebert.py

thin wrapper around bert tokenizer

fix typos

wip sb model code

wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working

set up squeezebert to use BertModelOutput when returning results.

squeezebert documentation

formatting

allow head mask that is an array of [None, ..., None]

docs

docs cont'd

path to vocab

docs and pointers to cloud files (WIP)

line length and indentation

squeezebert model cards

formatting of model cards

untrack modeling_squeezebert_scratchpad.py

update aws paths to vocab and config files

get rid of stub of NSP code, and advise users to pretrain with mlm only

fix rebase issues

redo rebase of modeling_auto.py

fix issues with code formatting

more code format auto-fixes

move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert

tests for squeezebert modeling and tokenization

fix typo

move squeezebert before bert in modeling_auto.py to fix inheritance problem

disable test_head_masking, since squeezebert doesn't yet implement head masking

fix issues exposed by the test_modeling_squeezebert.py

fix an issue exposed by test_tokenization_squeezebert.py

fix issue exposed by test_modeling_squeezebert.py

auto generated code style improvement

issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()

update copyright

resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask

docs

add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli

autogenerated formatting tweaks

integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings

* tiny change to order of imports
2020-10-05 04:25:43 -04:00
Sylvain Gugger
e2c935f561
Cleanup documentation for BART, Marian, MBART and Pegasus (#7523)
* Cleanup documentation for BART, Marian, MBART and Pegasus

* Cleanup documentation for BART, Marian, MBART and Pegasus
2020-10-05 04:22:12 -04:00
Alexandr
9a92afb6d0
Update LayoutLM doc (#7388)
Co-authored-by: Alexandr Maslov <avmaslov3@gmail.com>
2020-10-01 09:11:42 -04:00
Sylvain Gugger
be51c1039d
Add forgotten return_dict argument in the docs (#7483) 2020-10-01 04:41:29 -04:00
Sylvain Gugger
dc7d2daa4c
Alphabetize model lists (#7478) 2020-09-30 10:43:58 -04:00
François REMY
cc4eff8087
Make transformers install check positive (#7473)
When transformers is correctly installed, you should get a positive message ^_^
2020-09-30 07:44:40 -04:00
Pengcheng He
7a0cf0ec93
Add DeBERTa model (#5929)
* Add DeBERTa model

* Remove dependency of deberta

* Address comments

* Patch DeBERTa
Documentation
Style

* Add final tests

* Style

* Enable tests + nitpicks

* position IDs

* BERT -> DeBERTa

* Quality

* Style

* Tokenization

* Last updates.

* @patrickvonplaten's comments

* Not everything can be a copy

* Apply most of @sgugger's review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Last reviews

* DeBERTa -> Deberta

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-30 07:07:30 -04:00
Sylvain Gugger
a1c2ef7bd0 Add documentation for v3.3.1 2020-09-29 14:31:43 -04:00
Sylvain Gugger
1ba08dc221 Release: v3.3.1 2020-09-29 14:17:34 -04:00
Lysandre
16c213820e Update docs to version v3.3.0 2020-09-28 16:32:00 +02:00
Lysandre
0613f05226 Release: v3.3.0 2020-09-28 16:24:43 +02:00
Sylvain Gugger
ca3fc36de3
Reorganize documentation navbar (#7423)
* Reorganize documentation navbar

* Update css to have clear sections
2020-09-28 16:22:58 +02:00
Sylvain Gugger
0611eab5e3
Document RAG again (#7377)
Do not merge before Monday
2020-09-28 08:31:46 -04:00
Boris Dayma
1749ca317e
docs: fix model sharing file names (#5855)
* docs: fix model sharing file names

* Update docs/source/model_sharing.rst

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* docs(model_sharing.rst): fix new line

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-28 08:17:30 -04:00
Sylvain Gugger
a8e7982f84
Remove mentions of RAG from the docs (#7376)
* Remove mentions of  RAG from the docs

* Deactivate check
2020-09-24 17:07:14 -04:00
Lysandre Debut
8d3bb781ee
Formatter (#7368)
* Formatter

* Docs
2020-09-24 10:59:21 -04:00
Sylvain Gugger
0ccb6f5c6d
Clean RAG docs and template docs (#7348)
* Clean RAG docs and template docs

* Fix typo

* Better doc
2020-09-24 09:24:41 -04:00
Sylvain Gugger
3323146e90
Models doc (#7345)
* Clean up model documentation

* Formatting

* Preparation work

* Long lines

* Main work on rst files

* Cleanup all config files

* Syntax fix

* Clean all tokenizers

* Work on first models

* Models beginning

* FaluBERT

* All PyTorch models

* All models

* Long lines again

* Fixes

* More fixes

* Update docs/source/model_doc/bert.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/electra.rst

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Last fixes

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-23 13:20:45 -04:00
Stas Bekman
28cf873036
[testing] skip decorators: docs, tests, bugs (#7334)
* skip decorators: docs, tests, bugs

* another important note

* style

* bloody style

* add @pytest.mark.parametrize

* add note

* no idea what it wants :(
2020-09-23 05:16:19 -04:00
Ola Piktus
c754c41c61
RAG (#6813)
* added rag WIP

* path fix

* Formatting / renaming prior to actual work

* added rag WIP

* path fix

* Formatting / renaming prior to actual work

* added rag WIP

* path fix

* Formatting / renaming prior to actual work

* added rag WIP

* Formatting / renaming prior to actual work

* First commit

* improve comments

* Retrieval evaluation scripts

* refactor to include modeling outputs + MPI retriever

* Fix rag-token model + refactor

* Various fixes + finetuning logic

* use_bos fix

* Retrieval refactor

* Finetuning refactoring and cleanup

* Add documentation and cleanup

* Remove set_up_rag_env.sh file

* Fix retrieval wit HF index

* Fix import errors

* Fix quality errors

* Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867

* fix quality

* Fix RAG Sequence generation

* minor cleanup plus initial tests

* fix test

* fix tests 2

* Comments fix

* post-merge fixes

* Improve readme + post-rebase refactor

* Extra dependencied for tests

* Fix tests

* Fix tests 2

* Refactor test requirements

* Fix tests 3

* Post-rebase refactor

* rename nlp->datasets

* RAG integration tests

* add tokenizer to slow integration test and allow retriever to run on cpu

* add tests; fix position ids warning

* change structure

* change structure

* add from encoder generator

* save working solution

* make all integration tests pass

* add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained

* don't save paths

* delete unnecessary imports

* pass config to AutoTokenizer.from_pretrained for Rag tokenizers

* init wiki_dpr only once

* hardcode legacy index and passages paths (todo: add the right urls)

* finalize config

* finalize retriver api and config api

* LegacyIndex index download refactor

* add dpr to autotokenizer

* make from pretrained more flexible

* fix ragfortokengeneration

* small name changes in tokenizer

* add labels to models

* change default index name

* add retrieval tests

* finish token generate

* align test with previous version and make all tests pass

* add tests

* finalize tests

* implement thoms suggestions

* add first version of test

* make first tests work

* make retriever platform agnostic

* naming

* style

* add legacy index URL

* docstrings + simple retrieval test for distributed

* clean model api

* add doc_ids to retriever's outputs

* fix retrieval tests

* finish model outputs

* finalize model api

* fix generate problem for rag

* fix generate for other modles

* fix some tests

* save intermediate

* set generate to default

* big refactor generate

* delete rag_api

* correct pip faiss install

* fix auto tokenization test

* fix faiss install

* fix test

* move the distributed logic to examples

* model page

* docs

* finish tests

* fix dependencies

* fix import in __init__

* Refactor eval_rag and finetune scripts

* start docstring

* add psutil to test

* fix tf test

* move require torch to top

* fix retrieval test

* align naming

* finish automodel

* fix repo consistency

* test ragtokenizer save/load

* add rag model output docs

* fix ragtokenizer save/load from pretrained

* fix tokenizer dir

* remove torch in retrieval

* fix docs

* fixe finetune scripts

* finish model docs

* finish docs

* remove auto model for now

* add require torch

* remove solved todos

* integrate sylvains suggestions

* sams comments

* correct mistake on purpose

* improve README

* Add generation test cases

* fix rag token

* clean token generate

* fix test

* add note to test

* fix attention mask

* add t5 test for rag

* Fix handling prefix in finetune.py

* don't overwrite index_name

Co-authored-by: Patrick Lewis <plewis@fb.com>
Co-authored-by: Aleksandra Piktus <piktus@devfair0141.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5102.h2.fair>
Co-authored-by: Aleksandra Piktus <piktus@learnfair5067.h2.fair>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2020-09-22 18:29:58 +02:00
Lysandre
6e21f24220 Documentation version 2020-09-22 18:04:39 +02:00
Lysandre
3ebb1b3a2b Release: v3.2.0 2020-09-22 17:36:51 +02:00
Sylvain Gugger
21ca148090
is_pretokenized -> is_split_into_words (#7236)
* is_pretokenized -> is_split_into_words

* Fix tests
2020-09-22 09:34:35 -04:00
Minghao Li
cd9a0585ea
Add LayoutLM Model (#7064)
* first version

* finish test docs readme model/config/tokenization class

* apply make style and make quality

* fix layoutlm GitHub link

* fix conflict in index.rst and add layoutlm to pretrained_models.rst

* fix bug in test_parents_and_children_in_mappings

* reformat modeling_auto.py and tokenization_auto.py

* fix bug in test_modeling_layoutlm.py

* Update docs/source/model_doc/layoutlm.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/layoutlm.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove inh, add tokenizer fast, and update some doc

* copy and rename necessary class from modeling_bert to modeling_layoutlm

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* add mish to activations.py, import ACT2FN and import logging from utils

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-22 09:28:02 -04:00
Stas Bekman
47ab3e8262
@slow has to be last (#7251)
Found an issue when `@slow` isn't the last decorator (gets ignored!), so documenting this significance.
2020-09-20 09:17:29 -04:00
Stas Bekman
1eeb206bef
[ported model] FSMT (FairSeq MachineTranslation) (#6940)
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-17 11:31:29 -04:00
Stas Bekman
f8590c56e6
[doc] improve/expand the Parametrization section (#7156) 2020-09-16 08:45:50 -04:00
Stas Bekman
b00cafbde5
[docs] add testing documentation (#7101)
* [docs] add testing documentation

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweaks as suggested

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweaks

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more tweaks

* suggestions from @LysandreJik

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-15 19:25:25 -04:00
sgugger
5636cbb25d Extra ) 2020-09-14 09:37:55 -04:00
Sylvain Gugger
ccc8e30c8a
Clean up autoclass doc (#7081) 2020-09-14 09:26:41 -04:00
Bartosz Telenczuk
15d18e0307
fix link to paper (#7116) 2020-09-14 07:43:40 -04:00
Sylvain Gugger
4cbd50e611
Compute loss method (#7074) 2020-09-11 12:06:31 -04:00
Sylvain Gugger
e841b75dec
Automate the lists in auto-xxx docs (#7061)
* More readable dict

* More nlp -> datasets

* Revert "More nlp -> datasets"

This reverts commit 3cd1883d22.

* Automate the lists in auto-xxx docs

* More readable dict

* Revert "More nlp -> datasets"

This reverts commit 3cd1883d22.

* Automate the lists in auto-xxx docs

* nlp -> datasets

* Fix new key
2020-09-11 10:42:09 -04:00
Patrick von Platen
db38f7ce29
[BertGeneration, Docs] Fix another old name in docs (#7050)
* correct docs for bert generation

* upload
2020-09-10 17:12:33 +02:00
Patrick von Platen
3bd95b0faf
correct docs for bert generation (#7048) 2020-09-10 17:08:40 +02:00
Sylvain Gugger
15a189049e
Add TF Funnel Transformer (#7029)
* Add TF Funnel Transformer

* Proper dummy input

* Formatting

* Update src/transformers/modeling_tf_funnel.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* One review comment forgotten

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-10 10:41:56 -04:00
Patrick von Platen
7fd1febf38
Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594)
* add conversion script

* improve conversion script

* make style

* add tryout files

* fix

* update

* add causal bert

* better names

* add tokenizer file as well

* finish causal_bert

* fix small bugs

* improve generate

* change naming

* renaming

* renaming

* renaming

* remove leftover files

* clean files

* add fix tokenizer

* finalize

* correct slow test

* update docs

* small fixes

* fix link

* adapt check repo

* apply sams and sylvains recommendations

* fix import

* implement Lysandres recommendations

* fix logger warn
2020-09-10 16:40:51 +02:00
Stas Bekman
4ee1053dcf
add -y to bypass prompt for transformers-cli upload (#7035) 2020-09-10 04:58:29 -04:00
Stas Bekman
d0963486c1
adding TRANSFORMERS_VERBOSITY env var (#6961)
* introduce TRANSFORMERS_VERBOSITY env var + test + test helpers

* cleanup

* remove helper function
2020-09-09 04:08:01 -04:00
Sam Shleifer
f0fc0aea6b
pegasus.rst: fix expected output (#7017) 2020-09-08 13:29:16 -04:00
Sylvain Gugger
d155b38d6e
Funnel transformer (#6908)
* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Fix copyright

* Forgot some layers can be repeated

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/modeling_funnel.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Update src/transformers/modeling_funnel.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

* Update src/transformers/modeling_funnel.py

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Slow integration test

* Make small integration test

* Formatting

* Add checkpoint and separate classification head

* Formatting

* Expand list, fix link and add in pretrained models

* Styling

* Add the model in all summaries

* Typo fixes

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-08 08:08:08 -04:00
Antonio V Mendoza
ea2c6f1afc
Adding the LXMERT pretraining model (MultiModal languageXvision) to HuggingFace's suite of models (#5793)
* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-03 04:02:25 -04:00
Suraj Patil
4230d30f77
[pipelines] Text2TextGenerationPipeline (#6744)
* add Text2TextGenerationPipeline

* remove max length warning

* remove comments

* remove input_length

* fix typo

* add tests

* use TFAutoModelForSeq2SeqLM

* doc

* typo

* add the doc below TextGenerationPipeline

* doc nit

* style

* delete comment
2020-09-02 07:34:35 -04:00
Harry Wang
ee1bff06f8
minor docs grammar fixes (#6889) 2020-09-02 06:45:19 -04:00
Patrick von Platen
4d1a3ffde8
[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878)
* finish xlm-roberta

* finish docs

* expose XLMRobertaForCausalLM
2020-09-01 21:56:39 +02:00
Lysandre Debut
1461aac8d7
Update docs stable version 2020-09-01 11:02:24 -04:00
Lysandre
3726754a6c v3.1.0 documentation 2020-09-01 14:39:07 +02:00
Lysandre
4b3ee9cbc5 Release: v3.1.0 2020-09-01 14:27:52 +02:00
Patrick von Platen
afc4ece462
[Generate] Facilitate PyTorch generate using ModelOutputs (#6735)
* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test
2020-09-01 12:38:25 +02:00
Sylvain Gugger
d5f1ffa0d8
Logging doc (#6852)
* Add logging doc

* Foamtting

* Update docs/source/main_classes/logging.rst

* Update src/transformers/utils/logging.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-01 03:16:34 -04:00