Commit Graph

637 Commits

Author SHA1 Message Date
Sylvain Gugger
e8246f78f9
Add auto_wrap option in fairscale integration (#10673)
* Add auto_wrap option in fairscale integration

* Style
2021-03-12 07:50:20 -05:00
Nicolas Patry
543d0549f8
Adding new parameter to generate: max_time. (#9846)
* [WIP] Adding new parameter to `generate`:  `max_time`.

Generation by tokens number is sometimes a bit clunky because we don't
know how many tokens are good enough or even how many tokens are in
the payload (for pipelines users for instance). This leads to hard
to understand behavior.

This PR proposes a new argument `max_time` which is a float of seconds
for the allowed time for `generate` to run on.
Ideally combinations of `max_tokens=None`, `max_time=2` could be used to
generate as many tokens as possible within time budget.

NB: Another possible approach consists of passing a callback to `generate`
  putting the caller in charge of the actual decision of when to stop
  generating tokens. It opens the door to 'which args should we pass'
  to this callback. It's hard to imagine other use-cases for this
  early stopping behavior than time (that are not already covered by
  parameters of generate)

* Revamp with StoppingCriteria

* Removing deprecated mentions.

* Forgot arguments to stopping criteria.

* Readding max_length it's not just used as a stopping criteria.

* Default value for `stopping_criteria`.

* Address @patrickvonplaten comments.

- More docstrings
- Actual doc
- Include in global namespace
- Remove TF work.

* Put back `max_length` (deprecation different PR).

* Doc quality.

* Fixing old behavior without `stopping_criteria` but with `max_length`.

Making sure we don't break that in the future.

* Adding more tests for possible inconsistencies between

`max_length` and `stopping_criteria`.

* Fixing the torch imports.
2021-03-12 10:11:50 +01:00
WybeKoper
2f8485199c
Fix broken link (#10656)
* Fixed broken link

* fixed max length violation

Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-11 14:29:02 -05:00
Suraj Patil
055ed78f52
[S2T] fix example in docs (#10667) 2021-03-11 22:43:37 +05:30
Patrick von Platen
602d63f05c
[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648)
* add conversion script

* add wav2vec2 xslr models

* finish

* Update docs/source/model_doc/xlsr_wav2vec2.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-11 17:44:18 +03:00
Sylvain Gugger
26a33cfd8c
Document Trainer limitation on custom models (#10635) 2021-03-10 14:58:22 -05:00
Suraj Patil
d26b37e744
Speech2TextTransformer (#10175)
* s2t

* fix config

* conversion script

* fix import

* add tokenizer

* fix tok init

* fix tokenizer

* first version working

* fix embeds

* fix lm head

* remove extra heads

* fix convert script

* handle encoder attn mask

* style

* better enc attn mask

* override _prepare_attention_mask_for_generation

* handle attn_maks in encoder and decoder

* input_ids => input_features

* enable use_cache

* remove old code

* expand embeddings if needed

* remove logits bias

* masked_lm_loss => loss

* hack tokenizer to support feature processing

* fix model_input_names

* style

* fix error message

* doc

* remove inputs_embeds

* remove input_embeds

* remove unnecessary docstring

* quality

* SpeechToText => Speech2Text

* style

* remove shared_embeds

* subsample => conv

* remove Speech2TextTransformerDecoderWrapper

* update output_lengths formula

* fix table

* remove max_position_embeddings

* update conversion scripts

* add possibility to do upper case for now

* add FeatureExtractor and Processor

* add tests for extractor

* require_torch_audio => require_torchaudio

* add processor test

* update import

* remove classification head

* attention mask is now 1D

* update docstrings

* attention mask should be of type long

* handle attention mask from generate

* alwyas return attention_mask

* fix test

* style

* doc

* Speech2TextTransformer => Speech2Text

* Speech2TextTransformerConfig => Speech2TextConfig

* remove dummy_inputs

* nit

* style

* multilinguial tok

* fix tokenizer

* add tgt_lang setter

* save lang_codes

* fix tokenizer

* add forced_bos_token_id to tokenizer

* apply review suggestions

* add torchaudio to extra deps

* add speech deps to CI

* fix dep

* add libsndfile to ci

* libsndfile1

* add speech to extras all

* libsndfile1 -> libsndfile1

* libsndfile

* libsndfile1-dev

* apt update

* add sudo to install

* update deps table

* install libsndfile1-dev on CI

* tuple to list

* init conv layer

* add model tests

* quality

* add integration tests

* skip_special_tokens

* add speech_to_text_transformer in toctree

* fix tokenizer

* fix fp16 tests

* add tokenizer tests

* fix copyright

* input_values => input_features

* doc

* add model in readme

* doc

* change checkpoint names

* fix copyright

* fix code example

* add max_model_input_sizes in tokenizer

* fix integration tests

* add do_lower_case to tokenizer

* remove clamp trick

* fix "Add modeling imports here"

* fix copyrights

* fix tests

* SpeechToTextTransformer => SpeechToText

* fix naming

* fix table formatting

* fix typo

* style

* fix typos

* remove speech dep from extras[testing]

* fix copies

* rename doc file,

* put imports under is_torch_available

* run feat extract tests when torch is available

* dummy objects for processor and extractor

* fix imports in tests

* fix import in modeling test

* fxi imports

* fix torch import

* fix imports again

* fix positional embeddings

* fix typo in import

* adapt new extractor refactor

* style

* fix torchscript test

* doc

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docs, copied from, style

* fix docstring

* handle imports

* remove speech from all extra deps

* remove s2t from seq2seq lm mapping

* better names

* skip training tests

* add install instructions

* List => Tuple

* doc

* fix conversion script

* fix urls

* add instruction for libsndfile

* fix fp16 test

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-10 21:42:04 +05:30
Patrick von Platen
9a06b6b11b
[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor (#10594)
* save first version

* finish refactor

* finish refactor

* correct naming

* correct naming

* shorter names

* Update src/transformers/feature_extraction_common_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change name

* finish

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-09 12:16:59 +03:00
Ratthachat (Jung)
696e8a4365
Add TFRag (#9002)
* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing

* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* create modeling_tf_rag

* add tests for tf

* add tf tests

* revert wrong pt commit

* further refactor

* further refactor

* refactor

* Update modeling_tf_rag.py

- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate

* delete colab pieces of code

* Show case of greedy "generate"

Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.

* cosmetic update

* correct typos

* update

* push some progress

* make easy check

* fix rag save from pretrained

* Update src/transformers/modeling_tf_utils.py

* remove commented out lines

* delete unnecessary lines

* add simple test case for nq_checkpoint

Add nq_checkpoint test to show that current version without hack still fails

* temporarily put ugly hack back again

* Add TFRagSequenceForGeneration!!

* __init__.py , import TFRagSequenceForGeneration

* Add TFRagSequence tests!

* rag init.py - add TFRagSequenceForGeneration

* fix from_pretrained

* fix prepare_inputs_for_generation

* Beam search for RagToken!

* minor clean up

* add tf.cast in TFRagModel

* More tf.cast

* Add all remaining tests (still have issues)

* delete all T5 related

* make style

* fix load weight prefix

* fix bart

* fix return_dict for tf_rag

make all tests pass .. Hooray

* fix some tests

* fix code quality

* fix qualtiy check

* finish tests tf rag

* add tf rag to docs

* remove TFT5 from docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove TFT5 from docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Delete outdated comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* improve doc strings

* add generative model classes

* fix adjust token logic

* refactor generate for TFRag

* using shape_list, not _get_shape

Co-authored-by: Julien Plu <plu.julien@gmail.com>

* axis=[1]->axis=1

* delete NEED_HELP comment

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Indicating model is in a developing state in docstrings

As suggested by Julien

* small last changes

* apply sylvains suggestions

* finish tf rag

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-09 00:49:51 +03:00
Sylvain Gugger
5469369480
Fix version control with anchors (#10595)
* Fix version control with anchors

* Simplify
2021-03-08 10:19:22 -05:00
Suraj Patil
f6e74a63ca
Add m2m100 (#10236)
* m2m_100

* no layernorm_embedding

* sinusoidal positional embeddings

* update pos embeddings

* add default config values

* tokenizer

* add conversion script

* fix config

* fix pos embed

* remove _float_tensor

* update tokenizer

* update lang codes

* handle lang codes

* fix pos embeds

* fix spm key

* put embedding weights on device

* remove qa and seq classification heads

* fix convert script

* lang codes pn one line

* fix embeds

* fix tokenizer

* fix tokenizer

* add fast tokenizer

* style

* M2M100MT => M2M100

* fix copyright, style

* tokenizer converter

* vocab file

* remove fast tokenizer

* fix embeds

* fix tokenizer

* fix tests

* add tokenizer tests

* add integration test

* quality

* fix model name

* fix test

* doc

* doc

* fix doc

* add copied from statements

* fix tokenizer tests

* apply review suggestions

* fix urls

* fix shift_tokens_right

* apply review suggestions

* fix

* fix doc

* add lang code to id

* remove unused function

* update checkpoint names

* fix copy

* fix tokenizer

* fix checkpoint names

* fix merge issue

* style
2021-03-06 22:14:16 +05:30
Stas Bekman
88a951e3cc
offline mode for firewalled envs (#10407)
* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
lewtun
12b66215cf
Fix example of custom Trainer to reflect signature of compute_loss (#10537) 2021-03-05 07:44:53 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc (#10505) 2021-03-03 14:55:18 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys (#10481)
* feat(docs): navigate with left/right arrow keys

* fix: add missing comma
2021-03-03 11:17:12 -05:00
Lysandre Debut
0c2325198f
Add I-BERT to README (#10462) 2021-03-01 12:12:31 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst (#10445) 2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary (#10443)
* feat: Adds three definitions to glossary from @cronoik

Needed a definition for transformer which in turn needed 2 more definitions

To do with issue https://github.com/huggingface/transformers/issues/9078

* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument (#10286)
* Introduce save_strategy training argument

* deprecate EvaluationStrategy

* collapse EvaluationStrategy and LoggingStrategy into a single
  IntervalStrategy enum

* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 (#10374)
* Fix None in add_token_positions - issue #10210

Fix None in add_token_positions related to the issue #10210

* add_token_positions fix None values in end_positions vector

add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354)
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Sehoon Kim
63645b3b11
I-BERT model support (#10153)
* IBertConfig, IBertTokentizer added

* IBert Model names moified

* tokenizer bugfix

* embedding -> QuantEmbedding

* quant utils added

* quant_mode added to configuration

* QuantAct added, Embedding layer + QuantAct addition

* QuantAct added

* unused path removed, QKV quantized

* self attention layer all quantized, except softmax

* temporarl commit

* all liner layers quantized

* quant_utils bugfix

* bugfix: requantization missing

* IntGELU added

* IntSoftmax added

* LayerNorm implemented

* LayerNorm implemented all

* names changed: roberta->ibert

* config not inherit from ROberta

* No support for CausalLM

* static quantization added, quantize_model.py removed

* import modules uncommented

* copyrights fixed

* minor bugfix

* quant_modules, quant_utils merged as one file

* import * fixed

* unused runfile removed

* make style run

* configutration.py docstring fixed

* refactoring: comments removed, function name fixed

* unused dependency removed

* typo fixed

* comments(Copied from), assertion string added

* refactoring: super(..) -> super(), etc.

* refactoring

* refarctoring

* make style

* refactoring

* cuda -> to(x.device)

* weight initialization removed

* QuantLinear set_param removed

* QuantEmbedding set_param removed

* IntLayerNorm set_param removed

* assert string added

* assertion error message fixed

* is_decoder removed

* enc-dec arguments/functions removed

* Converter removed

* quant_modules docstring fixed

* conver_slow_tokenizer rolled back

* quant_utils docstring fixed

* unused aruments e.g. use_cache removed from config

* weight initialization condition fixed

* x_min, x_max initialized with small values to avoid div-zero exceptions

* testing code for ibert

* test emb, linear, gelu, softmax added

* test ln and act added

* style reformatted

* force_dequant added

* error tests overrided

* make style

* Style + Docs

* force dequant tests added

* Fix fast tokenizer in init

* Fix doc

* Remove space

* docstring, IBertConfig, chunk_size

* test_modeling_ibert refactoring

* quant_modules.py refactoring

* e2e integration test added

* tokenizers removed

* IBertConfig added to tokenizer_auto.py

* bugfix

* fix docs & test

* fix style num 2

* final fixes

Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
9dc7825744
Remove unused variable in example for Q&A (#10392) 2021-02-25 09:18:47 -05:00
Lysandre
3591844306 v4.3.3 docs 2021-02-24 15:19:01 -05:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310)
* implement gradient_accumulation_steps support in DeepSpeed integration

* typo

* cleanup

* cleanup
2021-02-22 11:15:59 -08:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch (#10287)
* Deprecate prepare_seq2seq_batch

* Fix last tests

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* More review comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Lysandre Debut
cd8c4c3fc2
DeBERTa-v2 fixes (#10328)
Co-authored-by: Pengcheng He <penhe@microsoft.com>

Co-authored-by: Pengcheng He <penhe@microsoft.com>
2021-02-22 07:45:18 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018)
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;

* DeBERTa-v2

* Fix v2 model loading issue (#10129)

* Doc members

* Update src/transformers/models/deberta/modeling_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address Sylvain's comments

* Address Patrick's comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Sylvain Gugger
f6e53e3c2b
Fix example links in the task summary (#10291) 2021-02-19 18:04:15 -05:00
Stas Bekman
5da7c78ed8
update to new script; notebook notes (#10241) 2021-02-17 15:58:08 -08:00
Joe Davison
4210cd96fc
fix add_token_positions fn (#10217) 2021-02-16 14:00:05 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 (#10154)
* add tokenizer for mBART-50

* update tokenizers

* make src_lang and tgt_lang optional

* update tokenizer test

* add setter

* update docs

* update conversion script

* update docs

* update conversion script

* update tokenizer

* update test

* update docs

* doc

* address Sylvain's suggestions

* fix test

* fix formatting

* nits
2021-02-15 20:58:54 +05:30
Sylvain Gugger
803498318c
[Doc] Fix version control in internal pages (#10124) 2021-02-13 08:52:30 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab (#10130)
* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Tanmay Thakur
2f3b5f4dcc
Add new community notebook - Blenderbot (#10126)
* Update:community.md, new nb add

* feat: updated grammar on  nb description

* Update: Train summarizer for BlenderBotSmall
2021-02-11 12:53:40 +03:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information (#9610)
* how to specify a specific gpu

* new paper

* expand on buffer sizes

* style

* where to find config examples

* specific example

* small updates
2021-02-09 22:16:20 -08:00
Boris Dayma
7c7962ba89
doc: update W&B related doc (#10086)
* doc: update W&B related doc

* doc(wandb): mention report_to

* doc(wandb): commit suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* doc(wandb): fix typo

* doc(wandb): remove WANDB_DISABLED

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Sylvain Gugger
0c3d23dff7 Add patch releases to the doc 2021-02-09 14:17:09 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation (#10092)
* Enable propagation by default

* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC (#10089)
* add wav2vec2CTC and deprecate for maskedlm

* remove from docs
2021-02-09 03:49:02 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py (#10066)
* Removing run_pl_glue.py from seq classification docs

* Adding run_tf_text_classification.py

* Using :prefix_link: to refer local files

* Applying "make style" to the branch

* Update docs/source/task_summary.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removing last underscores

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf Docs for v4.3.0 2021-02-08 18:53:24 +01:00
Sylvain Gugger
45aaf5f7ab
A few fixes in the documentation (#10033) 2021-02-08 05:02:01 -05:00
Patrick von Platen
89be094e29
[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921)
* add big bird

* change teacher to mentor

* add proposal template

* adapt template

* delete old template

* correct some links

* finish template

* create big bird from template

* add big bird

* improve boxes

* finish boxes

* add pointers for BigBird

* finish big bird

* up

* up

* up

* up

* apply lysandres and sylvains suggestions

* delete bogus file

* correct markdown

* try different style

* try different style

* finalize
2021-02-05 15:47:54 +03:00
Sylvain Gugger
3be965c5db
Update doc for pre-release (#10014)
* Update doc for pre-release

* Use stable as default

* Use the right commit :facepalms:
2021-02-04 16:52:27 -05:00
Sylvain Gugger
b72f16b3ec Fix doc for TFConverBertModel 2021-02-04 10:14:46 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM (#9128)
* initiliaze bart4causalLM

* create BartDecoderWrapper, setters/getters

* delete spaces

* forward and additional methods

* update cache function, loss function, remove ngram* params in data class.

* add bartcausallm, bartdecoder testing

* correct bart for causal lm

* remove at

* add mbart as well

* up

* fix typo

* up

* correct

* add pegasusforcausallm

* add blenderbotforcausallm

* add blenderbotsmallforcausallm

* add marianforcausallm

* add test for MarianForCausalLM

* add Pegasus test

* add BlenderbotSmall test

* add blenderbot test

* fix a fail

* fix an import fail

* a fix

* fix

* Update modeling_pegasus.py

* fix models

* fix inputs_embeds setting getter

* adapt tests

* correct repo utils check

* finish test improvement

* fix tf models as well

* make style

* make fix-copies

* fix copies

* run all tests

* last changes

* fix all tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-04 11:56:12 +03:00
yylun
5442a11f5f
fix steps_in_epoch variable in trainer when using max_steps (#9969)
* fix steps_in_epoch variable when using max_steps

* redundant sentence

* Revert "redundant sentence"

This reverts commit ad5c0e9b6e.

* remove redundant sentence

Co-authored-by: wujindou <wujindou@sogou-inc.com>
2021-02-03 09:30:37 -05:00