Commit Graph

627 Commits

Author SHA1 Message Date
Suraj Patil
f6e74a63ca
Add m2m100 (#10236)
* m2m_100

* no layernorm_embedding

* sinusoidal positional embeddings

* update pos embeddings

* add default config values

* tokenizer

* add conversion script

* fix config

* fix pos embed

* remove _float_tensor

* update tokenizer

* update lang codes

* handle lang codes

* fix pos embeds

* fix spm key

* put embedding weights on device

* remove qa and seq classification heads

* fix convert script

* lang codes pn one line

* fix embeds

* fix tokenizer

* fix tokenizer

* add fast tokenizer

* style

* M2M100MT => M2M100

* fix copyright, style

* tokenizer converter

* vocab file

* remove fast tokenizer

* fix embeds

* fix tokenizer

* fix tests

* add tokenizer tests

* add integration test

* quality

* fix model name

* fix test

* doc

* doc

* fix doc

* add copied from statements

* fix tokenizer tests

* apply review suggestions

* fix urls

* fix shift_tokens_right

* apply review suggestions

* fix

* fix doc

* add lang code to id

* remove unused function

* update checkpoint names

* fix copy

* fix tokenizer

* fix checkpoint names

* fix merge issue

* style
2021-03-06 22:14:16 +05:30
Stas Bekman
88a951e3cc
offline mode for firewalled envs (#10407)
* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
lewtun
12b66215cf
Fix example of custom Trainer to reflect signature of compute_loss (#10537) 2021-03-05 07:44:53 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc (#10505) 2021-03-03 14:55:18 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys (#10481)
* feat(docs): navigate with left/right arrow keys

* fix: add missing comma
2021-03-03 11:17:12 -05:00
Lysandre Debut
0c2325198f
Add I-BERT to README (#10462) 2021-03-01 12:12:31 -05:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst (#10445) 2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary (#10443)
* feat: Adds three definitions to glossary from @cronoik

Needed a definition for transformer which in turn needed 2 more definitions

To do with issue https://github.com/huggingface/transformers/issues/9078

* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument (#10286)
* Introduce save_strategy training argument

* deprecate EvaluationStrategy

* collapse EvaluationStrategy and LoggingStrategy into a single
  IntervalStrategy enum

* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 (#10374)
* Fix None in add_token_positions - issue #10210

Fix None in add_token_positions related to the issue #10210

* add_token_positions fix None values in end_positions vector

add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354)
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Sehoon Kim
63645b3b11
I-BERT model support (#10153)
* IBertConfig, IBertTokentizer added

* IBert Model names moified

* tokenizer bugfix

* embedding -> QuantEmbedding

* quant utils added

* quant_mode added to configuration

* QuantAct added, Embedding layer + QuantAct addition

* QuantAct added

* unused path removed, QKV quantized

* self attention layer all quantized, except softmax

* temporarl commit

* all liner layers quantized

* quant_utils bugfix

* bugfix: requantization missing

* IntGELU added

* IntSoftmax added

* LayerNorm implemented

* LayerNorm implemented all

* names changed: roberta->ibert

* config not inherit from ROberta

* No support for CausalLM

* static quantization added, quantize_model.py removed

* import modules uncommented

* copyrights fixed

* minor bugfix

* quant_modules, quant_utils merged as one file

* import * fixed

* unused runfile removed

* make style run

* configutration.py docstring fixed

* refactoring: comments removed, function name fixed

* unused dependency removed

* typo fixed

* comments(Copied from), assertion string added

* refactoring: super(..) -> super(), etc.

* refactoring

* refarctoring

* make style

* refactoring

* cuda -> to(x.device)

* weight initialization removed

* QuantLinear set_param removed

* QuantEmbedding set_param removed

* IntLayerNorm set_param removed

* assert string added

* assertion error message fixed

* is_decoder removed

* enc-dec arguments/functions removed

* Converter removed

* quant_modules docstring fixed

* conver_slow_tokenizer rolled back

* quant_utils docstring fixed

* unused aruments e.g. use_cache removed from config

* weight initialization condition fixed

* x_min, x_max initialized with small values to avoid div-zero exceptions

* testing code for ibert

* test emb, linear, gelu, softmax added

* test ln and act added

* style reformatted

* force_dequant added

* error tests overrided

* make style

* Style + Docs

* force dequant tests added

* Fix fast tokenizer in init

* Fix doc

* Remove space

* docstring, IBertConfig, chunk_size

* test_modeling_ibert refactoring

* quant_modules.py refactoring

* e2e integration test added

* tokenizers removed

* IBertConfig added to tokenizer_auto.py

* bugfix

* fix docs & test

* fix style num 2

* final fixes

Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
9dc7825744
Remove unused variable in example for Q&A (#10392) 2021-02-25 09:18:47 -05:00
Lysandre
3591844306 v4.3.3 docs 2021-02-24 15:19:01 -05:00
Stas Bekman
eab0afc19c
[Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310)
* implement gradient_accumulation_steps support in DeepSpeed integration

* typo

* cleanup

* cleanup
2021-02-22 11:15:59 -08:00
Sylvain Gugger
9e147d31f6
Deprecate prepare_seq2seq_batch (#10287)
* Deprecate prepare_seq2seq_batch

* Fix last tests

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* More review comments

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-02-22 12:36:16 -05:00
Lysandre Debut
cd8c4c3fc2
DeBERTa-v2 fixes (#10328)
Co-authored-by: Pengcheng He <penhe@microsoft.com>

Co-authored-by: Pengcheng He <penhe@microsoft.com>
2021-02-22 07:45:18 -05:00
Pengcheng He
9a7e63729f
Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018)
* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;

* DeBERTa-v2

* Fix v2 model loading issue (#10129)

* Doc members

* Update src/transformers/models/deberta/modeling_deberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address Sylvain's comments

* Address Patrick's comments

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-19 18:34:44 -05:00
Sylvain Gugger
f6e53e3c2b
Fix example links in the task summary (#10291) 2021-02-19 18:04:15 -05:00
Stas Bekman
5da7c78ed8
update to new script; notebook notes (#10241) 2021-02-17 15:58:08 -08:00
Joe Davison
4210cd96fc
fix add_token_positions fn (#10217) 2021-02-16 14:00:05 -05:00
Suraj Patil
6fc940ed09
Add mBART-50 (#10154)
* add tokenizer for mBART-50

* update tokenizers

* make src_lang and tgt_lang optional

* update tokenizer test

* add setter

* update docs

* update conversion script

* update docs

* update conversion script

* update tokenizer

* update test

* update docs

* doc

* address Sylvain's suggestions

* fix test

* fix formatting

* nits
2021-02-15 20:58:54 +05:30
Sylvain Gugger
803498318c
[Doc] Fix version control in internal pages (#10124) 2021-02-13 08:52:30 -05:00
Stas Bekman
b54cb0bd82
[DeepSpeed in notebooks] Jupyter + Colab (#10130)
* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK
2021-02-11 14:02:05 -08:00
Tanmay Thakur
2f3b5f4dcc
Add new community notebook - Blenderbot (#10126)
* Update:community.md, new nb add

* feat: updated grammar on  nb description

* Update: Train summarizer for BlenderBotSmall
2021-02-11 12:53:40 +03:00
Stas Bekman
7c07a47dfb
[DeepSpeed docs] new information (#9610)
* how to specify a specific gpu

* new paper

* expand on buffer sizes

* style

* where to find config examples

* specific example

* small updates
2021-02-09 22:16:20 -08:00
Boris Dayma
7c7962ba89
doc: update W&B related doc (#10086)
* doc: update W&B related doc

* doc(wandb): mention report_to

* doc(wandb): commit suggestion

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* doc(wandb): fix typo

* doc(wandb): remove WANDB_DISABLED

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-09 14:47:52 -05:00
Sylvain Gugger
0c3d23dff7 Add patch releases to the doc 2021-02-09 14:17:09 -05:00
Lysandre Debut
78f4a0e7e5
Logging propagation (#10092)
* Enable propagation by default

* Document enable/disable default handler
2021-02-09 10:27:49 -05:00
Patrick von Platen
b972125ced
Deprecate Wav2Vec2ForMaskedLM and add Wav2Vec2ForCTC (#10089)
* add wav2vec2CTC and deprecate for maskedlm

* remove from docs
2021-02-09 03:49:02 -05:00
Juan Cruz-Benito
e4bf9910dc
Removing run_pl_glue.py from text classification docs, include run_xnli.py & run_tf_text_classification.py (#10066)
* Removing run_pl_glue.py from seq classification docs

* Adding run_tf_text_classification.py

* Using :prefix_link: to refer local files

* Applying "make style" to the branch

* Update docs/source/task_summary.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removing last underscores

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-08 13:04:21 -05:00
Lysandre
0dd579c9cf Docs for v4.3.0 2021-02-08 18:53:24 +01:00
Sylvain Gugger
45aaf5f7ab
A few fixes in the documentation (#10033) 2021-02-08 05:02:01 -05:00
Patrick von Platen
89be094e29
[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921)
* add big bird

* change teacher to mentor

* add proposal template

* adapt template

* delete old template

* correct some links

* finish template

* create big bird from template

* add big bird

* improve boxes

* finish boxes

* add pointers for BigBird

* finish big bird

* up

* up

* up

* up

* apply lysandres and sylvains suggestions

* delete bogus file

* correct markdown

* try different style

* try different style

* finalize
2021-02-05 15:47:54 +03:00
Sylvain Gugger
3be965c5db
Update doc for pre-release (#10014)
* Update doc for pre-release

* Use stable as default

* Use the right commit :facepalms:
2021-02-04 16:52:27 -05:00
Sylvain Gugger
b72f16b3ec Fix doc for TFConverBertModel 2021-02-04 10:14:46 -05:00
demSd
00031785a8
BartForCausalLM analogs to ProphetNetForCausalLM (#9128)
* initiliaze bart4causalLM

* create BartDecoderWrapper, setters/getters

* delete spaces

* forward and additional methods

* update cache function, loss function, remove ngram* params in data class.

* add bartcausallm, bartdecoder testing

* correct bart for causal lm

* remove at

* add mbart as well

* up

* fix typo

* up

* correct

* add pegasusforcausallm

* add blenderbotforcausallm

* add blenderbotsmallforcausallm

* add marianforcausallm

* add test for MarianForCausalLM

* add Pegasus test

* add BlenderbotSmall test

* add blenderbot test

* fix a fail

* fix an import fail

* a fix

* fix

* Update modeling_pegasus.py

* fix models

* fix inputs_embeds setting getter

* adapt tests

* correct repo utils check

* finish test improvement

* fix tf models as well

* make style

* make fix-copies

* fix copies

* run all tests

* last changes

* fix all tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-04 11:56:12 +03:00
yylun
5442a11f5f
fix steps_in_epoch variable in trainer when using max_steps (#9969)
* fix steps_in_epoch variable when using max_steps

* redundant sentence

* Revert "redundant sentence"

This reverts commit ad5c0e9b6e.

* remove redundant sentence

Co-authored-by: wujindou <wujindou@sogou-inc.com>
2021-02-03 09:30:37 -05:00
Patrick von Platen
d6217fb30c
Wav2Vec2 (#9659)
* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-02 15:52:10 +03:00
Sylvain Gugger
de38a6e4d2
Fix 9918 (#9932)
* Initial work

* Fix doc styler and other models
2021-02-02 05:22:20 -05:00
Patrick von Platen
0e3be1ac8f
Add new model docs (#9667)
* add new model logic

* fix docs

* change structure

* improve add_new_model

* push new changes

* up

* up

* correct spelling

* improve docstring

* correct line length

* update readme

* correct links

* correct typos

* only add rst file for now

* Apply suggestions from code review 1

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review

Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* finish adding all suggestions

* make style

* apply Niels feedback

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply sylvains suggestions

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-02-01 17:55:10 +03:00
Stas Bekman
40cfc355f1
[doc] nested markup is invalid in rst (#9898)
Apparently nested markup in RST is invalid: https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible

So currently this line doesn't get rendered properly, leaving inner markdown unrendered, resulting in:
```
https://docutils.sourceforge.io/FAQ.html#is-nested-inline-markup-possible
```

This PR removes the bold which fixes the link.
2021-01-30 09:59:19 -05:00
Stas Bekman
15e4ce353a
[docs] expand install instructions (#9817)
* expand install instructions

* fix

* white space

* rewrite as discussed in the PR

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change the wording to encourage issue report

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-28 09:36:46 -08:00
Joe Davison
caddf9126b
tutorial typo 2021-01-28 09:21:58 -05:00
Stefan Schweter
5ed5a54684
ADD BORT (#9813)
* tests: add integration tests for new Bort model

* bort: add conversion script from Gluonnlp to Transformers 🚀

* bort: minor cleanup (BORT -> Bort)

* add docs

* make fix-copies

* clean doc a bit

* correct docs

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/bort.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct dialogpt doc

* correct link

* Update docs/source/model_doc/bort.rst

* Update docs/source/model_doc/dialogpt.rst

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-27 21:25:11 +03:00
abhishek thakur
f617490e71
ConvBERT Model (#9717)
* finalize convbert

* finalize convbert

* fix

* fix

* fix

* push

* fix

* tf image patches

* fix torch model

* tf tests

* conversion

* everything aligned

* remove print

* tf tests

* fix tf

* make tf tests pass

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup

* add electra test again

* fix doc

* fix doc again

* fix doc again

* Update src/transformers/modeling_tf_pytorch_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/conv_bert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* conv_bert -> convbert

* more fixes from review

* add conversion script

* dont use pretrained embed

* unused config

* suggestions from julien

* some more fixes

* p -> param

* fix copyright

* fix doc

* Update src/transformers/models/convbert/configuration_convbert.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments from reviews

* fix-copies

* fix style

* revert shape_list

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-01-27 03:20:09 -05:00
Yusuke Mori
cb73ab5a38
Fix broken links in the converting tf ckpt document (#9791)
* Fix broken links in the converting tf ckpt document

* Update docs/source/converting_tensorflow_models.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Reflect the review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-01-26 03:37:57 -05:00
Sylvain Gugger
7acfa95afb Add missing new line 2021-01-20 14:13:16 -05:00