Commit Graph

9639 Commits

Author SHA1 Message Date
Krishna Sirumalla
aaee4038c3
Add onnx config for RoFormer (#16861)
* add roformer onnx config
2022-04-26 16:51:15 +02:00
Ahmed Elnaggar
8afaaa26f5
FIx Iterations for decoder (#16934)
FIx Iterations for decoder
2022-04-26 12:54:14 +02:00
Manuel
fa32247406
apply torch int div to layoutlmv2 (#15457)
* apply torch int div

* black linting fixup

* update path to torch_int_div

* clarify imports
2022-04-26 10:07:51 +02:00
Sylvain Gugger
344b9fb0c6
Limit the use of PreTrainedModel.device (#16935)
* Limit the use of PreTrainedModel.device

* Fix
2022-04-25 20:58:50 -04:00
code-review-doctor
6568752039
Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) 2022-04-25 15:15:00 -04:00
Sanchit Gandhi
fea94d6790
Replace deprecated logger.warn with warning (#16876) 2022-04-25 15:12:51 -04:00
Joao Gante
e03966e404
TF: XLA stable softmax (#16892)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-25 20:10:51 +01:00
Rushi Chaudhari
8246caf3eb
added deit onnx config (#16887)
* added deit onnx config
2022-04-25 20:50:45 +02:00
Joao Gante
9331b37967
TF: XLA Logits Warpers (#16899)
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-25 19:48:08 +01:00
Joao Gante
809dac48f9
TF: XLA logits processors - minimum length, forced eos, and forced bos (#16912)
* XLA min len, forced eos, and forced bos

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-25 19:27:53 +01:00
Yih-Dar
f6210c49e2
Fix RemBertTokenizerFast (#16933)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-25 19:51:50 +02:00
Yih-Dar
32adbb26d6
Fix PyTorch RAG tests GPU OOM (#16881)
* add torch.cuda.empty_cache in some PT RAG tests

* torch.cuda.empty_cache in tearDownModule()

* tearDown()

* add gc.collect()

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-25 17:33:56 +02:00
Yih-Dar
3e47d19cfc
Add missing ckpt in config docs (#16900)
* add missing ckpt in config docs

* add more missing ckpt in config docs

* fix wrong ckpts

* fix realm ckpt

* fix s2t2

* fix xlm_roberta ckpt

* Fix for deberta v2

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use only one checkpoint for DPR

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-04-25 17:31:45 +02:00
Patrick von Platen
3a71e94a92
Fix doc test quicktour dataset (#16929)
* fix doc test

* fix doc test

Co-authored-by: Patrick <patrick@pop-os.localdomain>
2022-04-25 16:26:59 +02:00
Thomas Chaigneau
508baf1943
add bigbird typo fixes (#16897)
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-04-25 11:32:06 +02:00
Patrick von Platen
72728be3db
[DocTests] Fix some doc tests (#16889)
* [DocTests] Fix some doc tests

* hacky fix

* correct
2022-04-23 08:40:14 +02:00
cavdard
22fc93c4d9
Changes in create_optimizer to support tensor parallelism with SMP (#16880)
* changes in create optimizer to support tensor parallelism with SMP

* Update src/transformers/trainer.py

Convert if check to one line.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-22 15:24:38 -04:00
Joao Gante
99c8226b12
TF: XLA repetition penalty (#16879) 2022-04-22 18:29:32 +01:00
Thomas Chaigneau
ec81c11a18
Add OnnxConfig for ConvBERT (#16859)
* add OnnxConfig for ConvBert

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
2022-04-22 18:19:15 +02:00
Minh Chien Vu
0d1cff1195
Add doc tests for Albert and Bigbird (#16774)
* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

* Add Doctest for Albert and Bigbird

* make fixup

* overwrite examples for Albert and Bigbird

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update longer examples for Bigbird

* using examples from squad_v2

* print out example text

* change name token-classification-big-bird checkpoint to random

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-22 18:07:16 +02:00
Mario Šaško
9fa88172c2
Minor fixes/improvements in convert_file_size_to_int (#16891)
* Minor improvements to `convert_file_size_to_int`

* Add <unit>bit version to kilos and megas

* Minor fix
2022-04-22 16:54:20 +02:00
Joao Gante
6d90d76f5d
TF: rework XLA generate tests (#16866) 2022-04-22 12:38:08 +01:00
Yih-Dar
3b1bbefc47
Add missing entries in mappings (#16857)
* add missing entries in some mappings

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-22 10:53:24 +02:00
Loubna Ben Allal
d91841315a
New features for CodeParrot training script (#16851)
* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-04-21 18:43:46 +02:00
Yih-Dar
eef2422e96
Fix doctest list (#16878)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-21 18:12:14 +02:00
Thomas Chaigneau
0b1e0fcf7a
Fix GPT-J onnx conversion (#16780)
* add gptj to TOKENIZER_MAPPING_NAMES

* fix int32 to float to avoid problem in onnx

* Update src/transformers/models/gptj/modeling_gptj.py

Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2022-04-21 15:55:30 +02:00
Eldar Kurtic
bae9b6458c
Use ACT2FN to fetch ReLU activation (#16874)
- all activations should be fetched through ACT2FN
- it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`
2022-04-21 09:33:29 -04:00
Sylvain Gugger
cb555af2c7
Return input_ids in ImageGPT feature extractor (#16872) 2022-04-21 09:09:00 -04:00
Nicolas Patry
e789418ebe
Adding support for array key in raw dictionnaries in ASR pipeline. (#16827)
* Adding support for `array` key in raw dictionnaries in ASR pipeline.

* ES .

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Making it work by not popping `array` first.

* Black 22.3

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-21 14:39:10 +02:00
ghlai9665
daf520b033
tiny tweak to allow BatchEncoding.token_to_char when token doesn't correspond to chars (#15901)
* tweak to allow BatchEncoding.char_to_token(0)

* update docstring

* remote trailing whitespace

* make fixup

* make value checking for span_indices explicit

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-21 08:07:54 -04:00
Stefan Schweter
cb7e166428
t5: add conversion script for T5X to FLAX (#16853)
* t5: add conversion script for T5X to FLAX

* t5: make flake happy

* t5: add copyright message to t5x conversion script

* t5: fix lm head for v1.0 checkpoints
2022-04-21 13:00:35 +02:00
Nicolas Patry
6620f60c0a
Long QuestionAnsweringPipeline fix. (#16778)
* Temporary commit witht the long QA fix.

* Adding slow tests covering this fix.

* Removing fast test as it doesn't fail anyway.
2022-04-21 09:59:25 +02:00
Zachary Mueller
705d65368f
Fix multiproc metrics in no_trainer examples (#16865) 2022-04-20 17:26:27 -04:00
Sylvain Gugger
175da8d182
Fix custom init sorting script (#16864) 2022-04-20 17:05:39 -04:00
Stas Bekman
67ed0e43dc
[docs] fix url (#16860) 2022-04-20 11:01:24 -07:00
Stas Bekman
afa1ef0992
[modeling_utils] use less cpu memory with sharded checkpoint loading (#16844)
* less cpu memory with sharded checkpoint loading

* Trigger CI

* Trigger CI
2022-04-20 07:44:37 -07:00
Nicolas Patry
e13a91fe60
Fixing return type tensor with num_return_sequences>1. (#16828)
* Fixing return type tensor with `num_return_sequences>1`.

* Nit.
2022-04-20 16:11:51 +02:00
Yang Ming
ff06b17791
add DebertaV2 fast tokenizer (#15529)
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-20 10:26:51 +02:00
Patrick von Platen
e1c153cbaa
[Typo] Fix typo in modeling utils (#16840) 2022-04-19 23:09:03 +02:00
Manuel R. Ciosici
3104036e7f
Add support for bitsandbytes (#15622)
* Add initial BNB integration

* fixup! Add initial BNB integration

* Add bnb test decorator

* Update Adamw8bit option name

* Use the full bnb package name

* Overide bnb for all embedding layers

* Fix package name

* Formatting

* Remove unnecessary import

* Update src/transformers/trainer.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Rename AdamwBNB optimizer option

* Add training test checking that bnb memory utilization is lower

* fix merge

* fix merge; fix + extend new test

* cleanup

* expand bnb

* move all require_* candidates to testing_utils.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-04-19 16:01:29 -04:00
Yih-Dar
e6d23a4b9b
Improve test_pt_tf_model_equivalence on PT side (#16731)
* Update test_pt_tf_model_equivalence on PT side

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-19 21:13:27 +02:00
Dahlbomii
3dd57b15c5
Type hints added to Speech to Text (#16506)
* Type hints added

* return hints added

* Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-19 17:58:08 +01:00
SaulLu
1efca4e6c8
replace Speech2TextTokenizer by Speech2TextFeatureExtractor in some docstrings (#16835)
* replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring

* quality
2022-04-19 18:32:22 +02:00
Jeevesh Juneja
b5c6a63ed9
Correct Logging of Eval metric to Tensorboard (#16825)
* Correct Logging of Eval metric to Tensorboard

An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``.

* Remove unused variable
2022-04-19 17:27:54 +02:00
Joao Gante
f09c45e067
TF: Add sigmoid activation function (#16819) 2022-04-19 16:13:08 +01:00
wiio12
74814574ae
Add doc about attention_mask on gpt2 (#16829)
* Add doc about `attention_mask` on gpt2

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

* Add doc about attention_mask on gpt2_tf

* clean up style

* remove empty line white spaces

* remove whitespace in empty line
2022-04-19 16:32:26 +02:00
NielsRogge
b96e82c80a
Add image classification script, no trainer (#16727)
* Add first draft

* Improve README and run fixup

* Make script aligned with other scripts, improve README

* Improve script and add test

* Remove print statement

* Apply suggestions from code review

* Add num_labels to make test pass

* Improve README
2022-04-19 16:32:08 +02:00
Patrick von Platen
db9f189121
[ASR Pipeline] Correct init docs (#16833)
* correct

* up
2022-04-19 16:12:36 +02:00
Ella Charlaix
77de8d6c31
Add onnx export of models with a multiple choice classification head (#16758)
* Add export of models with a multiple-choice classification head
2022-04-19 15:51:51 +02:00
Wonjae Kim
b74a955325
fix rum_clm.py seeking text column name twice (#16624) 2022-04-19 14:38:25 +01:00