Stas Bekman
67ed0e43dc
[docs] fix url ( #16860 )
2022-04-20 11:01:24 -07:00
Stas Bekman
afa1ef0992
[modeling_utils] use less cpu memory with sharded checkpoint loading ( #16844 )
...
* less cpu memory with sharded checkpoint loading
* Trigger CI
* Trigger CI
2022-04-20 07:44:37 -07:00
Nicolas Patry
e13a91fe60
Fixing return type tensor with num_return_sequences>1
. ( #16828 )
...
* Fixing return type tensor with `num_return_sequences>1`.
* Nit.
2022-04-20 16:11:51 +02:00
Yang Ming
ff06b17791
add DebertaV2 fast tokenizer ( #15529 )
...
Co-authored-by: alcinos <carion.nicolas@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
Co-authored-by: Nicolas Carion <carion.nicolas@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-20 10:26:51 +02:00
Patrick von Platen
e1c153cbaa
[Typo] Fix typo in modeling utils ( #16840 )
2022-04-19 23:09:03 +02:00
Manuel R. Ciosici
3104036e7f
Add support for bitsandbytes ( #15622 )
...
* Add initial BNB integration
* fixup! Add initial BNB integration
* Add bnb test decorator
* Update Adamw8bit option name
* Use the full bnb package name
* Overide bnb for all embedding layers
* Fix package name
* Formatting
* Remove unnecessary import
* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Rename AdamwBNB optimizer option
* Add training test checking that bnb memory utilization is lower
* fix merge
* fix merge; fix + extend new test
* cleanup
* expand bnb
* move all require_* candidates to testing_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-04-19 16:01:29 -04:00
Yih-Dar
e6d23a4b9b
Improve test_pt_tf_model_equivalence on PT side ( #16731 )
...
* Update test_pt_tf_model_equivalence on PT side
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-19 21:13:27 +02:00
Dahlbomii
3dd57b15c5
Type hints added to Speech to Text ( #16506 )
...
* Type hints added
* return hints added
* Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-04-19 17:58:08 +01:00
SaulLu
1efca4e6c8
replace Speech2TextTokenizer
by Speech2TextFeatureExtractor
in some docstrings ( #16835 )
...
* replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring
* quality
2022-04-19 18:32:22 +02:00
Jeevesh Juneja
b5c6a63ed9
Correct Logging of Eval metric to Tensorboard ( #16825 )
...
* Correct Logging of Eval metric to Tensorboard
An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``.
* Remove unused variable
2022-04-19 17:27:54 +02:00
Joao Gante
f09c45e067
TF: Add sigmoid activation function ( #16819 )
2022-04-19 16:13:08 +01:00
wiio12
74814574ae
Add doc about attention_mask
on gpt2 ( #16829 )
...
* Add doc about `attention_mask` on gpt2
Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.
* Add doc about attention_mask on gpt2_tf
* clean up style
* remove empty line white spaces
* remove whitespace in empty line
2022-04-19 16:32:26 +02:00
NielsRogge
b96e82c80a
Add image classification script, no trainer ( #16727 )
...
* Add first draft
* Improve README and run fixup
* Make script aligned with other scripts, improve README
* Improve script and add test
* Remove print statement
* Apply suggestions from code review
* Add num_labels to make test pass
* Improve README
2022-04-19 16:32:08 +02:00
Patrick von Platen
db9f189121
[ASR Pipeline] Correct init docs ( #16833 )
...
* correct
* up
2022-04-19 16:12:36 +02:00
Ella Charlaix
77de8d6c31
Add onnx export of models with a multiple choice classification head ( #16758 )
...
* Add export of models with a multiple-choice classification head
2022-04-19 15:51:51 +02:00
Wonjae Kim
b74a955325
fix rum_clm.py
seeking text column name twice ( #16624 )
2022-04-19 14:38:25 +01:00
Dahlbomii
3663fca41b
Type hints added for TFMobileBert ( #16505 )
...
* Type hints added
* make style
* Return type hints added
* fixed typo
Co-authored-by: matt <rocketknight1@gmail.com>
2022-04-19 14:37:03 +01:00
code-review-doctor
a2392415e9
Some tests misusing assertTrue for comparisons fix ( #16771 )
...
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor
* fix tests
* fix tf
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:44:08 +02:00
Suraj Patil
d3bd9ac728
[Flax] improve large model init and loading ( #16148 )
...
* begin do_init
* add params_shape_tree
* raise error if params are accessed when do_init is False
* don't allow do_init=False when keys are missing
* make shape tree a property
* assign self._params at the end
* add test for do_init
* add do_init arg to all flax models
* fix param setting
* disbale do_init for composite models
* update test
* add do_init in FlaxBigBirdForMultipleChoice
* better names and errors
* improve test
* style
* add a warning when do_init=False
* remove extra if
* set params after _required_params
* add test for from_pretrained
* do_init => _do_init
* chage warning to info
* fix typo
* add params in init_weights
* add params to gpt neo init
* add params to init_weights
* update do_init test
* Trigger CI
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* update template
* trigger CI
* style
* style
* fix template
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:19:55 +02:00
Arthur
6de4ee61a0
Wav2 vec2 phoneme ctc tokenizer optimisation ( #16817 )
...
* Solved href rendering issue in heading
Markdown references in headings such as '####' don't render well.
Replaced it with <h4>...<a></a></h> banners.
* PhonemeTokenizer optimization using phonemizer lib
The backend should only be initialized once, otherwise it is reloaded.
Added `init_backend` function, intializes a backend attribute.
Phonemize re-uses self.backend.
Should give ~10 times faster phonemization.
* formatted file with make style
* Documentation suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update /tokenization_wav2vec2_phoneme.py based on PR suggestion
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update CONTRIBUTING.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-19 07:39:04 -04:00
Li-Huai (Allan) Lin
306c9ee966
Fix LayoutLMv2
tokenization docstrings ( #16187 )
...
* Fix docstrings
* Fix up
* Fix
2022-04-19 12:14:51 +02:00
NielsRogge
7db7aab439
Add semantic script no trainer, v2 ( #16788 )
...
* Add first draft from previous PR
* First draft
* Improve README and remove num_labels
* Make script more aligned with other scripts
* Improve README and apply suggestion from code review
2022-04-19 09:07:29 +02:00
NielsRogge
494c2a8c4d
Clean up semantic segmentation tests ( #16801 )
...
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-19 09:02:19 +02:00
David Hall
989a15d173
fix _setup_devices in case where there is no torch.distributed package in build ( #16821 )
...
* fix _setup_devices in case where there is not torch.distributed
* in training_args_sm.py as well
2022-04-18 18:36:46 -04:00
Lysandre Debut
c11a49573f
Refactor issues with yaml ( #16772 )
...
* Refactor issues with yaml
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update .github/ISSUE_TEMPLATE/feature-request.yml
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update .github/ISSUE_TEMPLATE/bug-report.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Address review comments
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-18 16:43:21 -04:00
jsnfly
51e0ebedcb
Allow passing encoder_ouputs as tuple to EncoderDecoder Models ( #16814 )
...
* Add passing encoder_outputs as tuple to existing test
* Add check for tuple
* Add check for tuple also for speech and vision
Co-authored-by: jsnfly <jsnfly@gmx.de>
2022-04-18 19:49:58 +02:00
Nicholas Broad
51fa7191b1
use base_version to check torch version in torch_less_than_1_11 ( #16806 )
...
* use base_version
* make is_torch_less_than_1_8 match 1_11
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
2022-04-18 13:02:00 -04:00
Patrick von Platen
8d3f952adb
[Data2Vec] Add data2vec vision ( #16760 )
...
* save intermediate
* add vision
* add vision
* save
* finish models
* finish models
* continue
* finish
* up
* up
* up
* tests all pass
* clean up
* up
* up
* fix bugs in beit
* correct docs
* finish
* finish docs
* make style
* up
* more fixes
* fix type hint
* make style
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/data2vec/test_modeling_data2vec_vision.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-18 17:52:13 +02:00
Zhengqiang Yin
33cd4be576
fix megatron bert convert state dict naming ( #15820 )
2022-04-18 11:34:36 -04:00
Patrick von Platen
9a2995ee39
[Quicktour Audio] Improve && remove ffmpeg dependency ( #16723 )
...
* [Quicktour Audio] Improve && remove ffmpeg dependency
* final fix
* final touches
2022-04-18 16:50:13 +02:00
NielsRogge
d3c9d0e55f
[ViT, BEiT, DeiT, DPT] Improve code ( #16799 )
...
* Improve code
* Fix bugs
* Fix another bug
* Clean up DTP as well
* Update DPT model outputs
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-18 09:25:08 -04:00
Sylvain Gugger
3785f4665a
Fix syntax error in TorchHub workflow
2022-04-18 07:54:00 -04:00
Joao Gante
6984848ed0
Create empty venv on cache miss ( #16816 )
2022-04-18 07:49:31 -04:00
Allan Jie
438144832e
Raise error and suggestion when using custom optimizer with Fairscale or Deepspeed ( #16786 )
...
* optimizer issues related to saving
* remove the "optimizer saving" option
* reformat using make style
2022-04-18 07:47:21 -04:00
Joao Gante
b4ddd2677c
TF generate refactor - XLA sample ( #16713 )
2022-04-18 10:58:24 +01:00
Joao Gante
02de7a8e7f
CI: non-remote GH Actions now use a python venv ( #16789 )
2022-04-18 09:47:38 +01:00
Sylvain Gugger
dee6f01636
Pin Jax to last working release ( #16808 )
...
* Pin Jax to last working release
* Try lower
* Try lower
2022-04-16 21:15:19 -04:00
NielsRogge
78f346c2b5
Update README.md ( #16797 )
2022-04-15 14:10:16 +02:00
Yih-Dar
ee209d4d01
Fix PT TF ViTMAE ( #16766 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-15 06:37:10 +02:00
Stas Bekman
5da33f8729
[modeling utils] revamp from_pretrained(..., low_cpu_mem_usage=True)
+ tests ( #16657 )
...
* add low_cpu_mem_usage tests
* wip: revamping
* wip
* install /usr/bin/time
* wip
* cleanup
* cleanup
* cleanup
* cleanup
* cleanup
* fix assert
* put the wrapper back
* cleanup; switch to bert-base-cased
* Trigger CI
* Trigger CI
2022-04-14 18:10:05 -07:00
Stas Bekman
ce2fef2ad2
[trainer / deepspeed] fix hyperparameter_search ( #16740 )
...
* [trainer / deepspeed] fix hyperparameter_search
* require optuna
* style
* oops
* add dep in the right place
* create deepspeed-testing dep group
* Trigger CI
2022-04-14 17:24:38 -07:00
code-review-doctor
1b7de41a07
Fix issue avoid-missing-comma found at https://codereview.doctor ( #16768 )
2022-04-14 16:42:27 -04:00
Sanchit Gandhi
de8b06f9bf
[SpeechEncoderDecoderModel] Fix bug in reshaping labels ( #16748 )
2022-04-14 19:02:40 +01:00
NielsRogge
048443db86
Improve image classification example ( #16585 )
...
* Improve README
* Make dataset_name argument optional
* Improve local data
* Fix bug
* Improve README some more
* Apply suggestions from code review
* Improve README
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-14 18:10:52 +02:00
Sylvain Gugger
3e4eec47f5
Kill async pushes when calling push_to_hub with blocking=True ( #16755 )
2022-04-14 10:02:29 -04:00
Stas Bekman
c21e1071a7
[deepspeed / m2m_100] make deepspeed zero-3 work with layerdrop ( #16717 )
...
* [deepspeed / m2m_100] make deepspeed 3 work with layerdrop
* fix
* revert last
2022-04-14 06:51:55 -07:00
Zachary Mueller
89293a0f6b
Make nightly install dev accelerate ( #16783 )
2022-04-14 09:41:02 -04:00
Sylvain Gugger
b151ddb9b9
Fix batch size in evaluation loop ( #16763 )
...
* Fix batch size in evaluation loop
* remove debug statement
2022-04-14 09:22:54 -04:00
Sanchit Gandhi
d8269eb4d5
[Flax .from_pretrained
] Raise a warning if model weights are not in float32 ( #16762 )
...
* [Flax] Raise a warning if model weights are not in float32
* apply suggestions and few small changes
* reorder wording for better readability
2022-04-14 11:52:15 +02:00
Nicolas Patry
195fbbb6cf
Enabling Tapex
in table question answering pipeline. ( #16663 )
...
* Enabling `Tapex` in table question answering pipeline.
* Questions are independant for Tapex, making the test respect that.
* Missing extra space.
2022-04-14 09:06:14 +02:00