Commit Graph

15053 Commits

Author SHA1 Message Date
Sanchit Gandhi
c9cf337772
[Whisper Tokenizer] Skip special tokens when decoding with timestamps (#23945) 2023-06-02 16:26:59 +02:00
Claudius Kienle
8940d315aa
Trainer: fixed evaluate raising KeyError for ReduceLROnPlateau (#23952)
Trainer: fixed KeyError on evaluate for ReduceLROnPlateau

Co-authored-by: Claudius Kienle <claudius.kienle@artiminds.com>
2023-06-02 08:53:48 -04:00
Kihoon Son
2fdba73a99
🌐 [i18n-KO] Translated object_detection.mdx to Korean (#23164)
* translated object_detection.mdx

Co-Authored-By: Hyeonseo Yun <0525_hhgus@naver.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: simso <3035487+simso@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

---------

Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: simso <3035487+simso@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
2023-06-02 07:43:55 -04:00
Patrick von Platen
dcb5e18c9e
add new mms functions to doc (#23954) 2023-06-02 11:35:52 +01:00
Shehan Munasinghe
07c54413ac
Add MobileViTv2 (#22820)
* generated code from add-new-model-like

* Add code for modeling, config, and weight conversion

* add tests for image-classification, update modeling and config

* add code, tests for semantic-segmentation

* make style, make quality, make fix-copies

* make fix-copies

* Update modeling_mobilevitv2.py

fix bugs

* Update _toctree.yml

* update modeling, config

fix bugs

* Edit docs - fix bug MobileViTv2v2 -> MobileViTv2

* Update mobilevitv2.mdx

* update docstrings

* Update configuration_mobilevitv2.py

make style

* Update convert_mlcvnets_to_pytorch.py

remove unused options

* Update convert_mlcvnets_to_pytorch.py

make style

* Add suggestions from code review

Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style, make quality

* Add suggestions from code review

Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add suggestions from code review

Remove MobileViTv2ImageProcessor

Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style

* Add suggestions from code review

Rename MobileViTv2 -> MobileViTV2

Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add suggestions from code review

Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_mobilevitv2.py

make style

* Update serialization.mdx

* Update modeling_mobilevitv2.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-06-02 10:37:02 +01:00
Patrick von Platen
5dfd407b37
[MMS] Scaling Speech Technology to 1,000+ Languages | Add attention adapter to Wav2Vec2 (#23813)
* add fine-tuned with adapter layer

* Add set_target_lang to tokenizer

* Implement load adapter

* add tests

* make style

* Apply suggestions from code review

* Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py

* make fix-copies

* Apply suggestions from code review

* make fix-copies

* make style again

* mkae style again

* fix doc string

* Update tests/models/wav2vec2/test_tokenization_wav2vec2.py

* Apply suggestions from code review

* fix

* Correct wav2vec2 adapter

* mkae style

* Update src/transformers/models/wav2vec2/modeling_wav2vec2.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* add more nice docs

* finish

* finish

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Apply suggestions from code review

* all finish

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-06-02 10:30:24 +01:00
wasupandceacar
f49a3453ca
Fix ReduceLROnPlateau object has no attribute 'get_last_lr' (#23944)
* Fix 'ReduceLROnPlateau' object has no attribute 'get_last_lr'

* fix style
2023-06-01 16:10:52 -04:00
Kashif Rasul
c62b01d0b0
use _make_causal_mask in clip/vit models (#23942)
use _make_causal_mask in clip models
2023-06-01 16:10:24 -04:00
Marc Sun
e03a9cc0cd
Modify device_map behavior when loading a model using from_pretrained (#23922)
* Modify device map behavior for 4/8 bits model

* Remove device_map arg for training 4/8 bit model

* Remove index

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add Exceptions

* Modify comment

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix formatting

* Get current device with accelerate

* Revert "Get current device with accelerate"

This reverts commit 46f0079910.

* Fix Exception

* Modify quantization doc

* Fix error

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-06-01 13:21:22 -04:00
Brendon Soong
d1fa349e78
#23675 Registering Malay language (#23689)
* #23675 Registering Malay language

* removing untranslated files

* some translate

* more updates to toctree

* inc index

* additional translations for toctree

* translations of more sections

* removing untranslated file

* translated index.mdx to malay
2023-06-01 13:17:27 -04:00
Lysandre Debut
dc67da0182
Revert "Update stale.yml to use HuggingFaceBot" (#23943)
Revert "Update stale.yml to use HuggingFaceBot (#23941)"

This reverts commit 5929f86ebb.
2023-06-01 11:58:11 -04:00
Matt
8088ca4185
Make TF ESM inv_freq non-trainable like PyTorch (#23940)
Make TF inv_freq non-trainable like PyTorch
2023-06-01 16:15:00 +01:00
Lysandre Debut
5929f86ebb
Update stale.yml to use HuggingFaceBot (#23941) 2023-06-01 10:54:50 -04:00
Adam Lewis
857d4e1c87
rename DocumentQuestionAnsweringTool parameter input to match docstring (#23939)
rename encode input to match docstring
2023-06-01 10:54:01 -04:00
Sylvain Gugger
9193188276
Pin rhoknp (#23937) 2023-06-01 10:25:43 -04:00
Sheon Han
af2c36793f
Fix doc string nits (#23929) 2023-06-01 10:10:15 -04:00
fxmarty
9a35a7b9e1
Effectively allow encoder_outputs input to be a tuple in pix2struct (#23932)
consistentcy
2023-06-01 09:07:57 -04:00
Sanchit Gandhi
9603ef890a
[Flax Whisper] Update decode docstring (#23908) 2023-06-01 14:36:45 +02:00
Sylvain Gugger
fabe17a726
Skip device placement for past key values in decoder models (#23919) 2023-05-31 15:32:21 -04:00
NielsRogge
6affd9cd7c
[PushToHub] Make it possible to upload folders (#23920)
Add first draft
2023-05-31 15:31:28 -04:00
Sylvain Gugger
4aa13224a5
Update the update metadata job to use upload_folder (#23917) 2023-05-31 14:10:14 -04:00
Sylvain Gugger
3ff443a6d9
Re-enable squad test (#23912)
* Re-enable squad test

* [all-test]

* [all-test] Fix all test command

* Fix the all-test
2023-05-31 13:44:26 -04:00
Sourab Mangrulkar
d13021e35f
remove the extra accelerator.prepare (#23914)
remove the extra `accelerator.prepare` that slipped in with multiple update from main 😅
2023-05-31 23:04:55 +05:30
amyeroberts
c608b8fc93
Bug fix - flip_channel_order for channels first images (#23701)
Bug fix - flip_channel_order for channels_first
2023-05-31 17:12:27 +01:00
Sylvain Gugger
0b3d092f63
Empty circleci config (#23913)
* Try easy first

* Add an empty job

* Fix name

* Fix method
2023-05-31 12:02:05 -04:00
amyeroberts
8714b964ee
Raise error if loss can't be calculated - ViT MIM (#23872)
Raise error if loss can't be calculated
2023-05-31 17:01:53 +01:00
Hari
404d925384
add conditional statement for auxiliary loss calculation (#23899)
* add conditional statement for auxiliary loss calculation

* fix style and copies
2023-05-31 16:40:23 +01:00
Younes Belkada
c63bfc3023
[RWKV] Fix RWKV 4bit (#23910)
fix RWKV 4bit
2023-05-31 17:36:56 +02:00
Zachary Mueller
55451c66ce
Upgrade safetensors version (#23911)
* Upgrade safetensors

* Second table
2023-05-31 11:30:39 -04:00
Connor Henderson
7adce8b532
fix: Replace add_prefix_space in get_prompt_ids with manual space for FastTokenizer compatibility (#23796)
* add ' ' replacement for add_prefix_space

* add fast tokenizer test
2023-05-31 10:52:35 -04:00
Zachary Mueller
84bac652f3
Move import check to before state reset (#23906)
* Move import check to before state reset

* Guard better
2023-05-31 10:49:43 -04:00
Younes Belkada
e42869b091
[bnb] add warning when no linear (#23894)
* add warning for gpt2-like models

* more details

* adapt from suggestions
2023-05-31 16:40:07 +02:00
Sanchit Gandhi
8f915c450d
Unpin numba (#23162)
* fix for ragged list

* unpin numba

* make style

* np.object -> object

* propagate changes to tokenizer as well

* np.long -> "long"

* revert tokenization changes

* check with tokenization changes

* list/tuple logic

* catch numpy

* catch else case

* clean up

* up

* better check

* trigger ci

* Empty commit to trigger CI
2023-05-31 14:59:30 +01:00
Xinyu Yang
d99f11e898
ensure banned_mask and indices in same device (#23901)
* ensure banned_mask and indices in same device

* ensure banned_mask and indices in same device

switch the order in which indices and banned_mask are created and create banned_mask on the proper device
2023-05-31 09:47:46 -04:00
Thomas Wang
d68d6665f9
Support shared tensors (#23871)
* Suport shared storage

* Really be sure we have the same storage

* Make style

* - Refactor storage identifier mechanism
 - Group everything into a single for loop

* Make style

* PR

* make style

* Update src/transformers/pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-31 09:42:30 -04:00
Sylvain Gugger
68d53bc717
Fix Trainer when model is loaded on a different GPU (#23792) 2023-05-31 07:54:26 -04:00
Calico
0963a2508b
fix(configuration_llama): add keys_to_ignore_at_inference to LlamaConfig (#23891) 2023-05-31 07:39:51 -04:00
Sylvain Gugger
00f6ba0e7e
Skip failing test for now 2023-05-31 06:31:33 -04:00
Sourab Mangrulkar
a73b1d59a3
accelerate deepspeed and gradient accumulation integrate (#23236)
* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate

* shift deepspeed integration and save & load utils to accelerate

* fix accelerate launcher support

* oops

* fix 🐛

* save ckpt fix

* Trigger CI

* nasty 🐛 😅

* as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate

* make tests happy

* quality 

* loss tracked needs to account for grad_acc

* fixing the deepspeed tests

* quality 

* 😅😅😅

* tests 😡

* quality 

* Trigger CI

* resolve comments and fix the issue with the previous merge from branch

* Trigger CI

* accelerate took over deepspeed integration

---------

Co-authored-by: Stas Bekman <stas@stason.org>
2023-05-31 15:16:22 +05:30
Denisa Roberts
88f50a1e89
Add TensorFlow implementation of EfficientFormer (#22620)
* Add tf code for efficientformer

* Fix return dict bug - return last hidden state after last stage

* Fix corresponding return dict bug

* Override test tol

* Change default values of training to False

* Set training to default False X3

* Rm axis from ln

* Set init in dense projection

* Rm debug stuff

* Make style; all tests pass.

* Modify year to 2023

* Fix attention biases codes

* Update the shape list logic

* Add a batch norm eps config

* Remove extract comments in test files

* Add conditional attn and hidden states return for serving output

* Change channel dim checking logic

* Add exception for withteacher model in training mode

* Revert layer count for now

* Add layer count for conditional layer naming

* Transpose for conv happens only in main layer

* Make tests smaller

* Make style

* Update doc

* Rm from_pt

* Change to actual expect image class label

* Remove stray print in tests

* Update image processor test

* Remove the old serving output logic

* Make style

* Make style

* Complete test
2023-05-31 10:43:12 +01:00
Sylvain Gugger
9fea71b465
Fix last instances of kbit -> quantized (#23797) 2023-05-31 11:38:20 +02:00
Sam Passaglia
38dbbc2640
Fix bug leading to missing token in GPTSanJapaneseTokenizer (#23883)
* add \n

* removed copied from header
2023-05-31 11:32:27 +02:00
Sourab Mangrulkar
03db591047
shift torch dynamo handling to accelerate (#23168)
* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving

* shift torch dynamo handling to accelerate
2023-05-31 14:42:07 +05:30
Sourab Mangrulkar
0b774074a5
move fsdp handling to accelerate (#23158)
* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments

* move fsdp handling to accelerate

* fixex

* fix saving
2023-05-31 14:10:46 +05:30
Sohyun Sim
015829e6c4
🌐 [i18n-KO] Translated pad_truncation.mdx to Korean (#23823)
* docs: ko: pad_truncation.mdx

* feat: manual draft

* fix: resolve suggestions

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
2023-05-31 10:23:59 +02:00
Sourab Mangrulkar
1cf148a6aa
Smangrul/accelerate ddp integrate (#23151)
* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* move ddp prep to accelerate

* fix 😅

* resolving comments
2023-05-31 13:42:49 +05:30
Sourab Mangrulkar
9f0646a555
Smangrul/accelerate mp integrate (#23148)
* mixed precision support via accelerate

* fix issues

* fix for the sharded ddp case

* fix flax and tf failing tests

* `refactor the place to create `Accelerator` object

* address comments by removing debugging print statements
2023-05-31 12:27:51 +05:30
Abhinav Patil
de9255de27
Adds AutoProcessor.from_pretrained support for MCTCTProcessor (#23856)
Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models
2023-05-30 14:36:18 -04:00
George
6451ad0471
Editing issue with pickle def with lambda function (#23869)
* Editing issue with pickle def with lambda function

* fix type

* Made helper function private

* delete tab

---------

Co-authored-by: georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>
2023-05-30 13:26:37 -04:00
Arthur
af2aac51fc
[from_pretrained] imporve the error message when _no_split_modules is not defined (#23861)
* Better warning

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* format line

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-05-30 17:12:14 +02:00