Commit Graph

15053 Commits

Author SHA1 Message Date
Sebastian
3e142cb0f5
fix overflow when training mDeberta in fp16 (#24116)
* Porting changes from https://github.com/microsoft/DeBERTa/ that hopefully allows for fp16 training of mdeberta

* Updates to deberta modeling from microsoft repo

* Performing some cleanup

* Undoing changes that weren't necessary

* Undoing float calls

* Minimally change the p2c block

* Fix error

* Minimally changing the c2p block

* Switch to torch sqrt

* Remove math

* Adding back the to calls to scale

* Undoing attention_scores change

* Removing commented out code

* Updating modeling_sew_d.py to satisfy utils/check_copies.py

* Missed changed

* Further reduce changes needed to get fp16 working

* Reverting changes to modeling_sew_d.py

* Make same change in TF
2023-06-13 15:04:27 +01:00
amyeroberts
f91810da88
Safely import pytest in testing_utils.py (#24241) 2023-06-13 14:28:08 +01:00
Nicolas Patry
fdd78d9153
Improving error message when using use_safetensors=True. (#24232) 2023-06-13 15:07:00 +02:00
Yih-Dar
74b846cacf
Update (TF)SamModelIntegrationTest (#24199)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-13 14:28:14 +02:00
yuanwu2017
d7389cd201
fix: TextIteratorStreamer cannot work with pipeline (#23641)
* fix: TextIteratorStreamer cannot work with pipeline

Deepcopying the TextIteratorStreamer object causes the exception.

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Update src/transformers/pipelines/text_generation.py

Got it. I will update the patch.

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/pipelines/text_generation.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update text_generation.py

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-06-13 10:42:41 +01:00
Sylvain Gugger
70c7994095
Fix README copies 2023-06-12 16:24:27 -04:00
Yih-Dar
41a8fa4e14
Add the number of model test failures to slack CI report (#24207)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 21:27:10 +02:00
Zach Mueller
4da84008dc
Finish dataloader integration (#24201) 2023-06-12 13:26:17 -04:00
Yih-Dar
0675600a60
Update WhisperForAudioClassification doc example (#24188)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 19:10:31 +02:00
fxmarty
e5dd7432e7
Remove unnecessary aten::to overhead in llama (#24203)
* fix dtype init

* fix copies

* fix fixcopies mess

* edit forward as well

* copy
2023-06-12 12:18:04 -04:00
Yih-Dar
4fe9716a79
Skip RWKV test in past CI (#24204)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 18:14:15 +02:00
Ethan
f7d80cb3d2
Fix steps bugs in no trainer examples (#24197)
Fix step bugs in no trainer + load checkpoint + grad acc
2023-06-12 11:49:55 -04:00
Marc Sun
08ae37c820
Fix _load_pretrained_model (#24200)
Fix test
2023-06-12 11:31:06 -04:00
Zach Mueller
ebd94b0f6f
🚨🚨🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨🚨🚨 (#24028)
* Working integration

* Fix failing test

* Revert label host logic

* Bring it back!
2023-06-12 11:23:37 -04:00
Kihoon Son
dc42a9d76f
🌐 [i18n-KO] Translated tasks_summary.mdx to Korean (#23977)
* 🌐 [i18n-KO] Translated tasks_summary.mdx to Korean

Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>

* Apply suggestions from code review

Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* Update _toctree.yml

* Delete generation_strategies.mdx

* Delete tasks_explained.mdx

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
2023-06-12 11:07:15 -04:00
Joao Gante
60b69f7de2
Generate: detect special architectures when loaded from PEFT (#24198) 2023-06-12 16:06:20 +01:00
Jacob
97527898da
typo: fix typos in CONTRIBUTING.md and deepspeed.mdx (#24184)
* typo: fix typos in CONTRIBUTING.md and deepspeed.mdx

* Update CONTRIBUTING.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-06-12 15:43:58 +01:00
Yih-Dar
dadc9fb427
Update GPTNeoXLanguageGenerationTest (#24193)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 15:37:12 +02:00
Yih-Dar
a9cdb059a8
Fix device issue in OpenLlamaModelTest::test_model_parallelism (#24195)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 15:21:27 +02:00
Joao Gante
9f81f4f6dd
Generate: force caching on the main model, in assisted generation (#24177) 2023-06-12 14:10:49 +01:00
Kihoon Son
535f92aea3
[i18n]Translated "attention.mdx" to korean (#23878)
* [i18n]Translated "attention.mdx" to korean

Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

* Update _toctree.yml

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-06-12 08:59:18 -04:00
AinL
ba64ec07bb
Change ProgressCallback to use dynamic_ncols=True (#24101)
* Change ProgressCallback to use dynamic_ncols=True

* style: make style

* Revert "style: make style"

This reverts commit dee484904c.

* run make style only trainer_callback
2023-06-12 08:56:48 -04:00
NielsRogge
93f73a3848
Fix push to hub (#24187)
Add fix
2023-06-12 08:51:09 -04:00
Yih-Dar
e26c6f03be
Fix Wav2Vec2 CI OOM (#24190)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 11:39:04 +02:00
Yih-Dar
8f093fb799
Avoid OOM in doctest CI (#24139)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-10 09:47:38 +02:00
Stas Bekman
0d217f428f
[tests] fix bitsandbytes import issue (#24151)
fix bitsandbytes import issue
2023-06-09 21:53:11 -07:00
Lysandre Debut
deff5979fe
Tool types (#24032)
* Tool types

* Tests + fixes

* Isolate types

* Oops

* Review comments + docs

* Tests + docs

* soundfile -> vision
2023-06-09 13:34:07 -04:00
Freddie Vargus
061580c82c
Fix typo in streamers.py (#24144) 2023-06-09 17:27:46 +01:00
LiamSwayne
12bb853ccd
[documentation] grammatical fixes in image_classification.mdx (#24141)
Update image_classification.mdx
2023-06-09 16:59:44 +01:00
Yih-Dar
d0d1632958
Fix Pipeline CI OOM issue (#24124)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 16:49:02 +02:00
Arthur
a7501f6fc6
[BlenderBotSmall] Update doc example (#24092)
* small tokenizer uses `__start__` and `__end__`

* fix PR doctest
2023-06-09 16:31:57 +02:00
Arthur
5af3a1aa48
[lamaTokenizerFast] Update documentation (#24132)
* Update documentation

* nits
2023-06-09 16:30:20 +02:00
Younes Belkada
62fe753325
[SAM] Fix sam slow test (#24140)
* fix sam test

* update pipeline typehint
2023-06-09 16:22:09 +02:00
Yih-Dar
847b47c0ee
Fix XGLM OOM on CI (#24123)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:20:59 +02:00
Yih-Dar
b8fe259f16
Fix SAM OOM issue on CI (#24125)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:07:08 +02:00
Yih-Dar
707023d155
Fix TF Rag OOM issue (#24122)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:03:11 +02:00
Sourab Mangrulkar
f2b918356c
fix bugs with trainer (#24134)
* fix the deepspeed test failures

* apex fix

* FSDP save ckpt fix

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-06-09 17:54:53 +05:30
Joao Gante
be10092e63
Generate: PT's top_p enforces min_tokens_to_keep when it is 1 (#24111) 2023-06-09 13:20:05 +01:00
Matt
03585f3734
Correctly build models and import call_context for older TF versions (#24138) 2023-06-09 13:11:01 +01:00
Younes Belkada
a6d05d55f6
[bnb] Fix bnb config json serialization (#24137)
* fix bnb config json serialization

* forward contrib credits from discussions

---------

Co-authored-by: Andrechang <Andrechang@users.noreply.github.com>
2023-06-09 13:41:14 +02:00
Elliott Wang
e2972dffdd
PLAM => PaLM (#24129) 2023-06-09 12:32:16 +01:00
Arthur
535542d38d
[Lllama] Update tokenization code to ensure parsing of the special tokens [core] (#24042)
* preventllama fast from returning token type ids

* remove type hints

* normalised False
2023-06-09 09:36:19 +02:00
Yih-Dar
2e2088f24b
Avoid GPT-2 daily CI job OOM (in TF tests) (#24106)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-08 18:21:09 +02:00
Serge Panev
9322c24476
Fix typo in Llama docstrings (#24020)
* Fix typo in Llama docstrings

Signed-off-by: Serge Panev <spanev@nvidia.com>

* Update

Signed-off-by: Serge Panev <spanev@nvidia.com>

* make style

Signed-off-by: Serge Panev <spanev@nvidia.com>

---------

Signed-off-by: Serge Panev <spanev@nvidia.com>
2023-06-08 17:19:07 +01:00
Radamés Ajna
a73883ae9e
add trust_remote_code option to CLI download cmd (#24097)
* add trust_remote_code option

* require_torch
2023-06-08 11:13:57 -04:00
Younes Belkada
8b169142f8
[GPT2] Add correct keys on _keys_to_ignore_on_load_unexpected on all child classes of GPT2PreTrainedModel (#24113)
* add correct keys on `_keys_to_ignore_on_load_unexpected`

* oops
2023-06-08 10:21:42 -04:00
Marc Sun
71a114d3e0
fix get_keys_to_not_convert function (#24095)
* fix get_keys_to_not_convert funct

* Fix style
2023-06-08 10:14:27 -04:00
Sylvain Gugger
8c5f306719
Update the pin on Accelerate (#24110) 2023-06-08 10:11:01 -04:00
Younes Belkada
2200bf7a45
[Trainer] Correct behavior of _load_best_model for PEFT models (#24103)
* v1

* some refactor

- add ST format as well

* fix

* add `ADAPTER_WEIGHTS_NAME` & `ADAPTER_SAFE_WEIGHTS_NAME`
2023-06-08 15:38:30 +02:00
Sourab Mangrulkar
0f23605094
reset accelerate env variables after each test (#24107) 2023-06-08 09:19:07 -04:00