Sebastian
3e142cb0f5
fix overflow when training mDeberta in fp16 ( #24116 )
...
* Porting changes from https://github.com/microsoft/DeBERTa/ that hopefully allows for fp16 training of mdeberta
* Updates to deberta modeling from microsoft repo
* Performing some cleanup
* Undoing changes that weren't necessary
* Undoing float calls
* Minimally change the p2c block
* Fix error
* Minimally changing the c2p block
* Switch to torch sqrt
* Remove math
* Adding back the to calls to scale
* Undoing attention_scores change
* Removing commented out code
* Updating modeling_sew_d.py to satisfy utils/check_copies.py
* Missed changed
* Further reduce changes needed to get fp16 working
* Reverting changes to modeling_sew_d.py
* Make same change in TF
2023-06-13 15:04:27 +01:00
amyeroberts
f91810da88
Safely import pytest in testing_utils.py ( #24241 )
2023-06-13 14:28:08 +01:00
Nicolas Patry
fdd78d9153
Improving error message when using use_safetensors=True
. ( #24232 )
2023-06-13 15:07:00 +02:00
Yih-Dar
74b846cacf
Update (TF)SamModelIntegrationTest
( #24199 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-13 14:28:14 +02:00
yuanwu2017
d7389cd201
fix: TextIteratorStreamer cannot work with pipeline ( #23641 )
...
* fix: TextIteratorStreamer cannot work with pipeline
Deepcopying the TextIteratorStreamer object causes the exception.
Signed-off-by: yuanwu <yuan.wu@intel.com>
* Update src/transformers/pipelines/text_generation.py
Got it. I will update the patch.
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/pipelines/text_generation.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update text_generation.py
---------
Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-06-13 10:42:41 +01:00
Sylvain Gugger
70c7994095
Fix README copies
2023-06-12 16:24:27 -04:00
Yih-Dar
41a8fa4e14
Add the number of model
test failures to slack CI report ( #24207 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 21:27:10 +02:00
Zach Mueller
4da84008dc
Finish dataloader integration ( #24201 )
2023-06-12 13:26:17 -04:00
Yih-Dar
0675600a60
Update WhisperForAudioClassification
doc example ( #24188 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 19:10:31 +02:00
fxmarty
e5dd7432e7
Remove unnecessary aten::to overhead in llama ( #24203 )
...
* fix dtype init
* fix copies
* fix fixcopies mess
* edit forward as well
* copy
2023-06-12 12:18:04 -04:00
Yih-Dar
4fe9716a79
Skip RWKV test in past CI ( #24204 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 18:14:15 +02:00
Ethan
f7d80cb3d2
Fix steps bugs in no trainer examples ( #24197 )
...
Fix step bugs in no trainer + load checkpoint + grad acc
2023-06-12 11:49:55 -04:00
Marc Sun
08ae37c820
Fix _load_pretrained_model
( #24200 )
...
Fix test
2023-06-12 11:31:06 -04:00
Zach Mueller
ebd94b0f6f
🚨 🚨 🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨 🚨 🚨 ( #24028 )
...
* Working integration
* Fix failing test
* Revert label host logic
* Bring it back!
2023-06-12 11:23:37 -04:00
Kihoon Son
dc42a9d76f
🌐 [i18n-KO] Translated tasks_summary.mdx to Korean ( #23977 )
...
* 🌐 [i18n-KO] Translated tasks_summary.mdx to Korean
Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* Update _toctree.yml
* Delete generation_strategies.mdx
* Delete tasks_explained.mdx
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
2023-06-12 11:07:15 -04:00
Joao Gante
60b69f7de2
Generate: detect special architectures when loaded from PEFT ( #24198 )
2023-06-12 16:06:20 +01:00
Jacob
97527898da
typo: fix typos in CONTRIBUTING.md and deepspeed.mdx ( #24184 )
...
* typo: fix typos in CONTRIBUTING.md and deepspeed.mdx
* Update CONTRIBUTING.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-06-12 15:43:58 +01:00
Yih-Dar
dadc9fb427
Update GPTNeoXLanguageGenerationTest
( #24193 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 15:37:12 +02:00
Yih-Dar
a9cdb059a8
Fix device issue in OpenLlamaModelTest::test_model_parallelism
( #24195 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 15:21:27 +02:00
Joao Gante
9f81f4f6dd
Generate: force caching on the main model, in assisted generation ( #24177 )
2023-06-12 14:10:49 +01:00
Kihoon Son
535f92aea3
[i18n]Translated "attention.mdx" to korean ( #23878 )
...
* [i18n]Translated "attention.mdx" to korean
Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* Update _toctree.yml
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
2023-06-12 08:59:18 -04:00
AinL
ba64ec07bb
Change ProgressCallback to use dynamic_ncols=True ( #24101 )
...
* Change ProgressCallback to use dynamic_ncols=True
* style: make style
* Revert "style: make style"
This reverts commit dee484904c
.
* run make style only trainer_callback
2023-06-12 08:56:48 -04:00
NielsRogge
93f73a3848
Fix push to hub ( #24187 )
...
Add fix
2023-06-12 08:51:09 -04:00
Yih-Dar
e26c6f03be
Fix Wav2Vec2
CI OOM ( #24190 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-12 11:39:04 +02:00
Yih-Dar
8f093fb799
Avoid OOM in doctest CI ( #24139 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-10 09:47:38 +02:00
Stas Bekman
0d217f428f
[tests] fix bitsandbytes import issue ( #24151 )
...
fix bitsandbytes import issue
2023-06-09 21:53:11 -07:00
Lysandre Debut
deff5979fe
Tool types ( #24032 )
...
* Tool types
* Tests + fixes
* Isolate types
* Oops
* Review comments + docs
* Tests + docs
* soundfile -> vision
2023-06-09 13:34:07 -04:00
Freddie Vargus
061580c82c
Fix typo in streamers.py ( #24144 )
2023-06-09 17:27:46 +01:00
LiamSwayne
12bb853ccd
[documentation] grammatical fixes in image_classification.mdx ( #24141 )
...
Update image_classification.mdx
2023-06-09 16:59:44 +01:00
Yih-Dar
d0d1632958
Fix Pipeline CI OOM issue ( #24124 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 16:49:02 +02:00
Arthur
a7501f6fc6
[BlenderBotSmall] Update doc example ( #24092 )
...
* small tokenizer uses `__start__` and `__end__`
* fix PR doctest
2023-06-09 16:31:57 +02:00
Arthur
5af3a1aa48
[lamaTokenizerFast] Update documentation ( #24132 )
...
* Update documentation
* nits
2023-06-09 16:30:20 +02:00
Younes Belkada
62fe753325
[SAM
] Fix sam slow test ( #24140 )
...
* fix sam test
* update pipeline typehint
2023-06-09 16:22:09 +02:00
Yih-Dar
847b47c0ee
Fix XGLM OOM on CI ( #24123 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:20:59 +02:00
Yih-Dar
b8fe259f16
Fix SAM OOM issue on CI ( #24125 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:07:08 +02:00
Yih-Dar
707023d155
Fix TF Rag OOM issue ( #24122 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-09 15:03:11 +02:00
Sourab Mangrulkar
f2b918356c
fix bugs with trainer ( #24134 )
...
* fix the deepspeed test failures
* apex fix
* FSDP save ckpt fix
* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-06-09 17:54:53 +05:30
Joao Gante
be10092e63
Generate: PT's top_p
enforces min_tokens_to_keep
when it is 1
( #24111 )
2023-06-09 13:20:05 +01:00
Matt
03585f3734
Correctly build models and import call_context for older TF versions ( #24138 )
2023-06-09 13:11:01 +01:00
Younes Belkada
a6d05d55f6
[bnb
] Fix bnb config json serialization ( #24137 )
...
* fix bnb config json serialization
* forward contrib credits from discussions
---------
Co-authored-by: Andrechang <Andrechang@users.noreply.github.com>
2023-06-09 13:41:14 +02:00
Elliott Wang
e2972dffdd
PLAM => PaLM ( #24129 )
2023-06-09 12:32:16 +01:00
Arthur
535542d38d
[Lllama] Update tokenization code to ensure parsing of the special tokens [core] ( #24042 )
...
* preventllama fast from returning token type ids
* remove type hints
* normalised False
2023-06-09 09:36:19 +02:00
Yih-Dar
2e2088f24b
Avoid GPT-2
daily CI job OOM (in TF tests) ( #24106 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-06-08 18:21:09 +02:00
Serge Panev
9322c24476
Fix typo in Llama docstrings ( #24020 )
...
* Fix typo in Llama docstrings
Signed-off-by: Serge Panev <spanev@nvidia.com>
* Update
Signed-off-by: Serge Panev <spanev@nvidia.com>
* make style
Signed-off-by: Serge Panev <spanev@nvidia.com>
---------
Signed-off-by: Serge Panev <spanev@nvidia.com>
2023-06-08 17:19:07 +01:00
Radamés Ajna
a73883ae9e
add trust_remote_code option to CLI download cmd ( #24097 )
...
* add trust_remote_code option
* require_torch
2023-06-08 11:13:57 -04:00
Younes Belkada
8b169142f8
[GPT2
] Add correct keys on _keys_to_ignore_on_load_unexpected
on all child classes of GPT2PreTrainedModel
( #24113 )
...
* add correct keys on `_keys_to_ignore_on_load_unexpected`
* oops
2023-06-08 10:21:42 -04:00
Marc Sun
71a114d3e0
fix get_keys_to_not_convert function ( #24095 )
...
* fix get_keys_to_not_convert funct
* Fix style
2023-06-08 10:14:27 -04:00
Sylvain Gugger
8c5f306719
Update the pin on Accelerate ( #24110 )
2023-06-08 10:11:01 -04:00
Younes Belkada
2200bf7a45
[Trainer
] Correct behavior of _load_best_model
for PEFT models ( #24103 )
...
* v1
* some refactor
- add ST format as well
* fix
* add `ADAPTER_WEIGHTS_NAME` & `ADAPTER_SAFE_WEIGHTS_NAME`
2023-06-08 15:38:30 +02:00
Sourab Mangrulkar
0f23605094
reset accelerate env variables after each test ( #24107 )
2023-06-08 09:19:07 -04:00