transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 18:48:24 +06:00

Author	SHA1	Message	Date
Sebastian	3e142cb0f5	fix overflow when training mDeberta in fp16 (#24116 ) * Porting changes from https://github.com/microsoft/DeBERTa/ that hopefully allows for fp16 training of mdeberta * Updates to deberta modeling from microsoft repo * Performing some cleanup * Undoing changes that weren't necessary * Undoing float calls * Minimally change the p2c block * Fix error * Minimally changing the c2p block * Switch to torch sqrt * Remove math * Adding back the to calls to scale * Undoing attention_scores change * Removing commented out code * Updating modeling_sew_d.py to satisfy utils/check_copies.py * Missed changed * Further reduce changes needed to get fp16 working * Reverting changes to modeling_sew_d.py * Make same change in TF	2023-06-13 15:04:27 +01:00
amyeroberts	f91810da88	Safely import pytest in testing_utils.py (#24241 )	2023-06-13 14:28:08 +01:00
Nicolas Patry	fdd78d9153	Improving error message when using `use_safetensors=True`. (#24232 )	2023-06-13 15:07:00 +02:00
Yih-Dar	74b846cacf	Update `(TF)SamModelIntegrationTest` (#24199 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-13 14:28:14 +02:00
yuanwu2017	d7389cd201	fix: TextIteratorStreamer cannot work with pipeline (#23641 ) * fix: TextIteratorStreamer cannot work with pipeline Deepcopying the TextIteratorStreamer object causes the exception. Signed-off-by: yuanwu <yuan.wu@intel.com> * Update src/transformers/pipelines/text_generation.py Got it. I will update the patch. Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update text_generation.py --------- Signed-off-by: yuanwu <yuan.wu@intel.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-06-13 10:42:41 +01:00
Sylvain Gugger	70c7994095	Fix README copies	2023-06-12 16:24:27 -04:00
Yih-Dar	41a8fa4e14	Add the number of `model` test failures to slack CI report (#24207 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 21:27:10 +02:00
Zach Mueller	4da84008dc	Finish dataloader integration (#24201 )	2023-06-12 13:26:17 -04:00
Yih-Dar	0675600a60	Update `WhisperForAudioClassification` doc example (#24188 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 19:10:31 +02:00
fxmarty	e5dd7432e7	Remove unnecessary aten::to overhead in llama (#24203 ) * fix dtype init * fix copies * fix fixcopies mess * edit forward as well * copy	2023-06-12 12:18:04 -04:00
Yih-Dar	4fe9716a79	Skip RWKV test in past CI (#24204 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 18:14:15 +02:00
Ethan	f7d80cb3d2	Fix steps bugs in no trainer examples (#24197 ) Fix step bugs in no trainer + load checkpoint + grad acc	2023-06-12 11:49:55 -04:00
Marc Sun	08ae37c820	Fix `_load_pretrained_model` (#24200 ) Fix test	2023-06-12 11:31:06 -04:00
Zach Mueller	ebd94b0f6f	🚨🚨🚨 Replace DataLoader logic for Accelerate in Trainer, remove unneeded tests 🚨🚨🚨 (#24028 ) * Working integration * Fix failing test * Revert label host logic * Bring it back!	2023-06-12 11:23:37 -04:00
Kihoon Son	dc42a9d76f	🌐 [i18n-KO] Translated tasks_summary.mdx to Korean (#23977 ) * 🌐 [i18n-KO] Translated tasks_summary.mdx to Korean Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> * Apply suggestions from code review Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * Update _toctree.yml * Delete generation_strategies.mdx * Delete tasks_explained.mdx --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>	2023-06-12 11:07:15 -04:00
Joao Gante	60b69f7de2	Generate: detect special architectures when loaded from PEFT (#24198 )	2023-06-12 16:06:20 +01:00
Jacob	97527898da	typo: fix typos in CONTRIBUTING.md and deepspeed.mdx (#24184 ) * typo: fix typos in CONTRIBUTING.md and deepspeed.mdx * Update CONTRIBUTING.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-12 15:43:58 +01:00
Yih-Dar	dadc9fb427	Update `GPTNeoXLanguageGenerationTest` (#24193 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 15:37:12 +02:00
Yih-Dar	a9cdb059a8	Fix device issue in `OpenLlamaModelTest::test_model_parallelism` (#24195 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 15:21:27 +02:00
Joao Gante	9f81f4f6dd	Generate: force caching on the main model, in assisted generation (#24177 )	2023-06-12 14:10:49 +01:00
Kihoon Son	535f92aea3	[i18n]Translated "attention.mdx" to korean (#23878 ) * [i18n]Translated "attention.mdx" to korean Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com> Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> * Update _toctree.yml --------- Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-06-12 08:59:18 -04:00
AinL	ba64ec07bb	Change ProgressCallback to use dynamic_ncols=True (#24101 ) * Change ProgressCallback to use dynamic_ncols=True * style: make style * Revert "style: make style" This reverts commit `dee484904c`. * run make style only trainer_callback	2023-06-12 08:56:48 -04:00
NielsRogge	93f73a3848	Fix push to hub (#24187 ) Add fix	2023-06-12 08:51:09 -04:00
Yih-Dar	e26c6f03be	Fix `Wav2Vec2` CI OOM (#24190 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-12 11:39:04 +02:00
Yih-Dar	8f093fb799	Avoid OOM in doctest CI (#24139 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-10 09:47:38 +02:00
Stas Bekman	0d217f428f	[tests] fix bitsandbytes import issue (#24151 ) fix bitsandbytes import issue	2023-06-09 21:53:11 -07:00
Lysandre Debut	deff5979fe	Tool types (#24032 ) * Tool types * Tests + fixes * Isolate types * Oops * Review comments + docs * Tests + docs * soundfile -> vision	2023-06-09 13:34:07 -04:00
Freddie Vargus	061580c82c	Fix typo in streamers.py (#24144 )	2023-06-09 17:27:46 +01:00
LiamSwayne	12bb853ccd	[documentation] grammatical fixes in image_classification.mdx (#24141 ) Update image_classification.mdx	2023-06-09 16:59:44 +01:00
Yih-Dar	d0d1632958	Fix Pipeline CI OOM issue (#24124 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 16:49:02 +02:00
Arthur	a7501f6fc6	[BlenderBotSmall] Update doc example (#24092 ) * small tokenizer uses `__start__` and `__end__` * fix PR doctest	2023-06-09 16:31:57 +02:00
Arthur	5af3a1aa48	[lamaTokenizerFast] Update documentation (#24132 ) * Update documentation * nits	2023-06-09 16:30:20 +02:00
Younes Belkada	62fe753325	[`SAM`] Fix sam slow test (#24140 ) * fix sam test * update pipeline typehint	2023-06-09 16:22:09 +02:00
Yih-Dar	847b47c0ee	Fix XGLM OOM on CI (#24123 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:20:59 +02:00
Yih-Dar	b8fe259f16	Fix SAM OOM issue on CI (#24125 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:07:08 +02:00
Yih-Dar	707023d155	Fix TF Rag OOM issue (#24122 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 15:03:11 +02:00
Sourab Mangrulkar	f2b918356c	fix bugs with trainer (#24134 ) * fix the deepspeed test failures * apex fix * FSDP save ckpt fix * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-09 17:54:53 +05:30
Joao Gante	be10092e63	Generate: PT's `top_p` enforces `min_tokens_to_keep` when it is `1` (#24111 )	2023-06-09 13:20:05 +01:00
Matt	03585f3734	Correctly build models and import call_context for older TF versions (#24138 )	2023-06-09 13:11:01 +01:00
Younes Belkada	a6d05d55f6	[`bnb`] Fix bnb config json serialization (#24137 ) * fix bnb config json serialization * forward contrib credits from discussions --------- Co-authored-by: Andrechang <Andrechang@users.noreply.github.com>	2023-06-09 13:41:14 +02:00
Elliott Wang	e2972dffdd	PLAM => PaLM (#24129 )	2023-06-09 12:32:16 +01:00
Arthur	535542d38d	[Lllama] Update tokenization code to ensure parsing of the special tokens [core] (#24042 ) * preventllama fast from returning token type ids * remove type hints * normalised False	2023-06-09 09:36:19 +02:00
Yih-Dar	2e2088f24b	Avoid `GPT-2` daily CI job OOM (in TF tests) (#24106 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-08 18:21:09 +02:00
Serge Panev	9322c24476	Fix typo in Llama docstrings (#24020 ) * Fix typo in Llama docstrings Signed-off-by: Serge Panev <spanev@nvidia.com> * Update Signed-off-by: Serge Panev <spanev@nvidia.com> * make style Signed-off-by: Serge Panev <spanev@nvidia.com> --------- Signed-off-by: Serge Panev <spanev@nvidia.com>	2023-06-08 17:19:07 +01:00
Radamés Ajna	a73883ae9e	add trust_remote_code option to CLI download cmd (#24097 ) * add trust_remote_code option * require_torch	2023-06-08 11:13:57 -04:00
Younes Belkada	8b169142f8	[`GPT2`] Add correct keys on `_keys_to_ignore_on_load_unexpected` on all child classes of `GPT2PreTrainedModel` (#24113 ) * add correct keys on `_keys_to_ignore_on_load_unexpected` * oops	2023-06-08 10:21:42 -04:00
Marc Sun	71a114d3e0	fix get_keys_to_not_convert function (#24095 ) * fix get_keys_to_not_convert funct * Fix style	2023-06-08 10:14:27 -04:00
Sylvain Gugger	8c5f306719	Update the pin on Accelerate (#24110 )	2023-06-08 10:11:01 -04:00
Younes Belkada	2200bf7a45	[`Trainer`] Correct behavior of `_load_best_model` for PEFT models (#24103 ) * v1 * some refactor - add ST format as well * fix * add `ADAPTER_WEIGHTS_NAME` & `ADAPTER_SAFE_WEIGHTS_NAME`	2023-06-08 15:38:30 +02:00
Sourab Mangrulkar	0f23605094	reset accelerate env variables after each test (#24107 )	2023-06-08 09:19:07 -04:00

... 37 38 39 40 41 ...

15053 Commits