transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-08-03 03:31:05 +06:00

Author	SHA1	Message	Date
Arthur	9cccb3a838	[`Persimmon`] Add support for persimmon (#26042 ) * intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-09-12 11:33:27 +02:00
Phuc Van Phan	5af2c62696	docs: add space to docs (#26067 ) * docs: add space to docs * docs: remove reduntant space	2023-09-11 22:03:26 +01:00
Patrick von Platen	ce2e7ef3d9	[Core] Add lazy import structure to imports (#26090 ) * improve import time * Update src/transformers/integrations/__init__.py * sort import	2023-09-11 17:20:29 +02:00
Phuc Van Phan	9cebae64ad	docs: update link huggingface map (#26077 )	2023-09-11 12:57:04 +01:00
Hang	7fd2d68613	only main process should call _save on deepspeed zero3 (#25959 ) only main process should call _save when deepspeed zero3	2023-09-11 12:56:36 +01:00
Arthur	95b374952d	[`CITests`] skip failing tests until #26054 is merged (#26063 ) * skip failing tests until #26054 is merged * fixup	2023-09-09 05:43:26 +02:00
Arthur	09b2de6eb7	[`CodeLlamaTokenizerFast`] Fix fix `set_infilling_processor` to properly reset (#26041 ) * fix `set_infilling_processor` to properly reset * Add docstring! * fixups * more details in the docuemtation about the tokenization * styl;e	2023-09-08 22:03:09 +02:00
Harheem Kim	d53606031f	🌐 [i18n-KO] Translated `llama.md` to Korean (#26044 ) * docs: ko-llama.md * fix: chatgpt draft * feat: manual edits * fix: resolve suggestions	2023-09-08 12:38:41 -07:00
Angela Yi	6c26faa159	Skip warning if tracing with dynamo (#25581 ) * Ignore warning if tracing with dynamo * fix import error * separate to function * add test	2023-09-08 21:13:33 +02:00
Thien Tran	18ee1fe762	Update missing docs on `activation_dropout` and fix DropOut docs for SEW-D (#26031 ) * add missing doc for activation dropout * fix doc for SEW-D dropout * deprecate hidden_dropout for SEW-D	2023-09-08 14:51:54 +01:00
Alexander Krauck	0c67a72c9a	Fix Dropout Implementation in Graphormer (#24817 ) This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically: 1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`. 2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers. These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.	2023-09-08 12:49:39 +01:00
dumpmemory	fb7d246951	Try to fix training Loss inconsistent after resume from old checkpoint (#25872 ) * fix loss inconsistent after resume #25340 * fix typo * clean code * reformatted code * adjust code according to comments * adjust check_dataloader_randomsampler location * return sampler only * handle sampler is None * Update src/transformers/trainer_pt_utils.py thanks @amyeroberts Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-07 20:00:22 +01:00
MyungHa Kwon	c5e66a40a4	Punctuation fix (#26025 ) fix typo	2023-09-07 19:54:52 +01:00
raghavanone	00efd64e51	Fix vilt config docstring parameter to match value in init (#26017 ) * Fix vilt config init parameter to match the ones in documentation * Fix the documentation	2023-09-07 19:53:43 +01:00
Muskan Kumar	02c4a77f57	Added HerBERT to README.md (#26020 ) * Added HerBERT to README.md * Update README.md to contain HerBERT (#26016) * Resolved #26016: Updated READMEs and index.md to contain Herbert Updated READMEs and ran make fix-copies	2023-09-07 19:51:45 +01:00
Sanchit Gandhi	2af87d018e	[VITS] Fix nightly tests (#25986 ) * fix tokenizer * make bs even * fix multi gpu test * style * model forward * fix torch import * revert tok pin	2023-09-07 17:49:14 +01:00
CokeDong	3744126c87	Add `tgs` speed metrics (#25858 ) * Add tgs metrics * bugfix and black formatting * workaround for tokens counting * formating and bugfix * Fix * Add opt-in for tgs metrics * make style and fix error * Fix doc * fix docbuild * hf-doc-build * fix * test * Update src/transformers/training_args.py renaming Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py renaming Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Fix some symbol * test * Update src/transformers/trainer_utils.py match nameing patterns Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/trainer.py nice Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix reviews * Fix * Fix black --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-07 17:17:30 +01:00
Yih-Dar	0188739a74	Fix CircleCI config (#26023 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-07 14:51:35 +02:00
Kai	df04959e55	fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 (#26024 )	2023-09-07 10:10:40 +01:00
Zach Mueller	e3a9716384	Fix err with FSDP (#25991 ) * Fix err * Use version check	2023-09-07 09:52:53 +05:30
Marc Sun	fa6107c97e	modify context length for GPTQ + version bump (#25899 ) * add new arg for gptq * add tests * add min version autogptq * fix order * skip test * fix * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style * change model path --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-09-06 11:45:47 -04:00
Matt	300d6a4a62	Remove Falcon from undocumented list (#26008 ) Remove falcon from undocumented list	2023-09-06 15:49:04 +01:00
Harheem Kim	fa522d8d7b	🌐[i18n-KO] Translated `llm_tutorial.md` to Korean (#25791 ) * docs: ko: llm_tutoroal.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: resolve suggestions	2023-09-06 07:40:03 -07:00
zspo	3e203f92be	Fix small typo README.md (#25934 ) * fix some samll bugs in readme * Update docs/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-06 14:07:29 +01:00
Matt	842e99f1b9	TF-OPT attention mask fixes (#25238 ) * stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-06 13:37:27 +01:00
Lysandre Debut	f6301b9a13	Falcon: fix revision propagation (#26006 ) * Fix revision propagation * Cleaner	2023-09-06 07:21:00 -04:00
Nino Risteski	f6295c6c53	Update README.md (#26003 ) fixed a typo	2023-09-06 10:55:11 +01:00
tju_skywalker	172f42c512	save space when converting hf model to megatron model. (#25950 ) * fix convert megatron model too large * fix convert megatron model too large	2023-09-05 16:47:48 -04:00
Tanay Mehta	b8def68934	Fix Mega chunking error when using decoder-only model (#25765 ) * add: potential fix to mega chunking in decoder only model bug * add: decoder with chunking test * add: input_mask passed with input_ids	2023-09-05 21:50:14 +02:00
Arthur	4fa0aff21e	[`VITS`] tokenizer integration test: fix revision did not exist (#25996 ) * revision did not exist * correct revision	2023-09-05 21:21:33 +02:00
Arthur	d0354e5e86	[`CI`] Fix red CI and ERROR failed should show (#25995 ) * start with error too * fix ? * start with nit * one more path * use `job_name` * mark pipeline test as slow	2023-09-05 20:16:00 +02:00
Injin Paek	6206f599e1	Add LLaMA resources (#25859 ) * docs: feat: model resources for llama * fix: resolve suggestion Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>	2023-09-05 10:50:08 -07:00
Sanchit Gandhi	8d518013ef	[Wav2Vec2 Conformer] Fix inference float16 (#25985 ) * [Wav2Vec2 Conformer] Fix inference float16 * fix test * fix test more * clean pipe test	2023-09-05 18:26:06 +01:00
Sourab Mangrulkar	6bc517ccd4	deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863 ) * Add support for deepspeed optimizer and HF scheduler * fix bug * fix the import * fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario * fix loading of hf scheduler when loading deepspeed checkpoint * fix import of `DeepSpeedSchedulerWrapper` * add tests * add the comment and skip the failing tests * address comment	2023-09-05 22:31:20 +05:30
raghavanone	1110b565d6	Add TFDebertaV2ForMultipleChoice (#25932 ) * Add TFDebertaV2ForMultipleChoice * Import newer model in main init * Fix import issues * Fix copies * Add doc * Fix tests * Fix copies * Fix docstring	2023-09-05 17:13:06 +01:00
andreeahedes	da1af21dbb	PegasusX add _no_split_modules (#25933 ) * no_split_modules * no_split_modules * inputs_embeds+pos same device * update _no_split_modules * update _no_split_modules	2023-09-05 16:34:34 +01:00
Abhilash Majumder	70a98024b1	Patch with accelerate xpu (#25714 ) * patch with accelerate xpu * patch with accelerate xpu * formatting * fix tests * revert ruff unrelated fixes * revert ruff unrelated fixes * revert ruff unrelated fixes * fix test * review fixes * review fixes * black fixed * review commits * review commits * style fix * use pytorch_utils * revert markuplm test	2023-09-05 15:41:42 +01:00
Yih-Dar	aa5c94d38d	Show failed tests on CircleCI layout in a better way (#25895 ) * update * update * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-05 15:49:33 +02:00
Joao Gante	9a70d6e56f	Trainer: delegate default generation values to `generation_config` (#25987 )	2023-09-05 14:47:00 +01:00
Sahel Sharify	aea761499f	Update training_args.py to remove the runtime error (#25920 ) This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error: File "..../transformers/training_args.py", line 1544, in post_init for k, v in self.fsdp_config.items(): RuntimeError: dictionary keys changed during iteration	2023-09-05 12:43:51 +01:00
Traun Leyden	7011cd8667	Update RAG README.md with correct path to examples/seq2seq (#25953 ) Update README.md with correct path to examples/seq2seq	2023-09-05 12:31:59 +01:00
Julien Chaumond	6316ce8d27	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
Yih-Dar	391f26459a	Use main in conversion script (#25973 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-05 13:04:49 +02:00
Kai	6f125aaa48	fix typo (#25981 ) rename doanloading to downloading	2023-09-05 11:13:06 +01:00
Susnato Dhar	52a46dc57b	Add `Pop2Piano` space demo. (#25975 ) Update pop2piano.md	2023-09-05 11:07:02 +01:00
Huazhong Ji	1cc3bc22fe	nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0 (#25974 ) nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0	2023-09-05 11:37:54 +02:00
Yih-Dar	fbbe1b8a40	Fix `test_load_img_url_timeout` (#25976 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-05 11:34:28 +02:00
Yih-Dar	feec56959a	Fix Detr CI (#25972 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-05 11:19:56 +02:00
Susnato Dhar	404ff8fc17	Fix typo (#25966 ) * Update feature_extraction_clap.py * changed all lenght to length	2023-09-05 10:12:25 +02:00
Lysandre	d8e13b3e04	v4.34.dev.0	2023-09-04 15:12:11 -04:00

... 3 4 5 6 7 ...

14144 Commits