transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Matt	866df66fe4	Overhaul Conversation class and prompt templating (#25323 ) * First commit while I figure this out * make fixup * Remove unused method * Store prompt attrib * Fix prompt argument for tests * Make same changes in fast tokenizer * Remove global prompts from fast tokenizer too * stash commit * stash commit * Migrate PromptConfig to its True Final Location * Replace Conversation entirely with the new class * Import/dependency fixes * Import/dependency fixes * Change format for lots of default prompts * More default prompt fixups * Revert llama old methods so we can compare * Fix some default configs * Fix some default configs * Fix misspelled kwarg * Fixes for Blenderbot * make fixup * little rebase cleanup * Add basic documentation * Quick doc fix * Truncate docstring for now * Add handling for the case when messages is a single string * Quick llama merges * Update conversational pipeline and tests * Add a couple of legacy properties for backward compatibility * More legacy handling * Add docstring for build_conversation_input_ids * Restructure PromptConfig * Let's start T E M P L A T I N G * Refactor all default configs to use templates instead * Revert changes to the special token properties since we don't need them anymore * More class templates * Make the sandbox even sandier * Everything replaced with pure templating * Remove docs for PromptConfig * Add testing and optional requirement boilerplate * Fix imports and make fixup * Fix LLaMA tests and add Conversation docstring * Finally get LLaMA working with the template system * Finally get LLaMA working with the template system * make fixup * make fixup * fmt-off for the long lists of test tokens * Rename method to apply_chat_template for now * Start on documentation * Make chat_template a property that reads through to the default if it's not set * Expand docs * Expand chat templating doc some more * trim/lstrip blocks by default and update doc * Few doc tweaks * rebase cleanup * Clarify docstring * rebase cleanup * rebase cleanup * make fixup * Quick doc edit * Reformat the standard template to match ChatML * Re-add PEFT check * Update docs/source/en/chat_templating.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Add apply_chat_template to the tokenizer doc * make fixup * Add doc links * Fix chat links * Fix chat links * Explain system messages in the doc * Add chat template test * Proper save-loading for chat template attribute * Add test skips for layout models * Remove _build_conversation_input_ids, add default_chat_template to code_llama * Make sure all LLaMA models are using the latest template * Remove default_system_prompt block in code_llama because it has no default prompt * Update ConversationPipeline preprocess * Add correct #Copied from links to the default_chat_templates * Remove unneeded type checking line * Add a dummy mark_processsed method * Reorganize Conversation to have *deprecated_kwargs Update chat_templating.md * Quick fix to LLAMA tests * Small doc tweaks * Add proper docstrings and "copied from" statements to all default chat templates * Merge use_default_system_prompt support for code_llama too * Improve clarity around self.chat_template * Docstring fix * Fix blenderbot default template * More doctest fix * Break out some tokenizer kwargs * Update doc to explain default templates * Quick tweaks to tokenizer args * Cleanups for tokenizer args * Add note about cacheing * Quick tweak to the chat-templating doc * Update the LLaMA template with error checking and correct system message embedding * make fixup * make fixup * add requires_jinja * Cleanup to expected output formatting * Add cacheing * Fix typo in llama default template * Update LLaMA tests * Update documentation * Improved legacy handling in the Conversation class * Update Jinja template with proper error handling * Quick bugfix * Proper exception raising * Change cacheing behaviour so it doesn't try to pickle an entire Jinja env * make fixup * rebase cleanup --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2023-09-14 15:10:34 +01:00
Younes Belkada	7c63e6fc8c	[`PEFT`] Fix PEFT + gradient checkpointing (#25846 ) * fix PEFT + gradient checkpointing * add disable RG * polish tests * fix comment * Revert "fix comment" This reverts commit `b85386f50d`. * final explanations and tests	2023-09-14 13:01:58 +02:00
Sanchit Gandhi	ac957f69cc	[Whisper Tokenizer] Encode timestamps (#26054 ) * [Whisper Tokenizer] Fix tests after adding timestamps * fix s2t tokenizer tests * fix vocab test * backwards comp * fix tests * comment * style * fix last test * fix fast * make faster * move logic to decode * remove skip test * fix decode with offsets * fix special tokens * empty commit to re-trigger ci * use lru cache	2023-09-14 12:00:43 +01:00
Sam Denton	6d49b9dcbf	Fix eval accumulation when `accelerate` > 0.20.3 (#26060 ) As mentioned in: https://github.com/huggingface/transformers/issues/25641 Eval accumulation will never happen with `accelerate > 0.20.3`, so this change ensures that `sync_gradients` is ignored if accelerate is > 0.20.3	2023-09-14 10:57:47 +01:00
Craig Chan	d7bd325b5a	Add missing Maskformer dataclass decorator, add dataclass check in ModelOutput for subclasses (#25638 ) * Add @dataclass to MaskFormerPixelDecoderOutput * Add dataclass check if subclass of ModelOutout * Use unittest assertRaises rather than pytest per contribution doc * Update src/transformers/utils/generic.py per suggested change Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-14 10:30:49 +01:00
Abhilash Majumder	05de038f3d	Flex xpu bug fix (#26135 ) flex gpu bug fix	2023-09-13 21:03:52 +01:00
Maria Khalusova	9709ab116c	[docs] last hidden state vs hidden_states[-1] (#26142 ) * last hidden state clarification * feedback addressed	2023-09-13 14:35:42 -04:00
Serizao	e52f1cb669	Update training_args.py - addition of self.distributed_state when using XPU (#25999 ) * Update training_args.py Missing distributed state so lign 1813-1814 failed because value is undefined * Update training_args.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>	2023-09-13 19:21:46 +01:00
BakerBunker	0fced06788	Fix `beam_scores` shape when token scores shape changes after `logits_processor` (#25980 )	2023-09-13 19:12:47 +01:00
Joao Gante	a796f7eea6	Falcon: batched generation (#26137 )	2023-09-13 17:00:52 +01:00
Yih-Dar	95a904104e	Fix `test_finetune_bert2bert` (#25984 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-13 16:53:43 +01:00
Joao Gante	86ffef87b6	Generate: ignore warning when `generation_config.max_length` is set to `None` (#26147 )	2023-09-13 16:50:58 +01:00
김준재_T3056	a6ae2bd059	docs: feat: add llama2 notebook resources from OSSCA community (#26076 )	2023-09-13 08:27:41 -07:00
Younes Belkada	7ccac73f74	[`RWKV`] Final fix RWMV 4bit (#26134 ) * Final fix RWMV 4bit * fixup * add a test * add more clarifications	2023-09-13 16:30:20 +02:00
Vaibhav Srivastav	32ec7345f2	Update spectrogram and waveform model mapping for TTS/A pipeline (#26114 ) update names mapping for spectrogram and waveform models	2023-09-13 09:05:11 -04:00
Juarez Bochi	a9b63ca989	Add missing space in generation/utils.py (#26121 ) Add missing space in utils.py Warning now reads as "... to control thegeneration length. We ..."	2023-09-13 13:45:55 +01:00
Younes Belkada	c8b26096d4	[`core`] fix 4bit `num_parameters` (#26132 ) * fix 4bit `num_parameters` * stronger check	2023-09-13 14:12:35 +02:00
amyeroberts	7db1ad63d9	Fix AutoTokenizer docstring typo (#26117 ) Fix docstring typo	2023-09-13 11:12:27 +01:00
Sourab Mangrulkar	b477327394	fix the deepspeed tests (#26021 ) * fix the deepspeed tests * resolve comment	2023-09-13 10:26:53 +05:30
Sourab Mangrulkar	73b13ac099	safeguard torch distributed check (#26056 )	2023-09-13 10:26:37 +05:30
Tanay Mehta	12f043eaea	Fix `MarianTokenizer` to remove metaspace character in `decode` (#26091 ) * add: check to remove metaspace from marian tokenizer * fix: metaspace character being removed from everywhere * fix: remove redundant check at top * add: test for marian tokenizer decode fix * fix: simplified the test	2023-09-12 21:53:31 +02:00
Joao Gante	03e309d58e	Text2text pipeline: don't parameterize from the config (#26118 )	2023-09-12 18:40:45 +01:00
Phuc Van Phan	4fb64e285a	chore: correct update_step and correct gradient_accumulation_steps (#26068 )	2023-09-12 18:31:23 +01:00
Wang, Yi	8f609ab9e0	enable optuna multi-objectives feature (#25969 ) * enable optuna multi-objectives feature Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update hpo doc * update docstring Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * extend direction to List[str] type Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-12 18:01:22 +01:00
MinJae Kang	92f2fbad50	🌐 [i18n-KO] Translated `contributing.md` to Korean (#25877 ) * docs: ko-contributing.md * feat: chatGPT draft * feat: manual edits * feat: change linked document * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: Haewon Kim <ehdvkf02@naver.com> * fix: resolve suggestion Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestion Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestion Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com> * fix: resolve suggestion * fix: resolve suggestion * feat: delete file to resolve error --------- Co-authored-by: Haewon Kim <ehdvkf02@naver.com> Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>	2023-09-12 08:35:29 -07:00
Maria Khalusova	1fe7ce48f1	[docs] Updates to TTS task guide with regards to the new TTS pipeline (#26095 ) * tts guide updates with a pipeline * Apply suggestions from code review Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * Update docs/source/en/tasks/text-to-speech.md Co-authored-by: Vaibhav Srivastav <vaibhavs10@gmail.com> --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Co-authored-by: Vaibhav Srivastav <vaibhavs10@gmail.com>	2023-09-12 11:29:06 -04:00
MinJae Kang	be9438ed43	🌐 [i18n-KO] Translated `llama2.md` to Korean (#26047 ) * docs: ko-llama2.md * feat: chatGPT draft and manul edits * feat: added inline TOC * fix: inline TOC * fix: resolve suggestions Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: resolve suggestion Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: resolve suggestion Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-09-12 08:04:26 -07:00
pokjay	6acc27eea8	Fix ExponentialDecayLengthPenalty negative logits issue (#25594 ) * Fix issues in test_exponential_decay_length_penalty Fix tests which were broken and add validation of negative scores. Current test didn't take into account that ExponentialDecayLengthPenalty updates the score inplace, resulting in updates to base tested Tensor. In addition, the gt assert had empty Tensors due to indexing along the batch dimension. Test is currently expected to fail to show ExponentialDecayLengthPenalty issues with negative scores * Fix ExponentialDecayLengthPenalty negative logits issue In cases where the scores are negative, ExponentialDecayLengthPenalty decreases the score of eos_token_id instead of increasing it. To fix this issue we compute the penalty of the absolute value and add it to the original score. * Add examples for ExponentialDecayLengthPenalty * Fix styling issue in ExponentialDecayLengthPenalty doc * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Style and quality fix * Fix example outputs --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-09-12 12:50:41 +01:00
larekrow	d65c4a4fed	Update logits_process.py docstrings (#25971 )	2023-09-12 12:36:31 +01:00
Joao Gante	3319eb5490	Generate: legacy mode is only triggered when `generation_config` is untouched (#25962 )	2023-09-12 12:08:17 +01:00
Younes Belkada	18abc756c5	[`core`] Import tensorflow inside relevant methods in `trainer_utils` (#26106 ) import tensorflow inside relevant methods in trainer_utils	2023-09-12 11:49:06 +02:00
Arthur	9cccb3a838	[`Persimmon`] Add support for persimmon (#26042 ) * intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-09-12 11:33:27 +02:00
Phuc Van Phan	5af2c62696	docs: add space to docs (#26067 ) * docs: add space to docs * docs: remove reduntant space	2023-09-11 22:03:26 +01:00
Patrick von Platen	ce2e7ef3d9	[Core] Add lazy import structure to imports (#26090 ) * improve import time * Update src/transformers/integrations/__init__.py * sort import	2023-09-11 17:20:29 +02:00
Phuc Van Phan	9cebae64ad	docs: update link huggingface map (#26077 )	2023-09-11 12:57:04 +01:00
Hang	7fd2d68613	only main process should call _save on deepspeed zero3 (#25959 ) only main process should call _save when deepspeed zero3	2023-09-11 12:56:36 +01:00
Arthur	95b374952d	[`CITests`] skip failing tests until #26054 is merged (#26063 ) * skip failing tests until #26054 is merged * fixup	2023-09-09 05:43:26 +02:00
Arthur	09b2de6eb7	[`CodeLlamaTokenizerFast`] Fix fix `set_infilling_processor` to properly reset (#26041 ) * fix `set_infilling_processor` to properly reset * Add docstring! * fixups * more details in the docuemtation about the tokenization * styl;e	2023-09-08 22:03:09 +02:00
Harheem Kim	d53606031f	🌐 [i18n-KO] Translated `llama.md` to Korean (#26044 ) * docs: ko-llama.md * fix: chatgpt draft * feat: manual edits * fix: resolve suggestions	2023-09-08 12:38:41 -07:00
Angela Yi	6c26faa159	Skip warning if tracing with dynamo (#25581 ) * Ignore warning if tracing with dynamo * fix import error * separate to function * add test	2023-09-08 21:13:33 +02:00
Thien Tran	18ee1fe762	Update missing docs on `activation_dropout` and fix DropOut docs for SEW-D (#26031 ) * add missing doc for activation dropout * fix doc for SEW-D dropout * deprecate hidden_dropout for SEW-D	2023-09-08 14:51:54 +01:00
Alexander Krauck	0c67a72c9a	Fix Dropout Implementation in Graphormer (#24817 ) This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically: 1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`. 2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers. These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.	2023-09-08 12:49:39 +01:00
dumpmemory	fb7d246951	Try to fix training Loss inconsistent after resume from old checkpoint (#25872 ) * fix loss inconsistent after resume #25340 * fix typo * clean code * reformatted code * adjust code according to comments * adjust check_dataloader_randomsampler location * return sampler only * handle sampler is None * Update src/transformers/trainer_pt_utils.py thanks @amyeroberts Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-07 20:00:22 +01:00
MyungHa Kwon	c5e66a40a4	Punctuation fix (#26025 ) fix typo	2023-09-07 19:54:52 +01:00
raghavanone	00efd64e51	Fix vilt config docstring parameter to match value in init (#26017 ) * Fix vilt config init parameter to match the ones in documentation * Fix the documentation	2023-09-07 19:53:43 +01:00
Muskan Kumar	02c4a77f57	Added HerBERT to README.md (#26020 ) * Added HerBERT to README.md * Update README.md to contain HerBERT (#26016) * Resolved #26016: Updated READMEs and index.md to contain Herbert Updated READMEs and ran make fix-copies	2023-09-07 19:51:45 +01:00
Sanchit Gandhi	2af87d018e	[VITS] Fix nightly tests (#25986 ) * fix tokenizer * make bs even * fix multi gpu test * style * model forward * fix torch import * revert tok pin	2023-09-07 17:49:14 +01:00
CokeDong	3744126c87	Add `tgs` speed metrics (#25858 ) * Add tgs metrics * bugfix and black formatting * workaround for tokens counting * formating and bugfix * Fix * Add opt-in for tgs metrics * make style and fix error * Fix doc * fix docbuild * hf-doc-build * fix * test * Update src/transformers/training_args.py renaming Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py renaming Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Fix some symbol * test * Update src/transformers/trainer_utils.py match nameing patterns Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/trainer.py nice Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix reviews * Fix * Fix black --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-07 17:17:30 +01:00
Yih-Dar	0188739a74	Fix CircleCI config (#26023 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-09-07 14:51:35 +02:00
Kai	df04959e55	fix _resize_token_embeddings will set lm head size to 0 when enabled deepspeed zero3 (#26024 )	2023-09-07 10:10:40 +01:00

1 2 3 4 5 ...

13975 Commits