transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	77db28dc52	Update some torchscript tests after #24505 (#24566 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-29 16:05:24 +02:00
Sanchit Gandhi	1c1c90756d	Add Musicgen (#24109 ) * Add Audiocraft * add cross attention * style * add for lm * convert and verify * introduce t5 * split configs * load t5 + lm * clean conversion * copy from t5 * style * start pattern provider * make generation work * style * fix pos embs * propagate shape changes * propagate shape changes * style * delay pattern: pad tokens at end * audiocraft -> musicgen * fix inits * add mdx * style * fix pad token in processor * override generate and add todos * add init to test * undo pattern delay mask after gen * remove cfg logits processor * remove cfg logits processor * remove logits processor in favour of mask * clean pos embs * make fix copies * update readmes * clean pos emb * refactor encoder/decoder * make fix copies * update conversion * fix config imports * update config docs * make style * send pattern mask to device * pattern mask with delay * recover prompted audio tokens * fix docstrings * laydown test file * pattern edge case * remove t5 ref * add processing class * config refactor * better pattern comment * check if mask is not present * check if mask is not present * refactor to auto class * remove encoder configs * fix processor * processor import * start updating conversion * start updating tests * make style * convert t5, encodec, lm * convert as composite * also convert processor * run generate * classifier free gen * comments and clean up * make style * docs for logit proc * docstring for uncond gen * start lm tests * work tests * let the lm generate * refactor: reshape inside forward * undo greedy loop changes * from_enc_dec -> from_sub_model * fix input id shapes in docstrings * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * undo generate changes * from sub model config * Update src/transformers/models/musicgen/modeling_musicgen.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make generate work again * generate uncond -> get uncond inputs * remove prefix allowed tokens fn * better error message * logit proc checks * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * make decoder only tests work * composite fast tests * make style * uncond generation * feat extr padding * make audio prompt work * fix inputs docstrings * unconditional inputs: dict -> model output * clean up tests * more clean up tests * make style * t5 encoder -> auto text encoder * remove comments * deal with frames * fix auto text * slow tests * nice mdx * remove can generate * todo - hub id * convert m/l * make fix copies * only import generation with torch * ignore decoder from tests * don't wrap uncond inputs * make style * cleaner uncond inputs * add example to musicgen forward * fix docs * ignore MusicGen Model/ForConditionalGeneration in auto mapping * add doc section to toctree * add to doc tests * add processor tests * fix push to hub in conversion * tips for decoder only loading * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix conversion for s / m / l checkpoints * import stopping criteria from module * remove from pipeline tests * fix uncond docstring * decode audio method * fix docs * org: sanchit-gandhi -> facebook * fix max pos embeddings * remove auto doc (not compatible with shapes) * bump max pos emb * make style * fix doc * fix config doc * fix config doc * ignore musicgen config from docstring * make style * fix config * fix config for doctest * consistent from_sub_models * don't automap decoder * fix mdx save audio file * fix mdx save audio file * processor batch decode for audio * remove keys to ignore * update doc md * update generation config * allow changes for default generation config * update tests * make style * fix docstring for uncond * fix processor test * fix processor test --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-06-29 14:48:59 +01:00
Sylvain Gugger	2dc5e1a120	Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574 ) Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)" This reverts commit `c5e29d4381`.	2023-06-29 08:14:43 -04:00
Joao Gante	4f1b31c2ee	Docs: 4 bit doc corrections (#24572 ) 4 bit doc corrections	2023-06-29 13:13:20 +01:00
MS Kim(tony9402)	1fd52e6e60	Fix annotations (#24571 ) * fix annotations * fix copies	2023-06-29 08:05:19 -04:00
MS Kim(tony9402)	63cc30e71b	Fix Typo (#24559 )	2023-06-29 08:04:07 -04:00
amyeroberts	ae454f41d4	Update old existing feature extractor references (#24552 ) * Update old existing feature extractor references * Typo * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Address comments from review - update 'feature extractor' Co-authored by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-06-29 10:17:36 +01:00
Pasquale De Marinis	10c2ac7bc6	Fixed OwlViTModel inplace operations (#24529 ) * fixed OwlViTModel inplace operations * fixed operands order in owlvit	2023-06-29 10:17:26 +02:00
condor-cp	66954ea25e	Update masked_language_modeling.md (#24560 ) See https://github.com/huggingface/transformers/issues/24546	2023-06-28 17:54:20 -04:00
Yih-Dar	fd6735102a	Make PT/Flax tests could be run on GPU (#24557 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 20:11:01 +02:00
Yih-Dar	faae8d8255	Update PT/Flax weight conversion after #24030 (#24556 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 19:44:31 +02:00
Younes Belkada	33b5ef5cdf	[`InstructBlip`] Add instruct blip int8 test (#24555 ) * add 8bit instructblip test * update tests	2023-06-28 19:06:30 +02:00
amyeroberts	c70c88a268	Fix processor __init__ bug if image processor undefined (#24554 ) Make sure feature_extractor is defined in all cases	2023-06-28 17:17:27 +01:00
Younes Belkada	903b97d8df	[`gpt2-int8`] Add gpt2-xl int8 test (#24543 ) add gpt2-xl test	2023-06-28 18:02:13 +02:00
Yih-Dar	b0651655be	Update `EncodecIntegrationTest` (#24553 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 18:01:41 +02:00
Yih-Dar	6c57ce1558	Update PT/TF weight conversion after #24030 (#24547 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 16:36:57 +02:00
Max Ryabinin	c5e29d4381	Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549 ) * Fix typing annotations for FSDP and DeepSpeed in TrainingArguments * Change dict to Dict	2023-06-28 10:36:17 -04:00
Frank995	daccde143d	Allow for warn_only selection in enable_full_determinism (#24496 ) * Warn only in enable full determinism * Add option in the function definition	2023-06-28 08:54:36 -04:00
Yih-Dar	11cb6e0f7e	Unpin DeepSpeed and require DS >= 0.9.3 (#24541 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 14:01:22 +02:00
Yih-Dar	e84bf1f734	⚠️ Time to say goodbye to py37 (#24091 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 07:22:39 +02:00
Dario Sučić	12240925cf	Add bitsandbytes support for gpt2 models (#24504 ) * Add bitsandbytes support for gpt2 models * Guard Conv1D import to pass tensorflow test * Appease ruff linter * Fix 4bit test and remove int8 test boilerplate * Update tests/bnb/test_mixed_int8.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-28 05:55:32 +02:00
Sylvain Gugger	89b6ee49fd	Finishing tidying keys to ignore on load (#24535 )	2023-06-27 21:35:15 -04:00
MS Kim(tony9402)	04f46a22d8	Fix Typo (#24530 ) * Fix Typo * Fix all copies	2023-06-27 15:38:14 -04:00
amyeroberts	462f77cbce	Allow backbones not in backbones_supported - Maskformer Mask2Former (#24532 ) Allow backbones not in backbones_supported	2023-06-27 20:34:36 +01:00
Sylvain Gugger	8e5d1619b3	Clean load keys (#24505 ) * Preliminary work on some models * Fix test load missing and make sure nonpersistent buffers are tested * Always ignore nonpersistent buffers if in state_dict * Treat models * More models * Treat remaining models * Fix quality * Fix tests * Remove draft * This test is not needed anymore * Fix copies * Fix last test * Newly added models * Fix last tests * Address review comments	2023-06-27 14:45:40 -04:00
NielsRogge	53194991e9	[Mask2Former] Remove SwinConfig (#24259 ) Remove SwinConfig	2023-06-27 13:33:55 -04:00
Zach Mueller	fb6a62762f	Fix LR scheduler based on bs from auto bs finder (#24521 ) * One solution * args -> self	2023-06-27 13:28:26 -04:00
Sylvain Gugger	38db04ece0	Find module name in an OS-agnostic fashion (#24526 ) * Find module name in an OS-agnostic fashion * address review comment	2023-06-27 13:21:19 -04:00
Yih-Dar	7d150d68ff	Update `huggingface_hub` commit sha (#24527 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-27 17:41:55 +02:00
Wang, Yi	4e8929dcbb	set model to training mode before accelerate.prepare (#24520 )	2023-06-27 10:09:38 -04:00
Sebastian	06910f5a76	[`T5`] Add T5ForQuestionAnswering and MT5ForQuestionAnswering (#24481 ) * Adding T5ForQuestionAnswering * Changed weight initialization that results in better initial loss when fine-tuning * Update to class variables * Running make fixup * Running make fix-copies * Remove model_parallel * Adding MT5ForQuestionAnswering * Adding docs * Fix wrong doc * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * File formatting * Undoing change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-27 10:07:06 -04:00
Sourab Mangrulkar	bcf02ec701	Update hyperparameter_search.py (#24515 ) * Update hyperparameter_search.py * resolve comments	2023-06-27 18:42:15 +05:30
Wang, Yi	6fe8d198e3	use accelerate autocast in jit eval path, since mix precision logic is… (#24460 ) use accelerate autocast in jit eval path, since mix precision logic is in accelerator currently Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-06-27 08:33:21 -04:00
Hyeonseo Yun	0863436b6c	🌐 [i18n-KO] Translated `tflite.mdx` to Korean (#24435 ) * docs: ko: tflite.mdx * feat: nmt and manual edit `tflite.mdx` * revised: resolve suggestions tflite.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * revised: resolve suggestions and new line tflite.mdx Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-06-27 08:18:42 -04:00
Yih-Dar	4abd3ee479	Fix poor past ci (#24485 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-27 14:14:17 +02:00
Xiaoli Wang	239ace152b	Fix TypeError: Object of type int64 is not JSON serializable (#24340 ) * Fix TypeError: Object of type int64 is not JSON serializable * Convert numpy.float64 and numpy.int64 to float and int for json serialization * Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py * * make style	2023-06-27 12:15:49 +01:00
Joao Gante	ac19871ce2	Generate: `min_tokens_to_keep` has to be `>= 1` (#24453 )	2023-06-27 11:48:23 +01:00
Joao Gante	5f3efdf762	Generate: `group_beam_search` requires `diversity_penalty>0.0` (#24456 ) * add exception * update docs	2023-06-27 10:46:39 +01:00
hukuda222	43479ef98f	🚨🚨 Fix group beam search (#24407 ) * group_beam_search now works correctly * add argument descriptions * add a comment * format * make style * change comment * Update src/transformers/generation/beam_search.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: shogo.fujita <shogo.fujita@legalontech.jp> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-06-27 10:43:10 +01:00
Gema Parreño	68c92981ff	Fix link in utils (#24501 ) * fix link * new link --------- Co-authored-by: Gema <gema@mbp-de-gema-2.lan>	2023-06-26 14:26:09 -04:00
Yih-Dar	7b4e3b5b40	Compute `dropout_probability` only in training mode (SpeechT5) (#24498 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 19:43:06 +02:00
Tomoko Uchida	c9fd49853f	Fix 'local_rank' AttiributeError in Trainer class (#24297 ) fix attribute error	2023-06-26 13:38:29 -04:00
Yih-Dar	850cf4af0c	Compute `dropout_probability` only in training mode (#24486 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 18:36:47 +02:00
Younes Belkada	9895670e95	[`InstructBlip`] Add accelerate support for instructblip (#24488 ) * add accelerate support for instructblip * add `_keep_in_fp32_modules` * dynamically adapt `_no_split_modules` * better fix * same logic for `_keep_in_fp32_modules`	2023-06-26 18:36:27 +02:00
Sylvain Gugger	5757923888	Add support for for loops in python interpreter (#24429 ) Add support for for loops	2023-06-26 09:58:14 -04:00
condor-cp	c2aa5e17e4	Update token_classification.md (#24484 ) Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.	2023-06-26 08:42:38 -04:00
Yih-Dar	3ca022238b	Update `InstructBlipModelIntegrationTest` (#24490 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 14:37:12 +02:00
Sourab Mangrulkar	195a9e5bdb	deepspeed z1/z2 state dict fix (#24489 ) * deepspeed z2/z1 state_dict bloating fix * update * version check	2023-06-26 17:45:37 +05:30
Wang, Yi	c8aff1d3e6	when resume from peft checkpoint, the model should be trainable (#24463 )	2023-06-26 08:07:27 -04:00
Younes Belkada	914289ac4b	[`pipeline`] Fix str device issue (#24396 ) * fix str device issue * fixup * adapt from suggestions * forward contrib credits from suggestions * better fix * added backward compatibility for older PT versions * final fixes * oops * Attempting something with less branching. --------- Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-26 13:58:36 +02:00

1 2 3 4 5 ...

13322 Commits