transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 10:38:23 +06:00

Author	SHA1	Message	Date
Yih-Dar	e84bf1f734	⚠️ Time to say goodbye to py37 (#24091 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-28 07:22:39 +02:00
Dario Sučić	12240925cf	Add bitsandbytes support for gpt2 models (#24504 ) * Add bitsandbytes support for gpt2 models * Guard Conv1D import to pass tensorflow test * Appease ruff linter * Fix 4bit test and remove int8 test boilerplate * Update tests/bnb/test_mixed_int8.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-28 05:55:32 +02:00
Sylvain Gugger	89b6ee49fd	Finishing tidying keys to ignore on load (#24535 )	2023-06-27 21:35:15 -04:00
MS Kim(tony9402)	04f46a22d8	Fix Typo (#24530 ) * Fix Typo * Fix all copies	2023-06-27 15:38:14 -04:00
amyeroberts	462f77cbce	Allow backbones not in backbones_supported - Maskformer Mask2Former (#24532 ) Allow backbones not in backbones_supported	2023-06-27 20:34:36 +01:00
Sylvain Gugger	8e5d1619b3	Clean load keys (#24505 ) * Preliminary work on some models * Fix test load missing and make sure nonpersistent buffers are tested * Always ignore nonpersistent buffers if in state_dict * Treat models * More models * Treat remaining models * Fix quality * Fix tests * Remove draft * This test is not needed anymore * Fix copies * Fix last test * Newly added models * Fix last tests * Address review comments	2023-06-27 14:45:40 -04:00
NielsRogge	53194991e9	[Mask2Former] Remove SwinConfig (#24259 ) Remove SwinConfig	2023-06-27 13:33:55 -04:00
Zach Mueller	fb6a62762f	Fix LR scheduler based on bs from auto bs finder (#24521 ) * One solution * args -> self	2023-06-27 13:28:26 -04:00
Sylvain Gugger	38db04ece0	Find module name in an OS-agnostic fashion (#24526 ) * Find module name in an OS-agnostic fashion * address review comment	2023-06-27 13:21:19 -04:00
Yih-Dar	7d150d68ff	Update `huggingface_hub` commit sha (#24527 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-27 17:41:55 +02:00
Wang, Yi	4e8929dcbb	set model to training mode before accelerate.prepare (#24520 )	2023-06-27 10:09:38 -04:00
Sebastian	06910f5a76	[`T5`] Add T5ForQuestionAnswering and MT5ForQuestionAnswering (#24481 ) * Adding T5ForQuestionAnswering * Changed weight initialization that results in better initial loss when fine-tuning * Update to class variables * Running make fixup * Running make fix-copies * Remove model_parallel * Adding MT5ForQuestionAnswering * Adding docs * Fix wrong doc * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * File formatting * Undoing change --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-06-27 10:07:06 -04:00
Sourab Mangrulkar	bcf02ec701	Update hyperparameter_search.py (#24515 ) * Update hyperparameter_search.py * resolve comments	2023-06-27 18:42:15 +05:30
Wang, Yi	6fe8d198e3	use accelerate autocast in jit eval path, since mix precision logic is… (#24460 ) use accelerate autocast in jit eval path, since mix precision logic is in accelerator currently Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>	2023-06-27 08:33:21 -04:00
Hyeonseo Yun	0863436b6c	🌐 [i18n-KO] Translated `tflite.mdx` to Korean (#24435 ) * docs: ko: tflite.mdx * feat: nmt and manual edit `tflite.mdx` * revised: resolve suggestions tflite.mdx Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> * revised: resolve suggestions and new line tflite.mdx Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com> Co-Authored-By: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com> Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com> Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by: Kihoon Son <75935546+KIHOON71@users.noreply.github.com> Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by: Nayeon Han <nayeon2.han@gmail.com> Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>	2023-06-27 08:18:42 -04:00
Yih-Dar	4abd3ee479	Fix poor past ci (#24485 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-27 14:14:17 +02:00
Xiaoli Wang	239ace152b	Fix TypeError: Object of type int64 is not JSON serializable (#24340 ) * Fix TypeError: Object of type int64 is not JSON serializable * Convert numpy.float64 and numpy.int64 to float and int for json serialization * Black reformatted examples/pytorch/token-classification/run_ner_no_trainer.py * * make style	2023-06-27 12:15:49 +01:00
Joao Gante	ac19871ce2	Generate: `min_tokens_to_keep` has to be `>= 1` (#24453 )	2023-06-27 11:48:23 +01:00
Joao Gante	5f3efdf762	Generate: `group_beam_search` requires `diversity_penalty>0.0` (#24456 ) * add exception * update docs	2023-06-27 10:46:39 +01:00
hukuda222	43479ef98f	🚨🚨 Fix group beam search (#24407 ) * group_beam_search now works correctly * add argument descriptions * add a comment * format * make style * change comment * Update src/transformers/generation/beam_search.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> --------- Co-authored-by: shogo.fujita <shogo.fujita@legalontech.jp> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-06-27 10:43:10 +01:00
Gema Parreño	68c92981ff	Fix link in utils (#24501 ) * fix link * new link --------- Co-authored-by: Gema <gema@mbp-de-gema-2.lan>	2023-06-26 14:26:09 -04:00
Yih-Dar	7b4e3b5b40	Compute `dropout_probability` only in training mode (SpeechT5) (#24498 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 19:43:06 +02:00
Tomoko Uchida	c9fd49853f	Fix 'local_rank' AttiributeError in Trainer class (#24297 ) fix attribute error	2023-06-26 13:38:29 -04:00
Yih-Dar	850cf4af0c	Compute `dropout_probability` only in training mode (#24486 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 18:36:47 +02:00
Younes Belkada	9895670e95	[`InstructBlip`] Add accelerate support for instructblip (#24488 ) * add accelerate support for instructblip * add `_keep_in_fp32_modules` * dynamically adapt `_no_split_modules` * better fix * same logic for `_keep_in_fp32_modules`	2023-06-26 18:36:27 +02:00
Sylvain Gugger	5757923888	Add support for for loops in python interpreter (#24429 ) Add support for for loops	2023-06-26 09:58:14 -04:00
condor-cp	c2aa5e17e4	Update token_classification.md (#24484 ) Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.	2023-06-26 08:42:38 -04:00
Yih-Dar	3ca022238b	Update `InstructBlipModelIntegrationTest` (#24490 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 14:37:12 +02:00
Sourab Mangrulkar	195a9e5bdb	deepspeed z1/z2 state dict fix (#24489 ) * deepspeed z2/z1 state_dict bloating fix * update * version check	2023-06-26 17:45:37 +05:30
Wang, Yi	c8aff1d3e6	when resume from peft checkpoint, the model should be trainable (#24463 )	2023-06-26 08:07:27 -04:00
Younes Belkada	914289ac4b	[`pipeline`] Fix str device issue (#24396 ) * fix str device issue * fixup * adapt from suggestions * forward contrib credits from suggestions * better fix * added backward compatibility for older PT versions * final fixes * oops * Attempting something with less branching. --------- Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-26 13:58:36 +02:00
amyeroberts	892399c5ff	Update AlbertModel type annotation (#24450 ) Update type annotation	2023-06-26 10:59:42 +01:00
Meghan Cowan	be2d9f2e47	Fix tpu_metrics_debug (#24452 ) fix for tpu metrics debugs string	2023-06-26 10:59:07 +01:00
Matthijs Hollemans	3b84d86b57	add missing alignment_heads to Whisper integration test (#24487 ) add missing alignment heads	2023-06-26 11:50:10 +02:00
NielsRogge	868363abb9	Add InstructBLIP (#23460 ) * Squash 88 commits * Use markdown * Remove mdx files due to bad rebase * Fix modeling files due to bad rebase * Fix style * Update comment * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 11:23:57 +02:00
Matt	8e164c5400	Improved keras imports (#24448 ) * An end to accursed version-specific imports * No more K.is_keras_tensor() either * Update dependency tables * Use a cleaner call context function getter * Add a cap to <2.14 * Add cap to examples requirements too	2023-06-23 19:09:34 +01:00
Yih-Dar	1e9da2b0a6	Update `JukeboxConfig.from_pretrained` (#24443 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-23 15:00:52 +02:00
Sanchit Gandhi	8767958fc1	Allow dict input for audio classification pipeline (#23445 ) * Allow dict input for audio classification pipeline * make style * Empty commit to trigger CI * Empty commit to trigger CI * check for torchaudio * add pip instructions Co-authored-by: Sylvain <sylvain.gugger@gmail.com> * Update src/transformers/pipelines/audio_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * asr -> audio class * asr -> audio class --------- Co-authored-by: Sylvain <sylvain.gugger@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-23 13:50:37 +01:00
Sourab Mangrulkar	a6f37f8879	fixes issue when saving fsdp via accelerate's FSDP plugin (#24446 )	2023-06-23 18:03:57 +05:30
Yih-Dar	2898fd3968	Fix some `TFWhisperModelIntegrationTests` (#24428 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-23 14:27:49 +02:00
Moon Gi Cho	5e9f6752ee	Fix typo (#24440 )	2023-06-23 08:21:08 -04:00
Bowen Bao	a28325e25e	Replace python random with torch.rand to enable dynamo.export (#24434 ) * Replace python random with torch.rand to enable dynamo.export * revert changes to flax model code * Remove unused random import * Fix torch template * Move torch.manual_seed(0) to right location	2023-06-23 08:17:21 -04:00
Sourab Mangrulkar	c036c814f4	fix the grad_acc issue at epoch boundaries (#24415 ) * fix the grad_acc issue at epoch boundaries Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com> * add contributors. Co-authored-by: sumpster * address comments --------- Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>	2023-06-23 17:43:07 +05:30
Younes Belkada	468aed39af	[`Trainer`] Fix `.to` call on 4bit models (#24444 ) * fix `.to` call on 4bit models * better check	2023-06-23 13:35:04 +02:00
Sanchit Gandhi	ea91c2adca	[AutoModel] Add AutoModelForTextEncoding (#24305 ) * [AutoModel] Add AutoModelForTextEncoding * add mt5 * add other models * add to docs * fix tf imports * add tf to docs / init * up * fix inits * add to dummy objects	2023-06-23 10:01:37 +01:00
Weiming Zhao	feb83521ec	[llama] Fix comments in weights converter (#24436 ) Explain the reason to clone tensor	2023-06-22 20:38:53 -04:00
Yih-Dar	2c977e4a90	Save `site-packages` as cache in CircleCI job (#24424 ) * fix * fix * Upgrade complete! --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 23:16:35 +02:00
Sylvain Gugger	2834c17ad2	Clarify batch size displayed when using DataParallel (#24430 )	2023-06-22 14:46:20 -04:00
Alex Hall	b6295b26c5	Refactor hyperparameter search backends (#24384 ) * Refactor hyperparameter search backends * Simpler refactoring without abstract base class * black * review comments: specify name in class use methods instead of callable class attributes name constant better * review comments: safer bool checking, log multiple available backends * test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black. * copyright	2023-06-22 14:28:25 -04:00
Matt	a1c4b63076	TF CI fix for Segformer (#24426 ) Fix segformer so compilation can figure out the channel dim	2023-06-22 15:49:13 +01:00

... 34 35 36 37 38 ...

15053 Commits