transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Yih-Dar	850cf4af0c	Compute `dropout_probability` only in training mode (#24486 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 18:36:47 +02:00
Younes Belkada	9895670e95	[`InstructBlip`] Add accelerate support for instructblip (#24488 ) * add accelerate support for instructblip * add `_keep_in_fp32_modules` * dynamically adapt `_no_split_modules` * better fix * same logic for `_keep_in_fp32_modules`	2023-06-26 18:36:27 +02:00
Sylvain Gugger	5757923888	Add support for for loops in python interpreter (#24429 ) Add support for for loops	2023-06-26 09:58:14 -04:00
condor-cp	c2aa5e17e4	Update token_classification.md (#24484 ) Add link to pytorch CrossEntropyLoss so that one understand why '-100' is ignore by the loss function.	2023-06-26 08:42:38 -04:00
Yih-Dar	3ca022238b	Update `InstructBlipModelIntegrationTest` (#24490 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 14:37:12 +02:00
Sourab Mangrulkar	195a9e5bdb	deepspeed z1/z2 state dict fix (#24489 ) * deepspeed z2/z1 state_dict bloating fix * update * version check	2023-06-26 17:45:37 +05:30
Wang, Yi	c8aff1d3e6	when resume from peft checkpoint, the model should be trainable (#24463 )	2023-06-26 08:07:27 -04:00
Younes Belkada	914289ac4b	[`pipeline`] Fix str device issue (#24396 ) * fix str device issue * fixup * adapt from suggestions * forward contrib credits from suggestions * better fix * added backward compatibility for older PT versions * final fixes * oops * Attempting something with less branching. --------- Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-26 13:58:36 +02:00
amyeroberts	892399c5ff	Update AlbertModel type annotation (#24450 ) Update type annotation	2023-06-26 10:59:42 +01:00
Meghan Cowan	be2d9f2e47	Fix tpu_metrics_debug (#24452 ) fix for tpu metrics debugs string	2023-06-26 10:59:07 +01:00
Matthijs Hollemans	3b84d86b57	add missing alignment_heads to Whisper integration test (#24487 ) add missing alignment heads	2023-06-26 11:50:10 +02:00
NielsRogge	868363abb9	Add InstructBLIP (#23460 ) * Squash 88 commits * Use markdown * Remove mdx files due to bad rebase * Fix modeling files due to bad rebase * Fix style * Update comment * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-26 11:23:57 +02:00
Matt	8e164c5400	Improved keras imports (#24448 ) * An end to accursed version-specific imports * No more K.is_keras_tensor() either * Update dependency tables * Use a cleaner call context function getter * Add a cap to <2.14 * Add cap to examples requirements too	2023-06-23 19:09:34 +01:00
Yih-Dar	1e9da2b0a6	Update `JukeboxConfig.from_pretrained` (#24443 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-23 15:00:52 +02:00
Sanchit Gandhi	8767958fc1	Allow dict input for audio classification pipeline (#23445 ) * Allow dict input for audio classification pipeline * make style * Empty commit to trigger CI * Empty commit to trigger CI * check for torchaudio * add pip instructions Co-authored-by: Sylvain <sylvain.gugger@gmail.com> * Update src/transformers/pipelines/audio_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * asr -> audio class * asr -> audio class --------- Co-authored-by: Sylvain <sylvain.gugger@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-23 13:50:37 +01:00
Sourab Mangrulkar	a6f37f8879	fixes issue when saving fsdp via accelerate's FSDP plugin (#24446 )	2023-06-23 18:03:57 +05:30
Yih-Dar	2898fd3968	Fix some `TFWhisperModelIntegrationTests` (#24428 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-06-23 14:27:49 +02:00
Moon Gi Cho	5e9f6752ee	Fix typo (#24440 )	2023-06-23 08:21:08 -04:00
Bowen Bao	a28325e25e	Replace python random with torch.rand to enable dynamo.export (#24434 ) * Replace python random with torch.rand to enable dynamo.export * revert changes to flax model code * Remove unused random import * Fix torch template * Move torch.manual_seed(0) to right location	2023-06-23 08:17:21 -04:00
Sourab Mangrulkar	c036c814f4	fix the grad_acc issue at epoch boundaries (#24415 ) * fix the grad_acc issue at epoch boundaries Co-Authored-By: Zach Mueller <7831895+muellerzr@users.noreply.github.com> * add contributors. Co-authored-by: sumpster * address comments --------- Co-authored-by: Zach Mueller <7831895+muellerzr@users.noreply.github.com>	2023-06-23 17:43:07 +05:30
Younes Belkada	468aed39af	[`Trainer`] Fix `.to` call on 4bit models (#24444 ) * fix `.to` call on 4bit models * better check	2023-06-23 13:35:04 +02:00
Sanchit Gandhi	ea91c2adca	[AutoModel] Add AutoModelForTextEncoding (#24305 ) * [AutoModel] Add AutoModelForTextEncoding * add mt5 * add other models * add to docs * fix tf imports * add tf to docs / init * up * fix inits * add to dummy objects	2023-06-23 10:01:37 +01:00
Weiming Zhao	feb83521ec	[llama] Fix comments in weights converter (#24436 ) Explain the reason to clone tensor	2023-06-22 20:38:53 -04:00
Yih-Dar	2c977e4a90	Save `site-packages` as cache in CircleCI job (#24424 ) * fix * fix * Upgrade complete! --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 23:16:35 +02:00
Sylvain Gugger	2834c17ad2	Clarify batch size displayed when using DataParallel (#24430 )	2023-06-22 14:46:20 -04:00
Alex Hall	b6295b26c5	Refactor hyperparameter search backends (#24384 ) * Refactor hyperparameter search backends * Simpler refactoring without abstract base class * black * review comments: specify name in class use methods instead of callable class attributes name constant better * review comments: safer bool checking, log multiple available backends * test ALL_HYPERPARAMETER_SEARCH_BACKENDS vs HPSearchBackend in unit test, not module. format with black. * copyright	2023-06-22 14:28:25 -04:00
Matt	a1c4b63076	TF CI fix for Segformer (#24426 ) Fix segformer so compilation can figure out the channel dim	2023-06-22 15:49:13 +01:00
Josh	754f61ca05	Update RayTune doc link for Hyperparameter tuning (#24422 ) Update outdated hyperlink hpo_train.md Link to RayTune search space API docs was outdated - have provided correct new link for docs. Co-authored-by: Joshua Samuel <66880119+Joshsamuel101@users.noreply.github.com>	2023-06-22 10:38:01 -04:00
Yih-Dar	8f2ef52fb6	Fix `save_cache` version in `config.yml` (#24419 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 16:18:16 +02:00
Younes Belkada	3ce3385c47	Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420 ) Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)" This reverts commit `285a48011d`.	2023-06-22 16:11:27 +02:00
Younes Belkada	ebb62e8880	[`bnb`] Fix bnb serialization issue with new release (#24416 ) * fix bnb issue * fixup * revert and do simple patching instead * add more details	2023-06-22 15:40:38 +02:00
Yih-Dar	652ece0710	Skip `test_conditional_generation_pt_pix2struct` in Past CI (torch < 1.11) (#24417 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 15:34:13 +02:00
Matt	22fe73c378	TF safetensors reduced mem usage (#24404 ) * Slight comment cleanup * Reduce peak mem usage when loading TF-format safetensor weights * Tweak the PyTorch loading code to support lazy loading from safetensors * Pass safe_open objects to the PyTorch loading function * Do GPU transposes for speed * One more tweak to reduce peak usage further * One-line hasattr * Fix bug when there's a shape mismatch * Rename state_dict in the loading code to be clearer * Use TF format everywhere for consistency	2023-06-22 14:06:16 +01:00
Sanchit Gandhi	7e03e46934	[ASR pipeline] Check for torchaudio (#23953 ) * [ASR pipeline] Check for torchaudio * add pip instructions Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> --------- Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>	2023-06-22 13:48:49 +01:00
Yih-Dar	6ce6d62b6f	Explicit arguments in `from_pretrained` (#24306 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-21 19:24:11 +02:00
Zach Mueller	127e81c272	Remove redundant code from TrainingArgs (#24401 ) Remove redundant code	2023-06-21 11:51:27 -04:00
Matthijs Hollemans	cd927a4736	add word-level timestamps to Whisper (#23205 ) * let's go! * initial implementation of token-level timestamps * only return a single timestamp per token * remove token probabilities * fix return type * fix doc comment * strip special tokens * rename * revert to not stripping special tokens * only support models that have alignment_heads * add integration test * consistently name it token-level timestamps * small DTW tweak * initial support for ASR pipeline * fix pipeline doc comments * resolve token timestamps in pipeline with chunking * change warning when no final timestamp is found * return word-level timestamps * fixup * fix bug that skipped final word in each chunk * fix failing unit tests * merge punctuations into the words * also return word tokens * also return token indices * add (failing) unit test for combine_tokens_into_words * make combine_tokens_into_words private * restore OpenAI's punctuation rules * add pipeline tests * make requested changes * PR review changes * fix failing pipeline test * small stuff from PR * only return words and their timestamps, not segments * move alignment_heads into generation config * forgot to set alignment_heads in pipeline tests * tiny comment fix * grr	2023-06-21 17:48:21 +02:00
Yih-Dar	0f968ddaa3	Check auto mappings could be imported via `from transformers` (#24400 ) * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-21 17:31:57 +02:00
Zach Mueller	1a6fb930fb	Clean up dist import (#24402 )	2023-06-21 11:19:42 -04:00
Younes Belkada	285a48011d	Fix gradient checkpointing + fp16 autocast for most models (#24247 ) * fix gc bug * continue PoC on OPT * fixes * 🤯 * fix tests * remove pytest.mark * fixup * forward contrib credits from discussions * forward contrib credits from discussions * reverting changes on untouched files. --------- Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com> Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>	2023-06-21 17:04:59 +02:00
Meghan Cowan	1815d1865e	[Trainer] Fix optimizer step on PyTorch TPU (#24389 ) * update optimizer step for tpu * add comment	2023-06-21 07:24:41 -04:00
Bearnardd	4c6e429589	fix type annotation for debug arg (#24033 ) * fix type annotation for debug arg * fix TypeErorr	2023-06-21 11:42:21 +01:00
Yih-Dar	16c7b16a0a	byebye Hub connection timeout - Recast (#24399 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-21 12:36:34 +02:00
Joao Gante	5f0801d174	Generate: add SequenceBiasLogitsProcessor (#24334 )	2023-06-21 11:14:41 +01:00
Yih-Dar	45f71d793d	Add `ffmpeg` for `doc_test_job` on CircleCI (#24397 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-21 11:12:38 +02:00
Steven Liu	ad78d9597b	[docs] Fix NLLB-MoE links (#24388 ) fix broken links	2023-06-20 17:34:20 -07:00
Sergii Dymchenko	cb8f675510	Update deprecated torch.ger (#24387 )	2023-06-20 20:21:13 -04:00
Sylvain Gugger	eb849f6604	Migrate doc files to Markdown. (#24376 ) * Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-20 18:07:47 -04:00
Patrick von Platen	b0513b013b	[Wav2Vec2 - MMS] Correct directly loading adapters weights (#24335 ) * Correct direct lang loading * correct more * revert black * Use tie weights instead= * add tests * add tests * make style	2023-06-20 19:39:52 +02:00
Arthur	e5c760d636	[GPTNeoX] Nit in config (#24349 ) * add raise value error for attention size * nits to fix test_config * style	2023-06-20 19:19:19 +02:00

1 2 3 4 5 ...

13280 Commits