transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Jonathan Tow	2f12e40822	[`StableLm`] Add QK normalization and Parallel Residual Support (#29745 ) * init: add StableLm 2 support * add integration test for parallel residual and qk layernorm * update(modeling): match qk norm naming for consistency with phi/persimmon * fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity * `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward` * refactor: rename head states var in `StableLmLayerNormPerHead` * tests: update test model and add generate check	2024-04-08 23:51:58 +02:00
Felix Hirwa Nshuti	8c00b53eb0	Adding `mps` as device for `Pipeline` class (#30080 ) * adding env variable for mps and is_torch_mps_available for Pipeline * fix linting errors * Remove environment overide Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 18:07:30 +01:00
DrAnaximandre	7afade2086	Fix typo at ImportError (#30090 ) fix typo at ImportError	2024-04-08 17:45:21 +01:00
fxmarty	ef38e2a7e5	Make vitdet jit trace complient (#30065 ) * remove controlflows * style * rename patch_ to padded_ following review comment * style	2024-04-08 23:10:06 +08:00
Younes Belkada	a71def025c	Trainer / Core : Do not change init signature order (#30126 ) * Update trainer.py * fix copies	2024-04-08 16:57:38 +02:00
fxmarty	1897874edc	Fix falcon with SDPA, alibi but no passed mask (#30123 ) * fix falcon without attention_mask & alibi * add test * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 22:25:07 +08:00
Anton Vlasjuk	1773afcec3	fix learning rate display in trainer when using galore optimizer (#30085 ) fix learning rate display issue in galore optimizer	2024-04-08 14:54:12 +01:00
Nick Doiron	08c8443307	Accept token in trainer.push_to_hub() (#30093 ) * pass token to trainer.push_to_hub * fmt * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pass token to create_repo, update_folder --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 14:51:11 +01:00
Utkarsha Gupte	0201f6420b	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 ) * ImportError: Trainer with PyTorch requires accelerate>=0.20.1 Fix Adding the evaluate and accelerate installs at the beginning of the cell to fix the issue * ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 * Import Error Fix * Update installation.md * Update quicktour.md * rollback other lang changes * Update _config.py * updates for other languages * fixing error * Tutorial Update * Update tokenization_utils_base.py * Just use an optimizer string to pass the doctest? --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-04-08 14:21:16 +01:00
amyeroberts	7f9aff910b	Patch fix - don't use safetensors for TF models (#30118 ) * Patch fix - don't use safetensors for TF models * Skip test for TF for now * Update for another test	2024-04-08 13:29:20 +01:00
JINO ROHIT	f5658732d5	fixing issue 30034 - adding data format for run_ner.py (#30088 )	2024-04-08 12:49:59 +01:00
Fanli Lin	d16f0abc3f	[tests] add `require_bitsandbytes` marker (#30116 ) * add bnb flag * move maker * add accelerator maker	2024-04-08 12:49:31 +01:00
Haz Sameen Shahgir	5e673ed2dc	updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120 ) updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0	2024-04-08 12:41:28 +01:00
Howard Liberty	836e88caee	Make MLFlow version detection more robust and handles mlflow-skinny (#29957 ) * Make MLFlow version detection more robust and handles mlflow-skinny * Make function name more clear and refactor the logic * Further refactor	2024-04-08 12:20:02 +02:00
Xu Song	a907a903d6	Change log level to warning for num_train_epochs override (#30014 )	2024-04-08 10:36:53 +02:00
vaibhavagg303	1ed93be48a	[Whisper] Computing features on GPU in batch mode for whisper feature extractor. (#29900 ) * add _torch_extract_fbank_features_batch function in feature_extractor_whisper * reformat feature_extraction_whisper.py file * handle batching in single function * add gpu test & doc * add batch test & device in each __call__ * add device arg in doc string --------- Co-authored-by: vaibhav.aggarwal <vaibhav.aggarwal@sprinklr.com>	2024-04-08 10:36:25 +02:00
Cylis	1fc34aa666	doc: Correct spelling mistake (#30107 )	2024-04-08 08:44:05 +01:00
Raushan Turganbay	76fa17c166	Fix whisper kwargs and generation config (#30018 ) * clean-up whisper kwargs * failing test	2024-04-05 21:28:58 +05:00
Yih-Dar	9b5a6450d4	Fix auto tests (#30067 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 17:49:46 +02:00
Kola	d9fa13ce62	Add docstrings and types for MambaCache (#30023 ) * Add docstrings and types for MambaCache * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * make fixup * import copy in generation_whisper * ruff * Revert "make fixup" This reverts commit c4fedd6f60e3b0f11974a11433bc130478829a5c.	2024-04-05 16:19:54 +02:00
Yih-Dar	b17b54d3dd	Refactor daily CI workflow (#30012 ) * separate jobs * separate jobs * use channel name directly instead of ID * use channel name directly instead of ID * use channel name directly instead of ID --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 15:49:51 +02:00
Michael Benayoun	17cd7a9d28	Fix `torch.fx` symbolic tracing for LLama (#30047 ) * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * Apply changes to other models	2024-04-05 15:14:09 +02:00
Yih-Dar	48795317a2	[test fetcher] Always include the directly related test files (#30050 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 14:30:36 +02:00
miRx923	de11d0bdf0	Update quantizer_bnb_4bit.py: In the ValueError string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "`load_in_8bit_fp32_cpu_offload=True`". (#30013 ) * Update quantizer_bnb_4bit.py There is an mistake in ValueError on line 86 of quantizer_bnb_4bit.py. In the error string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "load_in_8bit_fp32_cpu_offload=True". I think you updated the BitsAndBytesConfig() arguments, but forgot to change the ValueError in quantizer_bnb_4bit.py. * Update quantizer_bnb_4bit.py Changed ValueError string "...you need to set load_in_8bit_fp32_cpu_offload=True..." to "....you need to set llm_int8_enable_fp32_cpu_offload=True...."	2024-04-05 14:04:50 +02:00
Marc Sun	4207a4076d	[bnb] Fix offload test (#30039 ) fix bnb test	2024-04-05 13:11:28 +02:00
NielsRogge	1ab7136488	[Trainer] Allow passing image processor (#29896 ) * Add image processor to trainer * Replace tokenizer=image_processor everywhere	2024-04-05 10:10:44 +02:00
Adam Louly	d704c0b698	Fix mixtral ONNX Exporter Issue. (#29858 ) * fix mixtral onnx export * fix qwen model	2024-04-05 09:49:42 +02:00
Wang, Yi	79d62b2da2	if output is tuple like facebook/hf-seamless-m4t-medium, waveform is … (#29722 ) * if output is tuple like facebook/hf-seamless-m4t-medium, waveform is the first element Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add test and fix batch issue Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add dict output support for seamless_m4t Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-04-05 09:26:44 +02:00
Yih-Dar	8b52fa6b42	skip `test_encode_decode_fast_slow_all_tokens` for now (#30044 ) skip test_encode_decode_fast_slow_all_tokens for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 09:07:41 +02:00
Yih-Dar	24d787ce9d	Add `whisper` to `IMPORTANT_MODELS` (#30046 ) Add whisper Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 09:06:40 +02:00
Saurabh Dash	517a3e670d	Refactor Cohere Model (#30027 ) * changes * addressing comments * smol fix	2024-04-04 12:46:20 +02:00
byi8220	75b76a5ea4	[`ProcessingIdefics`] Attention mask bug with padding (#29449 ) * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * redefaulted padding=longest again * fixup/doc	2024-04-04 10:11:09 +01:00
byi8220	4e6c5eb045	Add a converter from mamba_ssm -> huggingface mamba (#29705 ) * implement convert_mamba_ssm_checkpoint_to_pytorch * Add test test_model_from_mamba_ssm_conversion * moved convert_ssm_config_to_hf_config to inside mamba_ssm_available check * fix skipif clause * moved skips to inside test since skipif decorator isn't working for some reason * Added validation * removed test * fixup * only compare logits * remove weight rename * Update src/transformers/models/mamba/convert_mamba_ssm_checkpoint_to_pytorch.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-04 09:29:32 +01:00
Jacky Lee	03732dea60	Enable multi-device for efficientnet (#29989 ) feat: enable mult-idevice for efficientnet	2024-04-03 20:54:34 +01:00
Zach Mueller	863e2562d8	Make clearer about zero_init requirements (#29879 ) * Docstring to note about zero init * Check for accelerate * Change conditional return * Tweak * Add new accelerate-specific zero3 check * Fix import * Revert to RTFM * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-03 13:37:52 -04:00
Arthur	695d823323	[`Main CIs`] Fix the red cis (#30022 ) * fix * sort imports	2024-04-03 19:34:39 +02:00
Raushan Turganbay	c10b5dd25e	Superpoint imports fix (#29898 ) quick fix	2024-04-03 18:32:01 +01:00
Steven Liu	34bfe95af5	[docs] Fix audio file (#30006 ) new audio file	2024-04-03 10:05:15 -07:00
Raushan Turganbay	cc75f1ac73	Fix vipllava for generation (#29874 ) * fix vipllava generation * consistent llava code * revert llava tests changes	2024-04-03 17:00:08 +01:00
Ondřej Cífka	240e10626b	Fix probability computation in `WhisperNoSpeechDetection` when recomputing scores (#29248 ) * Fix is_scores_logprobs in WhisperNoSpeechDetection * Add test_whisper_longform_no_speech_detection * Fix typo	2024-04-03 17:53:07 +02:00
Ondřej Cífka	bcd42c4af9	Fix `kwargs` handling in `generate_with_fallback` (#29225 ) * Fix generate_with_fallback *kwargs Change pop to get * Delete keys from kwargs to prevent overriding generation_config * Revert to passing kwargs by reference, but make a (shallow) copy * dict -> copy.copy * Add test_whisper_longform_multi_batch_beam	2024-04-03 17:51:03 +02:00
Ren Xuancheng	851f253f4d	Fix Qwen2Tokenizer (#29929 ) qwen2: fixed tokens starting with # in slow tokenizer; add tests Co-authored-by: jklj077 <17811943+jklj077@users.noreply.github.com>	2024-04-03 17:42:43 +02:00
Miguel Almeida	17b06e2c66	Fix Swinv2ForImageClassification NaN output (#29981 ) To address the issue of NaN logit outputs for certain combinations of the `image_size`, `patch_size` and `depths` configuration parameters, an assertion was made to ensure that the resulting `window_size` field in the model's Self Attention class is greater than 1, preventing divisions by zero in the normalization of `relative_coords_table`. Fix: #28675	2024-04-03 14:54:45 +01:00
fxmarty	81642d2b51	Make EncodecModel.decode ONNX exportable (#29913 ) * fix encodec onnx export for musicgen * simplification * fix quality * better style	2024-04-03 17:11:01 +08:00
Yih-Dar	b44df05bc0	Update `tests/utils/tiny_model_summary.json` (#29941 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-03 09:25:01 +02:00
Mario Šaško	fce52cefa7	Fix `remove_columns` in `text-classification` example (#29351 )	2024-04-02 19:15:27 +02:00
Joao Gante	5080ab12c8	Generate: fix logits processors doctests (#29718 ) * fix norm * fix logits processors doctests	2024-04-02 17:18:31 +01:00
Nicolas Patry	9b0a8ea7d1	Hard error when ignoring tensors. (#27484 ) (#29906 ) * Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-02 16:59:05 +02:00
Minsub Lee (Matt)	15cd68713d	Fix `skip_special_tokens` for `Wav2Vec2CTCTokenizer._decode` (#29311 ) * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token	2024-04-02 16:55:11 +02:00
Michael	cb5927ca8f	[Docs] Make an ordered list prettier in add_tensorflow_model.md (#29949 )	2024-04-02 12:37:56 +01:00

1 2 3 4 5 ...

15542 Commits