transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 12:08:22 +06:00

Author	SHA1	Message	Date
Matthew Hoffman	b7d002bdff	Add str to TrainingArguments report_to type hint (#30078 ) * Add str to TrainingArguments report_to type hint * Swap order in Union * Merge Optional into Union https://github.com/huggingface/transformers/pull/30078#issuecomment-2042227546	2024-04-10 14:42:00 +01:00
Fanli Lin	185463784e	[tests] make 2 tests device-agnostic (#30008 ) add torch device	2024-04-10 14:46:39 +02:00
Marc Sun	bb76f81e40	[CI] Quantization workflow fix (#30158 ) * fix workflow * call ci * Update .github/workflows/self-scheduled-caller.yml Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-04-10 11:51:06 +02:00
Pavel Iakubovskii	56d001b26f	Fix and simplify semantic-segmentation example (#30145 ) * Remove unused augmentation * Fix pad_if_smaller() and remove unused augmentation * Add indentation * Fix requirements * Update dataset use instructions * Replace transforms with albumentations * Replace identity transform with None * Fixing formatting * Fixed comment place	2024-04-10 09:10:52 +01:00
Raushan Turganbay	41579763ee	Fix length related warnings in speculative decoding (#29585 ) * avoid generation length warning * add tests * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * add tests and minor fixes * refine `min_new_tokens` * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * add method to prepare length arguments * add test for min length * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * fix variable naming * empty commit for tests * trigger tests (empty) --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2024-04-10 12:45:07 +05:00
Marc Sun	6cdbd73e01	[CI] Fix setup (#30147 ) * [CI] fix setup * fix * test * Revert "test" This reverts commit `7df416d450`.	2024-04-09 18:10:00 +02:00
Steven Liu	21e23ffca7	[docs] Fix image segmentation guide (#30132 ) fixes	2024-04-09 09:08:37 -07:00
Marc Sun	58a939c6b7	Fix quantization tests (#29914 ) * revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit `399a5f9d93`. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one	2024-04-09 17:10:29 +02:00
Yih-Dar	6487e9b370	Send headers when converting safetensors (#30144 ) Co-authored-by: Wauplin <lucainp@gmail.com>	2024-04-09 17:03:36 +02:00
Yih-Dar	08a194fcd6	Fix slow tests for important models to be compatible with A10 runners (#29905 ) * fix mistral and mixtral * add pdb * fix mixtral tesst * fix * fix mistral ? * add fix gemma * fix mistral * fix * test * anoter test * fix * fix * fix mistral tests * fix them again * final fixes for mistral * fix padding right * fix whipser fa2 * fix * fix * fix gemma * test * fix llama * fix * fix * fix llama gemma * add class attribute * fix CI * clarify whisper * compute_capability * rename names in some comments * Add # fmt: skip * make style * Update tests/models/mistral/test_modeling_mistral.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * update --------- Co-authored-by: Younes Belkada <younesbelkada@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-04-09 13:28:54 +02:00
NielsRogge	e9c23fa056	[Trainer] Undo #29896 (#30129 ) * Undo * Use tokenizer * Undo data collator	2024-04-09 12:55:42 +02:00
NielsRogge	ba1b24e07b	[Trainer] Fix default data collator (#30142 ) * Fix data collator * Support feature extractors as well	2024-04-09 12:52:50 +02:00
Matt	ec59a42192	Revert workaround for TF safetensors loading (#30128 ) * See if we can get tests to pass with the fixed weights * See if we can get tests to pass with the fixed weights * Replace the revisions now that we don't need them anymore	2024-04-09 11:04:18 +01:00
Raushan Turganbay	841e87ef4f	Fix docs Pop2Piano (#30140 ) fix copies	2024-04-09 14:58:02 +05:00
Matthew Hoffman	af4c02622b	Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints (#30077 ) * Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints * Add is_datasets_available check for importing datasets under TYPE_CHECKING guard https://github.com/huggingface/transformers/pull/30077/files#r1555939352	2024-04-09 09:26:15 +01:00
Sourab Mangrulkar	4e3490f79b	Fix failing DeepSpeed model zoo tests (#30112 ) * fix sequence length errors * fix label column name error for vit * fix the lm_head embedding!=linear layer mismatches for Seq2Seq models	2024-04-09 12:01:47 +05:30
Jonathan Tow	2f12e40822	[`StableLm`] Add QK normalization and Parallel Residual Support (#29745 ) * init: add StableLm 2 support * add integration test for parallel residual and qk layernorm * update(modeling): match qk norm naming for consistency with phi/persimmon * fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity * `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward` * refactor: rename head states var in `StableLmLayerNormPerHead` * tests: update test model and add generate check	2024-04-08 23:51:58 +02:00
Felix Hirwa Nshuti	8c00b53eb0	Adding `mps` as device for `Pipeline` class (#30080 ) * adding env variable for mps and is_torch_mps_available for Pipeline * fix linting errors * Remove environment overide Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 18:07:30 +01:00
DrAnaximandre	7afade2086	Fix typo at ImportError (#30090 ) fix typo at ImportError	2024-04-08 17:45:21 +01:00
fxmarty	ef38e2a7e5	Make vitdet jit trace complient (#30065 ) * remove controlflows * style * rename patch_ to padded_ following review comment * style	2024-04-08 23:10:06 +08:00
Younes Belkada	a71def025c	Trainer / Core : Do not change init signature order (#30126 ) * Update trainer.py * fix copies	2024-04-08 16:57:38 +02:00
fxmarty	1897874edc	Fix falcon with SDPA, alibi but no passed mask (#30123 ) * fix falcon without attention_mask & alibi * add test * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 22:25:07 +08:00
Anton Vlasjuk	1773afcec3	fix learning rate display in trainer when using galore optimizer (#30085 ) fix learning rate display issue in galore optimizer	2024-04-08 14:54:12 +01:00
Nick Doiron	08c8443307	Accept token in trainer.push_to_hub() (#30093 ) * pass token to trainer.push_to_hub * fmt * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pass token to create_repo, update_folder --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-08 14:51:11 +01:00
Utkarsha Gupte	0201f6420b	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 ) * ImportError: Trainer with PyTorch requires accelerate>=0.20.1 Fix Adding the evaluate and accelerate installs at the beginning of the cell to fix the issue * ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 * Import Error Fix * Update installation.md * Update quicktour.md * rollback other lang changes * Update _config.py * updates for other languages * fixing error * Tutorial Update * Update tokenization_utils_base.py * Just use an optimizer string to pass the doctest? --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-04-08 14:21:16 +01:00
amyeroberts	7f9aff910b	Patch fix - don't use safetensors for TF models (#30118 ) * Patch fix - don't use safetensors for TF models * Skip test for TF for now * Update for another test	2024-04-08 13:29:20 +01:00
JINO ROHIT	f5658732d5	fixing issue 30034 - adding data format for run_ner.py (#30088 )	2024-04-08 12:49:59 +01:00
Fanli Lin	d16f0abc3f	[tests] add `require_bitsandbytes` marker (#30116 ) * add bnb flag * move maker * add accelerator maker	2024-04-08 12:49:31 +01:00
Haz Sameen Shahgir	5e673ed2dc	updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120 ) updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0	2024-04-08 12:41:28 +01:00
Howard Liberty	836e88caee	Make MLFlow version detection more robust and handles mlflow-skinny (#29957 ) * Make MLFlow version detection more robust and handles mlflow-skinny * Make function name more clear and refactor the logic * Further refactor	2024-04-08 12:20:02 +02:00
Xu Song	a907a903d6	Change log level to warning for num_train_epochs override (#30014 )	2024-04-08 10:36:53 +02:00
vaibhavagg303	1ed93be48a	[Whisper] Computing features on GPU in batch mode for whisper feature extractor. (#29900 ) * add _torch_extract_fbank_features_batch function in feature_extractor_whisper * reformat feature_extraction_whisper.py file * handle batching in single function * add gpu test & doc * add batch test & device in each __call__ * add device arg in doc string --------- Co-authored-by: vaibhav.aggarwal <vaibhav.aggarwal@sprinklr.com>	2024-04-08 10:36:25 +02:00
Cylis	1fc34aa666	doc: Correct spelling mistake (#30107 )	2024-04-08 08:44:05 +01:00
Raushan Turganbay	76fa17c166	Fix whisper kwargs and generation config (#30018 ) * clean-up whisper kwargs * failing test	2024-04-05 21:28:58 +05:00
Yih-Dar	9b5a6450d4	Fix auto tests (#30067 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 17:49:46 +02:00
Kola	d9fa13ce62	Add docstrings and types for MambaCache (#30023 ) * Add docstrings and types for MambaCache * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * make fixup * import copy in generation_whisper * ruff * Revert "make fixup" This reverts commit c4fedd6f60e3b0f11974a11433bc130478829a5c.	2024-04-05 16:19:54 +02:00
Yih-Dar	b17b54d3dd	Refactor daily CI workflow (#30012 ) * separate jobs * separate jobs * use channel name directly instead of ID * use channel name directly instead of ID * use channel name directly instead of ID --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 15:49:51 +02:00
Michael Benayoun	17cd7a9d28	Fix `torch.fx` symbolic tracing for LLama (#30047 ) * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * Apply changes to other models	2024-04-05 15:14:09 +02:00
Yih-Dar	48795317a2	[test fetcher] Always include the directly related test files (#30050 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 14:30:36 +02:00
miRx923	de11d0bdf0	Update quantizer_bnb_4bit.py: In the ValueError string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "`load_in_8bit_fp32_cpu_offload=True`". (#30013 ) * Update quantizer_bnb_4bit.py There is an mistake in ValueError on line 86 of quantizer_bnb_4bit.py. In the error string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "load_in_8bit_fp32_cpu_offload=True". I think you updated the BitsAndBytesConfig() arguments, but forgot to change the ValueError in quantizer_bnb_4bit.py. * Update quantizer_bnb_4bit.py Changed ValueError string "...you need to set load_in_8bit_fp32_cpu_offload=True..." to "....you need to set llm_int8_enable_fp32_cpu_offload=True...."	2024-04-05 14:04:50 +02:00
Marc Sun	4207a4076d	[bnb] Fix offload test (#30039 ) fix bnb test	2024-04-05 13:11:28 +02:00
NielsRogge	1ab7136488	[Trainer] Allow passing image processor (#29896 ) * Add image processor to trainer * Replace tokenizer=image_processor everywhere	2024-04-05 10:10:44 +02:00
Adam Louly	d704c0b698	Fix mixtral ONNX Exporter Issue. (#29858 ) * fix mixtral onnx export * fix qwen model	2024-04-05 09:49:42 +02:00
Wang, Yi	79d62b2da2	if output is tuple like facebook/hf-seamless-m4t-medium, waveform is … (#29722 ) * if output is tuple like facebook/hf-seamless-m4t-medium, waveform is the first element Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add test and fix batch issue Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add dict output support for seamless_m4t Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>	2024-04-05 09:26:44 +02:00
Yih-Dar	8b52fa6b42	skip `test_encode_decode_fast_slow_all_tokens` for now (#30044 ) skip test_encode_decode_fast_slow_all_tokens for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 09:07:41 +02:00
Yih-Dar	24d787ce9d	Add `whisper` to `IMPORTANT_MODELS` (#30046 ) Add whisper Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-05 09:06:40 +02:00
Saurabh Dash	517a3e670d	Refactor Cohere Model (#30027 ) * changes * addressing comments * smol fix	2024-04-04 12:46:20 +02:00
byi8220	75b76a5ea4	[`ProcessingIdefics`] Attention mask bug with padding (#29449 ) * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * redefaulted padding=longest again * fixup/doc	2024-04-04 10:11:09 +01:00
byi8220	4e6c5eb045	Add a converter from mamba_ssm -> huggingface mamba (#29705 ) * implement convert_mamba_ssm_checkpoint_to_pytorch * Add test test_model_from_mamba_ssm_conversion * moved convert_ssm_config_to_hf_config to inside mamba_ssm_available check * fix skipif clause * moved skips to inside test since skipif decorator isn't working for some reason * Added validation * removed test * fixup * only compare logits * remove weight rename * Update src/transformers/models/mamba/convert_mamba_ssm_checkpoint_to_pytorch.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-04 09:29:32 +01:00
Jacky Lee	03732dea60	Enable multi-device for efficientnet (#29989 ) feat: enable mult-idevice for efficientnet	2024-04-03 20:54:34 +01:00

... 10 11 12 13 14 ...

16108 Commits