Commit Graph

16108 Commits

Author SHA1 Message Date
Matthew Hoffman
b7d002bdff
Add str to TrainingArguments report_to type hint (#30078)
* Add str to TrainingArguments report_to type hint

* Swap order in Union

* Merge Optional into Union

https://github.com/huggingface/transformers/pull/30078#issuecomment-2042227546
2024-04-10 14:42:00 +01:00
Fanli Lin
185463784e
[tests] make 2 tests device-agnostic (#30008)
add torch device
2024-04-10 14:46:39 +02:00
Marc Sun
bb76f81e40
[CI] Quantization workflow fix (#30158)
* fix workflow

* call ci

* Update .github/workflows/self-scheduled-caller.yml

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-04-10 11:51:06 +02:00
Pavel Iakubovskii
56d001b26f
Fix and simplify semantic-segmentation example (#30145)
* Remove unused augmentation

* Fix pad_if_smaller() and remove unused augmentation

* Add indentation

* Fix requirements

* Update dataset use instructions

* Replace transforms with albumentations

* Replace identity transform with None

* Fixing formatting

* Fixed comment place
2024-04-10 09:10:52 +01:00
Raushan Turganbay
41579763ee
Fix length related warnings in speculative decoding (#29585)
* avoid generation length warning

* add tests

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add tests and minor fixes

* refine `min_new_tokens`

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* add method to prepare length arguments

* add test for min length

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* fix variable naming

* empty commit for tests

* trigger tests (empty)

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-04-10 12:45:07 +05:00
Marc Sun
6cdbd73e01
[CI] Fix setup (#30147)
* [CI] fix setup

* fix

* test

* Revert "test"

This reverts commit 7df416d450.
2024-04-09 18:10:00 +02:00
Steven Liu
21e23ffca7
[docs] Fix image segmentation guide (#30132)
fixes
2024-04-09 09:08:37 -07:00
Marc Sun
58a939c6b7
Fix quantization tests (#29914)
* revert back to torch 2.1.1

* run test

* switch to torch 2.2.1

* udapte dockerfile

* fix awq tests

* fix test

* run quanto tests

* update tests

* split quantization tests

* fix

* fix again

* final fix

* fix report artifact

* build docker again

* Revert "build docker again"

This reverts commit 399a5f9d93.

* debug

* revert

* style

* new notification system

* testing notfication

* rebuild docker

* fix_prev_ci_results

* typo

* remove warning

* fix typo

* fix artifact name

* debug

* issue fixed

* debug again

* fix

* fix time

* test notif with faling test

* typo

* issues again

* final fix ?

* run all quantization tests again

* remove name to clear space

* revert modfiication done on workflow

* fix

* build docker

* build only quant docker

* fix quantization ci

* fix

* fix report

* better quantization_matrix

* add print

* revert to the basic one
2024-04-09 17:10:29 +02:00
Yih-Dar
6487e9b370
Send headers when converting safetensors (#30144)
Co-authored-by: Wauplin <lucainp@gmail.com>
2024-04-09 17:03:36 +02:00
Yih-Dar
08a194fcd6
Fix slow tests for important models to be compatible with A10 runners (#29905)
* fix mistral and mixtral

* add pdb

* fix mixtral tesst

* fix

* fix mistral ?

* add fix gemma

* fix mistral

* fix

* test

* anoter test

* fix

* fix

* fix mistral tests

* fix them again

* final fixes for mistral

* fix padding right

* fix whipser fa2

* fix

* fix

* fix gemma

* test

* fix llama

* fix

* fix

* fix llama gemma

* add class attribute

* fix CI

* clarify whisper

* compute_capability

* rename names in some comments

* Add   # fmt: skip

* make style

* Update tests/models/mistral/test_modeling_mistral.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* update

---------

Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-09 13:28:54 +02:00
NielsRogge
e9c23fa056
[Trainer] Undo #29896 (#30129)
* Undo

* Use tokenizer

* Undo data collator
2024-04-09 12:55:42 +02:00
NielsRogge
ba1b24e07b
[Trainer] Fix default data collator (#30142)
* Fix data collator

* Support feature extractors as well
2024-04-09 12:52:50 +02:00
Matt
ec59a42192
Revert workaround for TF safetensors loading (#30128)
* See if we can get tests to pass with the fixed weights

* See if we can get tests to pass with the fixed weights

* Replace the revisions now that we don't need them anymore
2024-04-09 11:04:18 +01:00
Raushan Turganbay
841e87ef4f
Fix docs Pop2Piano (#30140)
fix copies
2024-04-09 14:58:02 +05:00
Matthew Hoffman
af4c02622b
Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints (#30077)
* Add datasets.Dataset to Trainer's train_dataset and eval_dataset type hints

* Add is_datasets_available check for importing datasets under TYPE_CHECKING guard

https://github.com/huggingface/transformers/pull/30077/files#r1555939352
2024-04-09 09:26:15 +01:00
Sourab Mangrulkar
4e3490f79b
Fix failing DeepSpeed model zoo tests (#30112)
* fix sequence length errors

* fix label column name error for vit

* fix the lm_head embedding!=linear layer mismatches for Seq2Seq models
2024-04-09 12:01:47 +05:30
Jonathan Tow
2f12e40822
[StableLm] Add QK normalization and Parallel Residual Support (#29745)
* init: add StableLm 2 support

* add integration test for parallel residual and qk layernorm

* update(modeling): match qk norm naming for consistency with phi/persimmon

* fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity

* `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`

* refactor: rename head states var in `StableLmLayerNormPerHead`

* tests: update test model and add generate check
2024-04-08 23:51:58 +02:00
Felix Hirwa Nshuti
8c00b53eb0
Adding mps as device for Pipeline class (#30080)
* adding env variable for mps and is_torch_mps_available for Pipeline

* fix linting errors

* Remove environment overide

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-08 18:07:30 +01:00
DrAnaximandre
7afade2086
Fix typo at ImportError (#30090)
fix typo at ImportError
2024-04-08 17:45:21 +01:00
fxmarty
ef38e2a7e5
Make vitdet jit trace complient (#30065)
* remove controlflows

* style

* rename patch_ to padded_ following review comment

* style
2024-04-08 23:10:06 +08:00
Younes Belkada
a71def025c
Trainer / Core : Do not change init signature order (#30126)
* Update trainer.py

* fix copies
2024-04-08 16:57:38 +02:00
fxmarty
1897874edc
Fix falcon with SDPA, alibi but no passed mask (#30123)
* fix falcon without attention_mask & alibi

* add test

* Update tests/models/falcon/test_modeling_falcon.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-08 22:25:07 +08:00
Anton Vlasjuk
1773afcec3
fix learning rate display in trainer when using galore optimizer (#30085)
fix learning rate display issue in galore optimizer
2024-04-08 14:54:12 +01:00
Nick Doiron
08c8443307
Accept token in trainer.push_to_hub() (#30093)
* pass token to trainer.push_to_hub

* fmt

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* pass token to create_repo, update_folder

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-08 14:51:11 +01:00
Utkarsha Gupte
0201f6420b
[#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888)
* ImportError: Trainer with PyTorch requires accelerate>=0.20.1 Fix

Adding the evaluate and accelerate installs at the beginning of the cell to fix the issue

* ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1

* Import Error Fix

* Update installation.md

* Update quicktour.md

* rollback other lang changes

* Update _config.py

* updates for other languages

* fixing error

* Tutorial Update

* Update tokenization_utils_base.py

* Just use an optimizer string to pass the doctest?

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-04-08 14:21:16 +01:00
amyeroberts
7f9aff910b
Patch fix - don't use safetensors for TF models (#30118)
* Patch fix - don't use safetensors for TF models

* Skip test for TF for now

* Update for another test
2024-04-08 13:29:20 +01:00
JINO ROHIT
f5658732d5
fixing issue 30034 - adding data format for run_ner.py (#30088) 2024-04-08 12:49:59 +01:00
Fanli Lin
d16f0abc3f
[tests] add require_bitsandbytes marker (#30116)
* add bnb flag

* move maker

* add accelerator maker
2024-04-08 12:49:31 +01:00
Haz Sameen Shahgir
5e673ed2dc
updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120)
updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0
2024-04-08 12:41:28 +01:00
Howard Liberty
836e88caee
Make MLFlow version detection more robust and handles mlflow-skinny (#29957)
* Make MLFlow version detection more robust and handles mlflow-skinny

* Make function name more clear and refactor the logic

* Further refactor
2024-04-08 12:20:02 +02:00
Xu Song
a907a903d6
Change log level to warning for num_train_epochs override (#30014) 2024-04-08 10:36:53 +02:00
vaibhavagg303
1ed93be48a
[Whisper] Computing features on GPU in batch mode for whisper feature extractor. (#29900)
* add _torch_extract_fbank_features_batch function in feature_extractor_whisper

* reformat feature_extraction_whisper.py file

* handle batching in single function

* add gpu test & doc

* add batch test & device in each __call__

* add device arg in doc string

---------

Co-authored-by: vaibhav.aggarwal <vaibhav.aggarwal@sprinklr.com>
2024-04-08 10:36:25 +02:00
Cylis
1fc34aa666
doc: Correct spelling mistake (#30107) 2024-04-08 08:44:05 +01:00
Raushan Turganbay
76fa17c166
Fix whisper kwargs and generation config (#30018)
* clean-up whisper kwargs

* failing test
2024-04-05 21:28:58 +05:00
Yih-Dar
9b5a6450d4
Fix auto tests (#30067)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 17:49:46 +02:00
Kola
d9fa13ce62
Add docstrings and types for MambaCache (#30023)
* Add docstrings and types for MambaCache

* Update src/transformers/models/mamba/modeling_mamba.py

* Update src/transformers/models/mamba/modeling_mamba.py

* Update src/transformers/models/mamba/modeling_mamba.py

* make fixup

* import copy in generation_whisper

* ruff

* Revert "make fixup"

This reverts commit c4fedd6f60e3b0f11974a11433bc130478829a5c.
2024-04-05 16:19:54 +02:00
Yih-Dar
b17b54d3dd
Refactor daily CI workflow (#30012)
* separate jobs

* separate jobs

* use channel name directly instead of ID

* use channel name directly instead of ID

* use channel name directly instead of ID

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 15:49:51 +02:00
Michael Benayoun
17cd7a9d28
Fix torch.fx symbolic tracing for LLama (#30047)
* [WIP] fix fx

* [WIP] fix fx

* [WIP] fix fx

* [WIP] fix fx

* [WIP] fix fx

* Apply changes to other models
2024-04-05 15:14:09 +02:00
Yih-Dar
48795317a2
[test fetcher] Always include the directly related test files (#30050)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 14:30:36 +02:00
miRx923
de11d0bdf0
Update quantizer_bnb_4bit.py: In the ValueError string there should be "....you need to set llm_int8_enable_fp32_cpu_offload=True...." instead of "load_in_8bit_fp32_cpu_offload=True". (#30013)
* Update quantizer_bnb_4bit.py

There is an mistake in ValueError on line 86 of quantizer_bnb_4bit.py. In the error string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "load_in_8bit_fp32_cpu_offload=True". I think you updated the BitsAndBytesConfig() arguments, but forgot to change the ValueError in quantizer_bnb_4bit.py.

* Update quantizer_bnb_4bit.py

Changed ValueError string "...you need to set load_in_8bit_fp32_cpu_offload=True..." to "....you need to set llm_int8_enable_fp32_cpu_offload=True...."
2024-04-05 14:04:50 +02:00
Marc Sun
4207a4076d
[bnb] Fix offload test (#30039)
fix bnb test
2024-04-05 13:11:28 +02:00
NielsRogge
1ab7136488
[Trainer] Allow passing image processor (#29896)
* Add image processor to trainer

* Replace tokenizer=image_processor everywhere
2024-04-05 10:10:44 +02:00
Adam Louly
d704c0b698
Fix mixtral ONNX Exporter Issue. (#29858)
* fix mixtral onnx export

* fix qwen model
2024-04-05 09:49:42 +02:00
Wang, Yi
79d62b2da2
if output is tuple like facebook/hf-seamless-m4t-medium, waveform is … (#29722)
* if output is tuple like facebook/hf-seamless-m4t-medium, waveform is the first element

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* add test and fix batch issue

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* add dict output support for seamless_m4t

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
2024-04-05 09:26:44 +02:00
Yih-Dar
8b52fa6b42
skip test_encode_decode_fast_slow_all_tokens for now (#30044)
skip test_encode_decode_fast_slow_all_tokens for now

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 09:07:41 +02:00
Yih-Dar
24d787ce9d
Add whisper to IMPORTANT_MODELS (#30046)
Add whisper

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-05 09:06:40 +02:00
Saurabh Dash
517a3e670d
Refactor Cohere Model (#30027)
* changes

* addressing comments

* smol fix
2024-04-04 12:46:20 +02:00
byi8220
75b76a5ea4
[ProcessingIdefics] Attention mask bug with padding (#29449)
* Defaulted IdeficsProcessor padding to 'longest', removed manual padding

* make fixup

* Defaulted processor call to padding=False

* Add padding to processor call in IdeficsModelIntegrationTest as well

* Defaulted IdeficsProcessor padding to 'longest', removed manual padding

* make fixup

* Defaulted processor call to padding=False

* Add padding to processor call in IdeficsModelIntegrationTest as well

* redefaulted padding=longest again

* fixup/doc
2024-04-04 10:11:09 +01:00
byi8220
4e6c5eb045
Add a converter from mamba_ssm -> huggingface mamba (#29705)
* implement convert_mamba_ssm_checkpoint_to_pytorch

* Add test test_model_from_mamba_ssm_conversion

* moved convert_ssm_config_to_hf_config to inside mamba_ssm_available check

* fix skipif clause

* moved skips to inside test since skipif decorator isn't working for some reason

* Added validation

* removed test

* fixup

* only compare logits

* remove weight rename

* Update src/transformers/models/mamba/convert_mamba_ssm_checkpoint_to_pytorch.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-04 09:29:32 +01:00
Jacky Lee
03732dea60
Enable multi-device for efficientnet (#29989)
feat: enable mult-idevice for efficientnet
2024-04-03 20:54:34 +01:00