Commit Graph

14923 Commits

Author SHA1 Message Date
Arthur
8189977885
[Core Tokenization] Support a fix for spm fast models (#26678)
* fix

* last attempt

* current work

* fix forward compatibility

* save all special tokens

* current state

* revert additional changes

* updates

* remove tokenizer.model

* add a test and the fix

* nit

* revert one more break

* fix typefield issue

* quality

* more tests

* fix fields for FC

* more nits?

* new additional changes

* how

* some updates

* the fix

* where do we stand

* nits

* nits

* revert unrelated changes

* nits nits nits

* styling

* don't break llama just yet

* revert llama changes

* safe arg check

* fixup

* Add a test for T5

* Necessary changes

* Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Add even more tests, when normalization is set to True (which does not work 😓 )

* Update to main

* nits

* fmt

* more and more test

* comments

* revert change as tests are failing

* make the test more readble

* nits

* refactor the test

* nit

* updates

* simplify

* style

* style

* style convert slow

* Update src/transformers/convert_slow_tokenizer.py
2024-01-18 12:31:54 +01:00
Yih-Dar
a1668cc72e
Use weights_only only if torch >= 1.13 (#28506)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-18 10:55:29 +00:00
Yih-Dar
3005f96552
Save Processor (#27761)
* save processor

* Update tests/models/auto/test_processor_auto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/test_processing_common.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-18 10:21:45 +00:00
Ahmed Elnaggar
98dda8ed03
Fix Switch Transformers When sparse_step = 1 (#28564)
Fix sparse_step = 1

I case sparse_step = 1, the current code will not work.
2024-01-17 21:26:21 +00:00
Lucas Thompson
fa6d12f74f
Allow to train dinov2 with different dtypes like bf16 (#28504)
I want to train dinov2 with bf16 but I get the following error in bc72b4e2cd/src/transformers/models/dinov2/modeling_dinov2.py (L635):

```
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
```

Since the input dtype is torch.float32, the parameter dtype has to be torch.float32...

@LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (bc72b4e2cd/src/transformers/models/clip/modeling_clip.py (L181-L182)).

So I add similar automatic dtype transformation to modeling_dinov2.py.
2024-01-17 19:03:08 +00:00
fxmarty
2c1eebc121
Fix SDPA tests (#28552)
* skip bf16 test if not supported by device

* fix

* fix bis

* use is_torch_bf16_available_on_device

* use is_torch_fp16_available_on_device

* fix & use public llama

* use 1b model

* fix flacky test

---------

Co-authored-by: Your Name <you@example.com>
2024-01-17 17:29:18 +01:00
Junyang Lin
d6ffe74dfa
Add qwen2 (#28436)
* add config, modeling, and tokenization

* add auto and init

* update readme

* update readme

* update team name

* fixup

* fixup

* update config

* update code style

* update for fixup

* update for fixup

* update for fixup

* update for testing

* update for testing

* fix bug for config and tokenization

* fix bug for bos token

* not doctest

* debug tokenizer

* not doctest

* debug tokenization

* debug init for tokenizer

* fix style

* update init

* delete if in token auto

* add tokenizer doc

* add tokenizer in init

* Update dummy_tokenizers_objects.py

* update

* update

* debug

* Update tokenization_qwen2.py

* debug

* Update convert_slow_tokenizer.py

* add copies

* add copied from and make style

* update files map

* update test

* fix style

* fix merge reading and update tests

* fix tests

* fix tests

* fix style

* debug a variable in readme

* Update src/transformers/models/qwen2/configuration_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update test and copied from

* fix style

* update qwen2 tokenization  and tests

* Update tokenization_qwen2.py

* delete the copied from after property

* fix style

* update tests

* update tests

* add copied from

* fix bugs

* update doc

* add warning for sliding window attention

* update qwen2 tokenization

* fix style

* Update src/transformers/models/qwen2/modeling_qwen2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer fast

---------

Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 16:02:22 +01:00
Gustavo de Rosa
d93ef7d751
Fixes default value of softmax_scale in PhiFlashAttention2. (#28537)
* fix(phi): Phi does not use softmax_scale in Flash-Attention.

* chore(docs): Update Phi docs.
2024-01-17 14:22:44 +01:00
fxmarty
a6adc05e6b
symbolic_trace: add past_key_values, llama, sdpa support (#28447)
* torch.fx: add pkv, llama, sdpa support

* Update src/transformers/models/opt/modeling_opt.py

* remove spaces

* trigger ci

* use explicit variable names
2024-01-17 11:50:53 +01:00
Patrick von Platen
09eb11a1bd
[Makefile] Exclude research projects from format (#28551) 2024-01-17 11:59:40 +02:00
Joao Gante
f4f57f9dfa
Config: warning when saving generation kwargs in the model config (#28514) 2024-01-16 18:31:01 +00:00
inisis
7142bdfa90
Add is_model_supported for fx (#28521)
* modify check_if_model_is_supported to return bool

* add is_model_supported and have check_if_model_is_supported use that

* Update src/transformers/utils/fx.py

Fantastic

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-16 17:52:44 +00:00
fxmarty
02f8738ef8
Clearer error for SDPA when explicitely requested (#28006)
* clearer error for sdpa

* better message
2024-01-16 16:10:44 +00:00
Arthur
fe23256b73
[SpeechT5Tokenization] Add copied from and fix the convert_tokens_to_string to match the fast decoding scheme (#28522)
* Add copied from and fix the `convert_tokens_to_string` to match the fast decoding scheme

* fixup

* add a small test

* style test file

* nites
2024-01-16 16:50:02 +01:00
Arthur
96d0883103
[TokenizationRoformerFast] Fix the save and loading (#28527)
* cleanup

* add a test

* update the test

* style

* revert part that allows to pickle the tokenizer
2024-01-16 16:37:15 +01:00
Arthur
716df5fb7e
[ TokenizationUtils] Fix add_special_tokens when the token is already there (#28520)
* fix adding special tokens when the token is already there.

* add a test

* add a test

* nit

* fix the test: make sure the order is preserved

* Update tests/test_tokenization_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-16 16:36:29 +01:00
Nima Yaqmuri
07ae53e6e7
Fix/speecht5 bug (#28481)
* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Fix bug in SpeechT5 speech decoder prenet's forward method

- Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues.
- Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact.
- This change resolves a critical bug affecting the model's performance in handling speaker embeddings.

* Refactor SpeechT5 text to speech integration tests

- Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite.
- Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations.
- Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing.
- Fixed existing test cases where incorrect assumptions about output shapes led to potential errors.

* Enhance handling of speaker embeddings in SpeechT5

- Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch).
- The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions.
- Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations.

* Improve Test Robustness with Randomized Speaker Embeddings
2024-01-16 14:14:28 +00:00
fxmarty
66db33ddc8
Fix mismatching loading in from_pretrained with/without accelerate (#28414)
* fix mismatching behavior in from_pretrained with/without accelerate

* meaningful refactor

* remove added space

* add test

* fix model on the hub

* comment

* use tiny model

* style
2024-01-16 14:29:51 +01:00
Hamza FILALI
002566f398
Improving Training Performance and Scalability Documentation (#28497)
* Improving Training Performance and Scaling documentation by adding PEFT techniques to suggestions to reduce memory requirements for training

* Update docs/source/en/perf_train_gpu_one.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-01-16 11:30:26 +01:00
regisss
0cdcd7a2b3
Remove task arg in load_dataset in image-classification example (#28408)
* Remove `task` arg in `load_dataset` in image-classification example

* Manage case where "train" is not in dataset

* Add new args to manage image and label column names

* Similar to audio-classification example

* Fix README

* Update tests
2024-01-16 08:04:08 +01:00
amyeroberts
edb170238f
SiLU activation wrapper for safe importing (#28509)
Add back in wrapper for safe importing
2024-01-15 19:36:59 +00:00
Timothy Cronin
ff86bc364d
improve dev setup comments and hints (#28495)
* improve dev setup comments and hints

* fix tests for new dev setup hints
2024-01-15 18:36:40 +00:00
Boris Dayma
735968b61c
fix: sampling in flax keeps EOS (#28378) 2024-01-15 18:12:09 +00:00
Joao Gante
7e0ddf89f4
Generate: consolidate output classes (#28494) 2024-01-15 17:04:08 +00:00
Matt
72db39c065
Add a use_safetensors arg to TFPreTrainedModel.from_pretrained() (#28511)
* Add a use_safetensors arg to TFPreTrainedModel.from_pretrained()

* One more catch!

* One more one more catch
2024-01-15 17:00:54 +00:00
Rishit Ratna
78d767e3c8
Fixed minor typos (#28489) 2024-01-15 16:45:15 +00:00
Marc Sun
7c8dd88d13
[GPTQ] Fix test (#28018)
* fix test

* reduce length

* smaller model
2024-01-15 11:22:54 -05:00
thedamnedrhino
366c03271e
Tokenizer kwargs in textgeneration pipe (#28362)
* added args to the pipeline

* added test

* more sensical tests

* fixup

* docs

* typo
;

* docs

* made changes to support named args

* fixed test

* docs update

* styles

* docs

* docs
2024-01-15 16:52:18 +01:00
yuanwu2017
a573ac74fd
Add the XPU device check for pipeline mode (#28326)
* Add the XPU check for pipeline mode

When setting xpu device for pipeline, It needs to use is_torch_xpu_available to load ipex and determine whether the device is available.

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Don't move model to device when hf_device_map isn't None

1. Don't move model to device when hf_device_map is not None
2. The device string maybe includes the device index, so use 'in'instead of equal

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Raise the error when xpu is not available

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Update src/transformers/pipelines/base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/pipelines/base.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Modify the error message

Signed-off-by: yuanwu <yuan.wu@intel.com>

* Change message format.

Signed-off-by: yuanwu <yuan.wu@intel.com>

---------

Signed-off-by: yuanwu <yuan.wu@intel.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-15 15:39:11 +00:00
Younes Belkada
1b9a2e4c80
[core/ FEAT] Add the possibility to push custom tags using PreTrainedModel itself (#28405)
* v1 tags

* remove unneeded conversion

* v2

* rm unneeded warning

* add more utility methods

* Update src/transformers/utils/hub.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* more enhancements

* oops

* merge tags

* clean up

* revert unneeded change

* add extensive docs

* more docs

* more kwargs

* add test

* oops

* fix test

* Update src/transformers/modeling_utils.py

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update src/transformers/utils/hub.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/modeling_utils.py

* Update src/transformers/trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add more conditions

* more logic

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2024-01-15 14:48:07 +01:00
Yih-Dar
64bdbd888c
Don't set finetuned_from if it is a local path (#28482)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-15 11:38:20 +01:00
Tom Aarsen
881e966ace
[chore] Update warning text, a word was missing (#28017)
Update warning, a word was missing
2024-01-15 10:08:03 +01:00
Francisco Kurucz
121641cab1
Fix paths to AI Sweden Models reference and model loading (#28423)
Fix URL to Ai Sweden Models reference and model loading
2024-01-15 09:09:22 +01:00
Joao Gante
bc72b4e2cd
Generate: fix candidate device placement (#28493)
* fix candidate device

* this line shouldn't have been in
2024-01-13 21:31:25 +01:00
Apoorv Saxena
e304f9769c
Adding Prompt lookup decoding (#27775)
* MVP

* fix ci

* more ci

* remove redundant kwarg

* added and wired up PromptLookupCandidateGenerator

* rebased with main, working

* removed print

* style fixes

* fix test

* fixed tests

* added test for prompt lookup decoding

* fixed circleci

* fixed test issue

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/candidate_generator.py

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-13 17:15:58 +00:00
Siddartha Naidu
29a2b14206
Change progress logging to once across all nodes (#28373) 2024-01-12 15:01:21 -05:00
Matt
2382706a1c
Fix docstrings and update docstring checker error message (#28460)
* Fix TF Regnet docstring

* Fix TF Regnet docstring

* Make a change to the PyTorch Regnet too to make sure the CI is checking it

* Add skips for TFRegnet

* Update error message for docstring checker
2024-01-12 17:54:11 +00:00
Joao Gante
4fb3d3a0f6
TF: purge TFTrainer (#28483) 2024-01-12 16:56:34 +00:00
Joao Gante
afc45b13ca
Generate: refuse to save bad generation config files (#28477) 2024-01-12 16:01:17 +00:00
Joao Gante
dc01cf9c5e
Docs: add model paths (#28475) 2024-01-12 15:25:43 +00:00
Joao Gante
d026498830
Generate: deprecate old public functions (#28478) 2024-01-12 15:21:15 +00:00
sungho-ham
edb314ae2b
Fix torch.ones usage in xlnet (#28471)
Fix xlnet torch.ones usage

Co-authored-by: sungho-ham <sungho.ham@linecorp.com>
2024-01-12 15:31:00 +01:00
dependabot[bot]
c45ef1c0d1
Bump jinja2 from 2.11.3 to 3.1.3 in /examples/research_projects/decision_transformer (#28457)
Bump jinja2 in /examples/research_projects/decision_transformer

Bumps [jinja2](https://github.com/pallets/jinja) from 2.11.3 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/2.11.3...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-12 15:28:55 +01:00
Younes Belkada
266c67b06a
[Mixtral / Awq] Add mixtral fused modules for Awq (#28240)
* add mixtral fused modules

* add changes from modeling utils

* add test

* fix test + rope theta issue

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-01-12 14:29:35 +01:00
amyeroberts
666a6f078c
Update metadata loading for oneformer (#28398)
* Update meatdata loading for oneformer

* Enable loading from a model repo

* Update docstrings

* Fix tests

* Update tests

* Clarify repo_path behaviour
2024-01-12 12:35:31 +00:00
amyeroberts
4e36a6cd00
Mark two logger tests as flaky (#28458)
* Mark two logger tests as flaky

* Add description to is_flaky
2024-01-12 11:58:59 +00:00
Younes Belkada
07bdbebb48
[Awq] Add llava fused modules support (#28239)
* add llava + fused modules

* Update src/transformers/models/llava/modeling_llava.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-12 06:55:54 +01:00
Hankyeol Kyung
995a7ce9a8
Fix broken link on page (#28451)
* [docs] Fix broken link

Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>

* [docs] Use shorter domain

Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>

---------

Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>
2024-01-11 09:26:13 -08:00
Matt
143451355c
Fix docstring checker issues with PIL enums (#28450) 2024-01-11 17:23:41 +00:00
jiqing-feng
19e83d174c
Doc (#28431)
* update version for cpu training

* update docs for cpu training

* fix readme

* fix readme
2024-01-11 08:55:48 -08:00