Commit Graph

16108 Commits

Author SHA1 Message Date
Fanli Lin
63a0c8f1cb
[tests] enable benchmark unit tests on XPU (#29284)
* add xpu for benchmark

* no auto_map

* use require_torch_gpu

* use gpu

* revert

* revert

* fix style
2024-02-27 09:44:48 +00:00
fxmarty
6d3b643e2a
Fix attn_implementation documentation (#29295)
fix
2024-02-27 10:43:01 +01:00
Merve Noyan
83e366bfd4
Image Feature Extraction docs (#28973)
* Image Feature Extraction docs

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update image_feature_extraction.md

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Address comments

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/image_feature_extraction.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update image_feature_extraction.md

* Update image_feature_extraction.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2024-02-27 09:39:58 +00:00
Andrei Panferov
e3fc90ae68
Cleaner Cache dtype and device extraction for CUDA graph generation for quantizers compatibility (#29079)
* input_layernorm as the beacon of hope

* cleaner dtype extraction

* AQLM + CUDA graph test

* is available check

* shorter text test
2024-02-27 09:32:39 +01:00
regisss
a3f9221a44
Add generate kwargs to VQA pipeline (#29134) 2024-02-27 03:03:00 +01:00
FredericOdermatt
871ba71dfa
GenerationConfig validate both constraints and force_words_ids (#29163)
GenerationConfig validate both options for constrained decoding: constraints and force_words_ids
2024-02-27 01:43:52 +01:00
Eduardo Pacheco
3fcfbe7549
Adding SegGPT (#27735)
* First commit

* Improvements

* More improvements

* Converted original checkpoint to HF checkpoint

* Fix style

* Fixed forward

* More improvements

* More improvements

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove asserts

* Remove unnecessary attributes

* Changed model name to camel case

* Improve forward doc

* Improve tests

* More improvements

* Fix copies

* Fix doc

* Make SegGptImageProcessor more flexible

* Added few-shot test

* Fix style

* Update READMEs and docs

* Update READMEs

* Make inputs required

* Add SegGptForImageSegmentation

* Make tests pass

* Rename to out_indicies

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fixed naming convention

* Copying SegGptMlp from modeling_sam.py

* Some minor improvements

* Remove mlp_ratio

* Fix docstrings

* Fixed docstring match

* Objects defined before use

* Storing only patch_size and beta for SegGptLoss

* removed _prepare_inputs method

* Removed modified from headers

* Renamed to output_indicies

* Removed unnecessary einsums

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixing issues

* Raise error as soon as possible

* More fixes

* Fix merge

* Added palette to SegGptImageProcessor

* Fixed typo

* Fixed shape typo

* Added permute before doing palette to class mapping

* Fixed style

* Fixed and added tests

* Fixed docstrings

* Matching SegFormer API for post_processing_semantic_segmentation

* Fixed copies

* Fixed SegGptImageProcessor to handle both binary and RGB masks

* Updated docstrings of SegGptImageProcessor

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/seggpt.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/seggpt/test_modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Object definitions above & fix style

* Renamed output_indices to intermediate_feature_indices

* Removed unnecessary check on bool_masked_pos

* Loss first in the outputs

* Added validation for do_normalize

* Improved SegGptImageProcessor and added new tests

* Added comment

* Added docstrings to SegGptLoss

* Reimplemented ensemble condition logic in SegGptEncoder

* Update src/transformers/models/seggpt/__init__.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Updated docstrings to use post_process_semantic_segmentation

* Fixed typo on docstrings

* moved pixel values test to test_image_processing_seggpt

* Addressed comments

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/image_processing_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Updated docstrings for SegGptLoss

* Address comments

* Added SegGpt example to model docs

* Update src/transformers/models/seggpt/modeling_seggpt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* moved patchify and unpatchify

* Rename checkpoint

* Renamed intermediate_features to intermediate_hidden_states for consistency

* Update src/transformers/models/seggpt/configuration_seggpt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Replaced post_process_masks for post_process_semantic_segmentation in the docs

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 18:17:19 +00:00
Eduardo Pacheco
3b8c053631
Fixed Deformable Detr typo when loading cuda kernels for MSDA (#29294) 2024-02-26 17:24:30 +00:00
Michael
a44d2dc3a9
[i18n-zh] Translated task/asr.md into Chinese (#29233)
* [zh] Translate a task: asr.md

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>

* apply suggestions from Fan-Lin

---------

Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2024-02-26 08:53:05 -08:00
David Nguyen
c29135046a
[i18n-vi] Translate README.md to Vietnamese (#29229)
* Add Tiếng Việt language support

* Add Vietnamese translation link to README.md

* update README_vi.md
2024-02-26 08:42:46 -08:00
Ming Xu (徐明)
734eb25476
🌐 [i18n-ZH] Translate chat_templating.md into Chinese (#28790)
* [Pix2struct] Simplify generation (#22527)

* Add model to doc tests

* Remove generate and replace by prepare_inputs_for_generation

* More fixes

* Remove print statements

* Update integration tests

* Fix generate

* Remove model from auto mapping

* Use auto processor

* Fix integration tests

* Fix test

* Add inference code snippet

* Remove is_encoder_decoder

* Update docs

* Remove notebook link

* Release: v4.28.0

* Revert (for now) the change on `Deta` in #22437 (#22750)

fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Patch release: v4.28.1

* update zh chat template.

* Update docs/source/zh/chat_templating.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/zh/_toctree.yml

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

* Update docs/source/zh/chat_templating.md

Co-authored-by: Michael <haifeng.yao@daocloud.io>

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Michael <haifeng.yao@daocloud.io>
2024-02-26 08:42:24 -08:00
Michael
b43340455d
[i18n-zh] Translated torchscript.md into Chinese (#29234)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2024-02-26 08:27:47 -08:00
Aaron Jimenez
9f7535bda8
[docs] Spanish translation of tasks_explained.md (#29224)
* Add tasks_explained.md to es/

* Fix little typo in en/ version

* translate speach/audio section

* translate part of vision computer section | fix little typo in en/

* Fix little typo in en/

* Translate vision computer section | remove ** ** to * * in both files

* Translate NLP section | fix link to task/translation in en/

* Updete link in es/tasks_summary.md

* Fix task_summary title link
2024-02-26 08:18:15 -08:00
Raushan Turganbay
8f2f0f0f85
Track each row separately for stopping criteria (#29116) 2024-02-26 16:06:16 +00:00
Joao Gante
ece1b62b93
Generate: v4.38 removals and related updates (#29171) 2024-02-26 13:36:12 +00:00
fxmarty
24d59c7969
Use torch.bool instead of torch.int64 for non-persistant causal mask buffer (#29241)
use torch.bool instead of torch.int64
2024-02-26 14:06:43 +01:00
Merve Noyan
7c4995f93d
Add feature extraction mapping for automatic metadata update (#28944)
* add feature extraction mapping

* added prefix

* ruff check

* minor fix

* Update modeling_auto.py

* fix typo

* remove prefix to make variable public/importable

* Update src/transformers/models/auto/modeling_auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fixes

* addressed comments

* nit

* fix-copies

* remove from tests

* this should fix

* Update tests/models/convnextv2/test_modeling_convnextv2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 10:35:37 +00:00
fxmarty
2a7746c4d1
Add non_device_test pytest mark to filter out non-device tests (#29213)
* add conftest

* fix

* remove deselected
2024-02-26 11:05:49 +01:00
Yih-Dar
93f8617afd
Use DS_DISABLE_NINJA=1 (#29290)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-26 17:41:01 +08:00
Benjamin Muskalla
9fe360883e
Cache is_vision_available result (#29280)
Cache `is_vision_available`

This check is used quite often during process in image models and can take up a serious amount of time compared to the other processing steps.
2024-02-26 09:01:45 +00:00
Yih-Dar
c8d98405a8
Use torch 2.2 for daily CI (model tests) (#29208)
* Use torch 2.2 for daily CI (model tests)

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-23 21:37:08 +08:00
Matt
371b572e55
Allow remote code repo names to contain "." (#29175)
* stash commit

* stash commit

* It works!

* Remove unnecessary change

* We don't actually need the cache_dir!

* Update docstring

* Add test

* Add test with custom cache dir too

* Update model repo path
2024-02-23 12:46:31 +00:00
Arthur
89c64817ce
[Doc] update model doc qwen2 (#29238)
* update model doc qwen2

* Update docs/source/en/model_doc/qwen2.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-23 10:43:31 +01:00
Alessandro Palla
3f60d11a87
Improve _update_causal_mask performance (#29210)
* Fix issue 29206

* Fix style
2024-02-23 10:40:44 +01:00
Amin
75ed76ecea
Fix missing translation in README_ru (#29054)
* Fix missing translation in README_ru

* Update README_ru.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

---------

Co-authored-by: Maria Khalusova <kafooster@gmail.com>
2024-02-23 09:26:21 +01:00
cchen-dialpad
4524494072
fix(mlflow): check mlflow version to use the synchronous flag (#29195)
* fix(mlflow): check mlflow version to use the  flag

* fix indent

* add log_params async and fix quality
2024-02-23 09:19:51 +01:00
fxmarty
2cc8cf6ce7
Fix torch.compile with fullgraph=True when attention_mask input is used (#29211)
* fix torch.export.export for llama

* do not change doc title

* make fix copies
2024-02-22 16:40:06 +01:00
NielsRogge
dabe855668
[Mistral, Mixtral] Improve docs (#29084)
* Improve docs

* Improve chat template
2024-02-22 11:48:01 +01:00
Sanchit Gandhi
2a9b1f80c4
[Gemma] Fix eager attention (#29187)
* fix modelling code

* add tests

* fix tests

* add some logit tests

* style

* fix fix
2024-02-22 01:07:52 +01:00
Andrei Panferov
fc37f38915
Add training version check for AQLM quantizer. (#29142)
* training version check

* warn old aqlm

* aqlm 1.0.2 real

* docs
2024-02-21 17:09:36 +01:00
Younes Belkada
ae49b218c3
FIX [Gemma] Fix bad rebase with transformers main (#29170)
fix bad rebase
2024-02-21 14:56:34 +01:00
Arthur
594c1277b2
[ gemma] Adds support for Gemma 💎 (#29167)
* inital commit

* update

* update conversion checkpoint

* update conversion script

* nits

* some fixes

* nits

* merge

* fix permute

* nits

* fix

* nits

* nits

* nits

* fix rope

* fix both rope

* nites

* style

* make sure flax works

* fix flax init code

* fix foward

* nits

* print flax generation out

* current code

* nits

* SIIIIIIIIIIIIIIIIIII

* update

* add new tokenizer

* correct fast tokenizer

* fix conversion

* more comments

* fix modeling and conversion

* nits and nits

* nits testing

* add some tokenization tests

* add some edge cases

* add slow tests and fix them

* fixup

* fix copies for modeling

* fix copies

* add 7B slow tests

* fix

* fix

* fix tests

* make tokenizer cis go green

* styling

* last tokenizer nits

* update jax tests

* fix flax for 7b

* add jit testing 🤗

* cleanups

* isolated nit, inv_freq for rotary_emb.inv_freq

* propagate to jax

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* adjust test

* fix conversion script

* change name

* correct file names

* update conversion script

* Fix bos and eos token ids in the model configuration (#3)

* update modelling

* update conversion script

* add static cache for gemma

* fix sdpa generate

* fix batched

* multiple fixes

* fix FA2

* final fix

* Rename a few missing strings and filenames (#4)

* merge with upstream main

* fix copies

* fix copies

* fix fixup

* fix fixup

* fix

* fix

* final tests

* fix fx gemma tests

* fix fx bf16/fp16 tests

* update slow fx tests

* fx slow tests: one logits, one generation

* move jit test standalone

* Apply suggestions from code review

* nits

* tokenizer updates

* more tokenization updates: custom GemmaSentencepieceExtrator

* style

* Update src/transformers/cache_utils.py

* Update src/transformers/models/gemma/__init__.py

* Update tests/models/gemma/test_modeling_flax_gemma.py

* small nits

* style

* update tokenization test

* fix the rotary embedding

* with style

* fix slow tests

* WARNING this commit might be very important for precisions

* Update tests/models/gemma/test_modeling_flax_gemma.py

* Update src/transformers/models/gemma/configuration_gemma.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/gemma/modeling_flax_gemma.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* small nits here and there!

* forgotten nit

* remove on the fly computation of inv_freq

* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_flax_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_tokenization_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update tests/models/gemma/test_modeling_gemma.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* nit conversion script link

* fix some tests

* add not doctest and pr doctest

* repo consistency

* fix last CIs 🚀

* update all readmes

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-02-21 14:21:28 +01:00
amyeroberts
58245ba6fb
[Maskformer] safely get backbone config (#29166)
Safe getattr
2024-02-21 13:51:15 +01:00
Ekaterina Aidova
1d0ea7abe0
support SDPA Attention in stablelm (#29106)
* support SDPA Attention in stablelm

* add integration test

* add fallback for output_attentions

* Update src/transformers/models/stablelm/modeling_stablelm.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/stablelm/test_modeling_stablelm.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/stablelm/modeling_stablelm.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* handle non-contiguous states

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-21 13:12:49 +01:00
fxmarty
cc4a664baa
torch.compile compatibility with generate + static cache (#29114)
* fix compatibility

* working version

* cleanup

* sanity checks

* more sanity

* working version WITH refactor

* working without API change

* cleanup & tests pass

* more cleaning

* fix test

* fix tests

* Update src/transformers/generation/utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* smaller comment

* update comment

* update comment

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-21 12:19:30 +01:00
Joao Gante
3994fa5baf
🚨 Llama: update rope scaling to match static cache changes (#29143) 2024-02-21 09:47:41 +00:00
Arthur Zucker
1a77f07f65 v4.39.dev.0 2024-02-21 15:23:22 +09:00
amyeroberts
e770f0316d
[pipeline] Add pool option to image feature extraction pipeline (#28985)
* Add pool option

* PR comments - error message and exact outputs check
2024-02-20 20:22:08 +00:00
Fernando Pérez-García
c47576ca6e
Fix drop path being ignored in DINOv2 (#29147)
Fix drop path not being used
2024-02-20 17:31:59 +00:00
Gustavo Isturiz
3c00b885b9
Added image_captioning version in es and included in toctree file (#29104)
added image_captioning version in es and included in toctree file
2024-02-20 09:13:15 -08:00
Joao Gante
857fd8eaab
Generate: missing generation config eos token setting in encoder-decoder tests (#29146) 2024-02-20 16:17:51 +00:00
Pablo Montalvo
1c81132e80
Raise unused kwargs image processor (#29063)
* draft processor arg capture

* add missing vivit model

* add new common test for image preprocess signature

* fix quality

* fix up

* add back missing validations

* quality

* move info level to warning for unused kwargs
2024-02-20 16:20:20 +01:00
JB (Don)
b8b16475d4
[Phi] Add support for sdpa (#29108) 2024-02-20 14:33:12 +01:00
Yih-Dar
7688d8df84
Save (circleci) cache at the end of a job (#29141)
nice job

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-20 21:31:36 +08:00
Taylor Jackle Spriggs
ee3af60be0
Add support for fine-tuning CLIP-like models using contrastive-image-text example (#29070)
* add support for siglip and chinese-clip model training with contrastive-image-text example

* codebase fixups
2024-02-20 12:08:31 +00:00
amyeroberts
0996a10077
Revert low cpu mem tie weights (#29135)
* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"

This reverts commit 725f4ad1cc.

* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"

This reverts commit 4156f517ce.
2024-02-20 12:06:46 +00:00
Arthur
15cfe38942
[Core tokenization] add_dummy_prefix_space option to help with latest issues (#28010)
* add add_dummy_prefix_space option to slow

* checking kwargs might be better. Should be there for all spm tokenizer IMO

* nits

* fix copies

* more copied

* nits

* add prefix space

* nit

* nits

* Update src/transformers/convert_slow_tokenizer.py

* fix inti

* revert wrong styling

* fix

* nits

* style

* updates

* make sure we use slow tokenizer for conversion instead of looking for the decoder

* support llama ast well

* update llama tokenizer fast

* nits

* nits nits nits

* update the doc

* update

* update to fix tests

* skip unrelated tailing test

* Update src/transformers/convert_slow_tokenizer.py

* add proper testing

* test decode as well

* more testing

* format

* fix llama test

* Apply suggestions from code review
2024-02-20 12:50:31 +01:00
Younes Belkada
efdd436663
FIX [PEFT / Trainer ] Handle better peft + quantized compiled models (#29055)
* handle peft + compiled models

* add tests

* fixup

* adapt from suggestions

* clarify comment
2024-02-20 12:45:08 +01:00
Arthur
5e95dcabe1
[cuda kernels] only compile them when initializing (#29133)
* only compile when needed

* fix mra as well

* fix yoso as well

* update

* rempve comment

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

* opps

* Update src/transformers/models/deta/modeling_deta.py

* nit
2024-02-20 12:38:59 +01:00
Joao Gante
a7755d2409
Generate: unset GenerationConfig parameters do not raise warning (#29119) 2024-02-20 11:34:31 +00:00