Aymeric Roucher
84c4b72ee9
Redirect transformers_agents doc to agents ( #31054 )
2024-05-27 10:34:14 +02:00
Pablo Montalvo
bdb9106f24
Paligemma- fix devices and dtype assignments ( #31008 )
...
* fix devices and dtype assignments
* [run-slow]paligemma
2024-05-24 19:02:55 +02:00
Ita Zaporozhets
deba7655e6
Add split special tokens ( #30772 )
...
* seems like `split_special_tokens` is used here
* split special token
* add new line at end of file
* moving split special token test to common tests
* added assertions
* test
* fixup
* add co-author
* passing rest of args to gptsan_japanese, fixing tests
* removing direct comparison of fast and slow models
* adding test support for UDOP and LayoutXLM
* ruff fix
* readd check if slow tokenizer
* modify test to handle bos tokens
* removing commented function
* trigger build
* applying review feedback - updated docstrings, var names, and simplified tests
* ruff fixes
* Update tests/test_tokenization_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* applying feedback, comments
* shutil temp directory fix
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
Co-authored-by: itazap <itazap@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local>
2024-05-24 08:38:58 -07:00
BHUVAN M
e5103a76cc
added interpolation for vitmae model in pytorch as well as tf. ( #30732 )
...
* added interpolation for vitmae model in pytorch as well as tf.
* Update modeling_vit_mae.py
irreugalr import fixed
* small changes and proper formatting
* changes suggested in review.
* modified decoder interpolate_func
* arguments and docstring fix
* Apply suggestions from code review
doc fixes
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 16:20:09 +01:00
Yih-Dar
a3cdff417b
save the list of new model failures ( #31013 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 15:20:25 +02:00
Younes Belkada
658b849aeb
Quantization / TST: Fix remaining quantization tests ( #31000 )
...
* Fix remaining quant tests
* Update test_quanto.py
2024-05-24 14:35:59 +02:00
Lucain
fd3c128040
Fix resume_download future warning ( #31007 )
...
* Fix resume_download future warning
* better like this
* Add regression test
2024-05-24 14:35:40 +02:00
Yih-Dar
acbfaf69cc
allow multi-gpu ( #31011 )
...
* allow multi-gpu
* allow multi-gpu
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 14:20:06 +02:00
Marc Sun
ae87f9797b
FIX / TST: Fix expected results on Mistral AWQ test ( #30971 )
...
fix awq mistral test
2024-05-24 14:06:31 +02:00
Fanli Lin
04c7c176d7
[tests] make test_model_parallelism
device-agnostic ( #30844 )
...
* enable on xpu
* fix style
* add comment and mps
2024-05-24 11:51:51 +01:00
Yixiang Gao
42d8dd8716
Perceiver interpolate position embedding ( #30979 )
...
* add test that currently fails
* test passed
* all perceiver passed
* fixup, style, quality, repo-consistency, all passed
* Apply suggestions from code review: default to False + compute sqrt once only
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix a minor bracket
* replace dim with self._num_channels
* add arguments to the rest preprocessors
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 11:13:58 +01:00
Yih-Dar
5855afd1f3
pin uv==0.1.45
( #31006 )
...
* fix
* [push-ci-image]
* run with latest
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 12:00:50 +02:00
Lucain
03935d300d
Do not trigger autoconversion if local_files_only ( #31004 )
2024-05-24 11:00:59 +02:00
Kevin Koehncke
21e259d8c5
Fix training speed regression introduced by "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 ( #26139 )" ( #30988 )
...
* Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139 )"
This reverts commit a7e0ed829c
.
* Instead of reverting commit, wrap indexing in torch.no_grad context
* Apply wrapping in LayoutLMv2
* Add comments explaining reason for no_grad
* Fix code format
---------
Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>
2024-05-24 10:43:44 +02:00
Ita Zaporozhets
7f6e87413f
add prefix space ignored in llama #29625 ( #30964 )
...
* add prefix space ignored in llama #29625
* adding test with add_prefix_space=False
* ruff
---------
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
2024-05-24 01:03:00 -07:00
Matthias Gerstgrasser
6657fb5fed
Bugfix: WandbCallback uploads initial model checkpoint ( #30897 )
...
* fix wandb always uploading initial model
* Update comment.
* Optionally log initial model
* Revert "Optionally log initial model"
This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.
2024-05-23 20:29:00 +01:00
Yasmin Moslem
6d3d5b1039
Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py ( #29834 )
...
* Fix typo in tokenization_nllb.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Fix typo in tokenization_nllb_fast.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Remove deprecated attributes in tokenization_nllb.py
Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`
* Remove deprecated attribute in tokenization_nllb_fast.py
Remove deprecated attribute `lang_code_to_id`
* Remove deprecated properties in tokenization_nllb.py
Remove deprecated properties - fix format
* Remove deprecated properties in tokenization_nllb_fast.py
Remove deprecated properties - fix format
* Update test_tokenization_nllb.py
* update test_tokenization_nllb.py
* Update tokenization_nllb.py
* Update test_tokenization_seamless_m4t.py
* Update test_tokenization_seamless_m4t.py
2024-05-23 18:53:26 +02:00
Aritra Roy Gosthipaty
965e98dc54
[Port] TensorFlow implementation of Mistral ( #29708 )
...
* chore: initial commit
* chore: adding imports and inits
* chore: adding the causal and classification code
* chore: adding names to the layers
* chore: using single self attn layer
* chore: built the model and layers
* chore: start with testing
* chore: docstring change, transpose fix
* fix: rotary embedding
* chore: adding cache implementation
* remove unused torch
* chore: fixing the indexing issue
* make fix-copies
* Use modeling_tf_utils.keras
* make fixup
* chore: fixing tests
* chore: adding past key value logic
* chore: adding multi label classfication test
* fix: switching on the built parameters in the layers
* fixing repo consistency
* ruff formats
* style changes
* fix: tf and pt equivalence
* removing returns from docstrings
* fix docstrings
* fix docstrings
* removing todos
* fix copies
* fix docstring
* fix docstring
* chore: using easier rotate_half
* adding integration tests
* chore: addressing review related to rotary embedding layer
* review changes
* [run-slow] mistral
* skip: test save load after resize token embedding
* style
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-05-23 17:48:49 +01:00
Yih-Dar
2a89673fe5
Update 4 MptIntegrationTests
expected outputs ( #30989 )
...
* fix
* fix
* fix
* fix
* fix
* [run-slow] mpt
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-23 18:27:54 +02:00
Yasmin Moslem
892b13d3cf
Add a check that warmup_setps is either 0 or >= 1 ( #30764 )
...
* Add a check that warmup_setps is either 0 or >= 1
Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error.
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-23 17:23:59 +01:00
Fanli Lin
21339a5213
[tests] add torch.use_deterministic_algorithms
for XPU ( #30774 )
...
* add xpu check
* add marker
* add documentation
* update doc
* fix ci
* remove from global init
* fix
2024-05-23 16:53:07 +01:00
Marc Sun
8366b57241
Fix accelerate failing tests ( #30836 )
...
* Fix accelerate tests
* fix clip
* skip dbrx tests
* fix GPTSan
* fix M2M100Model
* same fix as jamba
* fix mt5
* Fix T5Model
* Fix umt5 model
* fix switch_transformers
* fix whisper
* fix gptsan again
* fix siglip recent test
* skip siglip tests
* wrong place fixed
2024-05-23 17:18:58 +02:00
Younes Belkada
5a74ae6dbe
FIX / Docs: Minor changes in quantization docs ( #30985 )
...
* Change in quantization docs
* Update overview.md
* Update docs/source/en/quantization/overview.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-05-23 16:36:49 +02:00
Benjamin Warner
046c2ad792
Finish adding support for torch.compile dynamic shapes ( #30919 )
...
add torch.compile dynamic support
2024-05-23 16:01:29 +02:00
Poedator
6739e1d261
test_custom_4d_attention_mask skip with sliding window attn ( #30833 )
2024-05-23 15:22:10 +02:00
Younes Belkada
87a351818e
Docs / Quantization: refactor quantization documentation ( #30942 )
...
* refactor quant docs
* delete file
* rename to overview
* fix
* fix table
* fix
* add content
* fix library versions
* fix table
* fix table
* fix table
* fix table
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* replace to quantization_config
* fix aqlm snippet
* add DLAI courses
* fix
* fix table
* fix bulet points
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-23 14:31:52 +02:00
Raushan Turganbay
d583f1317b
Quantized KV Cache ( #30483 )
...
* clean-up
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
* Update tests/quantization/quanto_integration/test_quanto.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* more suggestions
* mapping if torch available
* run tests & add 'support_quantized' flag
* fix jamba test
* revert, will be fixed by another PR
* codestyle
* HQQ and versatile cache classes
* final update
* typo
* make tests happy
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-05-23 17:25:20 +05:00
dependabot[bot]
e05baad861
Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/visual_bert ( #30983 )
...
Bump requests in /examples/research_projects/visual_bert
Bumps [requests](https://github.com/psf/requests ) from 2.31.0 to 2.32.2.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.31.0...v2.32.2 )
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-23 12:38:00 +01:00
Arthur
4ef85fee71
Push ci image ( #30982 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
* [push-ci-image] skip
* [push-ci-image] test
* [push-ci-image]
* fix
* device
2024-05-23 11:45:31 +02:00
Kamil Akesbi
eb1a77bbb0
Using assistant in AutomaticSpeechRecognitionPipeline with different encoder size ( #30637 )
...
* fiw input to generate in pipeline
* fixup
* pass input_features to generate with assistant
* error if model and assistant with different enc size
* fix
* apply review suggestions
* use self.config.is_encoder_decoder
* pass inputs to generate directly
* add slow tests
* Update src/transformers/generation/utils.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* apply review
* Update src/transformers/generation/utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* apply code review
* update attributes encoder_xyz to check
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add slow test
* solve conflicts
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-05-23 09:59:38 +01:00
Pavel Iakubovskii
15585b81a5
Update object detection with latest resize and pad strategies ( #30955 )
...
* Update with new resizing and pad strategy
* Return pixel mask param
* Update inference in guide
* Fix empty compose
* Update guide
2024-05-23 00:13:56 +01:00
Pablo Montalvo
a25f7d3c12
Paligemma causal attention mask ( #30967 )
...
* PaliGemma working causal attention
* Formatting
* Style
* Docstrings + remove commented code
* Update docstring for PaliGemma Config
* PaliGemma - add separator ind to model/labels
* Refactor + docstring paligemma processor method
* Style
* return token type ids when tokenizing labels
* use token type ids when building causal mask
* add token type ids to tester
* remove separator from config
* fix style
* don't ignore separator
* add processor documentation
* simplify tokenization
* fix causal mask
* style
* fix label propagation, revert suffix naming
* fix style
* fix labels tokenization
* [run-slow]paligemma
* add eos if suffixes are present
* [run-slow]paligemma
* [run-slow]paligemma
* add misssing tokens to fast version
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix style
* [run-slow]paligemma
---------
Co-authored-by: Peter Robicheaux <peter@roboflow.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-22 19:37:15 +02:00
Jun
d44e1ae036
Fix link in Pipeline documentation ( #30948 )
...
fix documentation as suggested by stevhliu
Co-authored-by: Jun <jun@reliant.ai>
2024-05-22 09:39:46 -07:00
Sanchit Gandhi
0948c827de
[Whisper] Strip prompt before finding common subsequence ( #27836 )
2024-05-22 17:25:47 +01:00
Raushan Turganbay
b1065aa08a
Generation: get special tokens from model config ( #30899 )
...
* fix
* let's do this way?
* codestyle
* update
* add tests
2024-05-22 18:15:41 +02:00
Arthur
1d568dfab2
legacy to init the slow tokenizer when converting from slow was wrong ( #30972 )
2024-05-22 18:06:50 +02:00
Yih-Dar
1432f641b8
Finally fix the missing new model failure CI report ( #30968 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-22 17:48:26 +02:00
amyeroberts
dff54ad2d9
🚨 out_indices always a list ( #30941 )
...
* out_indices always a list
* Update src/transformers/utils/backbone_utils.py
* Update src/transformers/utils/backbone_utils.py
* Move type casting
* nit
2024-05-22 15:23:04 +01:00
Pablo Montalvo
250ae9f746
Paligemma - fix slow tests, add bf16 and f16 slow tests ( #30851 )
...
* fix slow tests, add bf16 and f16 slow tests
* few fixes
* [run-slow]paligemma
* add gate decorator
* [run-slow]paligemma
* add missing gating
* [run-slow]paligemma
* [run-slow]paligemma
2024-05-22 16:20:07 +02:00
Sanchit Gandhi
ada86f973c
[whisper] only trigger forced ids warning once ( #30966 )
2024-05-22 15:06:51 +01:00
Jonatan Kłosko
1518508467
Avoid extra chunk in speech recognition ( #29539 )
2024-05-22 14:07:51 +01:00
Vaibhav Srivastav
24d2a5e1a3
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper. ( #30938 )
...
[doc] Add references to the fine-tuning blog and distil-whisper to Whisper doc.
2024-05-22 14:06:09 +01:00
Marc Sun
5c186003b8
Fix low cpu mem usage tests ( #30808 )
...
* Fix tests
* fix udop failing test
* remove skip
* style
2024-05-22 14:09:01 +02:00
Raushan Turganbay
934e1b84e9
Update video-llava docs ( #30935 )
...
* update video-llava
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-22 16:56:41 +05:00
dependabot[bot]
edb14eba64
Bump requests from 2.31.0 to 2.32.2 in /examples/research_projects/lxmert ( #30956 )
...
---
updated-dependencies:
- dependency-name: requests
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-22 11:27:41 +01:00
Arthur
8e8786e5f0
Update build ci image [push-ci-image] ( #30933 )
...
* [build-ci-image]
* correct branch
* push ci image
* [build-ci-image]
* update scheduled as well
* [push-ci-image]
* [build-ci-image]
* [push-ci-image]
* update deps
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* oups [build-ci-image]
* [push-ci-image]
* fix
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* updated
* [build-ci-image] update tag
* [build-ci-image]
* [build-ci-image]
* fix tag
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* [build-ci-image]
* github name
* commit_title?
* fetch
* update
* it not found
* dev
* dev
* [push-ci-image]
* dev
* dev
* update
* dev
* dev print dev commit message dev
* dev ? dev
* dev
* dev
* dev
* dev
* [build-ci-image]
* [build-ci-image]
* [push-ci-image]
* revert unwanted
* revert convert as well
* no you are not important
* [build-ci-image]
* Update .circleci/config.yml
* pin tf probability dev
2024-05-22 10:52:59 +02:00
Arthur
673440d073
update ruff version ( #30932 )
...
* update ruff version
* fix research projects
* Empty
* Fix errors
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-05-22 06:40:15 +02:00
NielsRogge
60bb571e99
🚨 [Idefics2] Update ignore index ( #30898 )
...
* Update ignore index
* Update docs
* Update docs
2024-05-21 19:38:02 +02:00
Lu Teng
5bf9caa06d
Fix inhomogeneous shape error in example ( #30434 )
...
Fix inhomogeneous shape error in example.
2024-05-21 18:14:11 +01:00
amyeroberts
d24097e022
Fix swin embeddings interpolation ( #30936 )
2024-05-21 15:40:19 +01:00