Jonny Li
476890e9ae
Fix DeepSpeed compatibility with weight_norm ( #30881 ) ( #31018 )
2024-05-28 17:25:15 +01:00
Albert Villanova del Moral
aada568f73
Fix PretrainedConfig docstring with deprecated resume_download ( #31014 )
2024-05-28 17:47:35 +02:00
Yih-Dar
3af7bf30ad
skip test_multi_gpu_data_parallel_forward
for vit
and deit
( #31086 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 17:44:52 +02:00
Younes Belkada
ab19f907fd
FIX / OPT: Fix OPT multi-GPU training for OPTForQuestionAnswering
( #31092 )
...
Update modeling_opt.py
2024-05-28 17:06:00 +02:00
Younes Belkada
94d416f018
FIX: Add accelerate
as a hard requirement ( #31090 )
...
add accelerate
2024-05-28 17:05:44 +02:00
Sigbjørn Skjæret
22dab246c5
Render chat template tojson filter as unicode ( #31041 )
...
* Render chat template tojson filter as unicode
* ruff--
2024-05-28 15:02:51 +01:00
Younes Belkada
4f98b14465
Docs / PEFT: Add PEFT API documentation ( #31078 )
...
* add peft references
* add peft references
* Update docs/source/en/peft.md
* Update docs/source/en/peft.md
2024-05-28 15:04:43 +02:00
Raushan Turganbay
779bc360ff
Watermark: fix tests ( #30961 )
...
* fix tests
* style
* Update tests/generation/test_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-28 17:07:42 +05:00
Lysandre Debut
a3c7b59e31
Fix failing tokenizer tests ( #31083 )
...
* Fix failing tokenizer tests
* Use small tokenizer
* Fix remaining reference
2024-05-28 13:34:23 +02:00
NielsRogge
90da0b1c9f
[SuperPoint, PaliGemma] Update docs ( #31025 )
...
* Update docs
* Add PaliGemma resources
* Address comment
* Update docs
2024-05-28 13:22:06 +02:00
Sina Taslimi
66add161dc
Fix typo in trainer.py ( #31048 )
2024-05-28 12:09:32 +01:00
Pavel Iakubovskii
98e2d48e9a
Fix OWLv2 post_process_object_detection for multiple images ( #31082 )
...
* Add test for multiple images
* [run slow] owlv2
* Fix box rescaling
* [run slow] owlv2
2024-05-28 12:06:06 +01:00
Pavel Iakubovskii
c31473ed44
Remove float64 cast for OwlVit and OwlV2 to support MPS device ( #31071 )
...
Remove float64
2024-05-28 11:41:40 +01:00
oOraph
936ab7bae5
fix from_pretrained in offline mode when model is preloaded in cache ( #31010 )
...
* Unit test to verify fix
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
* fix from_pretrained in offline mode when model is preloaded in cache
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
* minor: fmt
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
---------
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com>
2024-05-28 11:56:05 +02:00
Hengwen Tong
537deb7869
Remove redundant backend checks in training_args.py ( #30999 )
...
* Remove backend checks in training_args.py
* Expilicit initialize the device
---------
Co-authored-by: tonghengwen <tonghengwen@cambricon.com>
2024-05-28 11:52:47 +02:00
AP
dd4654eab7
Update quicktour.md to fix broken link to Glossary ( #31072 )
...
Update quicktour.md to fix broken link
Missing '/' in attention mask link in the transformers quicktour
2024-05-28 11:50:45 +02:00
Clint Adams
e18da4e3f2
fix "piano" typo ( #31027 )
2024-05-28 11:48:23 +02:00
Yih-Dar
8e3b1fef97
Remove ninja
from docker image build ( #31080 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 11:36:26 +02:00
Yih-Dar
8f0f7271d0
use @main
( #31065 )
...
use main
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 10:53:28 +02:00
Yih-Dar
9d35edbb30
skip test_model_parallelism
for 2 model test classes ( #31067 )
...
skip
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-27 18:36:39 +02:00
Yoach Lacombe
d355741eca
Fix pad_to_max_length Whisper ( #30787 )
...
* fix pad_to_max_length Whisper
* add tests
* make style
2024-05-27 16:09:05 +02:00
Marc Sun
b84cd67526
Fix quanto tests ( #31062 )
...
fix quanto tests
2024-05-27 15:53:45 +02:00
amyeroberts
cd797778e4
Update feature request label in template ( #30940 )
2024-05-27 15:16:47 +02:00
Eitan Turok
0a064dc0fc
Follow up: Fix link in dbrx.md ( #30514 )
...
* Fix link in dbrx.md
* remove "though this may not be up to date"
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-05-27 14:57:43 +02:00
Yih-Dar
d7942d9d27
unpin uv ( #31055 )
...
[push-ci-image]
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-27 13:47:47 +02:00
Aymeric Roucher
84c4b72ee9
Redirect transformers_agents doc to agents ( #31054 )
2024-05-27 10:34:14 +02:00
Pablo Montalvo
bdb9106f24
Paligemma- fix devices and dtype assignments ( #31008 )
...
* fix devices and dtype assignments
* [run-slow]paligemma
2024-05-24 19:02:55 +02:00
Ita Zaporozhets
deba7655e6
Add split special tokens ( #30772 )
...
* seems like `split_special_tokens` is used here
* split special token
* add new line at end of file
* moving split special token test to common tests
* added assertions
* test
* fixup
* add co-author
* passing rest of args to gptsan_japanese, fixing tests
* removing direct comparison of fast and slow models
* adding test support for UDOP and LayoutXLM
* ruff fix
* readd check if slow tokenizer
* modify test to handle bos tokens
* removing commented function
* trigger build
* applying review feedback - updated docstrings, var names, and simplified tests
* ruff fixes
* Update tests/test_tokenization_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* applying feedback, comments
* shutil temp directory fix
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
Co-authored-by: itazap <itazap@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local>
2024-05-24 08:38:58 -07:00
BHUVAN M
e5103a76cc
added interpolation for vitmae model in pytorch as well as tf. ( #30732 )
...
* added interpolation for vitmae model in pytorch as well as tf.
* Update modeling_vit_mae.py
irreugalr import fixed
* small changes and proper formatting
* changes suggested in review.
* modified decoder interpolate_func
* arguments and docstring fix
* Apply suggestions from code review
doc fixes
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 16:20:09 +01:00
Yih-Dar
a3cdff417b
save the list of new model failures ( #31013 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 15:20:25 +02:00
Younes Belkada
658b849aeb
Quantization / TST: Fix remaining quantization tests ( #31000 )
...
* Fix remaining quant tests
* Update test_quanto.py
2024-05-24 14:35:59 +02:00
Lucain
fd3c128040
Fix resume_download future warning ( #31007 )
...
* Fix resume_download future warning
* better like this
* Add regression test
2024-05-24 14:35:40 +02:00
Yih-Dar
acbfaf69cc
allow multi-gpu ( #31011 )
...
* allow multi-gpu
* allow multi-gpu
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 14:20:06 +02:00
Marc Sun
ae87f9797b
FIX / TST: Fix expected results on Mistral AWQ test ( #30971 )
...
fix awq mistral test
2024-05-24 14:06:31 +02:00
Fanli Lin
04c7c176d7
[tests] make test_model_parallelism
device-agnostic ( #30844 )
...
* enable on xpu
* fix style
* add comment and mps
2024-05-24 11:51:51 +01:00
Yixiang Gao
42d8dd8716
Perceiver interpolate position embedding ( #30979 )
...
* add test that currently fails
* test passed
* all perceiver passed
* fixup, style, quality, repo-consistency, all passed
* Apply suggestions from code review: default to False + compute sqrt once only
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix a minor bracket
* replace dim with self._num_channels
* add arguments to the rest preprocessors
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 11:13:58 +01:00
Yih-Dar
5855afd1f3
pin uv==0.1.45
( #31006 )
...
* fix
* [push-ci-image]
* run with latest
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 12:00:50 +02:00
Lucain
03935d300d
Do not trigger autoconversion if local_files_only ( #31004 )
2024-05-24 11:00:59 +02:00
Kevin Koehncke
21e259d8c5
Fix training speed regression introduced by "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 ( #26139 )" ( #30988 )
...
* Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139 )"
This reverts commit a7e0ed829c
.
* Instead of reverting commit, wrap indexing in torch.no_grad context
* Apply wrapping in LayoutLMv2
* Add comments explaining reason for no_grad
* Fix code format
---------
Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>
2024-05-24 10:43:44 +02:00
Ita Zaporozhets
7f6e87413f
add prefix space ignored in llama #29625 ( #30964 )
...
* add prefix space ignored in llama #29625
* adding test with add_prefix_space=False
* ruff
---------
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
2024-05-24 01:03:00 -07:00
Matthias Gerstgrasser
6657fb5fed
Bugfix: WandbCallback uploads initial model checkpoint ( #30897 )
...
* fix wandb always uploading initial model
* Update comment.
* Optionally log initial model
* Revert "Optionally log initial model"
This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.
2024-05-23 20:29:00 +01:00
Yasmin Moslem
6d3d5b1039
Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py ( #29834 )
...
* Fix typo in tokenization_nllb.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Fix typo in tokenization_nllb_fast.py
Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.
* Remove deprecated attributes in tokenization_nllb.py
Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`
* Remove deprecated attribute in tokenization_nllb_fast.py
Remove deprecated attribute `lang_code_to_id`
* Remove deprecated properties in tokenization_nllb.py
Remove deprecated properties - fix format
* Remove deprecated properties in tokenization_nllb_fast.py
Remove deprecated properties - fix format
* Update test_tokenization_nllb.py
* update test_tokenization_nllb.py
* Update tokenization_nllb.py
* Update test_tokenization_seamless_m4t.py
* Update test_tokenization_seamless_m4t.py
2024-05-23 18:53:26 +02:00
Aritra Roy Gosthipaty
965e98dc54
[Port] TensorFlow implementation of Mistral ( #29708 )
...
* chore: initial commit
* chore: adding imports and inits
* chore: adding the causal and classification code
* chore: adding names to the layers
* chore: using single self attn layer
* chore: built the model and layers
* chore: start with testing
* chore: docstring change, transpose fix
* fix: rotary embedding
* chore: adding cache implementation
* remove unused torch
* chore: fixing the indexing issue
* make fix-copies
* Use modeling_tf_utils.keras
* make fixup
* chore: fixing tests
* chore: adding past key value logic
* chore: adding multi label classfication test
* fix: switching on the built parameters in the layers
* fixing repo consistency
* ruff formats
* style changes
* fix: tf and pt equivalence
* removing returns from docstrings
* fix docstrings
* fix docstrings
* removing todos
* fix copies
* fix docstring
* fix docstring
* chore: using easier rotate_half
* adding integration tests
* chore: addressing review related to rotary embedding layer
* review changes
* [run-slow] mistral
* skip: test save load after resize token embedding
* style
---------
Co-authored-by: Matt <rocketknight1@gmail.com>
2024-05-23 17:48:49 +01:00
Yih-Dar
2a89673fe5
Update 4 MptIntegrationTests
expected outputs ( #30989 )
...
* fix
* fix
* fix
* fix
* fix
* [run-slow] mpt
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-23 18:27:54 +02:00
Yasmin Moslem
892b13d3cf
Add a check that warmup_setps is either 0 or >= 1 ( #30764 )
...
* Add a check that warmup_setps is either 0 or >= 1
Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error.
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-23 17:23:59 +01:00
Fanli Lin
21339a5213
[tests] add torch.use_deterministic_algorithms
for XPU ( #30774 )
...
* add xpu check
* add marker
* add documentation
* update doc
* fix ci
* remove from global init
* fix
2024-05-23 16:53:07 +01:00
Marc Sun
8366b57241
Fix accelerate failing tests ( #30836 )
...
* Fix accelerate tests
* fix clip
* skip dbrx tests
* fix GPTSan
* fix M2M100Model
* same fix as jamba
* fix mt5
* Fix T5Model
* Fix umt5 model
* fix switch_transformers
* fix whisper
* fix gptsan again
* fix siglip recent test
* skip siglip tests
* wrong place fixed
2024-05-23 17:18:58 +02:00
Younes Belkada
5a74ae6dbe
FIX / Docs: Minor changes in quantization docs ( #30985 )
...
* Change in quantization docs
* Update overview.md
* Update docs/source/en/quantization/overview.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-05-23 16:36:49 +02:00
Benjamin Warner
046c2ad792
Finish adding support for torch.compile dynamic shapes ( #30919 )
...
add torch.compile dynamic support
2024-05-23 16:01:29 +02:00
Poedator
6739e1d261
test_custom_4d_attention_mask skip with sliding window attn ( #30833 )
2024-05-23 15:22:10 +02:00