Commit Graph

16108 Commits

Author SHA1 Message Date
Jonny Li
476890e9ae
Fix DeepSpeed compatibility with weight_norm (#30881) (#31018) 2024-05-28 17:25:15 +01:00
Albert Villanova del Moral
aada568f73
Fix PretrainedConfig docstring with deprecated resume_download (#31014) 2024-05-28 17:47:35 +02:00
Yih-Dar
3af7bf30ad
skip test_multi_gpu_data_parallel_forward for vit and deit (#31086)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 17:44:52 +02:00
Younes Belkada
ab19f907fd
FIX / OPT: Fix OPT multi-GPU training for OPTForQuestionAnswering (#31092)
Update modeling_opt.py
2024-05-28 17:06:00 +02:00
Younes Belkada
94d416f018
FIX: Add accelerate as a hard requirement (#31090)
add accelerate
2024-05-28 17:05:44 +02:00
Sigbjørn Skjæret
22dab246c5
Render chat template tojson filter as unicode (#31041)
* Render chat template tojson filter as unicode

* ruff--
2024-05-28 15:02:51 +01:00
Younes Belkada
4f98b14465
Docs / PEFT: Add PEFT API documentation (#31078)
* add peft references

* add peft references

* Update docs/source/en/peft.md

* Update docs/source/en/peft.md
2024-05-28 15:04:43 +02:00
Raushan Turganbay
779bc360ff
Watermark: fix tests (#30961)
* fix tests

* style

* Update tests/generation/test_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-28 17:07:42 +05:00
Lysandre Debut
a3c7b59e31
Fix failing tokenizer tests (#31083)
* Fix failing tokenizer tests

* Use small tokenizer

* Fix remaining reference
2024-05-28 13:34:23 +02:00
NielsRogge
90da0b1c9f
[SuperPoint, PaliGemma] Update docs (#31025)
* Update docs

* Add PaliGemma resources

* Address comment

* Update docs
2024-05-28 13:22:06 +02:00
Sina Taslimi
66add161dc
Fix typo in trainer.py (#31048) 2024-05-28 12:09:32 +01:00
Pavel Iakubovskii
98e2d48e9a
Fix OWLv2 post_process_object_detection for multiple images (#31082)
* Add test for multiple images

* [run slow] owlv2

* Fix box rescaling

* [run slow] owlv2
2024-05-28 12:06:06 +01:00
Pavel Iakubovskii
c31473ed44
Remove float64 cast for OwlVit and OwlV2 to support MPS device (#31071)
Remove float64
2024-05-28 11:41:40 +01:00
oOraph
936ab7bae5
fix from_pretrained in offline mode when model is preloaded in cache (#31010)
* Unit test to verify fix

Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

* fix from_pretrained in offline mode when model is preloaded in cache

Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

* minor: fmt

Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

---------

Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com>
2024-05-28 11:56:05 +02:00
Hengwen Tong
537deb7869
Remove redundant backend checks in training_args.py (#30999)
* Remove backend checks in training_args.py

* Expilicit initialize the device

---------

Co-authored-by: tonghengwen <tonghengwen@cambricon.com>
2024-05-28 11:52:47 +02:00
AP
dd4654eab7
Update quicktour.md to fix broken link to Glossary (#31072)
Update quicktour.md to fix broken link

Missing '/' in attention mask link in the transformers quicktour
2024-05-28 11:50:45 +02:00
Clint Adams
e18da4e3f2
fix "piano" typo (#31027) 2024-05-28 11:48:23 +02:00
Yih-Dar
8e3b1fef97
Remove ninja from docker image build (#31080)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 11:36:26 +02:00
Yih-Dar
8f0f7271d0
use @main (#31065)
use main

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-28 10:53:28 +02:00
Yih-Dar
9d35edbb30
skip test_model_parallelism for 2 model test classes (#31067)
skip

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-27 18:36:39 +02:00
Yoach Lacombe
d355741eca
Fix pad_to_max_length Whisper (#30787)
* fix pad_to_max_length Whisper

* add tests

* make style
2024-05-27 16:09:05 +02:00
Marc Sun
b84cd67526
Fix quanto tests (#31062)
fix quanto tests
2024-05-27 15:53:45 +02:00
amyeroberts
cd797778e4
Update feature request label in template (#30940) 2024-05-27 15:16:47 +02:00
Eitan Turok
0a064dc0fc
Follow up: Fix link in dbrx.md (#30514)
* Fix link in dbrx.md

* remove "though this may not be up to date"

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-05-27 14:57:43 +02:00
Yih-Dar
d7942d9d27
unpin uv (#31055)
[push-ci-image]

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-27 13:47:47 +02:00
Aymeric Roucher
84c4b72ee9
Redirect transformers_agents doc to agents (#31054) 2024-05-27 10:34:14 +02:00
Pablo Montalvo
bdb9106f24
Paligemma- fix devices and dtype assignments (#31008)
* fix devices and dtype assignments

* [run-slow]paligemma
2024-05-24 19:02:55 +02:00
Ita Zaporozhets
deba7655e6
Add split special tokens (#30772)
* seems like `split_special_tokens` is used here

* split special token

* add new line at end of file

* moving split special token test to common tests

* added assertions

* test

* fixup

* add co-author

* passing rest of args to gptsan_japanese, fixing tests

* removing direct comparison of fast and slow models

* adding test support for UDOP and LayoutXLM

* ruff fix

* readd check if slow tokenizer

* modify test to handle bos tokens

* removing commented function

* trigger build

* applying review feedback - updated docstrings, var names, and simplified tests

* ruff fixes

* Update tests/test_tokenization_common.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* applying feedback, comments

* shutil temp directory fix

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
Co-authored-by: itazap <itazap@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local>
2024-05-24 08:38:58 -07:00
BHUVAN M
e5103a76cc
added interpolation for vitmae model in pytorch as well as tf. (#30732)
* added interpolation for vitmae model in pytorch as well as tf.

* Update modeling_vit_mae.py

irreugalr import fixed

* small changes and proper formatting

* changes suggested in review.

* modified decoder interpolate_func

* arguments and docstring fix

* Apply suggestions from code review

doc fixes

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 16:20:09 +01:00
Yih-Dar
a3cdff417b
save the list of new model failures (#31013)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 15:20:25 +02:00
Younes Belkada
658b849aeb
Quantization / TST: Fix remaining quantization tests (#31000)
* Fix remaining quant tests

* Update test_quanto.py
2024-05-24 14:35:59 +02:00
Lucain
fd3c128040
Fix resume_download future warning (#31007)
* Fix resume_download future warning

* better like this

* Add regression test
2024-05-24 14:35:40 +02:00
Yih-Dar
acbfaf69cc
allow multi-gpu (#31011)
* allow multi-gpu

* allow multi-gpu

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 14:20:06 +02:00
Marc Sun
ae87f9797b
FIX / TST: Fix expected results on Mistral AWQ test (#30971)
fix awq mistral test
2024-05-24 14:06:31 +02:00
Fanli Lin
04c7c176d7
[tests] make test_model_parallelism device-agnostic (#30844)
* enable on xpu

* fix style

* add comment and mps
2024-05-24 11:51:51 +01:00
Yixiang Gao
42d8dd8716
Perceiver interpolate position embedding (#30979)
* add test that currently fails

* test passed

* all perceiver passed

* fixup, style, quality, repo-consistency, all passed

* Apply suggestions from code review: default to False + compute sqrt once only

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix a minor bracket

* replace dim with self._num_channels

* add arguments to the rest preprocessors

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-24 11:13:58 +01:00
Yih-Dar
5855afd1f3
pin uv==0.1.45 (#31006)
* fix

* [push-ci-image]

* run with latest

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-24 12:00:50 +02:00
Lucain
03935d300d
Do not trigger autoconversion if local_files_only (#31004) 2024-05-24 11:00:59 +02:00
Kevin Koehncke
21e259d8c5
Fix training speed regression introduced by "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)" (#30988)
* Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)"

This reverts commit a7e0ed829c.

* Instead of reverting commit, wrap indexing in torch.no_grad context

* Apply wrapping in LayoutLMv2

* Add comments explaining reason for no_grad

* Fix code format

---------

Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>
2024-05-24 10:43:44 +02:00
Ita Zaporozhets
7f6e87413f
add prefix space ignored in llama #29625 (#30964)
* add prefix space ignored in llama #29625

* adding test with add_prefix_space=False

* ruff

---------

Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>
2024-05-24 01:03:00 -07:00
Matthias Gerstgrasser
6657fb5fed
Bugfix: WandbCallback uploads initial model checkpoint (#30897)
* fix wandb always uploading initial model

* Update comment.

* Optionally log initial model

* Revert "Optionally log initial model"

This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.
2024-05-23 20:29:00 +01:00
Yasmin Moslem
6d3d5b1039
Remove deprecated properties in tokenization_nllb.py and tokenization_nllb_fast.py (#29834)
* Fix typo in tokenization_nllb.py

Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.

* Fix typo in tokenization_nllb_fast.py

Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability.

* Remove deprecated attributes in tokenization_nllb.py

Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens`

* Remove deprecated attribute in tokenization_nllb_fast.py

Remove deprecated attribute `lang_code_to_id`

* Remove deprecated properties in tokenization_nllb.py

Remove deprecated properties - fix format

* Remove deprecated properties in tokenization_nllb_fast.py

Remove deprecated properties - fix format

* Update test_tokenization_nllb.py

* update test_tokenization_nllb.py

* Update tokenization_nllb.py

* Update test_tokenization_seamless_m4t.py

* Update test_tokenization_seamless_m4t.py
2024-05-23 18:53:26 +02:00
Aritra Roy Gosthipaty
965e98dc54
[Port] TensorFlow implementation of Mistral (#29708)
* chore: initial commit

* chore: adding imports and inits

* chore: adding the causal and classification code

* chore: adding names to the layers

* chore: using single self attn layer

* chore: built the model and layers

* chore: start with testing

* chore: docstring change, transpose fix

* fix: rotary embedding

* chore: adding cache implementation

* remove unused torch

* chore: fixing the indexing issue

* make fix-copies

* Use modeling_tf_utils.keras

* make fixup

* chore: fixing tests

* chore: adding past key value logic

* chore: adding multi label classfication test

* fix: switching on the built parameters in the layers

* fixing repo consistency

* ruff formats

* style changes

* fix: tf and pt equivalence

* removing returns from docstrings

* fix docstrings

* fix docstrings

* removing todos

* fix copies

* fix docstring

* fix docstring

* chore: using easier rotate_half

* adding integration tests

* chore: addressing review related to rotary embedding layer

* review changes

* [run-slow] mistral

* skip: test save load after resize token embedding

* style

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-05-23 17:48:49 +01:00
Yih-Dar
2a89673fe5
Update 4 MptIntegrationTests expected outputs (#30989)
* fix

* fix

* fix

* fix

* fix

* [run-slow] mpt

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-23 18:27:54 +02:00
Yasmin Moslem
892b13d3cf
Add a check that warmup_setps is either 0 or >= 1 (#30764)
* Add a check that warmup_setps is either 0 or >= 1

Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error.

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-23 17:23:59 +01:00
Fanli Lin
21339a5213
[tests] add torch.use_deterministic_algorithms for XPU (#30774)
* add xpu check

* add marker

* add documentation

* update doc

* fix ci

* remove from global init

* fix
2024-05-23 16:53:07 +01:00
Marc Sun
8366b57241
Fix accelerate failing tests (#30836)
* Fix accelerate tests

* fix clip

* skip dbrx tests

* fix GPTSan

* fix M2M100Model

* same fix as jamba

* fix mt5

* Fix T5Model

* Fix umt5 model

* fix switch_transformers

* fix whisper

* fix gptsan again

* fix siglip recent test

* skip siglip tests

* wrong place fixed
2024-05-23 17:18:58 +02:00
Younes Belkada
5a74ae6dbe
FIX / Docs: Minor changes in quantization docs (#30985)
* Change in quantization docs

* Update overview.md

* Update docs/source/en/quantization/overview.md

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-05-23 16:36:49 +02:00
Benjamin Warner
046c2ad792
Finish adding support for torch.compile dynamic shapes (#30919)
add torch.compile dynamic support
2024-05-23 16:01:29 +02:00
Poedator
6739e1d261
test_custom_4d_attention_mask skip with sliding window attn (#30833) 2024-05-23 15:22:10 +02:00