Commit Graph

15129 Commits

Author SHA1 Message Date
Lysandre Debut
f497f564bb
Update all references to canonical models (#29001)
* Script & Manual edition

* Update
2024-02-16 08:16:58 +01:00
Titus
1e402b957d
add test marker to run all tests with @require_bitsandbytes (#28278) 2024-02-16 01:53:09 +01:00
Sadra Barikbin
f3aa7db439
Fix a tiny typo in generation/utils.py::GenerateEncoderDecoderOutput's docstring (#29044)
Update utils.py
2024-02-15 18:12:31 +00:00
Andrei Panferov
b0a7f44f85
Removed obsolete attribute setting for AQLM quantization. (#29034)
removed redundant field
2024-02-15 18:11:13 +00:00
amyeroberts
4156f517ce
Patch to skip failing test_save_load_low_cpu_mem_usage tests (#29043)
* Patch to skip currently failing tests

* Whoops - wrong place
2024-02-15 17:26:33 +00:00
Younes Belkada
6d1f545665
FIX: Fix error with logger.warning + inline with recent refactor (#29039)
Update modeling_utils.py
2024-02-15 15:33:26 +01:00
amyeroberts
8a0ed0a9a2
Fix copies between DETR and DETA (#29037) 2024-02-15 14:02:58 +00:00
Donggeun Yu
5b6fa2306a
DeformableDetrModel support fp16 (#29013)
* Update ms_deform_attn_cuda.cu

* Update ms_deform_attn_cuda.cuh

* Update modeling_deformable_detr.py

* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_deformable_detr.py

* python utils/check_copies.py --fix_and_overwrite

* Fix dtype missmatch error

* Update test_modeling_deformable_detr.py

* Update test_modeling_deformable_detr.py

* Update modeling_deformable_detr.py

* Update modeling_deformable_detr.py

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-15 12:31:09 +00:00
Sangbum Daniel Choi
83e96dc0ab
Add cuda_custom_kernel in DETA (#28989)
* enable graident checkpointing in DetaObjectDetection

* fix missing part in original DETA

* make style

* make fix-copies

* Revert "make fix-copies"

This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.

* remove fix-copies of DetaDecoder

* enable swin gradient checkpointing

* fix gradient checkpointing in donut_swin

* add tests for deta/swin/donut

* Revert "fix gradient checkpointing in donut_swin"

This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.

* change supports_gradient_checkpointing pipeline to PreTrainedModel

* Revert "add tests for deta/swin/donut"

This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.

* Revert "Revert "fix gradient checkpointing in donut_swin""

This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.

* Simple revert

* enable deformable detr gradient checkpointing

* add gradient in encoder

* add cuda_custom_kernel function in MSDA

* make style and fix input of DetaMSDA

* make fix-copies

* remove n_levels in input of DetaMSDA

* minor changes

* refactor custom_cuda_kernel like yoso format
0507e69d34/src/transformers/models/yoso/modeling_yoso.py (L53)
2024-02-15 12:09:39 +00:00
Arthur
f3788b09e1
Fix static generation when compiling! (#28937)
* wow I was scared!

* fix everything

* nits

* make it BC?

* add todo

* nits

* is_tracing should still be used to pass tracing tests

* nits

* some nits to make sure genration works with static cache uncompiled

* fix sdpa

* fix FA2 for both static and dynamic in a better way?

* style

* fix-copies

* fix fix copies

* fix sequential beam searcg

* style

* use `keys_to_ignore`

* nit

* correct dtype inference when init

* :( the fix for FA2 is still not optimal to investigate!

* styling

* nits

* nit

* this might work better

* add comment

* Update src/transformers/models/llama/modeling_llama.py

* "position_ids" -> "cache_position"

* style

* nit

* Remove changes that should no be propagatted just yet

* Apply suggestions from code review

* Styling

* make sure we raise an errir for static cache with FA2 enabled

* move  to the bottom of the signature

* style

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

* nit in the name

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-15 06:27:40 +01:00
Arthur
609a1767e8
[CLeanup] Revert SDPA attention changes that got in the static kv cache PR (#29027)
* revert unrelated changes that got in

* style
2024-02-15 00:55:48 +01:00
Younes Belkada
7a0fccc6eb
FIX [Trainer / tags]: Fix trainer + tags when users do not pass "tags" to trainer.push_to_hub() (#29009)
* fix trainer tags

* add test
2024-02-14 23:56:35 +01:00
Jiewen Tan
5f06053dd8
[TPU] Support PyTorch/XLA FSDP via SPMD (#28949)
* Initial commit

* Add guards for the global mesh

* Address more comments

* Move the dataloader into integrations/tpu.py

* Fix linters

* Make karg more explicitly

* Remove the move device logic

* Fix the CI

* Fix linters

* Re-enable checkpointing
2024-02-14 21:44:49 +00:00
amyeroberts
0199a484eb
Backbone kwargs in config (#28784)
* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Fix up

* Input verification

* Add tests

* Tidy up

* Update tests/utils/test_backbone_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-14 20:46:44 +00:00
JB (Don)
725f4ad1cc
Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)
* Add tie_weights() to LM heads and set bias in set_output_embeddings()

The bias were not tied correctly in some LM heads, and this change should fix that.

* Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin

* Adding _tie_weights() to MPNet and Vilt

* Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device

* Rename to test name to save_load to match the convention
2024-02-14 20:39:01 +00:00
Merve Noyan
3f4e79d29c
Mask Generation Task Guide (#28897)
* Create mask_generation.md

* add h1

* add to toctree

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update mask_generation.md

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update mask_generation.md

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/tasks/mask_generation.md

* Update mask_generation.md

* Update mask_generation.md

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Maria Khalusova <kafooster@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
2024-02-14 18:29:49 +00:00
Raushan Turganbay
354775bc57
Fix flaky test vision encoder-decoder generate (#28923) 2024-02-14 15:40:57 +00:00
Zach Mueller
0507e69d34
Introduce AcceleratorConfig dataclass (#28664)
* Introduce acceleratorconfig dataclass

* Extra second warn

* Move import

* Try moving import under is_accelerate_available

* Quality

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Clean

* Remove to_kwargs

* Change version

* Improve tests by including dispatch and split batches

* Improve reliability

* Update tests/trainer/test_trainer.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fixup tests and review nits

* Make tests pass

* protect import

* Protect import

* Empty-Commit

* Make training_args.to_dict handle the AcceleratorConfig

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-14 10:18:09 -05:00
Huazhong Ji
69ca640dd6
Set the dataset format used by test_trainer to float32 (#28920)
Co-authored-by: unit_test <test@unit.com>
2024-02-14 13:55:12 +00:00
amyeroberts
7252e8d937
[Doc] Fix docbuilder - make BackboneMixin and BackboneConfigMixin importable from utils. (#29002)
* Trigger doc build

* Test removing references

* Importable from utils

* Trigger another run on a new commit for testing
2024-02-14 10:29:22 +00:00
Andrei Panferov
1ecf5f7c98
AQLM quantizer support (#28928)
* aqlm init

* calibration and dtypes

* docs

* Readme update

* is_aqlm_available

* Simpler link in docs

* Test TODO real reference

* init _import_structure fix

* AqlmConfig autodoc

* integration aqlm

* integrations in tests

* docstring fix

* legacy typing

* Less typings

* More kernels information

* Performance -> Accuracy

* correct tests

* remoced multi-gpu test

* Update docs/source/en/quantization.md

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Brought back multi-gpu tests

* Update src/transformers/integrations/aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update tests/quantization/aqlm_integration/test_aqlm.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-02-14 09:25:41 +01:00
NielsRogge
63ffd56d02
Add SiglipForImageClassification and CLIPForImageClassification (#28952)
* First draft

* Add CLIPForImageClassification

* Remove scripts

* Fix doctests
2024-02-14 08:41:31 +01:00
Jonathan Tow
de6029a059
Add StableLM (#28810)
* Add `StableLM`

* fix(model): re-create from `huggingface-cli add-new-model-like persimmon`

* fix: re-add changes to address comments

* fix(readme): add links to paper

* fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref

* fix(tests): re-add `@slow` decorator to integration tests

* fix(tests): import slow...

* fix(readme_hd): remove whitespace edit

* fix(tokenizer): auto tokenizer tuple

* skip doctests for `modeling_stablelm`
2024-02-14 07:15:18 +01:00
Younes Belkada
164bdef8cc
ENH [AutoQuantizer]: enhance trainer + not supported quant methods (#28991)
* enhance trainer + not support quant methods

* remove all old logic

* add version
2024-02-14 01:30:23 +01:00
Younes Belkada
1d12b8bc25
ENH: Do not pass warning message in case quantization_config is in config but not passed as an arg (#28988)
* Update auto.py

* Update auto.py

* Update src/transformers/quantizers/auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/quantizers/auto.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-14 01:19:42 +01:00
amyeroberts
bd4b83e1ba
[DETR] Update the processing to adapt masks & bboxes to reflect padding (#28363)
* Update the processing so bbox coords are adjusted for padding

* Just pad masks

* Tidy up, add tests

* Better tests

* Fix yolos and mark as slow for pycocotols

* Fix yolos - return_tensors

* Clarify padding and normalization behaviour
2024-02-13 18:27:06 +00:00
Aditya Kane
3de6a6b493
Update configuration_llama.py: fixed broken link (#28946)
* Update configuration_llama.py: fix broken link

* [Nit] Explicit redirection not required

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-13 13:02:07 +00:00
Joao Gante
3e70a207df
Static Cache: load models with MQA or GQA (#28975) 2024-02-13 09:58:19 +00:00
Hiroshi Matsuda
da20209dbc
Add sudachi_projection option to BertJapaneseTokenizer (#28503)
* add sudachi_projection option

* Upgrade sudachipy>=0.6.8

* add a test case for sudachi_projection

* Compatible with older versions of SudachiPy

* make fixup

* make style

* error message for unidic download

* revert jumanpp test cases

* format options for sudachi_projection

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* format options for sudachi_split_mode and sudachi_dict_type

* comment

* add tests for full_tokenizer kwargs

* pass projection arg directly

* require_sudachi_projection

* make style

* revert upgrade sudachipy

* check is_sudachi_projection_available()

* revert dependency_version_table and bugfix

* style format

* simply raise ImportError

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* simply raise ImportError

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-13 04:47:20 +01:00
Arthur
b44567538b
[NllbTokenizer] refactor with added tokens decoder (#27717)
* refactor with addedtokens decoder

* style

* get rid of lang code to id

* style

* keep some things for BC

* update tests

* add the mask token at the end of the vocab

* nits

* nits

* fix final tests

* style

* nits

* Update src/transformers/models/nllb/tokenization_nllb_fast.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

* style?

* Update src/transformers/convert_slow_tokenizer.py

* make it a tad bit more custom

* ruff please stop
Co-Authored by avidale

<dale.david@mail.ru>

* Update
Co-authored-by: avidale
<dale.david@mail.ru>

* Update
Co-authored-by: avidale <dale.david@mail.ru>

* oupts

* ouft

* nites

* test

* fix the remaining failing tests

* style

* fix failing test

* ficx other test

* temp dir + test the raw init

* update test

* style

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-13 03:49:20 +01:00
Klaus Hipp
d90acc1643
[i18n-de] Translate CONTRIBUTING.md to German (#28954)
* Translate contributing.md to German

* Fix formatting issues in contributing.md

* Address review comments

* Fix capitalization
2024-02-12 13:39:20 -08:00
NielsRogge
78ba9f4617
[Docs] Add video section (#28958)
Add video section
2024-02-12 19:50:31 +01:00
Klaus Hipp
fe3df9d5b3
[Docs] Add language identifiers to fenced code blocks (#28955)
Add language identifiers to code blocks
2024-02-12 10:48:31 -08:00
Yunxuan Xiao
c617f988f8
Clean up staging tmp checkpoint directory (#28848)
clean up remaining tmp checkpoint dir

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
2024-02-12 15:47:21 +00:00
JB (Don)
136cd893dc
Always initialize tied output_embeddings if it has a bias term (#28947)
Continue to initialize tied output_embeddings if it has a bias term

The bias term is not tied, and so will need to be initialized accordingly.
2024-02-12 15:47:08 +00:00
Alexey Fadeev
792819f6cf
Updated requirements for image-classification samples: datasets>=2.14.0 (#28974)
Updated datasets requirements. Need a package version >= 2.14.0
2024-02-12 14:57:25 +00:00
Joao Gante
e30bbb2685
Tests: tag test_save_load_fast_init_from_base as flaky (#28930) 2024-02-12 14:43:34 +00:00
cmahmut
1709886eba
[pipelines] updated docstring with vqa alias (#28951)
updated docstring with vqa alias
2024-02-12 14:34:08 +00:00
Kossai Sbai
cf4c20b9fb
Convert torch_dtype as str to actual torch data type (i.e. "float16" …to torch.float16) (#28208)
* Convert torch_dtype as str to actual torch data type (i.e. "float16" to torch.float16)

* Check if passed torch_dtype is an attribute in torch

* Update src/transformers/pipelines/__init__.py

Check type via isinstance

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-12 14:04:53 +00:00
NielsRogge
ef5ab72f4b
[Docs] Update README and default pipelines (#28864)
* Update README and docs

* Update README

* Update README
2024-02-12 10:21:36 +01:00
NielsRogge
f278ef20ed
[Nougat] Fix pipeline (#28242)
* Fix pipeline

* Remove print statements

* Address comments

* Address issue

* Remove unused imports
2024-02-12 10:21:15 +01:00
Klaus Hipp
58e3d23e97
[i18n-de] Translate README.md to German (#28933)
* Translate README.md to German

* Add links to README_de.md

* Remove invisible characters in README

* Change to a formal tone and fix punctuation marks
2024-02-09 12:56:22 -08:00
Philip Blair
d123e661e4
Fix type annotations on neftune_noise_alpha and fsdp_config TrainingArguments parameters (#28942) 2024-02-09 15:42:01 +00:00
Yuki Watanabe
ebf3ea2788
Fix a wrong link to CONTRIBUTING.md section in PR template (#28941) 2024-02-09 15:10:47 +00:00
Karl Hajjar
de11e654c9
Fix max_position_embeddings default value for llama2 to 4096 #28241 (#28754)
* Changed max_position_embeddings default value from 2048 to 4096

* force push

* Fixed formatting issues. Fixed missing argument in write_model.

* Reverted to the default value 2048 in the Llama config. Added comments for the llama_version argument.

* Fixed issue with default value value of max_position_embeddings in docstring

* Updated help message for llama versions

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-09 10:24:01 +00:00
Klaus Hipp
2749e479f3
[Docs] Fix broken links and syntax issues (#28918)
* Fix model documentation links in attention.md

* Fix external link syntax

* Fix target anchor names of section links

* Fix copyright statement comments

* Fix documentation headings
2024-02-08 14:13:35 -08:00
Raushan Turganbay
d628664688
Support batched input for decoder start ids (#28887)
* support batched input for decoder start ids

* Fix typos

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* minor changes

* fix: decoder_start_id as list

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-02-08 16:00:53 +00:00
Raushan Turganbay
cc309fd406
pass kwargs in stopping criteria list (#28927) 2024-02-08 15:38:29 +00:00
vodkaslime
0b693e90e0
fix: torch.int32 instead of torch.torch.int32 (#28883) 2024-02-08 16:28:17 +01:00
Matt
693667b8ac
Remove dead TF loading code (#28926)
Remove dead code
2024-02-08 14:17:33 +00:00