Aviv Shamsian
7f79a97399
fix prompt strip to support tensors and np arrays ( #27818 )
...
* fix prompt strip to support tensors and np arrays
* framework agnostic
* change logic check before converting prompt into list
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adding _convert_to_list to tokenization_whisper_fast
* adding tests for prompt decoding
* adding comment
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adding comment
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* revert minor
* make style formatting
* style formatting after update
* Update src/transformers/models/whisper/tokenization_whisper_fast.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fixing _strip_prompt to handle _decode_with_timestamps
* fix copies
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-07-12 20:07:10 +01:00
Joao Gante
d1a1bcf56a
Docker: TF pin on the consistency job ( #31928 )
...
* pin
* dev-ci
* dev-ci
* dev-ci
* test pushed image
2024-07-12 14:28:46 +02:00
jiqing-feng
aec1ca3a58
[Bug Fix] fix qa pipeline tensor to numpy ( #31585 )
...
* fix qa pipeline
* fix tensor to numpy
2024-07-11 22:22:26 +01:00
Naman Garg
c1e139c2b0
Adding hiera ( #30356 )
...
* initialized Structure
* Updated variable names
* Added Config class, basic HF setup, convert_to_hf
* Fixed Convert function, added hiera to HF files, Initilized test files
* better naming for x in forward pass
* Moved utils to hiera
* Change hiera -> hiera_model
* Fixed integration into tranformers
* Fix: Convert Checkpoint
* added documentation for hiera
* added documentation for hiera
* added Docstings to models, Transformers based changes
* make style and quality
* make style and quality
* Integration & Block tests running
* Fixed bugs
* initialized Structure
* Updated variable names
* Added Config class, basic HF setup, convert_to_hf
* Fixed Convert function, added hiera to HF files, Initilized test files
* better naming for x in forward pass
* Moved utils to hiera
* Change hiera -> hiera_model
* Fixed integration into tranformers
* Fix: Convert Checkpoint
* added documentation for hiera
* added documentation for hiera
* added Docstings to models, Transformers based changes
* make style and quality
* make style and quality
* Integration & Block tests running
* Fixed bugs
* Removed tim dependency
* added HieraBlock
* fixed: Model name
* added tests for HieraModel, HieraBlock
* fixed imports
* fixed quality & copies
* Fixes
* Update docs/source/en/model_doc/hiera.md
Fix name
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fixed formatting
* Code quality & Import differences
* quality and repo-consistency fix
* fixed no torch error
* Docstring fix
* Docstring fix
* doc string fix
* fixed example usage
* Resolved issues in modeling_hiera
* Removed Hiera MAE
* Added test and resolved bug
* fixed doc string
* First commit
* Finished conversion script and model forward working
* Resolved all issues
* nits
* Improving tests
* Nits
* More nits
* Improving HieraForMaskedImageModeling
* More improvements and nits
* Fixed docstrings of outputs
* More fixes
* More imrpovments
* Updated conversion script
* Fixed docstrings
* Improved tests
* Fixed attentou outputs test
* All tests green
* Removed unnecessary file
* contribution attribution
* Resolved a few issues
* Resolved Comments
* Updated model repo id and fixed bugs
* Removed loss print
* Make tests green
* Updated docstrings
* Fix style
* Fixed num_heads in config
* Removed unnecessary video checkpoint related code in the conversion script
* Fix style
* Changed atol in conversion script
* HieraConfig
* Fix copies
* Fixed typo
* Resolved few issues
* make
* converted conv_nd -> nn.Module
* Removed video complexities
* Removed video complexities
* fix style
* Addressing comments
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix style
* Fixed tests
* Fixed typo
* Fixed interpolate test
* Made torch fx compatible
* Made sure imageprocesor is correct
* Addressed comments
* Noise directly as torch
* Remove unnecesary attr
* Added return_dit
* Update src/transformers/models/hiera/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Updated checkpoints
* [run_slow] hiera
* Fixed device mismatch
* [run_slow] hiera
* Fixed GPU tests
* [run_slow] hiera
---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Eduardo Pacheco <eduardo.pach@hotmail.com>
Co-authored-by: Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-11 22:13:56 +01:00
Apoorv Khandelwal
574e68d554
Allow Trainer.get_optimizer_cls_and_kwargs
to be overridden ( #31875 )
...
* Change `Trainer.get_optimizer_cls_and_kwargs` to `self.`
* Make `get_optimizer_cls_and_kwargs` an instance method
* Fixing typo
* Revert `get_optimizer_cls_and_kwargs` to staticmethod
* restore newline to trainer.py eof
2024-07-11 22:13:06 +01:00
t11s
52585019a1
🚨 fix(SigLip): remove spurious exclusion of first vision output token ( #30952 )
...
fix(SigLip): remove spurious exclusion of first vision output token in classifier
2024-07-11 19:40:57 +01:00
Joao Gante
6a05f68f51
Generate: fix SlidingWindowCache.reset()
( #31917 )
...
fix sliding cache
2024-07-11 19:35:46 +01:00
Arthur
e314395277
Refactor flash attention implementation in transformers ( #31446 )
...
* dumb commit
* nit
* update
* something like this
* unpack in modeling utils
* safe import
* oups
* update
* nits
* diff convert gemma
* update
* start propagating
* udpate other modeling code as well
* update for sliding window models
* nits
* more init cleanups
* styling
* fixup
* noice
* pass fixup
* typo typing_extension -> typing_extensions
* torch.nn.functionnal -> torch.nn.functional
* add to import structure
* unpack
* simplify a bit more for this first version
* nut
* update
* update
* nit
* ease the import of `Unpack`
* remove useless `use_sliding_window`
* no qua please
* protect import?
* style
* [run-slow]
* [run slow] llama,gemma,mistral,mixtral
* remove extra kwargs
* fix llama
* address review comments
* apply diff_model_converter to modeling_gemma.py
* remove cache_position 1
* remove cache_position 2
* some cleaning
* refactor gemma2 as well
* apply review comments
* rename file to modeling_flash_attention_utils.py
* siglip refactor
* remove dead code
* is the hub down?
* still down?
* fix siglip
* fix gemma2
* fatal: Could not read from remote repository.
* fix typo in softcap implem
* flacky
* Failed: Timeout >120.0s
---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
2024-07-11 20:37:31 +08:00
fxmarty
ad4ef3a290
Fix fx tests with inputs_embeds ( #31862 )
...
* fix tests
* [test_all] check
* address review comments
2024-07-11 20:14:03 +08:00
Omar Salman
1499a55008
Add warning message for beta and gamma parameters ( #31654 )
...
* Add warning message for and parameters
* Fix when the warning is raised
* Formatting changes
* Improve testing and remove duplicated warning from _fix_key
2024-07-11 13:01:47 +01:00
Sangbum Daniel Choi
23d6d0cc06
add gather_use_object arguments II ( #31799 )
...
* add gather_use_object arguments
* fix name and pass the CI test for Seq2SeqTrainer
* make style
* make it to functools
* fix typo
* add accelerate version:
* adding warning
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* make style
* Update src/transformers/training_args.py
* check function move to initial part
* add test for eval_use_gather_object
* fix minor
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-07-11 12:23:02 +01:00
Sai-Suraj-27
2e48b3e872
fix: Fixed the 1st argument
name in classmethods ( #31907 )
...
Fixed the first argument name in few classmethods.
2024-07-11 12:11:50 +01:00
Isotr0py
48c20700e1
Fix missing methods for Fuyu ( #31880 )
...
* add missing methods for FuyuForCausalLM
* fix a typo
* format code
* add missing tie_weights
* format code
2024-07-11 11:01:46 +01:00
Arthur
f4ec7a286a
[Gemma2
] Support FA2 softcapping ( #31887 )
...
* Support softcapping
* strictly greater than
* update
2024-07-11 11:57:35 +02:00
Arthur
f67e0f7fb7
[ConvertSlow
] make sure the order is preserved for addedtokens ( #31902 )
...
* preserve the order
* oups
* oups
* nit
* trick
* fix issues
2024-07-11 11:56:41 +02:00
Raushan Turganbay
14d3b3f0f0
Processor accepts any kwargs ( #31889 )
...
* accept kwargs in processors
* return unused kwargs
* fix tests
* typo
* update the other way
2024-07-11 13:20:30 +05:00
turboderp
a695c18649
Fixes to alternating SWA layers in Gemma2 ( #31775 )
...
* HybridCache: Flip order of alternating global-attn/sliding-attn layers
* HybridCache: Read sliding_window argument from cache_kwargs
* Gemma2Model: Flip order of alternating global-attn/sliding-attn layers
* Code formatting
2024-07-11 10:03:46 +02:00
Raushan Turganbay
d625294d79
InstructBlipVideo: Update docstring ( #31886 )
...
* update docs
* one more change
2024-07-11 10:13:29 +05:00
haikuoxin
c54af4c77e
Add a condition for nested_detach ( #31855 )
...
fix bug: https://github.com/huggingface/transformers/issues/31852
2024-07-10 21:37:22 +01:00
Yih-Dar
080e14b24c
Modify warnings
in a with
block to avoid flaky tests ( #31893 )
...
* fix
* [test_all] check before merge
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-10 17:56:12 +02:00
NielsRogge
ec03d97b27
[RT-DETR] Add resources ( #31815 )
...
* Add resources
* Address comments
2024-07-10 16:34:53 +01:00
Marc Sun
8df28bb308
Push sharded checkpoint to hub when push_to_hub=True
in TrainingArguments
( #31808 )
...
Save sharded checkpoint in Trainer
2024-07-10 15:14:20 +02:00
Sai-Suraj-27
da79b18087
fix: Removed duplicate
field definitions in some classes ( #31888 )
...
Removed duplicate field definitions in classes.
2024-07-10 13:46:31 +01:00
Yih-Dar
9d98706b3f
Fix failed tests in #31851 ( #31879 )
...
* Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868 )"
This reverts commit b45dd5de9c
.
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-10 14:25:24 +02:00
Noah Young
a0a3e2f469
Fix file type checks in data splits for contrastive training example script ( #31720 )
...
fix data split file type checks
2024-07-10 10:17:03 +01:00
yukionfire
e9eeedaf3b
remove duplicate words in msg ( #31876 )
2024-07-10 09:54:45 +01:00
Raushan Turganbay
97aa3e2905
Add conversion for interleave llava ( #31858 )
...
* add conversion for interleave llava
* remove debug lines
* remove unused imports
* Update src/transformers/models/llava/convert_llava_weights_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* small changes + docs
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-10 12:12:21 +05:00
Yun Dai
ad35309a62
add warning when using gradient_checkpointing with FSDP full shard ( #31578 )
...
* add warning when using with FSDP full shard
* fix style
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add hybrid shard warn
* fix style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-09 23:55:57 +01:00
dependabot[bot]
6176d8f5ee
Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/visual_bert ( #31872 )
...
Bump certifi in /examples/research_projects/visual_bert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-09 22:20:39 +01:00
Yih-Dar
b45dd5de9c
Revert "Fix _init_weights
for ResNetPreTrainedModel
" ( #31868 )
...
Revert "Fix `_init_weights` for `ResNetPreTrainedModel` (#31851 )"
This reverts commit 4c8149d643
.
2024-07-09 23:00:56 +02:00
Mauricio Villegas
c5bc2d5fd5
Add return type annotation to PreTrainedModel.from_pretrained ( #31869 )
...
Update modeling_utils.py
Add return type annotation to PreTrainedModel.from_pretrained
2024-07-09 21:49:29 +01:00
dependabot[bot]
6e59b30841
Bump zipp from 3.7.0 to 3.19.1 in /examples/research_projects/decision_transformer ( #31871 )
...
Bump zipp in /examples/research_projects/decision_transformer
Bumps [zipp](https://github.com/jaraco/zipp ) from 3.7.0 to 3.19.1.
- [Release notes](https://github.com/jaraco/zipp/releases )
- [Changelog](https://github.com/jaraco/zipp/blob/main/NEWS.rst )
- [Commits](https://github.com/jaraco/zipp/compare/v3.7.0...v3.19.1 )
---
updated-dependencies:
- dependency-name: zipp
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-09 21:44:48 +01:00
Merve Noyan
e3a7d9bd47
Update depth estimation task guide ( #31860 )
...
---------
Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-07-09 22:13:30 +03:00
Yih-Dar
4c8149d643
Fix _init_weights
for ResNetPreTrainedModel
( #31851 )
...
* init
* test
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-09 20:09:08 +02:00
Yung-Sung Chuang
d094d8d9ec
Generate: Add new decoding strategy "DoLa" in .generate()
( #29619 )
...
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-07-09 17:37:38 +01:00
chenk
99c0e55335
docs: typo in tf qa example ( #31864 )
...
Signed-off-by: chenk <hen.keinan@gmail.com>
2024-07-09 16:30:06 +01:00
Joao Gante
4c2538b863
Test loading generation config with safetensor weights ( #31550 )
...
fix test
2024-07-09 16:22:43 +02:00
kallewoof
cffa2b9c1d
save_pretrained: use tqdm when saving checkpoint shards from offloaded params ( #31856 )
2024-07-09 12:55:57 +01:00
hatti
350aed7076
chore: remove duplicate words ( #31853 )
...
remove duplicate words
2024-07-09 10:38:29 +01:00
NielsRogge
bd760cd13d
[Grounding DINO] Add processor to auto mapping ( #31845 )
...
Add model
2024-07-09 11:28:53 +02:00
fxmarty
0abf5e8eae
FX symbolic_trace: do not test decoder_inputs_embeds ( #31840 )
...
only test input_embeds, not decoder_input_embeds
2024-07-09 08:07:46 +02:00
Raushan Turganbay
952dfd4867
Deprecate vocab_size
in other two VLMs ( #31681 )
...
* deprrecate `vocab_size` in other two VLMs
* Update src/transformers/models/fuyu/configuration_fuyu.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* depracate until 4.44
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-09 10:40:06 +05:00
Joao Gante
594c1610fa
Mamba & RecurrentGemma: enable strict signature ( #31549 )
...
* enable strict signature
* this should not have been deleted
* recurrent_gemma too
2024-07-08 15:48:32 +01:00
André Storhaug
ae9dd02ee1
Fix incorrect accelerator device handling for MPS in TrainingArguments
( #31812 )
...
* Fix wrong acclerator device setup when using MPS
* More robust TrainingArguments MPS handling
* Update training_args.py
* Cleanup
2024-07-08 12:49:30 +01:00
Yih-Dar
4879ac2b33
Avoid failure TFBlipModelTest::test_pipeline_image_to_text
( #31827 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-08 13:49:21 +02:00
fxmarty
ba743700f4
transformers.fx.symbolic_trace supports inputs_embeds ( #31574 )
...
* symbolic trace supports inputs_embeds
* fix test?
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 19:17:28 +08:00
omahs
e5ca9b057c
Fix typos ( #31819 )
...
* fix typo
* fix typo
* fix typos
* fix typo
* fix typos
2024-07-08 11:52:47 +01:00
dependabot[bot]
f4711844a3
Bump certifi from 2023.7.22 to 2024.7.4 in /examples/research_projects/lxmert ( #31838 )
...
Bump certifi in /examples/research_projects/lxmert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2023.7.22 to 2024.7.4.
- [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 11:17:49 +01:00
dependabot[bot]
9f3f58c905
Bump transformers from 4.26.1 to 4.38.0 in /examples/tensorflow/language-modeling-tpu ( #31837 )
...
Bump transformers in /examples/tensorflow/language-modeling-tpu
Bumps [transformers](https://github.com/huggingface/transformers ) from 4.26.1 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases )
- [Commits](https://github.com/huggingface/transformers/compare/v4.26.1...v4.38.0 )
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-08 11:12:33 +01:00
Pavel Iakubovskii
a177821b24
Add FA2 and sdpa
support for SigLIP ( #31499 )
...
* Rebase to main
* Fix attention implementation autoset for tex and vision configs
* Fixup
* Minor fixes
* Fix copies
* Fix attention_mask for FA2
* Add eqvivalence tests for siglip
* Remove right padding test
* Uncomment flaky
* Fix import
* Add to docs
* Fix test message
* Add sdpa
* Add sdpa equivalence test
* Add siglip sdpa to docs
* Fix typing for attention output
* Add sdpa tests
* Fix signature of FA2
* Autoset attn_implementation in config
* Rename bsz -> batch_size
* Move back autoset attn method
* Mark as flaky
* Correct attention mask padding
* [run-slow] siglip
* Add FA2 and sdpa docs
* Style fix
* Remove flaky for FA2 test
* Change attention implementation set
* Change attn_implementaiton propogation
* Fix typos
* Add modality to assert message
* Add more sdpa backends in test
* [run slow] siglip
* Add math sdpa backend for all options
* [run slow] siglip
2024-07-08 11:10:02 +01:00