Fanli Lin
c85510f958
[docs] change temperature to a positive value ( #32077 )
...
fix
2024-07-23 17:47:51 +01:00
RhuiDih
9cf4f2aa9a
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs ( #31629 )
...
* add DataCollatorBatchFlattening
* Update data_collator.py
* change name
* new FA2 flow if position_ids is provided
* add comments
* minor fix
* minor fix data collator
* add test cases for models
* add test case for data collator
* remove extra code
* formating for ruff check and check_repo.py
* ruff format
ruff format tests src utils
* custom_init_isort.py
2024-07-23 15:56:41 +02:00
Raushan Turganbay
3aefb4ec7f
LLaVaNeXT: pad on right if training ( #32134 )
...
* pad on right if training
* docs
* add tests
2024-07-23 10:23:55 +05:00
James Thewlis
251a2409c6
Add llama3-llava-next-8b to llava_next conversion script ( #31395 )
...
* Add llama3-llava-next-8b to llava_next conversion script
Adds support for the lmms-lab/llama3-llava-next-8b model to the
convert_llava_next_weights_to_hf.py script, along with an example
prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT
repo.
* Exclude <|begin_of_text|> from prompt example
This token gets added automatically, so it should not be included in the
prompt example.
* Add llava-next-72b and llava-next-110b
Adds the Qwen-based LLaVA-Next models to the conversion script, along
with changes to load the models on multiple GPUs for inference.
* Add llama3 and qwen prompt formats to docs
* Chat prompt and padding side left for llama3 batched
* update
* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove code
* better naming
---------
Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-23 10:12:16 +05:00
Marc Sun
96a074fa7e
Add new quant method ( #32047 )
...
* Add new quant method
* update
* fix multi-device
* add test
* add offload
* style
* style
* add simple example
* initial doc
* docstring
* style again
* works ?
* better docs
* switch to non persistant
* remove print
* fix init
* code review
2024-07-22 20:21:59 +02:00
Bertrand Thia
7987710696
[RoBERTa] Minor clarifications to model doc ( #31949 )
...
* minor edits and clarifications
* address comment
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-07-22 10:08:27 -07:00
Woojun Jung
d1ec36b94f
Update ko/_toctree.yml
and remove custom_tools.md
to reflect latest changes ( #31969 )
...
update `ko/_toctree.yml` and remove `custom_tools.md`
2024-07-22 08:27:13 -07:00
Lucain
f2a1e3ca68
Mention model_info.id instead of model_info.modelId ( #32106 )
2024-07-22 14:14:47 +01:00
Raushan Turganbay
fe008d6ebe
Chameleon: not supported with fast load ( #32091 )
...
fixes
2024-07-19 19:21:45 +05:00
Merve Noyan
46835ec6ae
Add image-text-to-text task guide ( #31777 )
...
* Add image-text-to-text task page
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Address comments
* Fix heading
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/tasks/image_text_to_text.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Address comments
* Update image_text_to_text.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-19 13:40:40 +01:00
Merve Noyan
4bd8f12972
Fixes to chameleon docs ( #32078 )
...
* Fixes
* Let's not use auto
2024-07-19 12:50:34 +01:00
Raushan Turganbay
e316c5214f
VideoLLaVa: fix chat format in docs ( #32083 )
...
fix chat format
2024-07-19 15:38:01 +05:00
NielsRogge
56a7745704
[Chameleon, Hiera] Improve docs ( #32038 )
...
* Improve docs
* Fix docs
* Fix code snippet
2024-07-19 11:20:03 +03:00
Raushan Turganbay
b873234cb6
Llava: add default chat templates ( #31691 )
...
* add default chat templates
* Update src/transformers/models/llava/processing_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/llava_next/processing_llava_next.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* more clear docstring and docs
* Update docs/source/en/model_doc/llava.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/vipllava.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* add tests
* remove default templates (see #31733 )
* load chat template from another file
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* revert some changes in docs
* forgot vipllava
* chat template file is not temporary hack
* warn if loading from processor
* not that file
* similarly modify `save_pretrained`
* Update tests/models/llava_next/test_processor_llava_next.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_processor_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/vipllava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/vipllava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2024-07-19 10:08:56 +05:00
Raushan Turganbay
673d30b826
Chameleon: minor fixes after shipping ( #32037 )
...
* fix merging
* make chameleon conditional
2024-07-18 16:54:07 +05:00
Pavel Iakubovskii
1c37e8c1a6
Add sdpa
and FA2 for CLIP ( #31940 )
...
* Squashed commit of the following:
commit 102842cd477219b9f9bcb23a0bca3a8b92bd732f
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Fri Jul 12 18:23:52 2024 +0000
Add model-specific sdpa tests
commit 60e4c88581abf89ec098da84ed8e92aa904c997d
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Fri Jul 12 18:20:53 2024 +0000
Add fallback to eager (expensive operation)
commit c29033d30e7ffde4327e8a15cbbc6bee37546f80
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Thu Jul 11 17:09:55 2024 +0000
Fix attn_implementation propagation
commit 783aed05f0f38cb2f99e758f81db6838ac55b9f8
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:05:27 2024 +0530
style
commit e77e703ca75d00447cda277eca6b886cd32bddc0
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:04:57 2024 +0530
add comment to explain why I had to touch forbidden codebase.
commit ab9d8849758e7773a31778ccba71588d18552623
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:03:02 2024 +0530
fix: flax attribute access.
commit c570fc0abf9d1bd58c291aae3c7e384f995996d2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 08:23:54 2024 +0530
fix tensorflow attribute name.
commit 32c812871cfdb268d8a6e3e2c61c5c925c8ed47e
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 07:57:10 2024 +0530
fix attribute access.
commit 4f41a0138b6c417aed9c9332278f8bcd979cb7c2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 07:44:02 2024 +0530
_from_config.
commit 35aed64ff602422adcf41d7f677a0a24bd9eccae
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 18:46:52 2024 +0530
propagation of attn_implementation.
commit 4c25c19845438b1dc1d35a5adf9436151c8c5940
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:24:36 2024 +0530
style again
commit 5f7dc5c5015c0f8116408f737e8c318d1802c80c
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:19:05 2024 +0530
use from_config.
commit b70c409956d0359fa6ae5372275d2a20ba7e3389
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:13:43 2024 +0530
quality
commit a7b63beff53d0fc754c6564e2a7b51731ddee49d
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 14:35:10 2024 +0200
add benchmark numbers
commit 455b0eaea50862b8458c8f422b60fe60ae40fdcb
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:50:16 2024 +0200
Revert "reflect feedback more"
This reverts commit dc123e71ef
.
commit ca674829d28787349c2a9593a14e0f1d41f04ea4
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:50:05 2024 +0200
Revert "fix"
This reverts commit 37a1cb35b8
.
commit fab2dd8576c099eb1a3464958cb206a664d28247
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:47:46 2024 +0200
fix
commit fbc6ae50fd6f2d36294d31e191761631b701d696
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:38:30 2024 +0200
reflect feedback more
commit 87245bb020b2d60a89afe318a951df0159404fc9
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 08:54:34 2024 +0530
fixes
commit 1057cc26390ee839251e7f8b3326c4207595fb23
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:49:03 2024 +0530
don't explicit set attn_implementation in tests
commit e33f75916fc8a99f516b1cf449dbbe9d3aabda81
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:43:54 2024 +0530
explicitly override attn_implementation in the towers.
commit 4cf41cb1bc885c39df7cb8f2a0694ebf23299235
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:38:42 2024 +0530
import in one-line.
commit f2cc447ae9e74ccfacb448140cdf88259d4afc8c
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:34:58 2024 +0530
move sdpa mention to usage tips.
commit 92884766c64dbb456926a3a84dd427be1349fa95
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 29 10:58:26 2024 +0530
fix: memory allocation problem.
commit d7ffbbfe12f7750b7d0a361420f35c13e0ea787d
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 29 09:56:59 2024 +0530
fix-copies
commit 8dfc3731cedd02e36acd3fe56bb2e6d61efd25d8
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri Apr 26 20:16:12 2024 +0530
address arthur's comments.
commit d2ed7b4ce4ff15ae9aa4d3d0500f1544e3dcd9e9
Author: Sayak Paul <spsayakpaul@gmail.com>
Date: Fri Apr 26 20:08:15 2024 +0530
Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
commit 46e04361f37ded5c522ff05e9f725b9f82dce40e
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:55:27 2024 +0530
add to docs.
commit 831629158ad40d34d8983f209afb2740ba041af2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:33:10 2024 +0530
styling.g
commit d263a119c77314250f4b4c8469caf42559197f22
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:15:20 2024 +0530
up
commit d44f9d3d7633d4c241a737a1bc317f791f6aedb3
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 18:40:42 2024 +0530
handle causal and attention mask
commit 122f1d60153df6666b634a94e38d073f3f260926
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 15:18:21 2024 +0530
test fixes.
commit 4382d8cff6fa1dee5dbcf0d06b3e2841231e36f5
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 09:39:25 2024 +0530
fix: scaling inside sdpa.
commit 0f629989efc48b7315cf19405a81e02955efe7e5
Author: Sayak Paul <spsayakpaul@gmail.com>
Date: Tue Apr 23 08:14:58 2024 +0530
Update src/transformers/models/clip/modeling_clip.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
commit 14367316877dc27ea40f767ad1aee38bbc97e4ce
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 22 16:21:36 2024 +0530
add: sdpa support to clip.
* Remove fallback for empty attention mask (expensive operation)
* Fix typing in copies
* Add flash attention
* Add flash attention tests
* List CLIP in FA docs
* Fix embeddings attributes and tf
* [run-slow] clip
* Update clip documentation
* Remove commented code, skip compile dynamic for CLIPModel
* Fix doc
* Fix doc 2
* Remove double transpose
* Add torch version check for contiguous()
* Add comment to test mixin
* Fix copies
* Add comment for mask
* Update docs
* [run-slow] clip
2024-07-18 10:30:37 +05:30
Dmitry Rogozhkin
bc36c26fa6
doc: fix broken BEiT and DiNAT model links on Backbone page ( #32029 )
...
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
2024-07-17 20:24:10 +01:00
Raushan Turganbay
24cfcc2114
Chameleon: add model ( #31534 )
...
* Chameleon model integration
Co-authored-by: Jacob Kahn <jacobkahn1@gmail.com>
Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com>
* fix 7B, again. mask away image tokens
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove pretrained_config_map
* make fixup passing up to utils/check_config_docstrings.py; vqgan moved to the modeling file
* remove tokenizer (use llama's); remove codechameleon tests
* a few copied from statements and minor changes
* copied from in ChameleonModel
* some copies in ChameleonForCausalLM
* a few more copies
* VQModel moved to ChameleonModel (as opposed to being in the processor)
* ChameleonProcessor ready
* Fix chameleon weights convert
* update conversion script
* clean-up processing
* update modeling a bit
* update
* update (throws error...)
* correct conversion ready
* fix tests
* fix docs
* docs
* ve swin norm
* fix device for vocab map
* add normalization
* update
* update script with rope rotations
* final fix on model conversion
* add slow tests
* more info in docs
* fix repo consistency tests
* fix repo tests
* fix-copies
* hope this will make CI happy
* fix for 30b model
* Update docs/source/en/index.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address comments
* remove assertion in conversion script
* add image processor test
* not copied
* port changes for qk layernorm
* fix-copies
* read token decorator for tests
* [run-slow] chameleon
* one more read-token
* address some comments
* qk norm changes
* tests and repo check
* moved rope permutations to conversion, YAY!
* fix past kv check
* docs
* layernorm done!
* let's be consistent in naming
* fix slow tests
* weird thing with slow CI, but let's see
* once more try
* remove past-kv as tuple following llama
* ignore
* style
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: jacobkahn <jacobkahn1@gmail.com>
Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com>
Co-authored-by: Leonid Shamis <lshamis@meta.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-17 10:41:43 +05:00
Zach Mueller
e0dfd7bcaf
Speedup model init on CPU (by 10x+ for llama-3-8B as one example) ( #31771 )
...
* 1,100%!
* Clean
* Don't touch DS
* Experiment with dtype allocation
* skip test_load_save_without_tied_weights test
* A little faster
* Include proper upscaling?
* Fixup tests
* Potentially skip?
* Let's see if this fixes git history
* Maintain new dtype
* Fin
* Rm hook idea for now
* New approach, see what breaks
* stage
* Clean
* Stash
* Should be fin now, just need to mark failing models
* Clean up
* Simplify
* Deal with weird models
* Enc/Dec
* Skip w/ reason
* Adjust test
* Fix test
* one more test
* Keep experimenting
* Fix ref
* TO REMOVE: testing feedback CI
* Right push
* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* disable
* Add new func
* Test nits from Amy
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Adjust comment
* Adjust comment on skip
* make private
* Fin
* Should be a not flag
* Clarify and rename test
---------
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-16 09:32:01 -04:00
Naman Garg
c1e139c2b0
Adding hiera ( #30356 )
...
* initialized Structure
* Updated variable names
* Added Config class, basic HF setup, convert_to_hf
* Fixed Convert function, added hiera to HF files, Initilized test files
* better naming for x in forward pass
* Moved utils to hiera
* Change hiera -> hiera_model
* Fixed integration into tranformers
* Fix: Convert Checkpoint
* added documentation for hiera
* added documentation for hiera
* added Docstings to models, Transformers based changes
* make style and quality
* make style and quality
* Integration & Block tests running
* Fixed bugs
* initialized Structure
* Updated variable names
* Added Config class, basic HF setup, convert_to_hf
* Fixed Convert function, added hiera to HF files, Initilized test files
* better naming for x in forward pass
* Moved utils to hiera
* Change hiera -> hiera_model
* Fixed integration into tranformers
* Fix: Convert Checkpoint
* added documentation for hiera
* added documentation for hiera
* added Docstings to models, Transformers based changes
* make style and quality
* make style and quality
* Integration & Block tests running
* Fixed bugs
* Removed tim dependency
* added HieraBlock
* fixed: Model name
* added tests for HieraModel, HieraBlock
* fixed imports
* fixed quality & copies
* Fixes
* Update docs/source/en/model_doc/hiera.md
Fix name
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/hiera.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/configuration_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fixed formatting
* Code quality & Import differences
* quality and repo-consistency fix
* fixed no torch error
* Docstring fix
* Docstring fix
* doc string fix
* fixed example usage
* Resolved issues in modeling_hiera
* Removed Hiera MAE
* Added test and resolved bug
* fixed doc string
* First commit
* Finished conversion script and model forward working
* Resolved all issues
* nits
* Improving tests
* Nits
* More nits
* Improving HieraForMaskedImageModeling
* More improvements and nits
* Fixed docstrings of outputs
* More fixes
* More imrpovments
* Updated conversion script
* Fixed docstrings
* Improved tests
* Fixed attentou outputs test
* All tests green
* Removed unnecessary file
* contribution attribution
* Resolved a few issues
* Resolved Comments
* Updated model repo id and fixed bugs
* Removed loss print
* Make tests green
* Updated docstrings
* Fix style
* Fixed num_heads in config
* Removed unnecessary video checkpoint related code in the conversion script
* Fix style
* Changed atol in conversion script
* HieraConfig
* Fix copies
* Fixed typo
* Resolved few issues
* make
* converted conv_nd -> nn.Module
* Removed video complexities
* Removed video complexities
* fix style
* Addressing comments
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/hiera/modeling_hiera.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix style
* Fixed tests
* Fixed typo
* Fixed interpolate test
* Made torch fx compatible
* Made sure imageprocesor is correct
* Addressed comments
* Noise directly as torch
* Remove unnecesary attr
* Added return_dit
* Update src/transformers/models/hiera/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Updated checkpoints
* [run_slow] hiera
* Fixed device mismatch
* [run_slow] hiera
* Fixed GPU tests
* [run_slow] hiera
---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Eduardo Pacheco <eduardo.pach@hotmail.com>
Co-authored-by: Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-11 22:13:56 +01:00
NielsRogge
ec03d97b27
[RT-DETR] Add resources ( #31815 )
...
* Add resources
* Address comments
2024-07-10 16:34:53 +01:00
Raushan Turganbay
97aa3e2905
Add conversion for interleave llava ( #31858 )
...
* add conversion for interleave llava
* remove debug lines
* remove unused imports
* Update src/transformers/models/llava/convert_llava_weights_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* small changes + docs
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-10 12:12:21 +05:00
Merve Noyan
e3a7d9bd47
Update depth estimation task guide ( #31860 )
...
---------
Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-07-09 22:13:30 +03:00
Yung-Sung Chuang
d094d8d9ec
Generate: Add new decoding strategy "DoLa" in .generate()
( #29619 )
...
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-07-09 17:37:38 +01:00
hatti
350aed7076
chore: remove duplicate words ( #31853 )
...
remove duplicate words
2024-07-09 10:38:29 +01:00
omahs
e5ca9b057c
Fix typos ( #31819 )
...
* fix typo
* fix typo
* fix typos
* fix typo
* fix typos
2024-07-08 11:52:47 +01:00
Pavel Iakubovskii
a177821b24
Add FA2 and sdpa
support for SigLIP ( #31499 )
...
* Rebase to main
* Fix attention implementation autoset for tex and vision configs
* Fixup
* Minor fixes
* Fix copies
* Fix attention_mask for FA2
* Add eqvivalence tests for siglip
* Remove right padding test
* Uncomment flaky
* Fix import
* Add to docs
* Fix test message
* Add sdpa
* Add sdpa equivalence test
* Add siglip sdpa to docs
* Fix typing for attention output
* Add sdpa tests
* Fix signature of FA2
* Autoset attn_implementation in config
* Rename bsz -> batch_size
* Move back autoset attn method
* Mark as flaky
* Correct attention mask padding
* [run-slow] siglip
* Add FA2 and sdpa docs
* Style fix
* Remove flaky for FA2 test
* Change attention implementation set
* Change attn_implementaiton propogation
* Fix typos
* Add modality to assert message
* Add more sdpa backends in test
* [run slow] siglip
* Add math sdpa backend for all options
* [run slow] siglip
2024-07-08 11:10:02 +01:00
NielsRogge
06fd7972ac
Add ZoeDepth ( #30136 )
...
* First draft
* Add docs
* Clean up code
* Convert model
* Add image processor
* Convert Zoe_K
* More improvements
* Improve variable names and docstrings
* Improve variable names
* Improve variable names
* Replace nn.sequential
* More improvements
* Convert ZoeD_NK
* Fix most tests
* Verify pixel values
* Verify pixel values
* Add squeeze
* Update beit to support arbitrary window sizes
* Improve image processor
* Improve docstring
* Improve beit
* Improve model outputs
* Add figure
* Fix beit
* Update checkpoint
* Fix repo id
* Add _keys_to_ignore_on_load_unexpected
* More improvements
* Address comments
* Address comments
* Address comments
* Address comments
* Rename variable name
* Add backbone_hidden_size
* Vectorize
* Vectorize more
* Address comments
* Clarify docstring
* Remove backbone_hidden_size
* Fix image processor
* Remove print statements
* Remove print statement
* Add integration test
* Address comments
* Address comments
* Address comments
* Address comments
* Add requires_backends
* Clean up
* Simplify conversion script
* Simplify more
* Simplify more
* Simplify more
* Clean up
* Make sure beit is loaded correctly
* Address comment
* Address bin_configurations
* Use bin_configurations
* Convert models, add integration tests
* Fix doc test
* Address comments
* Unify regressor classes
* Clarify arguments
* Improve resize_image
* Add num_relative_features
* Address comment
* [run-slow]beit,data2vec,zoedepth
* [run-slow]beit,data2vec,zoedepth
* Address comments
* Address comment
* Address comment
* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder
* Replace nn.MultiheadAttention
* Add attributes for patch transformer to config
* Add tests for ensure_multiple_of
* Update organization
* Add tests
* [run-slow] beit data2vec
* Update ruff
* [run-slow] beit data2vec
* Add comment
* Improve docstrings, add test
* Fix interpolate_pos_encoding
* Fix slow tests
* Add docstring
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Improve tests and docstrings
* Use run_common_tests
* Improve docstrings
* Improve docstrings
* Improve tests
* Improve tests
* Remove print statements
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 11:43:33 +02:00
Pedro Cuenca
1082361a19
Depth Anything: update conversion script for V2 ( #31522 )
...
* Depth Anything: update conversion script for V2
* Update docs
* Style
* Revert "Update docs"
This reverts commit be0ca47ea1
.
* Add docs for depth anything v2
* Add depth_anything_v2 to MODEL_NAMES_MAPPING
Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files
* Add tip in original docs
2024-07-05 19:28:41 +01:00
Billy Cao
ac26260436
Allow FP16 or other precision inference for Pipelines ( #31342 )
...
* cast image features to model.dtype where needed to support FP16 or other precision in pipelines
* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Use .to instead
* Add FP16 pipeline support for zeroshot audio classification
* Remove unused torch imports
* Add docs on FP16 pipeline
* Remove unused import
* Add FP16 tests to pipeline mixin
* Add fp16 placeholder for mask_generation pipeline test
* Add FP16 tests for all pipelines
* Fix formatting
* Remove torch_dtype arg from is_pipeline_test_to_skip*
* Fix format
* trigger ci
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-05 17:21:50 +01:00
Matt
e786844425
Repeating an important warning in the chat template docs ( #31796 )
...
* Repeating an important warning in the chat template docs
* Update docs/source/en/chat_templating.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Reword for clarity
* Reword for clarity
---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-07-05 15:30:24 +01:00
Billy Cao
1d3eaa6f7e
Add training support for SigLIP ( #31495 )
...
* Add siglip loss function
* Update docs
* Enable training tests
[experimental] enable GC training tests as it has worked for my own data
* Remove test_training* overrides to enable training tests
[run_slow] siglip
* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip
* Skip GC training tests for SiglipForImageClassification
* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel
* Remove copied from to fix CI
2024-07-05 14:50:39 +01:00
Boris Feld
9e599d1d94
Update CometCallback to allow reusing of the running experiment ( #31366 )
...
* Update CometCallback to allow reusing of the running experiment
* Fixups
* Remove useless TODO
* Add checks for minimum version of the Comet SDK
* Fix documentation and links.
Also simplify how the Comet Experiment name is passed
2024-07-05 08:13:46 +02:00
Billy Cao
43ffb785c0
Add torch_empty_cache_steps to TrainingArguments ( #31546 )
...
* Add torch_empty_cache_steps to TrainingArguments
* Fix formatting
* Add torch_empty_cache_steps to docs on single gpu training
* Remove check for torch_empty_cache_steps <= max_steps
* Captalize Tip
* Be device agnostic
* Fix linting
2024-07-04 13:20:49 -04:00
Aymeric Roucher
0fd885b91c
Adds final answer tool for all agents ( #31703 )
...
* Adds final answer tool for all agents
* Typo
* Add clarification in doc
* Put final_answer tool adition in agent for clarity
2024-07-03 11:36:09 +02:00
Jörg Bornschein
f91c16d270
Fix documentation for Gemma2. ( #31682 )
...
* Fix documentation for Gemma2.
Model sizes and Blog post URL are wrong in the documentation.
* Update docs/source/en/model_doc/gemma2.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-02 23:04:53 +01:00
Sanchit Gandhi
a9701953ff
[whisper] static kv cache ( #31166 )
...
* make work with cache abstraction
* correct for static cache
* hacks for compile
* make fast
* fix
* fix pos ids
* generate
* fix sdpa
* fix sdpa cache pos
* fix fa2
* clean fa2
* integrate cache into generate
* make style
* copies
* more copies
* update eager
* update sdpa
* update fa2
* simplify
* use cache pos
* always compute cross-cache for debug
* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
* fix fix
* fix fix fix
* more fix
* try encoder-decoder cache (too messy)
* revert encoder-decoder cache
* check cross-attn cache
* use enc-dec dataclass
* use richer enc-dec dataclass
* clean-up
* revert static cache changes
* small fixes
* revert to cpu flag
* fix copies
* add static slow test
* past k/v docstring
* more docstrings
* cache_position docstrings
* add to docs
* add enc-dec cache to docs
* make style
* fix after rebase
* fix beam
* style
* fix generation strategies
* fix most decoder-only tests
* style
* skip test
* more clean up
* small docstrings
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add todo
* only crop self-attn
* check cache in mixin
* style
* fix re-compile after rebase
* move `is_updated` logic to enc-dec wrapper
* revert back
* revert cache back
* finalise design
* fix
* fix fix
* style
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* deprecate
* updates
* final updates
* style
* style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00
Jade Choghari
e655029515
Add French version of run scripts tutorial ( #31483 )
...
* Add French translation of run scripts tutorial
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/fr/run_scripts_fr.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Jade Choghari <chogharijade@icloud.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-06-28 18:02:30 +02:00
Steven Liu
464aa74659
[docs] Llama3 ( #31662 )
...
quick usage to top
2024-06-27 10:32:51 -07:00
Arthur
75a6319864
Fix post gemma merge ( #31660 )
...
* nit
* toctree issue
* protect gemma2 tests as well
* sdpa supported
2024-06-27 17:51:42 +02:00
Arthur
0cf60f13ab
Add gemma 2 ( #31659 )
...
* inital commit
* Add doc
* protect?
* fixup stuffs
* update tests
* fix build documentation
* mmmmmmm config attributes
* style
* nit
* uodate
* nit
* Fix docs
* protect some stuff
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-06-27 17:36:19 +02:00
amyeroberts
1de7dc7403
Skip tests properly ( #31308 )
...
* Skip tests properly
* [test_all]
* Add 'reason' as kwarg for skipTest
* [test_all] Fix up
* [test_all]
2024-06-26 21:59:08 +01:00
Raushan Turganbay
e71f2863d7
Add LLaVa NeXT Video ( #31252 )
...
* squash into single commit
* run diff once more
* docstring
* tests
* minor chnages and ready to go
* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* [run-slow] llava-next-video
* [run-slow] llava-next-video
* [run-slow] llava_next_video
* fix two tests
* fix slow tests
* remove logit checks due to numeric errors
* run test once more
* [run-slow] llava_next_video
* final try to pass the test
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* style
* fix
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-26 21:52:28 +05:00
Pavel Iakubovskii
ac52084bf2
Update RT-DETR code snippet ( #31631 )
...
Update code snippet
2024-06-26 14:42:20 +01:00
Anton Vlasjuk
b07770c5eb
[GPT-NeoX
] Add SDPA support ( #31031 )
...
* starting support for sdpa in `gptneox` models
* small comment on tests
* fix dropout
* documentation and style
* clarify concrete paths for reference
* generalise attn projections and rope application
added head mask check to sdpa mask creation
handle sdpa memory backend bug via own version flag
* update docs and style
* move dtype casting outside of general attn_projection_and_rope function
fix flash_attn_2 stuff
* more generic attn warning if output_attns or head_mask
* simplify head mask check by moving head mask creation to a later point
* remove copied llama artifact
* remove padding_mask from attention function signature
* removing unnecessary comments, only "save" attn implementation once
* [run_slow] gpt_neox
2024-06-26 13:56:36 +01:00
Raushan Turganbay
fc689d75a0
Add video modality for InstrucBLIP ( #30182 )
...
* squash in single commit
* add docs
* dummy obj
* more changes in diff converter
* tiny fix
* make docs happy
* skip test
* repo consistency tests
* update docstring
* style
* fix tests
* change diff imports
* [run-slow] instructblipvideo
* [run-slow] instructblipvideo
* fix tests and remove logit check
* [run-slow] instructblipvideo
2024-06-25 15:45:39 +05:00
Nicholi Caron
4b822560a1
Update mask_generation.md ( #31543 )
...
Minor bug fixes -- rearrange import & add missing parentheses
2024-06-23 20:27:21 +01:00
Sangbum Daniel Choi
74a207404e
New model support RTDETR ( #29077 )
...
* fill out docs string in configuration
75dcd3a0e8 (r1506391856)
* reduce the input image size for the tests
* remove the unappropriate tests
* only 5 failes exists
* make style
* fill up missed architecture for object detection in docs
* fix auto modeling
* simple fix in missing import
* major change including backbone refactor and objectdetectionoutput refactor
* minor fix only 4 fails left
* intermediate fix
* revert __init__.py
* revert __init__.py
* make style
* fixes in pr_docs
* intermediate fix
* make style
* two fixes
* pass doctest
* only one fix left
* intermediate commit
* all fixed
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/convert_rt_detr_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/rt_detr/test_modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* function class above the model definition in dice_loss
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* simple fix
* layernorm add config.layer_norm_eps
* fix inputs_docstring
* make style
* simple fix
* add custom coco loading test in image_processor
* fix error in BaseModelOutput
https://github.com/huggingface/transformers/pull/29077#discussion_r1516657790
* simple typo
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* intermediate fix
* fix with load_backbone format
* remove unused configuration
* 3 fix test left
* make style
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: Sounak Dey <dey.sounak@gmail.com>
* change last_hidden_state to first index
* all pass fix
TO DO: minor update in comments
* make fix-copies
* remove deepcopy
* pr_document fix
* revert deepcopy due to the issue of unexpceted behavior in decoderlayer
* add atol in final
* add no_split_module
* _no_split_modules = None
* device transfer for model parallelism
* minor fix
* make fix-copies
* fix typo
* add test_image_processor with post_processing
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add config in RTDETRPredictionHead
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* set lru_cache with max_size 32
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add lru_cache import and configuration change
* change the order of definition
* make fix-copies
* add docs and change config error
* revert strange make-fix
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* test pass
* fix get_clones related and remove deepcopy
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* nit for paper section
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename denoising related parameters
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* check the image transformation logic
* make style
* make style
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* pe_encoding -> positional_encoding_temperature
* remove TODO
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* remove eval_idx since transformer DETR is giving all decoder output
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* change variable name
* make style and docs import update
* Revert "Update src/transformers/models/rt_detr/image_processing_rt_detr.py"
This reverts commit 74aa3e1de0
.
* fix typo
* add postprocessing in docs
* move import scipy to top
* change varaible name
* make fix-copies
* remove eval_idx in test
* move to after first sentence
* update image_processor since box loss requires normalized one
* change appropriate name to auxiliary_outputs
* Update src/transformers/models/rt_detr/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* make style
* remove panoptic related comments
* make style
* revert valid_processor_keys
* fix aux related test
* make style
* change origination from config to backbone API
* enable the dn_loss
* fix test and conversion
* renewal weight initialization
* change initializer_range
* make fix-up
* fix the loss issue in the auxiliary output and denoising part
* change weight loss to original RTDETR
* fix in initialization
* sync shape format of dn and aux
* make style
* stable fine-tuning and compatible conversion for resnet101
* make style
* skip input_embed
* change encoder related variable
* enable converting rtdetr_r101
* add r101 related conversion code
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change name _shape to _reshape
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* maket style
* make fix-copies
* remove deprecated import
* more fix
* remove last_hidden_state for task-specific model
* Revert "remove last_hidden_state for task-specific model"
This reverts commit ccb7a34051
.
* minore change in convert
* remove print
* make style and fix-copies
* add custom rtdetr backbone for r18, r34
* remove print
* change copied
* add pad_size
* make style
* change layertype to optional to pass the CI
* make style
* add test in modeling_resnet_rt_detr
* make fix-copies
* skip tmp file test
* fix comment
* add docs
* change to modeling_resnet file format
* enabling resnet50 above
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com>
* enable all the rtdetr model :)
* finish except CI
* add RTDetrResNetBackbone
* make fix-copies
* fix
TO DO: CI enable
* make style
* rename test
* add docs
* add special fix
* revert resnet
* Update src/transformers/models/rt_detr/modeling_rt_detr_resnet.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* add more comment
* remove swin comment
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename convert and add verify backbone
* Update docs/source/en/_toctree.yml
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* make style
* requests for docs
* more general test docs
* general script docs
* make fix-copies
* final commit
* Revert "Update src/transformers/models/rt_detr/configuration_rt_detr.py"
This reverts commit d136225cd3
.
* skip test_model_get_set_embeddings
* remove target
* add changes
* make fix-copies
* remove decoder_attention_mask
* add load_backbone function for auto_backbone
* remove comment
* fix repo name
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* final commit
* remove unused downsample_in_bottleneck
* new test for autobackbone
* change to appropriate indices
* test fix
* fix dict in test_image_processor
* fix test
* [run-slow] rt_detr, rt_detr_resnet
* change the slow test
* [run-slow] rt_detr
* [run-slow] rt_detr, rt_detr_resnet
* make in to same cuda in CSPRepLayer
* [run-slow] rt_detr, rt_detr_resnet
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sounak Dey <dey.sounak@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com>
Co-authored-by: ChoiSangBum <choisangbum@ChoiSangBumui-MacBookPro.local>
2024-06-21 17:50:08 +01:00
Billy Cao
cd5f7c1790
Add docs on zeroshot image classification prompt templates ( #31343 )
...
* Add docs on pipeline templates
* Fix example and comments
Update usage tips
* Update docs/source/en/tasks/zero_shot_image_classification.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/siglip.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Trigger CI
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-19 11:11:44 +01:00
linlin
1c1aec2ef1
Update object_detection.md ( #31488 )
...
Define MAX_SIZE before it is used.
2024-06-19 10:36:44 +01:00