Yih-Dar
4c8149d643
Fix _init_weights
for ResNetPreTrainedModel
( #31851 )
...
* init
* test
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-09 20:09:08 +02:00
fxmarty
0abf5e8eae
FX symbolic_trace: do not test decoder_inputs_embeds ( #31840 )
...
only test input_embeds, not decoder_input_embeds
2024-07-09 08:07:46 +02:00
fxmarty
ba743700f4
transformers.fx.symbolic_trace supports inputs_embeds ( #31574 )
...
* symbolic trace supports inputs_embeds
* fix test?
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 19:17:28 +08:00
Yih-Dar
93cd94b79d
Move some test files (tets/test_xxx_utils.py
) to tests/utils
( #31730 )
...
* move
* move
* move
* move
* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-02 13:46:03 +02:00
Arthur
0cf60f13ab
Add gemma 2 ( #31659 )
...
* inital commit
* Add doc
* protect?
* fixup stuffs
* update tests
* fix build documentation
* mmmmmmm config attributes
* style
* nit
* uodate
* nit
* Fix docs
* protect some stuff
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-06-27 17:36:19 +02:00
amyeroberts
1de7dc7403
Skip tests properly ( #31308 )
...
* Skip tests properly
* [test_all]
* Add 'reason' as kwarg for skipTest
* [test_all] Fix up
* [test_all]
2024-06-26 21:59:08 +01:00
Pavel Iakubovskii
3c2d4d60d7
Correct @is_flaky test decoration ( #31480 )
...
* Correct @is_flaky decorator
2024-06-24 08:09:21 +01:00
amyeroberts
25245ec26d
Rename test_model_common_attributes -> test_model_get_set_embeddings ( #31321 )
...
* Rename to test_model_common_attributes
The method name is misleading - it is testing being able to get and set embeddings, not common attributes to all models
* Explicitly skip
2024-06-07 19:40:26 +01:00
Fanli Lin
04c7c176d7
[tests] make test_model_parallelism
device-agnostic ( #30844 )
...
* enable on xpu
* fix style
* add comment and mps
2024-05-24 11:51:51 +01:00
Poedator
6739e1d261
test_custom_4d_attention_mask skip with sliding window attn ( #30833 )
2024-05-23 15:22:10 +02:00
Marc Sun
5c186003b8
Fix low cpu mem usage tests ( #30808 )
...
* Fix tests
* fix udop failing test
* remove skip
* style
2024-05-22 14:09:01 +02:00
Benjamin Warner
cd6bd0af34
Add support for torch.compile dynamic shapes ( #30560 )
...
* add torch.compile dynamic support
* Add SDPA dynamic shapes compile test & improve SDPA comment
* comment consistency
2024-05-20 10:36:57 +02:00
Yih-Dar
1b3dba9417
Make Gemma
work with torch.compile
( #30775 )
...
* fix
* [run-slow] gemma
* add test
* add `test_compile_static_cache`
* fix
* style
* remove subprocess
* use attribute
* fix
* style
* update
* [run-slow] dbrx,gemma,jetmoe,phi3,recurrent_gemma
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-16 13:41:33 +02:00
Joao Gante
9d889f870e
Cache: add new flag to distinguish models that Cache
but not static cache ( #30800 )
...
* jamba cache
* new flag
* generate exception
2024-05-16 12:08:35 +01:00
hyenal
1c21f48a50
add sdpa to ViT [follow up of #29325 ] ( #30555 )
...
remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8db3] [run-slow]vit_msn
[ead817eca] fix vit msn multi gpu
[d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa88f] doc
[a3ff33e4a] finish implementation
[e20b7b7fb] Update test_modeling_common.py
[e290c5810] Update test_modeling_flax_common.py
[d3af86f46] comment
[ff7dd32d8] more comments
[59b137889] suggestion
[7e2ba6d67] attn_implementation as attribute of the class
[fe66ab71f] minor
[38642b568] Apply suggestions from code review
Accept comments
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d52] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137cc6] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c679f] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00ebb0] all tests are passing locally
[e9e0b82b7] vision encoder/decoder
[4d5076b56] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e26] all tests are passing locally
[bff7fc366] minor
[62f88306f] fix yolo and text_encoder tests
[121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10dd] fix-copies
[ef6c511c4] test vit hybrid
[7d4ba8644] vit hybrid
[66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a031] fixes
[cfde6eb21] fixup
[e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e22] vit + vit_mae are working
[547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/ passes
[61a97dfa9] it s the complete opposite...
[aefab37d4] fix more tests
[71802a1b9] fix all torch tests
[40b12eb58] encoder - decoder tests
[941552b69] slow decorator where appropriate
[14d055d80] has_attentions to yolo and msn
[3381fa19f] add correct name
[e261316a7] repo consistency
[31c6d0c08] fixup
[9d214276c] minor fix
[11ed2e1b7] chore
[eca6644c4] add sdpa to vit-based models
[cffbf390b] make fix-copies result
[6468319b0] fix style
[d324cd02a] add sdpa for vit
Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>
2024-05-16 10:56:11 +01:00
Edoardo Cetin
4b3eb19fa7
Fix llama model sdpa attention forward function masking bug when output_attentions=True ( #30652 )
...
* Fix llama model forward function with attention=True, same-length encoded sequence.
* Fix style
* propagate fix to modeling_cohere, gemma, dbrx, and olmo (which copy the same sdpa masking logic from llama)
* Fix style
* ignore unnecessary sdpa mask converter when output_attentions=True
* add tests checking sdpa and eager outputs match when output_attentions=True
* Split if statements in two lines
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix formatting
* Add fix to new jetmoe model
* Add missing output_attentions argument to jetmoe mask creation
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-15 19:48:19 +02:00
Raushan Turganbay
bd9f4d7951
Add Video Llava ( #29733 )
...
* add model draft
* update docstring
* add tests
* support image and video as input
* update for better handling of mixed input and clean-up a bit
* bug when mixed inputs & add tests
* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Merge remote-tracking branch 'upstream/main' into video_llava
* link to abstract of paper in README
* fix test
* fix-copies
* make tests happy
* skip docstest for now
* do not run doctest for now
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address review comments
* failing tests
* Fix vocab_size in common tests for VLMs
* codestyle
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* PR suggestions
* fix-copies
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add full example in docs
* clean-up with new model-id
* [run-slow] video_llava
* update docstring
* [run-slow] video_llava
* remove all achive maps
* fix some tests
* test was supposed to be skipped for llava :)
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 16:42:29 +05:00
Pablo Montalvo
1360801a69
Add PaliGemma ( #30814 )
...
* add new model like
* add state dict slicing + new model config
* update palma config and weights, passes vision activations
* fix
* update
* reorder loading/unpacking
* clean up
* add debug statements
* change device
* fix
* debugging
* fix noncausal mask
* fixup sdpa + causal mask
* fix activation function
* remove debug before changing modeling file
* add variants
* debug attention mask in generate
* revert to non-debug sdpa
* revert gemma modifications
* add custom language modeling
* use Processor
* add language modeling file to init
* try thin wrapper around generate
* Update
* update mask
* breakpoints galore
* remove conflict
* switch to left-padding
* add incomplete model doc
* add paligemma global files
* batch rename paligemma
* make generation match outputs and captioning
* style
* style
* remove copied from + doc
* remove more copied from
* remove copy from projector
* minor fix
* update config and style
* add readme - dummy
* CORRECT image captioning
* moving to args
* add siglip proper + fix merging image + text features
* take update_causal_mask from upstream
* remove breakpoint
* leverage AutoModel
* fix input_ids slicing
* make siglip head conditional
* remove encoder_decoder value
* remove unneeded modeling file
* add commented 4d attention mask
* FIXED generation with 4D mask
* Update src/transformers/models/siglip/modeling_siglip.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix left padding detection
* shuffle order of verifications
* fix missing labels for training
* fix
* vectorize merging of features, improve slicing
* improve testing before conversion
* handle merging in processor
* image token index depends on checkpoint
* add variants, save processor too
* save processors, base tokenizer off spm file
* expand model embeddings due to additional image token
* pass image processing args
* add convert rgb to siglip processor
* add \n token separately
* fix tokenizer and prompts
* fix docstrings
* change to camel
* fix casing
* debug pos_ids and sdpa
* pass and use cache_position
* add flag for newline tokenization
* Update src/transformers/models/paligemma/processing_paligemma.py
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
* simplify conversion script
* add copied from
* add precision to conversion script
* Update src/transformers/models/paligemma/modeling_paligemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* clean up
* Shift attention mask from `1:`
After discussion with @molbap
* add docs, fix quality
* quality, tied weights inheritance, and logits/label alignment
* fix more tests
* pass attn_implementation to language model correctly
* add SiglipVisionTransformer to no split modules
* skip paligemma test for sdpa dispatch to flash
* skip incompatible tests
* quality
* [broken archive maps]
* Apply suggestions
- remove archive lists
- style
- take shape of inputs_embeds for batch
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/utils/dummy_pt_objects.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* simplify conversion script
* add suggestions
* add suggestions
* add copied from
* fix
* move labels out
* revert
* fix
* remove placeholder labels if None
* use cache_position
* fix quality + docstrings
* fix quality
* fix paligemma 4d gemma mask incompatibility
* fix config docstring
* fix query and attn_mask dtype
---------
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-05-14 22:07:15 +02:00
Marc Sun
539ed75d50
skip low_cpu_mem_usage tests ( #30782 )
2024-05-13 18:00:43 +02:00
Poedator
a0779b9e19
Llama: fix custom 4D masks, v2 ( #30348 )
...
* 4d mask fixes
* Update custom 4D mask logic
* test moved to mixin
* extra tests 4d mask
* upd 4d mask and StaticCache handling
* added Mask4DTestHard to mistral tests
* post-rebase fixes
* test fixes for StaticCache
* make fix-copies
* upd 1 after #30476
* fix common tests
* rm elif attention_mask.dim() == 4:
* tests combined, fixed, mixtral supported
* bigbird style chg reverted
* rm if attention_mask.dim() == 2
* modeling_llama formatting chg
---------
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-05-13 13:46:06 +02:00
JB (Don)
54a2361a29
Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True ( #29024 )
...
* Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True
* Testing for the non-safe-tensors case, since the default is safe-tensors already
* Running fixup/fix-copies
* Adding accelerate annotations to tests
2024-05-07 11:12:21 +02:00
Michael Benayoun
fbabd6746f
Fix for Neuron ( #30259 )
2024-05-02 10:24:47 +02:00
Raushan Turganbay
38a4bf79ad
Encoder-decoder models: move embedding scale to nn.Module ( #30410 )
...
* move scaling to nn.Module
* let the test be here for now (need to fix)
* failing tests
* last failing models
* Revert commit 4c14817f38
* clean-up
* oops forgot
* codestyle
* raise NotImplemented when possible
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* skip tests in respective modeling files
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 12:33:00 +05:00
Raushan Turganbay
9d31b32e9d
Use text config's vocab size in testing models ( #30568 )
...
use text config's vocab size
2024-05-01 12:32:45 +05:00
JB (Don)
dfa7b580e9
[BERT
] Add support for sdpa ( #28802 )
...
* Adding SDPA support for BERT
* Using the proper input name for testing model input in inference()
* Adding documentation for SDPA in BERT model page
* Use the stable link for the documentation
* Adding a gate to only call .contiguous() for torch < 2.2.0
* Additions and fixes to the documentation
* Minor updates to documentation
* Adding extra requirements needed for the contiguous() bug
* Adding "Adapted from" in plcae of the "Copied from"
* Add benchmark speedup tables to the documentation
* Minor fixes to the documentation
* Use ClapText as a replacemenet for Bert in the Copied-From
* Some more fixes for the fix-copies references
* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
[test all]
* Undo changes to separate test
* Refactored SDPA self attention code for KV projections
* Change use_sdpa to attn_implementation
* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
2024-04-26 16:23:44 +01:00
Pavel Iakubovskii
13b3b90ab1
Fix DETA save_pretrained ( #30326 )
...
* Add class_embed to tied weights for DETA
* Fix test_tied_weights_keys for DETA model
* Replace error raise with assert statement
2024-04-22 17:11:13 +01:00
Jacky Lee
30b453206d
Enable multi-device for some models ( #30207 )
...
* feat: multidevice for resnet
* feat: yes! resnet
* fix: compare all elements in tuple
* feat: support for regnet
* feat: support for convnextv2
* feat: support for bit
* feat: support for cvt
* feat: add support for focalnet
* feat: support for yolos
* feat: support for glpn
* feat: support for imagegpt
* feat: support for levit
* feat: support for mgp_str
* feat: support for mobilnet_v1
* feat: support for mobilnet_v2
* feat: support for mobilevit
* feat: support for mobilevitv2
* feat: support for poolformer
* fix: copies
* fix: code quality check
* update: upstream changes from main
* fix: consistency check
* feat: support for sam
* feat: support for switchformer
* feat: support for swin
* feat: support for swinv2
* feat: support for timesformer
* feat: suport for trocr
* feat: support for upernet
* fix: check copies
* update: rerun CI
* update: rerun again, maybe
* update: one more rerun
---------
Co-authored-by: Jacky Lee <jackylee328@gmail.com>
2024-04-19 09:24:44 +01:00
Younes Belkada
5728b5ad00
FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time ( #30317 )
...
* Update awq.py
* style
* revert felix PR
* fix
* add felix comments
2024-04-18 15:51:17 +02:00
Arthur
acab997bef
Revert "Re-enable SDPA's FA2 path ( #30070 )" ( #30314 )
...
* Revert "Re-enable SDPA's FA2 path (#30070 )"
This reverts commit 05bdef16b6
.
* Revert "Fix quality Olmo + SDPA (#30302 )"
This reverts commit ec92f983af
.
2024-04-18 14:09:52 +02:00
fxmarty
9459efb807
Add atol for sliding window test ( #30303 )
...
atol for sliding window test
2024-04-18 17:08:34 +08:00
fxmarty
05bdef16b6
Re-enable SDPA's FA2 path ( #30070 )
...
* tentatively re-enable FA2 + SDPA
* better comment
* _ignore_causal_mask_sdpa as staticmethod
* type hints
* use past_seen_tokens instead
* enable copied from for sdpa
* ruff
* llama simplifications on review
* remove unnecessary self.is_causal check
* fix copies
* cleaning
* precise message
* better doc
* add test
* simplify
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* style
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-18 04:21:00 +08:00
fxmarty
40eb6d6c5f
Fix SDPA sliding window compatibility ( #30127 )
...
* fix sdpa + sliding window
* give credit
Co-authored-by: ehuaa <ehuamail@163.com>
* remove unnecessary warning
* fix typog
* add test
---------
Co-authored-by: ehuaa <ehuamail@163.com>
2024-04-17 17:21:26 +08:00
amyeroberts
6b78360e6d
Add Idefics2 ( #30253 )
...
* Initial add model additions
* Test
* All weights loading
* Can perform full forward pass
* Local and remote the same
* Matching local and remote
* Fixup
* Idefics2Model importable; fixup docstrings
* Don't skip by default
* Remove deprecated use_resampler arg
* Remove self.config
* DecoupledLinear takes config
* Tidy up
* Enable eager attention and tidy up
* Most tests passing
* Update for batch of processed images
* Add image processor
* Update doc pages
* Update conversion script
* Remove erroneous breakpoint
* Remove accidendtal spelling change
* Update to reflect changes on hub - make generate work
* Fix up
* Image processor tests
* Update tests
* Add a processor
* Add a processor
* Update convert script
* Update modeling file - remove fixmes
* Bug fix
* Add processing test
* Use processor
* Fix up
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Fix test
* Update config - PR comments and defaults align with checkpoint
* Reviewer comments
* Add copied froms for flahs attention
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove qk_layer_norm and freeze_layers functionality
* Fix
* Remove freeze_layer options from config
* Sync with upstream main
* Fix attention shapes siglip
* Remove Llava-next refs - TO REBASE
* Use AutoModel for text model
* Add comment to explain vision embeddings
* Fix issue with tie_word_embeddings
* Address review comments
* Fix and fix up
* Chat templates for idefics
* Fix copies
* Fix
* Add layer norms to FA2
* Fix tests
* Apply suggestions from code review
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Fix
* Review comments
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update inputs merger
* Merge weights in correct order
* Update convert script
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update template
* Model code examples (fix idefics too)
* More review comments
* Tidy up
* Update processing
* Fix attention mask preparation
* Update inputs_merger inputs
* Vectorize inputs_merger
* Update src/transformers/models/idefics2/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/idefics2/modeling_idefics2.py
* Review comments
* saying bye to the `qk_layer_norms`
* Simplify
* Update latents
* Remove erroneuous readme changes
* Return images when applying chat template
* Fix bug - prompt images are for a single sample
* Update src/transformers/models/idefics2/modeling_idefics2.py
* image splitting
* fix test
* some more comment
* some comment
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics2/image_processing_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update processor
* Update model tests
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Don't add BOS in template
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Remove index in examples
* Update tests to reflect #13
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* PR comment - consistent typing
* Update readme and model doc
* Update docs
* Update checkpoint references
* Update examples
* Fix and update tests
* Small addition
* Update tests - remove copied from as no ignore placement copy could be found
* Update example
* small fixes
* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update README.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Connector model as bridge
* Fix up
* Fix up
* Don't pass model inputs for generation kwargs update
* IDEFICS-2 -> Idefics2
* Remove config archive name
* IDEFICS-2 -> Idefics2
* Add back llava-next
* Update readmes
* Add requirements for processor tester
* Use custom convert_to_rgb to avoid possible BC
* Fix doc example
* Fix doc example
* Skip model doc tests - as model to large
* More doc example - account for image splitting
* Update src/transformers/image_transforms.py
* Fix config doctest
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-15 17:03:03 +01:00
Yih-Dar
08a194fcd6
Fix slow tests for important models to be compatible with A10 runners ( #29905 )
...
* fix mistral and mixtral
* add pdb
* fix mixtral tesst
* fix
* fix mistral ?
* add fix gemma
* fix mistral
* fix
* test
* anoter test
* fix
* fix
* fix mistral tests
* fix them again
* final fixes for mistral
* fix padding right
* fix whipser fa2
* fix
* fix
* fix gemma
* test
* fix llama
* fix
* fix
* fix llama gemma
* add class attribute
* fix CI
* clarify whisper
* compute_capability
* rename names in some comments
* Add # fmt: skip
* make style
* Update tests/models/mistral/test_modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update
* update
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-09 13:28:54 +02:00
Yoach Lacombe
569f6c7d43
Fix FA2 tests ( #29909 )
...
* fix FA2 tests
* refactor inference test name
2024-04-01 07:51:00 +00:00
Joao Gante
248d5d23a2
Tests: replace torch.testing.assert_allclose
by torch.testing.assert_close
( #29915 )
...
* replace torch.testing.assert_allclose by torch.testing.assert_close
* missing atol rtol
2024-03-28 09:53:31 +00:00
NielsRogge
776c9d3af8
[Tests] Remove unused code ( #29737 )
...
Remove unused code
2024-03-20 13:26:00 +01:00
Raushan Turganbay
5ac264d8a8
Fix batching tests for new models (Mamba and SegGPT) ( #29633 )
...
* fix batchinng tests for new models
* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-13 17:52:49 +00:00
Raushan Turganbay
8e64ba2890
Add tests for batching support ( #29297 )
...
* add tests for batching support
* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update tests/test_modeling_common.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* fixes and comments
* use cosine distance for conv models
* skip mra model testing
* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* finzalize and make style
* check model type by input names
* Update tests/models/vilt/test_modeling_vilt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixed batch size for all testers
* Revert "fixed batch size for all testers"
This reverts commit 525f3a0a05
.
* add batch_size for all testers
* dict from model output
* do not skip layoutlm
* bring back some code from git revert
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* clean-up
* where did minus go in tolerance
* make whisper happy
* deal with consequences of losing minus
* deal with consequences of losing minus
* maskformer needs its own test for happiness
* fix more models
* tag flaky CV models from Amy's approval
* make codestyle
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-12 17:46:19 +00:00
Eduardo Pacheco
3fcfbe7549
Adding SegGPT ( #27735 )
...
* First commit
* Improvements
* More improvements
* Converted original checkpoint to HF checkpoint
* Fix style
* Fixed forward
* More improvements
* More improvements
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Remove asserts
* Remove unnecessary attributes
* Changed model name to camel case
* Improve forward doc
* Improve tests
* More improvements
* Fix copies
* Fix doc
* Make SegGptImageProcessor more flexible
* Added few-shot test
* Fix style
* Update READMEs and docs
* Update READMEs
* Make inputs required
* Add SegGptForImageSegmentation
* Make tests pass
* Rename to out_indicies
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Fixed naming convention
* Copying SegGptMlp from modeling_sam.py
* Some minor improvements
* Remove mlp_ratio
* Fix docstrings
* Fixed docstring match
* Objects defined before use
* Storing only patch_size and beta for SegGptLoss
* removed _prepare_inputs method
* Removed modified from headers
* Renamed to output_indicies
* Removed unnecessary einsums
* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fixing issues
* Raise error as soon as possible
* More fixes
* Fix merge
* Added palette to SegGptImageProcessor
* Fixed typo
* Fixed shape typo
* Added permute before doing palette to class mapping
* Fixed style
* Fixed and added tests
* Fixed docstrings
* Matching SegFormer API for post_processing_semantic_segmentation
* Fixed copies
* Fixed SegGptImageProcessor to handle both binary and RGB masks
* Updated docstrings of SegGptImageProcessor
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/seggpt.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/configuration_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/seggpt/test_image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/seggpt/test_modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Object definitions above & fix style
* Renamed output_indices to intermediate_feature_indices
* Removed unnecessary check on bool_masked_pos
* Loss first in the outputs
* Added validation for do_normalize
* Improved SegGptImageProcessor and added new tests
* Added comment
* Added docstrings to SegGptLoss
* Reimplemented ensemble condition logic in SegGptEncoder
* Update src/transformers/models/seggpt/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/seggpt/convert_seggpt_to_hf.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/seggpt/configuration_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Updated docstrings to use post_process_semantic_segmentation
* Fixed typo on docstrings
* moved pixel values test to test_image_processing_seggpt
* Addressed comments
* Update src/transformers/models/seggpt/configuration_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/image_processing_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/configuration_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Updated docstrings for SegGptLoss
* Address comments
* Added SegGpt example to model docs
* Update src/transformers/models/seggpt/modeling_seggpt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* moved patchify and unpatchify
* Rename checkpoint
* Renamed intermediate_features to intermediate_hidden_states for consistency
* Update src/transformers/models/seggpt/configuration_seggpt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Replaced post_process_masks for post_process_semantic_segmentation in the docs
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 18:17:19 +00:00
Merve Noyan
7c4995f93d
Add feature extraction mapping for automatic metadata update ( #28944 )
...
* add feature extraction mapping
* added prefix
* ruff check
* minor fix
* Update modeling_auto.py
* fix typo
* remove prefix to make variable public/importable
* Update src/transformers/models/auto/modeling_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fixes
* addressed comments
* nit
* fix-copies
* remove from tests
* this should fix
* Update tests/models/convnextv2/test_modeling_convnextv2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* nits
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-02-26 10:35:37 +00:00
amyeroberts
0996a10077
Revert low cpu mem tie weights ( #29135 )
...
* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948 )"
This reverts commit 725f4ad1cc
.
* Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043 )"
This reverts commit 4156f517ce
.
2024-02-20 12:06:46 +00:00
JB (Don)
725f4ad1cc
Add tie_weights() to LM heads and set bias in set_output_embeddings() ( #28948 )
...
* Add tie_weights() to LM heads and set bias in set_output_embeddings()
The bias were not tied correctly in some LM heads, and this change should fix that.
* Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin
* Adding _tie_weights() to MPNet and Vilt
* Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device
* Rename to test name to save_load to match the convention
2024-02-14 20:39:01 +00:00
Joao Gante
e30bbb2685
Tests: tag test_save_load_fast_init_from_base
as flaky ( #28930 )
2024-02-12 14:43:34 +00:00
fxmarty
709dc43239
Fix symbolic_trace with kv cache ( #28724 )
...
* fix symbolic_trace with kv cache
* comment & better test
2024-02-01 09:45:02 +01:00
tom-p-reichel
ae0c27adfa
don't initialize the output embeddings if we're going to tie them to input embeddings ( #28192 )
...
* test that tied output embeddings aren't initialized on load
* don't initialize the output embeddings if we're going to tie them to the input embeddings
2024-01-31 02:19:18 +01:00
fxmarty
2c1eebc121
Fix SDPA tests ( #28552 )
...
* skip bf16 test if not supported by device
* fix
* fix bis
* use is_torch_bf16_available_on_device
* use is_torch_fp16_available_on_device
* fix & use public llama
* use 1b model
* fix flacky test
---------
Co-authored-by: Your Name <you@example.com>
2024-01-17 17:29:18 +01:00
fxmarty
a6adc05e6b
symbolic_trace: add past_key_values, llama, sdpa support ( #28447 )
...
* torch.fx: add pkv, llama, sdpa support
* Update src/transformers/models/opt/modeling_opt.py
* remove spaces
* trigger ci
* use explicit variable names
2024-01-17 11:50:53 +01:00
Weiming Zhao
701298d2d3
Use mmap option to load_state_dict ( #28331 )
...
Use mmap option to load_state_dict (#28331 )
2024-01-10 09:57:30 +01:00
Yih-Dar
7938c8c836
Fix weights not properly initialized due to shape mismatch ( #28122 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-20 14:20:02 +01:00