Raushan Turganbay
43df47d8e7
Llava Onevision: add model ( #32673 )
...
* working version
* fix copies
* update
* tests
* update docs
* codestyle
* add more tests
* add returns for docs
* clean up
* Update src/transformers/models/llava_onevision/processing_llava_onevision.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* updates
* codestyle
* style
* shouldn't be reversed
* [run-slow] llava_onevision
* [run-slow] llava_onevision
* add pooling in videos
* [run-slow] llava_onevision
* num-logits-to-keep
* [run-slow] llava_onevision
* [run-slow] llava_onevision
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* video matched orig impl
* fix tests
* chat template was modified
* Update docs/source/en/model_doc/llava_onevision.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add morer info in the doc page
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-05 14:43:20 +05:00
Niklas Muennighoff
ecd61c6286
Add OLMoE ( #32406 )
...
* Add OLMoE
* Add OLMoE
* Updates
* Make norm optional; add keys
* Add output
* Add
* Fix dtype
* Fix eos config
* Update
* Add OLMoE
* Fix OLMoE path
* Format
* Format
* Rmv copy statement
* Rmv copy statement
* Format
* Add copies
* Cp rotary
* Fix aming
* Fix naming
* Update RoPE integration; num_logits_to_keep; Add copy statements
* Add eps to config
* Format
* Add aux loss
* Adapt router_aux_loss_coef
* Update md
* Adapt
* adapt tests
2024-09-03 18:43:12 +02:00
Omar Salman
03c12d0d63
Add sdpa support for Albert ( #32092 )
...
* Add sdpa support for Albert
* [run_slow] albert
* Add benchmarks and PR suggestion
* Fix quality
* Fix
* [run_slow] albert
2024-09-03 14:01:00 +01:00
JB (Don)
f1a385b1de
[RoBERTa-based] Add support for sdpa ( #30510 )
...
* Adding SDPA support for RoBERTa-based models
* add not is_cross_attention
* fix copies
* fix test
* add minimal test for camembert and xlm_roberta as their test class does not inherit from ModelTesterMixin
* address some review comments
* use copied from
* style
* consistency
* fix lists
---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-28 10:26:00 +02:00
Mayank Mishra
c35d2ccf5a
Granite language models ( #31502 )
...
* first commit
* drop tokenizer
* drop tokenizer
* drop tokenizer
* drop convert
* granite
* drop tokenization test
* mup
* fix
* reformat
* reformat
* reformat
* fix docs
* stop checking for checkpoint
* update support
* attention multiplier
* update model
* tiny drop
* saibo drop
* skip test
* fix test
* fix test
* drop
* drop useless imports
* update docs
* drop flash function
* copied from
* drop pretraining tp
* drop pretraining tp
* drop pretraining tp
* drop unused import
* drop code path
* change name
* softmax scale
* head dim
* drop legacy cache
* rename params
* cleanup
* fix copies
* comments
* add back legacy cache
* multipliers
* multipliers
* multipliers
* text fix
* fix copies
* merge
* multipliers
* attention multiplier
* drop unused imports
* fix
* fix
* fix
* move rope?
* Update src/transformers/models/granite/configuration_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* Update src/transformers/models/granite/modeling_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix
* fix
* fix
* fix-copies
* torch rmsnorm
* add authors
* change model path
* fix
* test
* drop static cache test
* uupdate readme
* drop non-causal
* readme
* drop useless imports
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-27 21:27:21 +02:00
Shijie
19e6e80e10
support qwen2-vl ( #32318 )
...
* support-qwen2-vl
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* hyphen->underscore
* make style
* add-flash2-tipd
* delete-tokenize=False
* remove-image_processor-in-init-file
* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES
* format-doct
* support-Qwen2VLVisionConfig
* remove-standardize_cache_format
* fix-letter-varaibles
* remove-torch-in-image-processor
* remove-useless-docstring
* fix-one-letter-varaible-name
* change-block-name
* default-quick-gelu-in-vision
* remove-useless-doc
* use-preimplemented-flash-forward
* fix-doc
* fix-image-processing-doc
* fix-apply-rotary-embed
* fix-flash-attn-sliding-window
* refactor
* remove-default_template
* remove-reorder_cache
* simple-get-rope_deltas
* update-prepare_inputs_for_generation
* update-attention-mask
* update-rotary_seq_len
* remove-state
* kv_seq_length
* remove-warning
* _supports_static_cache
* remove-legacy-cache
* refactor
* fix-replace
* mrope-section-doc
* code-quality
* code-quality
* polish-doc
* fix-image-processing-test
* update readme
* Update qwen2_vl.md
* fix-test
* Update qwen2_vl.md
* nit
* processor-kwargs
* hard-code-norm_layer
* code-quality
* discard-pixel-values-in-gen
* fix-inconsistent-error-msg
* unify-image-video
* hidden_act
* add-docstring
* vision-encode-as-PreTrainedModel
* pixel-to-target-dtype
* update doc and low memoryvit
* format
* format
* channel-foramt
* fix vit_flashatt
* format
* inherit-Qwen2VLPreTrainedModel
* simplify
* format-test
* remove-one-line-func-in-image-processing
* avoid-one-line-reshape
* simplify-rotary_seq_len
* avoid-single-letter-variable
* no-for-loop-sdpa
* avoid-single-letter-variable
* remove-one-line-reshape
* remove-one-line-reshape
* remove-no-rope-in-vit-logic
* default-mrope
* add-copied-from
* more-docs-for-mrope
* polish-doc
* comment-and-link
* polish-doc
* single-letter-variables
* simplify-image-processing
* video->images
* kv_seq_len-update
* vision-rope-on-the-fly
* vision-eager-attention
* change-processor-order
---------
Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
2024-08-26 15:16:44 +02:00
Yunfei Chu
16ed0640be
Add Qwen2-Audio ( #32137 )
...
* add qwen2audio
* Update check_repo.py
* fix style
* fix test
* fix style
* add model size
* Qwen2AudioEncoderModel->Qwen2AudioEncoder; add copy info
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* switch the attention_mask and the feature_attention_mask
* add to PRIVATE_MODELS in check_repo.py; add to MODEL_NAMES_TO_IGNORE in check_table.py
* fix initialization
* update chat_template
* fix consistency issue after copy
* add docstrings to _merge_input_ids_with_audio_features
* add copied from to prepare_inputs_for_generation
* add more details to docs
* rm comment
* add init_std
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* update
* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* update tests
* rm ignore_index
* update processor
* rm ffmpeg_read
* Update tests/models/qwen2_audio/test_modeling_qwen2_audio.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* update
* typo
* [run_slow] qwen2_audio
* [run_slow] qwen2_audio
* [run_slow] qwen2_audio
* fix quality
* [run_slow] qwen2_audio
* [run_slow] qwen2_audio
* [run_slow] qwen2_audio
* add official model
---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-08 15:47:24 +02:00
Wonseok Lee (Jack)
e28784f821
Change Phi3 _supports_sdpa
to True ( #32457 )
...
* Change `_supports_sdpa` to True
* add phi3 to sdpa support list
2024-08-08 13:28:20 +02:00
Ao Tang
6a03942db7
Add Nemotron HF Support ( #31699 )
...
* Add nemotron support
* fix inference
* add unit test
* add layernorm1p as a class to avoid meta device mismatch
* test fixed
* Add copied_from statements
* remove pretraining_tp args
* remove nemotronlayernorm
* force LN computation done in FP32
* remove nemotrontokenizer and use llamatokenizer
* license update
* add option for kv_channels for minitron8b
* remove assert
* o_proj fixed
* o_proj reshape
* add gated_proj option
* typo
* remove todos
* fix broken test after merging latest main
* remove nezha/nat after meging main
* chnage default config to 15b model
* add nemo conversion script
* rename conversion script
* remove gate_proj option
* pr comment resolved
* fix unit test
* rename kv_channels to head_dim
* resolve PR issue
* add nemotron md
* fix broken tests
* refactor rope for nemotron
* test fix
* remove linearscaling
* whitespace and import
* fix some copied-from
* code style fix
* reformatted
* add position_embedding to nemotronattention
* rope refactor to only use config, copied-from fix
* format
* Run make fix-copies
* nemotron md with autodoc
* doc fix
* fix order
* pass check_config_docstrings.py
* fix config_attributes
* remove all llama BC related code
* Use PreTrainedTokenizerFast
* ruff check examples
* conversion script update
* add nemotron to toctree
2024-08-06 15:42:05 +02:00
Pavel Iakubovskii
1c37e8c1a6
Add sdpa
and FA2 for CLIP ( #31940 )
...
* Squashed commit of the following:
commit 102842cd477219b9f9bcb23a0bca3a8b92bd732f
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Fri Jul 12 18:23:52 2024 +0000
Add model-specific sdpa tests
commit 60e4c88581abf89ec098da84ed8e92aa904c997d
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Fri Jul 12 18:20:53 2024 +0000
Add fallback to eager (expensive operation)
commit c29033d30e7ffde4327e8a15cbbc6bee37546f80
Author: Pavel Iakubovskii <qubvel@gmail.com>
Date: Thu Jul 11 17:09:55 2024 +0000
Fix attn_implementation propagation
commit 783aed05f0f38cb2f99e758f81db6838ac55b9f8
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:05:27 2024 +0530
style
commit e77e703ca75d00447cda277eca6b886cd32bddc0
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:04:57 2024 +0530
add comment to explain why I had to touch forbidden codebase.
commit ab9d8849758e7773a31778ccba71588d18552623
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 09:03:02 2024 +0530
fix: flax attribute access.
commit c570fc0abf9d1bd58c291aae3c7e384f995996d2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 08:23:54 2024 +0530
fix tensorflow attribute name.
commit 32c812871cfdb268d8a6e3e2c61c5c925c8ed47e
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 07:57:10 2024 +0530
fix attribute access.
commit 4f41a0138b6c417aed9c9332278f8bcd979cb7c2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Sat May 25 07:44:02 2024 +0530
_from_config.
commit 35aed64ff602422adcf41d7f677a0a24bd9eccae
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 18:46:52 2024 +0530
propagation of attn_implementation.
commit 4c25c19845438b1dc1d35a5adf9436151c8c5940
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:24:36 2024 +0530
style again
commit 5f7dc5c5015c0f8116408f737e8c318d1802c80c
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:19:05 2024 +0530
use from_config.
commit b70c409956d0359fa6ae5372275d2a20ba7e3389
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 24 09:13:43 2024 +0530
quality
commit a7b63beff53d0fc754c6564e2a7b51731ddee49d
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 14:35:10 2024 +0200
add benchmark numbers
commit 455b0eaea50862b8458c8f422b60fe60ae40fdcb
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:50:16 2024 +0200
Revert "reflect feedback more"
This reverts commit dc123e71ef
.
commit ca674829d28787349c2a9593a14e0f1d41f04ea4
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:50:05 2024 +0200
Revert "fix"
This reverts commit 37a1cb35b8
.
commit fab2dd8576c099eb1a3464958cb206a664d28247
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:47:46 2024 +0200
fix
commit fbc6ae50fd6f2d36294d31e191761631b701d696
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 10 13:38:30 2024 +0200
reflect feedback more
commit 87245bb020b2d60a89afe318a951df0159404fc9
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 08:54:34 2024 +0530
fixes
commit 1057cc26390ee839251e7f8b3326c4207595fb23
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:49:03 2024 +0530
don't explicit set attn_implementation in tests
commit e33f75916fc8a99f516b1cf449dbbe9d3aabda81
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:43:54 2024 +0530
explicitly override attn_implementation in the towers.
commit 4cf41cb1bc885c39df7cb8f2a0694ebf23299235
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:38:42 2024 +0530
import in one-line.
commit f2cc447ae9e74ccfacb448140cdf88259d4afc8c
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri May 3 07:34:58 2024 +0530
move sdpa mention to usage tips.
commit 92884766c64dbb456926a3a84dd427be1349fa95
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 29 10:58:26 2024 +0530
fix: memory allocation problem.
commit d7ffbbfe12f7750b7d0a361420f35c13e0ea787d
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 29 09:56:59 2024 +0530
fix-copies
commit 8dfc3731cedd02e36acd3fe56bb2e6d61efd25d8
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Fri Apr 26 20:16:12 2024 +0530
address arthur's comments.
commit d2ed7b4ce4ff15ae9aa4d3d0500f1544e3dcd9e9
Author: Sayak Paul <spsayakpaul@gmail.com>
Date: Fri Apr 26 20:08:15 2024 +0530
Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
commit 46e04361f37ded5c522ff05e9f725b9f82dce40e
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:55:27 2024 +0530
add to docs.
commit 831629158ad40d34d8983f209afb2740ba041af2
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:33:10 2024 +0530
styling.g
commit d263a119c77314250f4b4c8469caf42559197f22
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Wed Apr 24 09:15:20 2024 +0530
up
commit d44f9d3d7633d4c241a737a1bc317f791f6aedb3
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 18:40:42 2024 +0530
handle causal and attention mask
commit 122f1d60153df6666b634a94e38d073f3f260926
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 15:18:21 2024 +0530
test fixes.
commit 4382d8cff6fa1dee5dbcf0d06b3e2841231e36f5
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Tue Apr 23 09:39:25 2024 +0530
fix: scaling inside sdpa.
commit 0f629989efc48b7315cf19405a81e02955efe7e5
Author: Sayak Paul <spsayakpaul@gmail.com>
Date: Tue Apr 23 08:14:58 2024 +0530
Update src/transformers/models/clip/modeling_clip.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
commit 14367316877dc27ea40f767ad1aee38bbc97e4ce
Author: sayakpaul <spsayakpaul@gmail.com>
Date: Mon Apr 22 16:21:36 2024 +0530
add: sdpa support to clip.
* Remove fallback for empty attention mask (expensive operation)
* Fix typing in copies
* Add flash attention
* Add flash attention tests
* List CLIP in FA docs
* Fix embeddings attributes and tf
* [run-slow] clip
* Update clip documentation
* Remove commented code, skip compile dynamic for CLIPModel
* Fix doc
* Fix doc 2
* Remove double transpose
* Add torch version check for contiguous()
* Add comment to test mixin
* Fix copies
* Add comment for mask
* Update docs
* [run-slow] clip
2024-07-18 10:30:37 +05:30
Raushan Turganbay
24cfcc2114
Chameleon: add model ( #31534 )
...
* Chameleon model integration
Co-authored-by: Jacob Kahn <jacobkahn1@gmail.com>
Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com>
* fix 7B, again. mask away image tokens
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove pretrained_config_map
* make fixup passing up to utils/check_config_docstrings.py; vqgan moved to the modeling file
* remove tokenizer (use llama's); remove codechameleon tests
* a few copied from statements and minor changes
* copied from in ChameleonModel
* some copies in ChameleonForCausalLM
* a few more copies
* VQModel moved to ChameleonModel (as opposed to being in the processor)
* ChameleonProcessor ready
* Fix chameleon weights convert
* update conversion script
* clean-up processing
* update modeling a bit
* update
* update (throws error...)
* correct conversion ready
* fix tests
* fix docs
* docs
* ve swin norm
* fix device for vocab map
* add normalization
* update
* update script with rope rotations
* final fix on model conversion
* add slow tests
* more info in docs
* fix repo consistency tests
* fix repo tests
* fix-copies
* hope this will make CI happy
* fix for 30b model
* Update docs/source/en/index.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/image_processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/chameleon/processing_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/chameleon/test_modeling_chameleon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address comments
* remove assertion in conversion script
* add image processor test
* not copied
* port changes for qk layernorm
* fix-copies
* read token decorator for tests
* [run-slow] chameleon
* one more read-token
* address some comments
* qk norm changes
* tests and repo check
* moved rope permutations to conversion, YAY!
* fix past kv check
* docs
* layernorm done!
* let's be consistent in naming
* fix slow tests
* weird thing with slow CI, but let's see
* once more try
* remove past-kv as tuple following llama
* ignore
* style
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: jacobkahn <jacobkahn1@gmail.com>
Co-authored-by: Leonid Shamis <leonid.shamis@gmail.com>
Co-authored-by: Leonid Shamis <lshamis@meta.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-17 10:41:43 +05:00
Pavel Iakubovskii
a177821b24
Add FA2 and sdpa
support for SigLIP ( #31499 )
...
* Rebase to main
* Fix attention implementation autoset for tex and vision configs
* Fixup
* Minor fixes
* Fix copies
* Fix attention_mask for FA2
* Add eqvivalence tests for siglip
* Remove right padding test
* Uncomment flaky
* Fix import
* Add to docs
* Fix test message
* Add sdpa
* Add sdpa equivalence test
* Add siglip sdpa to docs
* Fix typing for attention output
* Add sdpa tests
* Fix signature of FA2
* Autoset attn_implementation in config
* Rename bsz -> batch_size
* Move back autoset attn method
* Mark as flaky
* Correct attention mask padding
* [run-slow] siglip
* Add FA2 and sdpa docs
* Style fix
* Remove flaky for FA2 test
* Change attention implementation set
* Change attn_implementaiton propogation
* Fix typos
* Add modality to assert message
* Add more sdpa backends in test
* [run slow] siglip
* Add math sdpa backend for all options
* [run slow] siglip
2024-07-08 11:10:02 +01:00
Arthur
75a6319864
Fix post gemma merge ( #31660 )
...
* nit
* toctree issue
* protect gemma2 tests as well
* sdpa supported
2024-06-27 17:51:42 +02:00
Raushan Turganbay
e71f2863d7
Add LLaVa NeXT Video ( #31252 )
...
* squash into single commit
* run diff once more
* docstring
* tests
* minor chnages and ready to go
* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* [run-slow] llava-next-video
* [run-slow] llava-next-video
* [run-slow] llava_next_video
* fix two tests
* fix slow tests
* remove logit checks due to numeric errors
* run test once more
* [run-slow] llava_next_video
* final try to pass the test
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* style
* fix
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-26 21:52:28 +05:00
Anton Vlasjuk
b07770c5eb
[GPT-NeoX
] Add SDPA support ( #31031 )
...
* starting support for sdpa in `gptneox` models
* small comment on tests
* fix dropout
* documentation and style
* clarify concrete paths for reference
* generalise attn projections and rope application
added head mask check to sdpa mask creation
handle sdpa memory backend bug via own version flag
* update docs and style
* move dtype casting outside of general attn_projection_and_rope function
fix flash_attn_2 stuff
* more generic attn warning if output_attns or head_mask
* simplify head mask check by moving head mask creation to a later point
* remove copied llama artifact
* remove padding_mask from attention function signature
* removing unnecessary comments, only "save" attn implementation once
* [run_slow] gpt_neox
2024-06-26 13:56:36 +01:00
Anton Vlasjuk
b275a41005
[GPT2
] Add SDPA support ( #31172 )
...
* `gpt2` sdpa support
* fix (at least) one test, style, repo consistency
* fix sdpa mask in forward --> fixes generation
* test
* test2
* test3
* test4
* simplify shapes for attn mask creation and small comments
* hub fail test
* benchmarks
* flash attn 2 mask should not be inverted on enc-dec setup
* fix comment
* apply some suggestion from code review
- only save _attn_implentation once
- remove unnecessary comment
* change elif logic
* [run-slow] gpt2
* modify `test_gpt2_sample_max_time` to follow previous assertion patterns
2024-06-19 09:40:57 +02:00
Younes Belkada
f5590deaa8
Docs / Quantization: Replace all occurences of load_in_8bit
with bnb config ( #31136 )
...
Replace all occurences of `load_in_8bit` with bnb config
2024-05-30 16:47:35 +02:00
hyenal
1c21f48a50
add sdpa to ViT [follow up of #29325 ] ( #30555 )
...
remove blank line (+1 squashed commit)
Squashed commits:
[24ccd2061] [run-slow]vit_msn,vision_encoder_decoder (+24 squashed commits)
Squashed commits:
[08bd27e7a] [run-slow]vit_msn,vision_encoder_decoder
[ec96a8db3] [run-slow]vit_msn
[ead817eca] fix vit msn multi gpu
[d12cdc8fd] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[3fdbfa88f] doc
[a3ff33e4a] finish implementation
[e20b7b7fb] Update test_modeling_common.py
[e290c5810] Update test_modeling_flax_common.py
[d3af86f46] comment
[ff7dd32d8] more comments
[59b137889] suggestion
[7e2ba6d67] attn_implementation as attribute of the class
[fe66ab71f] minor
[38642b568] Apply suggestions from code review
Accept comments
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[22cde7d52] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[48e137cc6] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[99f4c679f] Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[96cf20a6d] Update src/transformers/models/vit_msn/modeling_vit_msn.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[c59377d23] Update src/transformers/models/vit_mae/modeling_vit_mae.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[b70a47259] Update tests/models/vision_text_dual_encoder/test_modeling_vision_text_dual_encoder.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
[00c84d216] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[61f00ebb0] all tests are passing locally
[e9e0b82b7] vision encoder/decoder
[4d5076b56] test-vision (+20 squashed commits)
Squashed commits:
[d1add8db9] yolo
[9fde65716] fix flax
[986566c28] minor
[ca2f21d1f] vit
[3333efd7a] easy models change
[ebfc21402] [run-slow]audio_spectrogram_transformer,deit,vision_encoder_decoder,vision_text_dual_encoder,vit,vit_hybrid,vit_mae,vit_msn,videomae,yolos
[b8b8603ed] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[48ecc7e26] all tests are passing locally
[bff7fc366] minor
[62f88306f] fix yolo and text_encoder tests
[121507555] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1064cae0a] [run-slow]vision_encoder_decoder,vision_text_dual_encoder,yolos
[b7f52ff3a] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[cffaa10dd] fix-copies
[ef6c511c4] test vit hybrid
[7d4ba8644] vit hybrid
[66f919033] [run-slow]audio_spectrogram_transformer,deit,vit,vit_hybrid,vit_mae,vit_msn,videomae
[1fcc0a031] fixes
[cfde6eb21] fixup
[e77df1ed3] all except yolo end encoder decoder (+17 squashed commits)
Squashed commits:
[602913e22] vit + vit_mae are working
[547f6c4cc] RUN_SLOW=1 pytest tests/models/audio_spectrogram_transformer/ tests/models/deit/ tests/models/videomae/ passes
[61a97dfa9] it s the complete opposite...
[aefab37d4] fix more tests
[71802a1b9] fix all torch tests
[40b12eb58] encoder - decoder tests
[941552b69] slow decorator where appropriate
[14d055d80] has_attentions to yolo and msn
[3381fa19f] add correct name
[e261316a7] repo consistency
[31c6d0c08] fixup
[9d214276c] minor fix
[11ed2e1b7] chore
[eca6644c4] add sdpa to vit-based models
[cffbf390b] make fix-copies result
[6468319b0] fix style
[d324cd02a] add sdpa for vit
Co-authored-by: Liubov Yaronskaya <luba.yaronskaya@gmail.com>
2024-05-16 10:56:11 +01:00
Raushan Turganbay
bd9f4d7951
Add Video Llava ( #29733 )
...
* add model draft
* update docstring
* add tests
* support image and video as input
* update for better handling of mixed input and clean-up a bit
* bug when mixed inputs & add tests
* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Merge remote-tracking branch 'upstream/main' into video_llava
* link to abstract of paper in README
* fix test
* fix-copies
* make tests happy
* skip docstest for now
* do not run doctest for now
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address review comments
* failing tests
* Fix vocab_size in common tests for VLMs
* codestyle
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/image_processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/video_llava.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/processing_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/video_llava/test_modeling_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* PR suggestions
* fix-copies
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/video_llava/configuration_video_llava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add full example in docs
* clean-up with new model-id
* [run-slow] video_llava
* update docstring
* [run-slow] video_llava
* remove all achive maps
* fix some tests
* test was supposed to be skipped for llava :)
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-15 16:42:29 +05:00
Pablo Montalvo
1360801a69
Add PaliGemma ( #30814 )
...
* add new model like
* add state dict slicing + new model config
* update palma config and weights, passes vision activations
* fix
* update
* reorder loading/unpacking
* clean up
* add debug statements
* change device
* fix
* debugging
* fix noncausal mask
* fixup sdpa + causal mask
* fix activation function
* remove debug before changing modeling file
* add variants
* debug attention mask in generate
* revert to non-debug sdpa
* revert gemma modifications
* add custom language modeling
* use Processor
* add language modeling file to init
* try thin wrapper around generate
* Update
* update mask
* breakpoints galore
* remove conflict
* switch to left-padding
* add incomplete model doc
* add paligemma global files
* batch rename paligemma
* make generation match outputs and captioning
* style
* style
* remove copied from + doc
* remove more copied from
* remove copy from projector
* minor fix
* update config and style
* add readme - dummy
* CORRECT image captioning
* moving to args
* add siglip proper + fix merging image + text features
* take update_causal_mask from upstream
* remove breakpoint
* leverage AutoModel
* fix input_ids slicing
* make siglip head conditional
* remove encoder_decoder value
* remove unneeded modeling file
* add commented 4d attention mask
* FIXED generation with 4D mask
* Update src/transformers/models/siglip/modeling_siglip.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix left padding detection
* shuffle order of verifications
* fix missing labels for training
* fix
* vectorize merging of features, improve slicing
* improve testing before conversion
* handle merging in processor
* image token index depends on checkpoint
* add variants, save processor too
* save processors, base tokenizer off spm file
* expand model embeddings due to additional image token
* pass image processing args
* add convert rgb to siglip processor
* add \n token separately
* fix tokenizer and prompts
* fix docstrings
* change to camel
* fix casing
* debug pos_ids and sdpa
* pass and use cache_position
* add flag for newline tokenization
* Update src/transformers/models/paligemma/processing_paligemma.py
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
* simplify conversion script
* add copied from
* add precision to conversion script
* Update src/transformers/models/paligemma/modeling_paligemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* clean up
* Shift attention mask from `1:`
After discussion with @molbap
* add docs, fix quality
* quality, tied weights inheritance, and logits/label alignment
* fix more tests
* pass attn_implementation to language model correctly
* add SiglipVisionTransformer to no split modules
* skip paligemma test for sdpa dispatch to flash
* skip incompatible tests
* quality
* [broken archive maps]
* Apply suggestions
- remove archive lists
- style
- take shape of inputs_embeds for batch
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/utils/dummy_pt_objects.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* simplify conversion script
* add suggestions
* add suggestions
* add copied from
* fix
* move labels out
* revert
* fix
* remove placeholder labels if None
* use cache_position
* fix quality + docstrings
* fix quality
* fix paligemma 4d gemma mask incompatibility
* fix config docstring
* fix query and attn_mask dtype
---------
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-05-14 22:07:15 +02:00
Yikang Shen
ccdabc5642
Add JetMoE model ( #30005 )
...
* init jetmoe code
* update archive maps
* remove flax import
* fix import error
* update README
* ruff fix
* update readme
* fix
* update config
* fix issue
* merge files
* fix model bug
* fix test
* auto fix
* model size
* add comments
* fix form
* add flash attention support
* fix attention head number
* fix init
* fix support list
* sort auto mapping
* fix test
* fix docs
* update test
* fix test
* fix test
* change variable name
* fix config
* fix init
* update format
* clean code
* fix config
* fix config
* change default config
* update config
* fix issues
* update formate
* update config argument
* update format
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* change to mixtral aux loss
* change to cache_position
* debug
* fix bugs
* debug
* fix format
* fix format
* fix copy
* fix format
* fix format
* fix sort
* fix sort
* fix sort
* add copy comment
* add copy from
* remove debug code
* revert readme update
* add copy
* debug
* remove debug code
* fix flash attention
* add comments
* clean code
* clean format
* fix format
* fix format
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* change variable name
* add copied from
* fix variable name
* remove deprecated functinos
* sync to llama implementation
* fix format
* fix copy
* fix format
* update format
* remove repr
* add comment for moe weight
* fix copy
* Update src/transformers/models/jetmoe/configuration_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add comments and reformat config
* fix format
* fix format
* fix format
* update test
* update doc string in config
* Update src/transformers/models/jetmoe/modeling_jetmoe.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update config doc
* update attention cache
* fix format
* fix copy
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-05-14 16:32:01 +02:00
fxmarty
37bba2a32d
CI: update to ROCm 6.0.2 and test MI300 ( #30266 )
...
* update to ROCm 6.0.2 and test MI300
* add callers for mi300
* update dockerfile
* fix trainer tests
* remove apex
* style
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* Update tests/trainer/test_trainer_seq2seq.py
* update to torch 2.3
* add workflow dispatch target
* we may need branches: mi300-ci after all
* nit
* fix docker build
* nit
* add check runner
* remove docker-gpu
* fix issues
* fix
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-13 18:14:36 +02:00
amyeroberts
e7d52a10d7
Fix GroundingDINO, DPR after BERT SDPA update ( #30506 )
...
Fix GroundingDINO, DPR after BET SDPA update
2024-04-26 18:04:41 +01:00
JB (Don)
dfa7b580e9
[BERT
] Add support for sdpa ( #28802 )
...
* Adding SDPA support for BERT
* Using the proper input name for testing model input in inference()
* Adding documentation for SDPA in BERT model page
* Use the stable link for the documentation
* Adding a gate to only call .contiguous() for torch < 2.2.0
* Additions and fixes to the documentation
* Minor updates to documentation
* Adding extra requirements needed for the contiguous() bug
* Adding "Adapted from" in plcae of the "Copied from"
* Add benchmark speedup tables to the documentation
* Minor fixes to the documentation
* Use ClapText as a replacemenet for Bert in the Copied-From
* Some more fixes for the fix-copies references
* Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
[test all]
* Undo changes to separate test
* Refactored SDPA self attention code for KV projections
* Change use_sdpa to attn_implementation
* Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
2024-04-26 16:23:44 +01:00
Gustavo de Rosa
c9693db2fc
Phi-3 ( #30423 )
...
* chore(root): Initial commit of Phi-3 files.
* fix(root): Fixes Phi-3 missing on readme.
* fix(root): Ensures files are consistent.
* fix(phi3): Fixes unit tests.
* fix(tests): Fixes style of phi-3 test file.
* chore(tests): Adds integration tests for Phi-3.
* fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.
* fix(phi3): Fixes incorrect docstrings.
* fix(phi3): Fixes docstring typos.
* fix(phi3): Adds support for Su and Yarn embeddings.
* fix(phi3): Improves according first batch of reviews.
* fix(phi3): Uses up_states instead of y in Phi3MLP.
* fix(phi3): Uses gemma rotary embedding to support torch.compile.
* fix(phi3): Improves how rotary embedding classes are defined.
* fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.
* fix(phi3): Adds last suggestions to modeling file.
* fix(phi3): Splits inv_freq calculation in two lines.
2024-04-24 17:32:09 +02:00
Kamil Akesbi
569743f510
Add sdpa and fa2 the Wav2vec2 family. ( #30121 )
...
* add sdpa to wav2vec.
Co-authored-by: kamilakesbi <kamil@huggingface.co>
Co-authored-by: jp1924 <jp42maru@gmail.com>
* add fa2 to wav2vec2
* add tests
* fix attention_mask compatibility with fa2
* minor dtype fix
* replace fa2 slow test
* fix fa2 slow test
* apply code review + add fa2 batch test
* add sdpa and fa2 to hubert
* sdpa and fa2 to data2vec_audio
* sdpa and fa2 to Sew
* sdpa to unispeech + unispeech sat
* small fix
* attention mask in tests
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* add_speedup_benchmark_to_doc
---------
Co-authored-by: kamil@huggingface.co <kamil.akesbi@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-04-22 18:30:38 +01:00
Abhi Venigalla
005b957fb8
Add DBRX Model ( #29921 )
...
* wip
* fix __init__.py
* add docs
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address comments 1
* work on make fixup
* pass configs down
* add sdpa attention
* remove DbrxBlock
* add to configuration_auto
* docstring now passes formatting test
* fix style
* update READMEs
* add dbrx to modeling_auto
* make fix-copies generated this
* add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP
* config docstring passes formatting test
* rename moe_loss_weight to router_aux_loss_coef
* add to flash-attn documentation
* fix model-path in tests
* Explicitly make `"suli"` the default `ffn_act_fn`
Co-authored-by: Wing Lian <wing.lian@gmail.com>
* default to using router_aux_loss_coef over ffn_config[moe_loss_weight]
* fix _flash_attn_uses_top_left_mask and is_causal
* fix tests path
* don't use token type IDs
* follow Llama and remove token_type_ids from test
* init ConfigTester differently so tests pass
* remove multiple choice test
* remove question + answer test
* remove sequence classification test
* remove token classification test
* copy Llama tests and remove token_type_ids from test inputs
* do not test pruning or headmasking; style code
* add _tied_weights_keys parameter to pass test
* add type hints
* fix type check
* update config tester
* remove masked_lm test
* remove encoder tests
* initialize DbrxModelTester with correct params
* style
* torch_dtype does not rely on torch
* run make fixup, fix-copies
* use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py
* add copyright info
* fix imports and DbrxRotaryEmbedding
* update DbrxModel docstring
* use copies
* change model path in docstring
* use config in DbrxFFN
* fix flashattention2, sdpaattention
* input config to DbrXAttention, DbrxNormAttentionNorm
* more fixes
* fix
* fix again!
* add informative comment
* fix ruff?
* remove print statement + style
* change doc-test
* fix doc-test
* fix docstring
* delete commented out text
* make defaults match dbrx-instruct
* replace `router_aux_loss_coef` with `moe_loss_weight`
* is_decoder=True
* remove is_decoder from configtester
* implement sdpa properly
* make is_decoder pass tests
* start on the GenerationTesterMixin tests
* add dbrx to sdpa documentation
* skip weight typing test
* style
* initialize smaller model
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Add DBRX to toctree
* skip test_new_cache_format
* make config defaults smaller again
* add pad_token_id
* remove pad_token_id from config
* Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP
* Update src/transformers/models/dbrx/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/dbrx.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix typo
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update docs, fix configuration_auto.py
* address pr comments
* remove is_decoder flag
* slice
* fix requires grad
* remove grad
* disconnect differently
* remove grad
* enable grads
* patch
* detach expert
* nissan al ghaib
* Update modeling_dbrx.py
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* replace "Gemma" with "Dbrx"
* remove # type: ignore
* don't hardcode vocab_size
* remove ToDo
* Re-add removed idefics2 line
* Update test to use tiny-random!
* Remove TODO
* Remove one more case of loading the entire dbrx-instruct in the tests
* Update src/transformers/models/dbrx/modeling_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* address some comments
* small model
* add dbrx to tokenization_auto
* More docstrings with add_start_docstrings
* Dbrx for now
* add PipelineTesterMixin
* Update src/transformers/models/dbrx/configuration_dbrx.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove flash-attn2 import error
* fix docstring
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add useage example
* put on one line
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix ffn_act_fn
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change "dbrx" to "DBRX" for display purposes.
* fix __init__.py?
* fix __init__.py
* fix README
* return the aux_loss
* remove extra spaces
* fix configuration_auto.py
* fix format in tokenization_auto
* remove new line
* add more useage examples
---------
Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Eitan Turok <eitan.turok@databricks.com>
Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com>
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Eitan Turok <eitanturok@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-18 15:18:52 +02:00
tomeras91
3f20877da9
Add jamba ( #29943 )
...
* Add jamba arch
* apply "make fix-copies" changes
* fix link to model in JambaConfig docstring
* Add n_ctx in modeling file because repo-consistency wants that
* Add jamba to flash attention and sdpa documentation
* mamba dt_proj quant fix now works for LoRA as well
* override test_left_padding_compatibility and use a more permissive tolerance. left padding numerical difference are accentuated by mamba layers
* add jamba to tokenization auto
* fix comments of shape (PR #24 in the model page: https://huggingface.co/ai21labs/Jamba-v0.1/discussions/24 )
* simple PR fixes
* remove unnecessary kwargs from JambaAttentionDecoderLayer and JambaMambaDecoderLayer
* remove the LoRA hack for the mamba dt_proj bias. It was solved in huggingface/peft#1530 (https://github.com/huggingface/peft/pull/1530 )
* Add copied comment on JambaMLP (it's the same as MixtralMLP)
* remove padding_mask warnings. It's not supported anymore
* fix docstring. Float instead of int
* A few more minor PR fixes
* (1) lowercase names for mamba layernorms (2) remove _apply_inner_layernorms and do it directly in the forward pass
* Return None attention weights from mamba layers. Append to all attentions only if not None.
* remove some leftover jamba archive lists
* Better separation between expert vs non-expert layers. non-expert layers return None as router_logits, and it is not concatenated to all_router_logits returned from JambaModel
* no need to take router_logits at config.expert_layer_offset anymore. result.router_logits now holds results only for expert layers
* Add Jamba paper on READMEs
* (1) rename n_ctx -> max_position_embeddings (2) don't use it in the modeling file since it's not needed (set it as an exception to check_config_attributes)
* Add copied from comment
* remove the code path for apply_inner_layernorms=False. Jamba always has the inner mamba layernorms
* clearer docstring for _convert_to_standard_cache
* style fixes
* Change calc_logits_for_entire_prompt (bool) to num_logits_to_keep (int). Adapt assisted decoding code tp use it. Also small change in low memory beam search decoding path to support this new int value in model_inputs
* rename test so it still overrides what its meant to override
* draft
* oups
* nit
* remove more complexe logic
* fix names used in config
* fix fix fix
* style
* fix some more failing tests
* generate did not init the cache 🙃
* more small nits
* typo
* config.mamba_expand * config.hidden_size for the intermediate size of the mamba shapes
* fix init of pkv with torch.tensor()
* empty tensor
* fix some init issues
* stupid changes required by generate because it does not even support it's own DynamicCache class
* more fixes
* fix general assisted gen cache_position bug
* tests passing
* Add offsets and periods as SPECIAL_CASES_TO_ALLOW in check_config_attributes.py
* fix reorder_cache to reorder mamba states and override some more functions in HybridMambaAttentionDynamicCache
* no need to override test_past_key_values_format() and _check_past_key_values_for_generate() in tests anymore
* fix docstrings and typehints for past_key_values
* style fixes
* fix docs
* change typehint due to copy from Mixtral
* forgot import
* import order
* Add configuration_jamba and modeling_jamba to not_doctested because the model is too big to download (in docstring of JambaForCausalLM.forward)
* Add integration test with tiny tandom Jamba model on hub
* fix flash attention cache shapes
* bring back forgotten hidden states
* rename HybridMambaAttentionDynamicCache.seqlen_offset to has_previous_state (and make bool) and bugfix - it should be set to True after a finished forward pass of the entire model
* align integration test after modeling fixes
* bugfix - mamba can use precomputed states only of forward pass is on a single token
* bugfix - mamba can use precomputed states only if they match the batch size
* typo
* remove making _prepare_4d_causal_attention_mask a leaf function
* stop using past_seq_len.get_seq_length(). Use cache positions instead. Adjust test (test_decoder_model_past_with_large_inputs) accordingly
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-04-18 11:04:02 +02:00
Alexander Visheratin
b65df514d1
Add Flash Attention 2 to M2M100 model ( #30256 )
...
* Added flash attention 2.
* Fixes.
* Fix inheritance.
* Fixed init.
* Remove stuff.
* Added documentation.
* Add FA2 to M2M100 documentation.
* Add test.
* Fixed documentation.
* Update src/transformers/models/m2m_100/modeling_m2m_100.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fixed variable name.
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-18 10:27:58 +02:00
Shane A
e4ea19b958
Add OLMo model family ( #29890 )
...
* Add OLMo using add-new-model-like with Llama
* Fix incorrect tokenizer for OLMo
* Copy-paste relevant OLMo methods and their imports
* Add OLMo config
* Modify OLMo config to follow HF conventions
* Remove unneeded Llama code from OLMo model
* Add ability for OLMo model to output attentions
* Add OLMoPreTrainedModel and OLMoModel
* Add OLMoForCausalLM
* Minor fixes to OLMo model for style and missing functions
* Implement OLMo tokenizer
* Implement OLMo to HF conversion script
* Add tests for OLMo model
* Add tests for OLMo fast tokenizer
* Add auto-generated dummy objects
* Remove unimplemented OLMo classes from auto and init classes and re-format
* Add README and associated auto-generated files
* Use OLMo names for common properties
* Run make fixup
* Remove `|` from OLMo typing
* Remove unneeded tokenization_olmo.py
* Revert model, config and converter to add-new-model-like Llama
* Move logic for adding bos/eos token into GPTNeoxTokenizerFast
* Change OLMoConfig defaults to match OLMo-7B
* Use GPTNeoXToknizerFast in OLMo tokenizer tests
* Modify auto-generated OLMoModelTests to work for OLMo
* Add non-parametric layer norm OLMoLayerNorm
* Update weight conversion script for OLMo
* Fix __init__ and auto structure for OLMo
* Fix errors from make fixup
* Remove OLMoTokenizerFast from documentation
* Add missing 'Copied from' for OLMoModel._update_causal_mask
* Run make fix-copies
* Rearrange string replacements in OLMoForCausalLM Copied from
* Move OLMo and Llama CausalLM.forward example into global constants
* Fix OLMO_GENERATION_EXAMPLE doc string typo
* Add option for qkv clipping to OLMo
* Rearrange OLMoConfig kwargs in convert_olmo_weights_to_hf
* Add clip_qkv to OLMoConfig in convert_olmo_weights_to_hf
* Fix OLMo tokenization bug using conversion script
* Keep model in full precision after conversion
* Do not add eos token automatically
* Update references to OLMo model in HF Hub
* Do not add eos token during encoding by default
* Fix Llama generation example
* Run make fixup
* OLMo 7B integration test fix
* Remove unneeded special case for OLMoConfig
* OLMo 7B Twin 2T integration test fix
* Fix test_model_7b_greedy_generation
* Remove test_compile_static_cache
* Fix OLMo and Llama generation example
* Run make fixup
* Revert "OLMo 7B integration test fix"
This reverts commit 4df56a4b15
.
* Revert "OLMo 7B Twin 2T integration test fix"
This reverts commit 9ff65a4a29
.
* Ungate 7B integration tests and fix greedy generation test
* Add retries for flaky test_eager_matches_sdpa_generate
* Fix output of doc example for OLMoForCausalLM.forward
* Downsize OLMo doc test for OLMoForCausalLM.forward to 1B model
* Try fix incorrect characters in OLMoForCausalLM.forward doct test
* Try fix incorrect characters in OLMoForCausalLM.forward doc test using end quotes
* Remove pretraining_tp from OLMo config and model
* Add missing 'Copied from' instances
* Remove unneeded causal_mask from OLMoModel
* Revert Llama changes
* Ignore copy for OLMoForCausalLM.forward
* Change 'OLMo' to 'Olmo' in classes
* Move minimal OLMo tokenization tests to model tests
* Add missed 'Copied from' for repeat_kv
2024-04-17 17:59:07 +02:00
amyeroberts
6b78360e6d
Add Idefics2 ( #30253 )
...
* Initial add model additions
* Test
* All weights loading
* Can perform full forward pass
* Local and remote the same
* Matching local and remote
* Fixup
* Idefics2Model importable; fixup docstrings
* Don't skip by default
* Remove deprecated use_resampler arg
* Remove self.config
* DecoupledLinear takes config
* Tidy up
* Enable eager attention and tidy up
* Most tests passing
* Update for batch of processed images
* Add image processor
* Update doc pages
* Update conversion script
* Remove erroneous breakpoint
* Remove accidendtal spelling change
* Update to reflect changes on hub - make generate work
* Fix up
* Image processor tests
* Update tests
* Add a processor
* Add a processor
* Update convert script
* Update modeling file - remove fixmes
* Bug fix
* Add processing test
* Use processor
* Fix up
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Fix test
* Update config - PR comments and defaults align with checkpoint
* Reviewer comments
* Add copied froms for flahs attention
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove qk_layer_norm and freeze_layers functionality
* Fix
* Remove freeze_layer options from config
* Sync with upstream main
* Fix attention shapes siglip
* Remove Llava-next refs - TO REBASE
* Use AutoModel for text model
* Add comment to explain vision embeddings
* Fix issue with tie_word_embeddings
* Address review comments
* Fix and fix up
* Chat templates for idefics
* Fix copies
* Fix
* Add layer norms to FA2
* Fix tests
* Apply suggestions from code review
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Fix
* Review comments
* Update src/transformers/models/idefics2/modeling_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update inputs merger
* Merge weights in correct order
* Update convert script
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update template
* Model code examples (fix idefics too)
* More review comments
* Tidy up
* Update processing
* Fix attention mask preparation
* Update inputs_merger inputs
* Vectorize inputs_merger
* Update src/transformers/models/idefics2/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/idefics2/modeling_idefics2.py
* Review comments
* saying bye to the `qk_layer_norms`
* Simplify
* Update latents
* Remove erroneuous readme changes
* Return images when applying chat template
* Fix bug - prompt images are for a single sample
* Update src/transformers/models/idefics2/modeling_idefics2.py
* image splitting
* fix test
* some more comment
* some comment
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics2/image_processing_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update processor
* Update model tests
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Don't add BOS in template
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Remove index in examples
* Update tests to reflect #13
* Update src/transformers/models/idefics2/processing_idefics2.py
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* PR comment - consistent typing
* Update readme and model doc
* Update docs
* Update checkpoint references
* Update examples
* Fix and update tests
* Small addition
* Update tests - remove copied from as no ignore placement copy could be found
* Update example
* small fixes
* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update docs/source/en/model_doc/idefics2.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Update README.md
Co-authored-by: Victor SANH <victorsanh@gmail.com>
* Connector model as bridge
* Fix up
* Fix up
* Don't pass model inputs for generation kwargs update
* IDEFICS-2 -> Idefics2
* Remove config archive name
* IDEFICS-2 -> Idefics2
* Add back llava-next
* Update readmes
* Add requirements for processor tester
* Use custom convert_to_rgb to avoid possible BC
* Fix doc example
* Fix doc example
* Skip model doc tests - as model to large
* More doc example - account for image splitting
* Update src/transformers/image_transforms.py
* Fix config doctest
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>
Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-04-15 17:03:03 +01:00
Yoach Lacombe
0d04b1e25a
Add Flash Attention 2 support to Musicgen and Musicgen Melody ( #29939 )
...
* add FA2 to o.g Musicgen
* make style
* add FA2 support to Musicgen Melody
* add generation FA2 tests to o.g Musicgen
* make style and fix copies
* add Musicgen to FA2 docs + deprecate list
* add sdpa supports to Musicgen's
* make style and fix copies
* refactor attention implementation arguments
* add Copied from to sdpa tests
* add copied form in sdpa tests melody
* add copied for FA2 generation tests
* add FA2 inference copied from
* make style
2024-04-02 11:23:49 +01:00
Eduardo Pacheco
22d159ddf9
Adding Flash Attention 2 Support for GPT2 ( #29226 )
...
* First commit to add flash attention 2 for GPT-2
* more improvements
* Make GPT2 pass tests and fixed Decison Transformers copies
* Fixed missing arg
* fix copies
* Added expected speedup
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Added test
* Fixed attn attribute
* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/gpt2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update Decision transformer attentions
* More updates
* Passing tests
* Fix copies
* Fix copies part 2
* Decision transformer updates
* Update src/transformers/models/gpt2/modeling_gpt2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fix copies
* Decision transformer not supporting flash attn
* Addressed comments
* Addressed comments
* Addressed comments
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-28 09:31:24 +00:00
Bo Zheng
1c39974a4c
Add Qwen2MoE ( #29377 )
...
* add support for qwen2 MoE models
* update docs
* add support for qwen2 MoE models
* update docs
* update model name & test
* update readme
* update class names & readme & model_doc of Qwen2MoE.
* update architecture name
* fix qwen2_moe tests
* use Qwen2Tokenizer instead of Qwen2MoeTokenizer
* update modeling_qwen2_moe.py
* fix model architecture
* fix qwen2_moe tests
* use Qwen2Tokenizer instead of Qwen2MoeTokenizer
* update modeling_qwen2_moe.py
* fix model architecture
* fix style
* fix test when there are sparse and non sparse layers
* fixup
* Update README.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
* fixup
* add archive back
* add support for qwen2 MoE models
* update docs
* update model name & test
* update readme
* update class names & readme & model_doc of Qwen2MoE.
* update architecture name
* fix qwen2_moe tests
* use Qwen2Tokenizer instead of Qwen2MoeTokenizer
* update modeling_qwen2_moe.py
* fix model architecture
* fixup
* fix qwen2_moe tests
* use Qwen2Tokenizer instead of Qwen2MoeTokenizer
* fix style
* fix test when there are sparse and non sparse layers
* fixup
* add archive back
* fix integration test
* fixup
---------
Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-03-27 02:11:55 +01:00
NielsRogge
d91fd7f92c
Add LLaVa-1.6, bis ( #29586 )
...
* First draft
* Fix tests, add docs
* Improve docstrings
* Fix test
* Address comments
* Address comments
* Remove vocab_size attribute
* Remove batch_size
* Address comment
* Add image processor tests
* Support fx
* Update docstring
* Add support for 34b
* Convert 34b model
* Add integration tests
* Update checkpoints
* Convert vicuna-13b, remove doc tests
* Remove script
* Remove file
* Address comments
* Improve docstrings
* Deprecate vocab_size
* Remove aspect_ratio_setting
* Address comments
* Update READMEs
* Add tips about chat templates
* Fix tests
* Deprecate vocab_size safely
* Update tests
---------
Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-03-20 15:51:12 +00:00
Saurabh Dash
0e4a1c3401
Cohere Model Release ( #29622 )
...
* Cohere Model Release (#1 )
Cohere Model Release
* Remove unnecessary files and code (#2 )
Some cleanup
* Delete cohere-model directory (#3 )
* Make Fix (#5 )
* Pr fixes (#6 )
* fixes for pr
* pr fixes for the format
* pr fixes for the format
* src/transformers/models/auto/tokenization_auto.py
* Tokenizer test (#8 )
* tokenizer test
* format fix
* Adding Docs and other minor changes (#7 )
* Add modeling tests (#9 )
* Smol Fix (#11 )
* tokenization tests are fixed
* format fixes
* fix pr doc tests
* fix pr doc tests
* fix pr doc tests
* fix pr style check
* small changes in cohere.md
* FIX: Address final comments for transformers integration (#13 )
* fix modeling final nits and add proper test file
* for now leave empty tests
* add integration test
* push new test
* fix modeling cohere (#14 )
* Update chat templates to use the new API (#15 )
---------
Co-authored-by: ahmetustun <ahmetustun89@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-03-15 14:29:11 +01:00
bytebarde
be3fd8a262
[Flash Attention 2] Add flash attention 2 for GPT-J ( #28295 )
...
* initial implementation of flash attention for gptj
* modify flash attention and overwrite test_flash_attn_2_generate_padding_right
* update flash attention support list
* remove the copy line in the `CodeGenBlock`
* address copy mechanism
* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add GPTJ attention classes
* add expected outputs in the gptj test
* Ensure repo consistency with 'make fix-copies'
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-03-13 08:43:00 +01:00
RaymondLi0
63caa370e6
Starcoder2 model - bis ( #29215 )
...
* Copy model
* changes
* misc
* fixes
* add embed and residual dropout (#30 )
* misc
* remove rms norm and gated MLP
* remove copied mentions where its not a copy anymore
* remove unused _shape
* copied from mistral instead
* fix copies
* fix copies
* add not doctested
* fix
* fix copyright
* Update docs/source/en/model_doc/starcoder2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/starcoder2/configuration_starcoder2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/starcoder2/configuration_starcoder2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix doc
* revert some changes
* add fa2 tests
* fix styling nit
* fix
* push dummy docs
---------
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-28 01:24:34 +01:00
fxmarty
2cc8cf6ce7
Fix torch.compile
with fullgraph=True
when attention_mask
input is used ( #29211 )
...
* fix torch.export.export for llama
* do not change doc title
* make fix copies
2024-02-22 16:40:06 +01:00
Arthur
594c1277b2
[ gemma
] Adds support for Gemma 💎 ( #29167 )
...
* inital commit
* update
* update conversion checkpoint
* update conversion script
* nits
* some fixes
* nits
* merge
* fix permute
* nits
* fix
* nits
* nits
* nits
* fix rope
* fix both rope
* nites
* style
* make sure flax works
* fix flax init code
* fix foward
* nits
* print flax generation out
* current code
* nits
* SIIIIIIIIIIIIIIIIIII
* update
* add new tokenizer
* correct fast tokenizer
* fix conversion
* more comments
* fix modeling and conversion
* nits and nits
* nits testing
* add some tokenization tests
* add some edge cases
* add slow tests and fix them
* fixup
* fix copies for modeling
* fix copies
* add 7B slow tests
* fix
* fix
* fix tests
* make tokenizer cis go green
* styling
* last tokenizer nits
* update jax tests
* fix flax for 7b
* add jit testing 🤗
* cleanups
* isolated nit, inv_freq for rotary_emb.inv_freq
* propagate to jax
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adjust test
* fix conversion script
* change name
* correct file names
* update conversion script
* Fix bos and eos token ids in the model configuration (#3 )
* update modelling
* update conversion script
* add static cache for gemma
* fix sdpa generate
* fix batched
* multiple fixes
* fix FA2
* final fix
* Rename a few missing strings and filenames (#4 )
* merge with upstream main
* fix copies
* fix copies
* fix fixup
* fix fixup
* fix
* fix
* final tests
* fix fx gemma tests
* fix fx bf16/fp16 tests
* update slow fx tests
* fx slow tests: one logits, one generation
* move jit test standalone
* Apply suggestions from code review
* nits
* tokenizer updates
* more tokenization updates: custom GemmaSentencepieceExtrator
* style
* Update src/transformers/cache_utils.py
* Update src/transformers/models/gemma/__init__.py
* Update tests/models/gemma/test_modeling_flax_gemma.py
* small nits
* style
* update tokenization test
* fix the rotary embedding
* with style
* fix slow tests
* WARNING this commit might be very important for precisions
* Update tests/models/gemma/test_modeling_flax_gemma.py
* Update src/transformers/models/gemma/configuration_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update src/transformers/models/gemma/modeling_flax_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* small nits here and there!
* forgotten nit
* remove on the fly computation of inv_freq
* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_flax_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* nit conversion script link
* fix some tests
* add not doctest and pr doctest
* repo consistency
* fix last CIs 🚀
* update all readmes
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-02-21 14:21:28 +01:00
Ekaterina Aidova
1d0ea7abe0
support SDPA Attention in stablelm ( #29106 )
...
* support SDPA Attention in stablelm
* add integration test
* add fallback for output_attentions
* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/stablelm/test_modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/stablelm/modeling_stablelm.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* handle non-contiguous states
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-21 13:12:49 +01:00
JB (Don)
b8b16475d4
[Phi] Add support for sdpa ( #29108 )
2024-02-20 14:33:12 +01:00
Lysandre Debut
f497f564bb
Update all references to canonical models ( #29001 )
...
* Script & Manual edition
* Update
2024-02-16 08:16:58 +01:00
Jonathan Tow
de6029a059
Add StableLM
( #28810 )
...
* Add `StableLM`
* fix(model): re-create from `huggingface-cli add-new-model-like persimmon`
* fix: re-add changes to address comments
* fix(readme): add links to paper
* fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref
* fix(tests): re-add `@slow` decorator to integration tests
* fix(tests): import slow...
* fix(readme_hd): remove whitespace edit
* fix(tokenizer): auto tokenizer tuple
* skip doctests for `modeling_stablelm`
2024-02-14 07:15:18 +01:00
Junyang Lin
d6ffe74dfa
Add qwen2 ( #28436 )
...
* add config, modeling, and tokenization
* add auto and init
* update readme
* update readme
* update team name
* fixup
* fixup
* update config
* update code style
* update for fixup
* update for fixup
* update for fixup
* update for testing
* update for testing
* fix bug for config and tokenization
* fix bug for bos token
* not doctest
* debug tokenizer
* not doctest
* debug tokenization
* debug init for tokenizer
* fix style
* update init
* delete if in token auto
* add tokenizer doc
* add tokenizer in init
* Update dummy_tokenizers_objects.py
* update
* update
* debug
* Update tokenization_qwen2.py
* debug
* Update convert_slow_tokenizer.py
* add copies
* add copied from and make style
* update files map
* update test
* fix style
* fix merge reading and update tests
* fix tests
* fix tests
* fix style
* debug a variable in readme
* Update src/transformers/models/qwen2/configuration_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update test and copied from
* fix style
* update qwen2 tokenization and tests
* Update tokenization_qwen2.py
* delete the copied from after property
* fix style
* update tests
* update tests
* add copied from
* fix bugs
* update doc
* add warning for sliding window attention
* update qwen2 tokenization
* fix style
* Update src/transformers/models/qwen2/modeling_qwen2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix tokenizer fast
---------
Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-17 16:02:22 +01:00
Yih-Dar
71f460578d
Update docs/source/en/perf_infer_gpu_one.md
( #28198 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-22 10:40:22 +01:00
Steven Liu
a52e180a0f
[docs] General doc fixes ( #28087 )
...
* doc fix friday
* deprecated objects
* update not_doctested
* update toctree
2023-12-18 10:44:09 -08:00
Younes Belkada
c7f076a00e
Adds VIP-llava to transformers ( #27932 )
...
* v1
* add-new-model-like
* revert
* fix forward and conversion script
* revert
* fix copies
* fixup
* fix
* Update docs/source/en/index.md
* Apply suggestions from code review
* push
* fix
* fixes here and there
* up
* fixup and fix tests
* Apply suggestions from code review
* add docs
* fixup
* fixes
* docstring
* add docstring
* fixup
* docstring
* fixup
* nit
* docs
* more copies
* fix copies
* nit
* update test
2023-12-13 10:42:24 +01:00
Stas Bekman
9936143014
[doc] fix typo ( #27981 )
2023-12-12 20:32:42 +00:00
Arthur
accccdd008
[Add Mixtral
] Adds support for the Mixtral MoE ( #27942 )
...
* up
* up
* test
* logits ok
* up
* up
* few fixes
* conversion script
* up
* nits
* nits
* update
* nuke
* more updates
* nites
* fix many issues
* nit
* scatter
* nit
* nuke megablocks
* nits
* fix conversion script
* nit
* remove
* nits
* nit
* update
* oupsssss
* change
* nits device
* nits
* fixup
* update
* merge
* add copied from
* fix the copy mentions
* update tests
* more fixes
* nits
* conversion script
* add parts of the readme
* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* new test + conversion script
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
* fix
* fix copies
* fix copies
* ooops
* fix config
* Apply suggestions from code review
* fix nits
* nit
* add copies
* add batched tests
* docs
* fix flash attention
* let's add more verbose
* add correct outputs
* support router ouptus
* ignore copies where needed
* fix
* cat list if list is given for now
* nits
* Update docs/source/en/model_doc/mixtral.md
* finish router refactoring
* fix forward
* fix expected values
* nits
* fixup
* fix
* fix bug
* fix
* fix dtype mismatch
* fix
* grrr grrr I support item assignment
* fix CI
* docs
* fixup
* remove some copied form
* fix weird diff
* skip doctest fast on the config and modeling
* mark that is supports flash attention in the doc
* update
* Update src/transformers/models/mixtral/modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update docs/source/en/model_doc/mixtral.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
* revert router logits config issue
* update doc accordingly
* Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py
* nits
* use torch testing asssert close
* fixup
* doc nits
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2023-12-11 12:50:27 +01:00