Pavel Iakubovskii
099d93d2e9
Grounding DINO Processor standardization ( #34853 )
...
* Add input ids to model output
* Add text preprocessing for processor
* Fix snippet
* Add test for equivalence
* Add type checking guard
* Fixing typehint
* Fix test for added `input_ids` in output
* Add deprecations and "text_labels" to output
* Adjust tests
* Fix test
* Update code examples
* Minor docs and code improvement
* Remove one-liner functions and rename class to CamelCase
* Update docstring
* Fixup
2025-01-17 14:18:16 +00:00
Pavel Iakubovskii
42b2857b01
OmDet Turbo processor standardization ( #34937 )
...
* Fix docstring
* Fix docstring
* Add `classes_structure` to model output
* Update omdet postprocessing
* Adjust tests
* Update code example in docs
* Add deprecation to "classes" key in output
* Types, docs
* Fixing test
* Fix missed clip_boxes
* [run-slow] omdet_turbo
* Apply suggestions from code review
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* Make CamelCase class
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2025-01-17 14:10:19 +00:00
Pavel Iakubovskii
94ae9a8da1
OwlViT/Owlv2 post processing standardization ( #34929 )
...
* Refactor owlvit post_process_object_detection + add text_labels
* Fix copies in grounding dino
* Sync with Owlv2 postprocessing
* Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection
* Add test cases
* Move text_labels to processors only
* [run-slow] owlvit owlv2
* [run-slow] owlvit, owlv2
* Update snippets
* Update docs structure
* Update deprecated objects for check_repo
* Update docstring for post processing of image guided object detection
2025-01-17 13:58:28 +00:00
Cyril Vallez
ab1afd56f5
Fix some tests ( #35682 )
...
* cohere tests
* glm tests
* cohere2 model name
* create decorator
* update
* fix cohere2 completions
* style
* style
* style
* add cuda in comments
2025-01-17 12:10:43 +00:00
Ross Wightman
8c1b5d3782
🚨 🚨 🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. ( #35615 )
...
* An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.
* Fix fix on load issue
* Fix gamma/beta warning test
* A style complaint
* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.
* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
Joao Gante
94af1c0aa2
[generate] return Cache object even if passed in a legacy format ( #35673 )
...
* generate returns a Cache object by default
* fix tests
* fix test for encoder-decoder models
2025-01-16 17:06:24 +00:00
Joao Gante
2818307e93
[generate] can instantiate GenerationConfig(cache_implementation="static")
( #35679 )
...
fix failing instantiation
2025-01-16 17:04:54 +00:00
Joao Gante
aeeceb9916
[cache] add a test to confirm we can use cache at train time ( #35709 )
...
* add test
* augment test as suggested
* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* rerun tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-16 17:02:34 +00:00
kang sheng
2cbcc5877d
Fix condition when GA loss bug fix is not performed ( #35651 )
...
* fix condition when GA loss bug fix is not performed
* max loss diff is 2.29
* fix typo
* add an extra validation that loss should not vary too much
2025-01-16 13:59:53 +01:00
jiqing-feng
387663e571
Enable gptqmodel ( #35012 )
...
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* gptqmodel need use checkpoint_format (#1 )
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* Revert quantizer_gptq.py (#2 )
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Fix Transformer compat (#3 )
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update gptqmodel version (#6 )
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5 )
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7 )
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* review: update docs (#10 )
* review: update docs (#12 )
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update document (#9 )
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-15 14:22:49 +01:00
Raushan Turganbay
09d5f76274
Clean-up composite configs ( #34603 )
...
* remove manual assignment tie-word-embeddings
* remove another unused attribute
* fix tests
* fix tests
* remove unnecessary overwrites
* fix
* decoder=True
* clean pix2struct
* run-all
* forgot `_tied_weights_keys` when adding Emu3
* also Aria + fix-copies
* and clean aria
2025-01-15 10:04:07 +01:00
Mahdi Baghbanzadeh
c61fcde910
Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities ( #35251 )
...
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling
2025-01-14 17:01:10 +00:00
Mohamed Mekkouri
a11041ffad
Fix : add require_read_token for gemma2 gated model ( #35687 )
...
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
Mohamed Mekkouri
df2a812e95
Fix expected output for ggml test ( #35686 )
...
fix expected output
2025-01-14 11:46:55 +01:00
Mohamed Mekkouri
050636518a
Fix : HQQ config when hqq not available ( #35655 )
...
* fix
* make style
* adding require_hqq
* make style
2025-01-14 11:37:37 +01:00
Arthur
c23a1c1932
Add-helium ( #35669 )
...
* Add the helium model.
* Add a missing helium.
* And add another missing helium.
* Use float for the rmsnorm mul.
* Add the Helium tokenizer converter.
* Add the pad token as suggested by Arthur.
* Update the RMSNorm + some other tweaks.
* Fix more rebase issues.
* fix copies and style
* fixes and add helium.md
* add missing tests
* udpate the backlink
* oups
* style
* update init, and expected results
* small fixes
* match test outputs
* style fixup, fix doc builder
* add dummies and we should be good to go!z
* update sdpa and fa2 documentation
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
2025-01-13 18:41:15 +01:00
Fanli Lin
2fa876d2d8
[tests] make cuda-only tests device-agnostic ( #35607 )
...
* intial commit
* remove unrelated files
* further remove
* Update test_trainer.py
* fix style
2025-01-13 14:48:39 +01:00
Arthur
e6f9b03464
[Compile
] Only test compiling model forward pass ( #35658 )
...
* rename test to only compile forward!
* style emu
2025-01-13 13:43:29 +01:00
Raushan Turganbay
84a6789145
Enable different torch dtype in sub models ( #34873 )
...
* fix
* fix test
* add tests
* add more tests
* fix tests
* supposed to be a torch.dtype test
* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
Yih-Dar
1e3c6c1f7d
Skip MobileNetV1ModelTest::test_batching_equivalence
for now ( #35614 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 18:32:36 +01:00
Yih-Dar
04eae987f3
Fix flaky test_beam_search_low_memory
( #35611 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 17:31:03 +01:00
Zach Mueller
b02828e4af
Let EarlyStoppingCallback
not require load_best_model_at_end
( #35101 )
...
* Bookmark
* Add warning
2025-01-10 10:25:32 -05:00
Zach Mueller
1211e616a4
Use inherit tempdir makers for tests + fix failing DS tests ( #35600 )
...
* Use existing APIs to make tempdir folders
* Fixup deepspeed too
* output_dir -> tmp_dir
2025-01-10 10:01:58 -05:00
Yih-Dar
bbc00046b9
Fix flaky test_custom_4d_attention_mask
( #35606 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 15:40:04 +01:00
Raushan Turganbay
52e1f87c7d
[WIP] Emu3: add model ( #33770 )
...
* model can convert to HF and be loaded back
* nit
* works in single batch generation but hallucinates
* use the image tokens
* add image generation
* now it works
* add tests
* update
* add modulare but it doesn't work for porting docstring :(
* skip some tests
* add slow tests
* modular removed the import?
* guess this works
* update
* update
* fix copies
* fix test
* fix copies
* update
* docs
* fix tests
* last fix tests?
* pls
* repo consistency
* more style
* style
* remove file
* address comments
* tiny bits
* update after the new modular
* fix tests
* add one more cond in check attributes
* decompose down/up/mid blocks
* allow static cache generation in VLMs
* nit
* fix copies
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix VAE upsampling
* Update src/transformers/models/emu3/modular_emu3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address comments
* state overwritten stuff explicitly
* fix copies
* add the flag for flex attn
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-10 12:23:00 +01:00
Cyril Vallez
ccc0381d36
Fix flex_attention in training mode ( #35605 )
...
* fix flex
* add test
* style
2025-01-10 11:49:12 +01:00
Raushan Turganbay
e0646f3dce
Chat template: return vectorized output in processors ( #34275 )
...
* update chat template
* style
* fix tests
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* typehints + docs
* fix tests
* remove unnecessary warnings
* forgot code style :(
* allow users to pass backend and num frames
* Update docs/source/en/chat_templating.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/image_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/processing_utils.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* typo fix
* style
* address comments
* align with "pipeline" template
* update docs
* update docs
* unpack for all kwargs?
* wrong conflict resolution while rebasing
* tmp
* update docs
* Update docs/source/en/chat_templating.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_templating.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_templating.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_templating.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-10 11:05:29 +01:00
eustlb
5f087d1335
Add Moonshine ( #34784 )
...
* config draft
* full encoder forward
* full decoder forward
* fix sdpa and FA2
* fix sdpa and FA2
* moonshine model
* moonshine model forward
* fix attention with past_key_values
* add MoonshineForConditionalGeneration
* fix cache handling and causality for cross attention
* no causal attention mask for the encoder
* model addition (imports etc)
* small nit
* nits
* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py
Co-authored-by: Joshua Lochner <admin@xenova.com>
* add rope_theta
* nits
* model doc
* Update src/transformers/models/auto/configuration_auto.py
Co-authored-by: Joshua Lochner <admin@xenova.com>
* imports
* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES
* updates modular
* make
* make fix-copies
* ruff check examples fix
* fix check_modular_conversion
* nit
* nits
* nits
* copied from -> imports
* imports fix
* integrate attention refacto
* modular edge case
* remove encoder
* convolutions params in config
* run modular_model_converter
* make
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Joshua Lochner <admin@xenova.com>
* MoonshineModelTest
* correct typo
* make style
* integration tests
* make
* modular convert
* name conversion update (up_proj -> fc1 etc)
* update config
* update MLP
* update attention
* update encoder layer
* update decoder layer
* update convolutions parameters
* update encoder
* remove INPUTS_DOCSTRING
* update decoder
* update conditional generation
* update pretrained model
* imports
* modular converted
* update doc
* fix
* typo
* update doc
* update license
* update init
* split config in file
* two classes for MLP
* attention from GLM
* from GlmRotaryEmbedding
* split MLP
* apply arthur's review suggestions
* apply arthur's review suggestions
* apply arthur's review suggestions
* auto feature extractor
* convert modular
* fix + make
* convert modular
* make
* unsplit config
* use correct checkpoint
* wrap generate
* update tests
* typos
* make
* typo
* update doc
---------
Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-01-10 11:00:54 +01:00
Yih-Dar
6f127d3f81
Skip torchscript
tests if a cache object is in model's outputs ( #35596 )
...
* fix 1
* fix 1
* comment
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-10 10:46:03 +01:00
Tom Aarsen
6b73ee8905
ModernBert: reuse GemmaRotaryEmbedding via modular + Integration tests ( #35459 )
...
* Introduce 5 integration tests for the 4 model classes + torch export
* ModernBert: reuse GemmaRotaryEmbedding via modular
* Revert #35589 , keep rope_kwargs; rely on them in modular_modernbert
* Revert "Revert #35589 , keep rope_kwargs; rely on them in modular_modernbert"
This reverts commit 11b44b9ee8
.
* Don't set rope_kwargs; override 'self.rope_init_fn' call instead
2025-01-10 10:25:10 +01:00
Cyril Vallez
3a4ae6eace
Refactor/fix Cohere2 ( #35594 )
...
* refactor/fix cohere2
* add kwargs
* tests
* remove func and import it
2025-01-09 17:54:57 +01:00
Tom Aarsen
32e0db8a69
[tokenizers
] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer ( #35593 )
...
* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer
in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.
* Simplify setting self.add_prefix_space, ensure pre_tok exists
* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'
862d1a346a/bindings/python/src/pre_tokenizers.rs (L672)
produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.
* Propagate add_prefix_space in T5TokenizerFast to superclass
2025-01-09 17:46:50 +01:00
Cyril Vallez
46276f9a7f
Fix modular edge case + modular sorting order ( #35562 )
...
* look-ahead negation
* re add examples by default
* Fix the bug in topological sort
* Update create_dependency_mapping.py
* start adding test
* finalize test
* more tests
* style
* style
2025-01-09 17:17:52 +01:00
Yih-Dar
82dd6c14bb
Fix flaky SwitchTransformersModelTest::test_training_gradient
( #35587 )
...
* fix
* Update tests/models/switch_transformers/test_modeling_switch_transformers.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-09 15:36:22 +01:00
Arthur
eb4579cf43
tokenizer
train from iterator without pre_tokenizers (#35396 )
...
* fix if else issues
* add a test
* fix the test
* style
2025-01-09 15:34:43 +01:00
Jack Morris
832c6191ed
Add inputs_embeds param to ModernBertModel ( #35373 )
...
* update modular_modernbert -- add inputs_embeds param to ModernBertModel
* Fix implementation issues; extend to other classes; docstring
First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.
I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.
I also introduced an error if input_ids and input_embeds are both or neither provided.
Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.
* Propagate inputs_embeds to ModernBertForMaskedLM correctly
Also reintroduce inputs_embeds test
---------
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2025-01-09 14:17:26 +01:00
Yih-Dar
1b2f942af7
Fix flaky test_batching_equivalence
( #35564 )
...
* yes!
* oh no!!!
* oh no!!!
* style
* oh no!!!
* oh no!!!
* oh no!!!
* oh no!!!
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-01-09 14:00:08 +01:00
Cyril Vallez
965a2fb320
More model refactoring! ( #35359 )
...
* cohere
* style
* phi3
* style
* small fix
* small fix
* phi3 longrope
* oups
* Update rope (only for phi3 still)
* Update test_modeling_rope_utils.py
* Update modeling_phi3.py
* fix
* fix copies
* style
* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
nhamanasu
b32938aeee
Fix all output_dir in test_trainer.py to use tmp_dir ( #35266 )
...
* update codecarbon
* replace directly-specified-test-dirs with tmp_dir
* pass tmp_dir to all get_regression_trainer
* test_trainer.py: Use tmp_dir consistently for all output_dir arguments
* fix some with...as tmp_dir blocks
* reflect the comments to improve test_trainer.py
* refresh .gitignore
2025-01-08 19:44:39 +01:00
Joao Gante
76da6ca034
Pipeline: simple API for assisted generation ( #34504 )
...
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:08:02 +00:00
Arthur
3f483beab9
[PixtralLarge
] Update Pixtral conversion script to support large format! ( #34801 )
...
* update conversion script
* update for bias again
* remove pdv
* use my dir
* Update how we initialize the tokenizer
* Convert in bfloat16
* Undo that one again
* fix config dump
* .to() was broken for BatchMixFeature
* quick debug breakpoint
* put the breakpoint in the right place
* Add a config flag for the multimodal projector bias
* Add a config flag for the multimodal projector bias
* Conversion script can load chat templates
* Indent config for comparison
* Stop clobbering the config
* Re-enable the config clobber
* Get rid of the config manual save - it has no effect!
* Handle adapter bias correctly
* Default vision transformer activation to silu
* Remove legacy processing path
* One commit with all the debug breakpoints before I delete them all, in case I need to revert
* Update conversion
* Remove vLLM debugging instrumentation
* Drop xformers
* Remove debug enumerates
* make fixup
* make fixup
* Break copied from in pixtral
* Propagate multimodal_projector_bias change
* Propagate multimodal_projector_bias change
* Remove debug device .to()
* Restore attention weights output
* Fix Pixtral test
* Drop image_seq_length
* Drop image_seq_length
* Put the legacy processing code back
* Add the bias option to the llava_next_video config
* Add the bias option to the llava_next_video config
* Make certain args required in converter
* Make certain args required in converter
* typo
* make fixup
* Reverting some dtype changes since it seems to work without them
---------
Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2025-01-08 17:39:47 +01:00
NielsRogge
8490d3159c
Add ViTPose ( #30530 )
...
* First draft
* Make fixup
* Make forward pass worké
* Improve code
* More improvements
* More improvements
* Make predictions match
* More improvements
* Improve image processor
* Fix model tests
* Add classic decoder
* Convert classic decoder
* Verify image processor
* Fix classic decoder logits
* Clean up
* Add post_process_pose_estimation
* Improve post_process_pose_estimation
* Use AutoBackbone
* Add support for MoE models
* Fix tests, improve num_experts%
* Improve variable names
* Make fixup
* More improvements
* Improve post_process_pose_estimation
* Compute centers and scales
* Improve postprocessing
* More improvements
* Fix ViTPoseBackbone tests
* Add docstrings, fix image processor tests
* Update index
* Use is_cv2_available
* Add model to toctree
* Add cv2 to doc tests
* Remove script
* Improve conversion script
* Add coco_to_pascal_voc
* Add box_to_center_and_scale to image_transforms
* Update tests
* Add integration test
* Fix merge
* Address comments
* Replace numpy by pytorch, improve docstrings
* Remove get_input_embeddings
* Address comments
* Move coco_to_pascal_voc
* Address comment
* Fix style
* Address comments
* Fix test
* Address comment
* Remove udp
* Remove comment
* [WIP] need to check if the numpy function is same as cv
* add scipy affine_transform
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* refactor convert
* add output_shape
* add atol 5e-2
* Use hf_hub_download in conversion script
* make box_to_center more applicable
* skipt test_get_set_embedding
* fix to accept array and fix CI
* add co-contributor
* make it to tensor type output
* add torch
* change to torch tensor
* add more test
* minor change
* CI test change
* import torch should be above ImageProcessor
* make style
* try not use torch in def
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix
* fix
* add caution
* make more detail about dataset_index
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* add docs
* Update docs/source/en/model_doc/vitpose.md
* Update src/transformers/models/vitpose/configuration_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Revert "Update src/transformers/__init__.py"
This reverts commit 7ffa504450
.
* change name
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vitpose/test_modeling_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* move vitpose only function to image_processor
* raise valueerror when using timm backbone
* use out_indices
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove camel-case of def flip_back
* rename vitposeEstimatorOutput
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix confused camelcase of MLP
* remove in-place logic
* clear scale description
* make consistent batch format
* docs update
* formatting docstring
* add batch tests
* test docs change
* Update src/transformers/models/vitpose/image_processing_vitpose.py
* Update src/transformers/models/vitpose/configuration_vitpose.py
* chagne ViT to Vit
* change to enable MoE
* make fix-copies
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* extract udp
* add more described docs
* simple fix
* change to accept target_size
* make style
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/configuration_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change to `verify_backbone_config_arguments`
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove unnecessary copy
* make config immutable
* enable gradient checkpointing
* update inappropriate docstring
* linting docs
* split function for visibility
* make style
* check isinstances
* change to acceptable use_pretrained_backbone
* make style
* remove copy in docs
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* simple fix + make style
* change input config of activation function to string
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* tmp docs
* delete index.md
* make fix-copies
* simple fix
* change conversion to sam2/mllama style
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* refactor convert
* add supervision
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* remove reduntant def
* seperate code block for visualization
* add validation for num_moe
* final commit
* add labels
* [run-slow] vitpose, vitpose_backbone
* Update src/transformers/models/vitpose/convert_vitpose_to_hf.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* enable all conversion
* final commit
* [run-slow] vitpose, vitpose_backbone
* ruff check --fix
* [run-slow] vitpose, vitpose_backbone
* rename split module
* [run-slow] vitpose, vitpose_backbone
* fix pos_embed
* Simplify init
* Revert "fix pos_embed"
This reverts commit 2c56a4806e
.
* refactor single loop
* allow flag to enable custom model
* efficiency of MoE to not use unused experts
* make style
* Fix range -> arange to avoid warning
* Revert MOE router, a new one does not work
* Fix postprocessing a bit (labels)
* Fix type hint
* Fix docs snippets
* Fix links to checkpoints
* Fix checkpoints in tests
* Fix test
* Add image to docs
---------
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 16:02:14 +00:00
Minho Shim
4349a0e401
fix: Qwen2-VL generate with inputs_embeds ( #35466 )
...
* fix: Qwen2-VL generate with inputs_embeds
* change: optional input_ids in get_rope_index
2025-01-08 16:36:03 +01:00
Sean (Seok-Won) Yi
88e18b3c63
Update doc for metric_for_best_model
when save_strategy="best"
. ( #35389 )
...
* Updated docstring for _determine_best_metric.
* Updated docstring for metric_for_best_model.
* Added test case for save strategy.
* Updated incorrect test case.
* Changed eval_strategy to match save_strategy.
* Separated test cases for metric.
* Allow load_best_model when save_strategy == "best".
* Updated docstring for metric_for_best_model.
2025-01-08 16:32:35 +01:00
Pavel Iakubovskii
657bb14f98
Enable auto task for timm models in pipeline ( #35531 )
...
* Enable auto task for timm models
* Add pipeline test
2025-01-08 15:14:17 +00:00
Pavel Iakubovskii
59e5b3f01b
Timm wrapper label names ( #35553 )
...
* Add timm wrapper label names mapping
* Add index to classification pipeline
* Revert adding index for pipelines
* Add custom model check for loading timm labels
* Add tests for labels
* [run-slow] timm_wrapper
* Add note regarding label2id mapping
2025-01-08 14:09:46 +00:00
Jacky Lee
3c1895aa65
Fix Qwen2VL processor to handle odd number of frames ( #35431 )
...
* fix: processing odd number of frames
* feat: add test case
* update: test one frame
* feat: support custom patch size
* fix: test with videos
* revert: change on patch repeat
* fix: much wow
* update: fixups
* fixup pls
* ruff fixup
* fix typo at least
2025-01-08 13:49:00 +01:00
Quentin Lhoest
3fde88b19d
support chat generator as input of TextGenerationPipeline ( #35551 )
...
* support chat generator as input of TextGenerationPipeline
* missing import
* fix tests
* again
* simpler
* add test
2025-01-08 13:27:07 +01:00
Raushan Turganbay
d1681ec2b6
VLMs: major clean up 🧼 ( #34502 )
...
only lllava models are modified
2025-01-08 10:35:23 +01:00
Jade Choghari
7176e06b52
Add TextNet ( #34979 )
...
* WIP
* Add config and modeling for Fast model
* Refactor modeling and add tests
* More changes
* WIP
* Add tests
* Add conversion script
* Add conversion scripts, integration tests, image processor
* Fix style and copies
* Add fast model to init
* Add fast model in docs and other places
* Fix import of cv2
* Rename image processing method
* Fix build
* Fix Build
* fix style and fix copies
* Fix build
* Fix build
* Fix Build
* Clean up docstrings
* Fix Build
* Fix Build
* Fix Build
* Fix build
* Add test for image_processing_fast and add documentation tests
* some refactorings
* Fix failing tests
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Introduce TextNet
* Fix failures
* Refactor textnet model
* Fix failures
* Add cv2 to setup
* Fix failures
* Fix failures
* Add CV2 dependency
* Fix bugs
* Fix build issue
* Fix failures
* Remove textnet from modeling fast
* Fix build and other things
* Fix build
* some cleanups
* some cleanups
* Some more cleanups
* Fix build
* Incorporate PR feedbacks
* More cleanup
* More cleanup
* More cleanup
* Fix build
* Remove all the references of fast model
* More cleanup
* Fix build
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix Build
* Fix build
* Fix build
* Fix build
* Fix build
* Fix build
* Incorporate PR feedbacks
* Fix style
* Fix build
* Incorporate PR feedbacks
* Fix image processing mean and std
* Incorporate PR feedbacks
* fix build failure
* Add assertion to image processor
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* fix style failures
* fix build
* Fix Imageclassification's linear layer, also introduce TextNetImageProcessor
* Fix build
* Fix build
* Fix build
* Fix build
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix build
* Incorporate PR feedbacks
* Remove some script
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix image processing in textnet
* Incorporate PR Feedbacks
* Fix CI failures
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Add textnet to readme
* Improve readability
* Incorporate PR feedbacks
* fix code style
* fix key error and convert working
* tvlt shouldn't be here
* fix test modeling test
* Fix tests, make fixup
* Make fixup
* Make fixup
* Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update tests/models/textnet/test_image_processing_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* space typo
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/configuration_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* make conv layer kernel sizes and strides default to None
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix keyword bug
* add batch init and make fixup
* Make fixup
* Update integration test
* Add figure
* Update textnet.md
* add testing and fix errors (classification, imgprocess)
* fix error check
* make fixup
* make fixup
* revert to original docstring
* add make style
* remove conflict for now
* Update modeling_auto.py
got a confusion in `timm_wrapper` - was giving some conflicts
* Update tests/models/textnet/test_modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update tests/models/textnet/test_modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* add changes
* Update textnet.md
* add doc
* add authors hf ckpt + rename
* add feedback: classifier/docs
---------
Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com>
Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 09:52:51 +01:00