* First commit
* Finish model implementation
* First commit
* Finish model implementation
* Register zamba2
* generated modeling and configuration
* generated modeling and configuration
* added hybrid cache
* fix attention_mask in mamba
* dropped unused loras
* fix flash2
* config docstrings
* fix config and fwd pass
* make fixup fixes
* text_modeling_zamba2
* small fixes
* make fixup fixes
* Fix modular model converter
* added inheritances in modular, renamed zamba cache
* modular rebase
* new modular conversion
* fix generated modeling file
* fixed import for Zamba2RMSNormGated
* modular file cleanup
* make fixup and model tests
* dropped inheritance for Zamba2PreTrainedModel
* make fixup and unit tests
* Add inheritance of rope from GemmaRotaryEmbedding
* moved rope to model init
* drop del self.self_attn and del self.feed_forward
* fix tests
* renamed lora -> adapter
* rewrote adapter implementation
* fixed tests
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Dropped adapter in-place sum
* removed rope from attention init
* updated rope
* created get_layers method
* make fixup fix
* make fixup fixes
* make fixup fixes
* update to new attention standard
* update to new attention standard
* make fixup fixes
* minor fixes
* cache_position
* removed cache_position postion_ids use_cache
* remove config from modular
* removed config from modular (2)
* import apply_rotary_pos_emb from llama
* fixed rope_kwargs
* Instantiate cache in Zamba2Model
* fix cache
* fix @slow decorator
* small fix in modular file
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* several minor fixes
* inherit mamba2decoder fwd and drop position_ids in mamba
* removed docstrings from modular
* reinstate zamba2 attention decoder fwd
* use regex for tied keys
* Revert "use regex for tied keys"
This reverts commit 9007a522b1.
* use regex for tied keys
* add cpu to slow forward tests
* dropped config.use_shared_mlp_adapter
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* re-convert from modular
---------
Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
`return unittest.skip()` used in the `test_model_parallel_beam_search` in
skip condition for xpu did not actually mark test to be skipped running
under pytest:
* 148 passed, 1 skipped
Other tests use `self.skipTest()`. Reusing this approach and moving the
condition outside the loop (since it does not depend on it) allows to skip
for xpu correctly:
* 148 skipped
Secondly, `device_map="auto"` is now implemented for XPU for IPEX>=2.5 and
torch>=2.6, so we can now enable these tests for XPU for new IPEX/torch
versions.
Fixes: 1ea3ad1ae ("[tests] use `torch_device` instead of `auto` for model testing (#29531)")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* model can convert to HF and be loaded back
* nit
* works in single batch generation but hallucinates
* use the image tokens
* add image generation
* now it works
* add tests
* update
* add modulare but it doesn't work for porting docstring :(
* skip some tests
* add slow tests
* modular removed the import?
* guess this works
* update
* update
* fix copies
* fix test
* fix copies
* update
* docs
* fix tests
* last fix tests?
* pls
* repo consistency
* more style
* style
* remove file
* address comments
* tiny bits
* update after the new modular
* fix tests
* add one more cond in check attributes
* decompose down/up/mid blocks
* allow static cache generation in VLMs
* nit
* fix copies
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/emu3.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix VAE upsampling
* Update src/transformers/models/emu3/modular_emu3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* address comments
* state overwritten stuff explicitly
* fix copies
* add the flag for flex attn
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* initial commit for PR
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
* rename dynamic cache
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add more unit tests
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add integration test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add integration test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Add modular bamba file
* Remove trainer changes from unrelated PR
* Modify modular and cofig to get model running
* Fix some CI errors and beam search
* Fix a plethora of bugs from CI/docs/etc
* Add bamba to models with special caches
* Updat to newer mamba PR for mamba sublayer
* fix test_left_padding_compatibility
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix remaining tests
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* missed this test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* ran make style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* move slow tag to integration obj
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* make style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* address comments
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix modular
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* left out one part of modular
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* change model
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Make Rotary modular as well
* Update bamba.md
Added overview, update Model inference card and added config
* Update bamba.md
* Update bamba.md
* Update bamba.md
Minor fixes
* Add docs for config and model back
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Add warning when using fast kernels
* replaced generate example
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Address comments from PR
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Propagate attention fixes
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Fix attention interfaces to the new API
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Fix API for decoder layer
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Remove extra weights
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
---------
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
* softcapping
* soft cap before the mask
* style
* ...
* super nit
* update
* fixes
* update
* small issue with modular
* fix modular imports
* update
* fixup
* simplify a hell lot
* simplify cleaning imports
* finish fixing
* update our design
* nits
* use a deprecation cycle
* updates
* Fix modular (recursive deps need to always be computed after merges!)
* push
* fix
* update
* fix modular order
* make fix-copies
* updates
* update
* ?
* don't compile for now
* ?
* fix some stuff
* donc!
* fix copies
* update
* fixup
* ?
* fix two tests
* fix?
* for now, don't use head info
* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))
* fix-copies
* revert sdpa check
* Apply suggestions from code review
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
* rebase, fix-copies and push
* add a slow integration test
* update the test
* fix left padding issue
* fix test
* remove duplicate scaling
* quality
* add a small test and make sure it works
* 2b
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
* blip2 tests
* instructblips
* copies
* fix slow tests
* fix
* uncomment this
* clean up after rebase
* should be model main input
* fix overwritten tests
* oops len should be multiple of frame number
* style
* fix some tests
* tmp commit
* tmp commit
* cull overwrites of deleted tests
* typo
* more specific docstring
* make fixup
* parameterize at the top?
* correction
* more deletions :D
* tmp commit
* for VLMs too
* fix _check_outputs
* test nit
* make fixup
* fix another flaky
* test_generate_from_inputs_embeds -- handle missing attention mask
* Add SynthIDTextWatermarkLogitsProcessor
* esolving comments.
* Resolving comments.
* esolving commits,
* Improving SynthIDWatermark tests.
* switch to PT version
* detector as pretrained model + style
* update training + style
* rebase
* Update logits_process.py
* Improving SynthIDWatermark tests.
* Shift detector training to wikitext negatives and stabilize with lower learning rate.
* Clean up.
* in for 7B
* cleanup
* upport python 3.8.
* README and final cleanup.
* HF Hub upload and initiaze.
* Update requirements for synthid_text.
* Adding SynthIDTextWatermarkDetector.
* Detector testing.
* Documentation changes.
* Copyrights fix.
* Fix detector api.
* ironing out errors
* ironing out errors
* training checks
* make fixup and make fix-copies
* docstrings and add to docs
* copyright
* BC
* test docstrings
* move import
* protect type hints
* top level imports
* watermarking example
* direct imports
* tpr fpr meaning
* process_kwargs
* SynthIDTextWatermarkingConfig docstring
* assert -> exception
* example updates
* no immutable dict (cant be serialized)
* pack fn
* einsum equivalent
* import order
* fix test on gpu
* add detector example
---------
Co-authored-by: Sumedh Ghaisas <sumedhg@google.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: sumedhghaisas2 <138781311+sumedhghaisas2@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
* auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned
* values func is changed with extensions & sequence key value bug is fixed
* map key value check is added in ExtensionsTree
* empty trimmed_ids bug is fixed
* tail_id IndexError is fixed
* empty trimmed_ids bug fix is updated for failed test
* too much specific case for specific tokenizer is removed
* input_ids check is updated
* require auto-gptq import is removed
* key error check is changed with empty list check
* empty input_ids check is added
* empty trimmed_ids fix is checked with numel function
* usage change comments are added
* test changes are commented
* comment style and quality bugs are fixed
* test comment style and quality bug is fixed
* add idefics
* conflicts after merging main
* enable tests but need to fix some
* fix tests
* no print
* fix/skip some slow tests
* continue not skip
* rebasing broken smth, this is the fix