* changes for video
* update modular
* change get_video_features
* update video token replacement
* update modular
* add test and fix typo
* lint
* fix order
* lint
* fix
* remove dependency
* lint
* lint
* remove todo
* resize video for test
* lint..
* fix test
* new a processor for video_test
* fix test
Also add notes asking users to set `TORCHDYNAMO_CAPTURE_SCALAR_OUTPUTS=1`
or call `torch._dynamo.config.capture_scalar_outputs = True`, as currently
this will cause a graph break.
Signed-off-by: Hollow Man <hollowman@opensuse.org>
* ensure the query is updated during training
avoid unused parameters that DDP does not like
* avoid a crash when `kwargs` contain `padding=True`
trainers often pass this argument automatically
* minor
* Remove mel_spec lazy init, and rename to mel_filters.
this ensures save_pretrained will not crash when saving the processor during training
d5d007a1a0/src/transformers/feature_extraction_utils.py (L595)
* minor - most feature extractors has a `sampling_rate` property
* speedup relative position embeddings
* fix several issues in model saving/loading:
- avoid modifying `self._hf_peft_config_loaded` when saving
- adapter_config automatically points to the original base model - a finetuned version should point to the model save dir.
- fixing model weights names, that are changed by adding an adapter.
* minor
* minor
* minor
* fixing a crash without peft active
* add todo to replace einsum
* granite speech speedups:
1. register attention_dist to avoid cpu-to-gpu transfer every layer.
2. pad_sequence is much faster than per-sample-padding + concat.
3. avoid returning audio back to cpu when using a compute device.
* support audio.shape=(1,L)
* docs: update LLaVA-NeXT model card
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* [docs] Updated llava_next model card
* Update docs/source/en/model_doc/llava_next.md remove image sources
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* [fix] Change Flash Attention to SDPA badge
* [doc] fixed quantization example
* docs: updated contribution details and badges
* Update llava_next.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update marian.md
This update improves the Marian model card to follow the Hugging Face standardized model card format. The changes include:
- Added a clear description of MarianMT, its architecture, and how it differs from other models.
- Provided usage examples for Pipeline and AutoModel.
- Added a quantization example for optimizing model inference.
- Included instructions and examples for multilingual translation with language codes.
- Added an Attention Mask Visualizer example.
- Added a Resources section with relevant links to papers, the Marian framework, language codes, tokenizer guides, and quantization documentation.
- Fixed formatting issues in the code blocks for correct rendering.
This update improves the readability, usability, and consistency of the Marian model documentation for users.
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update marian.md
* Update marian.md
* Update marian.md
* Update marian.md
* Update docs/source/en/model_doc/marian.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update marian.md
* Update marian.md
* Update marian.md
* Update marian.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add initial structure
* doc fixes, add model base logic
* update init files
* some fixes to config and modular
* some improvements for attention
* format
* remove unused attn
* some fixes for moe layer and for decoder
* adapt _compute_yarn_parameters for deepseek
* format
* small fix
* fix for decoder forward
* add tests, small refactoring
* fix dummies
* fix init
* fix doc
* fix config docs
* add sequce doc, fix init for gate
* fix issues in tests
* fix config doc
* remove unused args
* some fixes and refactoring after review
* fix doc for config
* small fixes for config args
* revert config refactoring
* small refactoring
* minor fixes after rebase
* small fix after merge
* fix modular
* remove rotaryembd from public init
* small test fix
* some rotary pos calculation improvement
* fix format
* some improvements and fixes
* fix config
* some refactoring
* adjust some unit tests
* skip test
* small fixes and tests adjustment
* reapply modular
* fix all tests except Integration
* fix integration testzs
* cleanup BC stuff
* rope
* fix integrations tests based on a10
* style
---------
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* Add Doge Model
* Fix code quality
* Rollback an error commit
* Fix config for open-source weights
* Revert "Fix config for open-source weights"
This reverts commit 229cdcac10.
* Add modular_doge
* Update Doge inherits from Llama
* Fix import bug
* [docs] Add usage of doge model
* Fix Doge import pretrainedconfig from modeling_utils to configuration_utils
* [docs] remove trust remote code from doge
* Fix dynamo bug in doge model
* Update docstrings
* Import apply_rotary_pos_emb and repeat_kv from Llama
* Fix all nits
* Fix code quality
* Fix some bugs
* Fix code quality
* Remove inherited `_update_causal_mask` from Llama
This leads to incorrect weight initialization.
* Fix the wrong tensor orderings in DogeCDMoE
* Fix attention mask bug
We have to provide attention_mask for dynamic mask computation
* Modify most implementations to inherit from Llama
But there are two problems:
1. `flex_attention_forward` is not updated properly
2. `Example` error in the forward method of DogeForCausalLM
* Modify CDMoE for batch efficient implementation
* Uniform MoE configuration names, just like QwenMoE
* Fix code quality
* Fix code quality
* Fix code quality
* Add tp plan of CDMoE Module
* Hybird DMA with sliding window
* Update valid tokens greater than window size
* Fix code quality
* Add `convert_doge_weights_to_hf`
* Fix STATE_DICT_MAPPING in convert_doge_weights_to_hf.py
* Fix nits in modular_doge
* Fix code quality
* Fix all nits
* Fix all nits
* Make sure the attention function is updated inside the class
* Fix code quality issues in the Doge model and add a test for it
* Fix `test_generate`
* Fix code quality
* Fix nits fllowing suggestions
* Fix code quality
* Fix code quality issues
* Fix nits
* Fix code quality nits
* Fix the missing parameters in the configuration.
* Fix the missing parameters in the configuration.
* Fix nits
* Add initialization of attention
* Fix last nits
* Simplify dynamic mask generation logic
* Rename router_logits to gate_logits for matching latest changes of MixtralModel
* Rename typings for matching latest changes of MixtralModel
* Fixes typo in comment
* Update src/transformers/models/doge/modular_doge.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fix code quality issues to match other modular
* Fix code quality issues to match other modular
* Fix the static compilation errors
* Update model weights link
* Fix code quality issues to match other modular
* reapply modular and support for new outputs
* style
* simplify a lot
* fix import location
* reapply modular
* fix
* fix integration test
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* Fix errors when use verl to train GLM4.1v model
* Support glm4v load from AutoModelForVision2Seq
* Set glm4v model _checkpoint_conversion_mapping attr from None to {}
* Update modeling_auto.py
* fix(decoding): stop beam search per-instance when heuristic satisfied
Previously, when early_stopping is set to `False`, the early-stopping heuristic only halted generation when **all** batch instances reached the criterion. This caused instances that are impossible (suggested by the heuristic) to improve keep generating, leading to inconsistent and overlong outputs across the batch.
Now we apply the heuristic **per-instance**: once a certain instance of batch has its all beams impossibe to improve, we mark that instance finished while letting others continue. This restores expected behavior and ensures consistency in batched generation.
* Add test case GenerationIntegrationTests.test_beam_search_early_stop_heuristic
* Update naming improvement_possibility -> is_early_stop_heuristic_unsatisfied
* Add comments for early stop heuristic
* Update src/transformers/generation/utils.py
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
- Complete Apache License text in Italian documentation
- Remove duplicate variable assignment in Perceiver converter
- Fix typo in MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES constant
* chameleon xpu bnb groundtruth update on bnb triton backend since we are
deprecating ipex backend
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* enable hqq uts on XPU, all passed
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* fix comment
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* update the glm4 model readme
* update test
* update GLM-4.1V model
* update as format
* update
* fix some tests
* fix the rest
* fix on a10, not t4
* nit: dummy import
---------
Co-authored-by: raushan <raushan@huggingface.co>
* Update LED model card
* Remove extra arguments
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>