* Iterative generation using input embeds
* Add Janus model
* discard changes
* Janus imports
* Refactor config and processor
* Added Vision tower of Janus
* Import Janus Image processor
* Vision tower fixes
* Refactor code
* Added VQ Model
* Complete model integration
* temp conversion script
* processor refactor
* Adding files to facilitate pulling
* Fixes after debugging
* Skip test for these models
* Add Janus Model
* discard changes
* Janus imports
* Refactor config and processor
* Added Vision tower of Janus
* Import Janus Image processor
* Vision tower fixes
* Refactor code
* Added VQ Model
* Complete model integration
* temp conversion script
* processor refactor
* Adding files to facilitate pulling
* Fixes after debugging
* Refactor to Text config
* ✨ Added generate function
* Saving intermediate convert file. Still need to read configs from the hub and convert them to our format.
* Adding version that reads from the JSON files. Still have to tweak some parameters manually.
* relative imports
* Initial tests
* Refactor image processor
* Seemingly working version of the conversion script, will need to test further.
* Adding command message
* Fixing conflicting JanusTextConfig class
* Incorporating some of the discussed changes.
* Small fix to create dir.
* Removing system from JINJA template
* Adding draft processor tests
* style fixes
* Minor fixes and enhancement
* added generation config
* Initial tests
* Small modifications, tests are now passing.
* Small changes I noticed while reading code.
* more fixes
* Added JanusModel class
* Small merge adaptations
* Small merge adaptations
* Image processing tests passing
* More tests and fixes
* Convert script updated and refactored
* Tests and cleanup
* make style
* Postprocessing for image generation
* generate refactor
* fixes
* - Passing tests that write a part of the model to cpu (e.g. test_cpu_offload)
- Passing tests of dispatching SDPA
- Only gradient checkpointing tests are left.
* Removing temporary code
* Changes
* Writing change to modular
* Added JanusVisionModel. SDPA dispatch tests pass more robustly. Gradient checkpoint tests are next
* Gradient checkpoint tests passing
* Removing debug code
* Major generate refactor 😮💨
* Temp changes for testing
* Green quality CI
* 2 out of 4 integration tests passing
* breadcrumbs
* Usage Examples
* Regenerate modeling after merge
* dirty code
* JanusIntegrationTest are passing
* breadcrumbs
* happy CI
* fixes
* Changing template
* nits
* Text generation logits matching original codebase at 100% precision
* Remove ./tmp from git tracking
* Remove ./tmp from git tracking
* Checkpointing changes after reviewing
* Fixing code in docstrings
* CHanging comments and small bug in convert file
* Fixing bug in image_token_id for 7B version
* Removing line that was added by both of us
* Pushing changes after discussion. Only one left is to change the key mapping for convert file.
* Updating module file
* New convert file using dict. Tested that it is equivalent to the old one by:
- comparing keys in a script
- comparing checksums of the output files between version generated with the current convert script and those generated with the old script. This is a more reliable test.
* revert changes
* mistake
* consistency change for CI
* make style
* doc fixes
* more fixes
* experimenting with masking out pad token
* checkpoint
* Batched generation with multi-images working for 1B models. Will test 7B next.
* Device fix.
* Writing changes to modular, previous ones were written to modeling just for quick testing.
* Using passed processor attention mask (only in modeling for now)
* Matching performance done in the non-standard way
* Working version of batched generation. Will change how some args are passed to make it more similar to language case
* More compliant version of the code
* Removed duplicated `_prepare_4d_causal_attention_mask_with_cache_position`
* Updating modular file, making masked filling with paddings more efficient
* Slightly more efficient version
* Modifying JanusVisionModel to be a wrapper
* Fixing test to comply with new names
* Modular overhaul
* More refactoring
* - Changing JanusVisionModel back
- Changing forward pass
- Adding boi token to the comparison
* - Removing whole context model_ids
- Using inherited implementation of prepare_inputs_for_generation
* Moving the way boi token is passed to the model
* Fixing sdpa test
* Minor changes
* testing changes
* Minor fix
* - Adding postprocessing test
- checking values of generated image on integration test
* changes
* Removing pooled attention vision module, fixing convert script as a consequence
* More changes
* Fixes
* Draft after merge
* Bug fixes
* More bug fix
* Fixing docs
* Nits
* Refactor return dict
* Moving image post processing test to main processor post process
* Passing guidance_scale as kwarg
* make style
* 🔥 refactor
* make style
* Update and green CI
* Nits and tests update
* up
* Added MID block
* fix
* Dead code
* update testcase
* update
* model_id change
* init_weight changes
---------
Co-authored-by: hsilva664 <metallic-silver@hotmail.com>
* add support for fast tokenizer
* make style
* fix according to reviews
* make style
* relax slow_fast_equivalence mean diff
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
* added efficientnet image preprocessor but tests fail
* ruff checks pass
* ruff formatted
* properly pass rescale_offset through the functions
* - corrected indentation, ordering of methods
- reshape test passes when casted to float64
- equivalence test doesn't pass
* all tests now pass
- changes order of rescale, normalize acc to slow
- rescale_offset defaults to False acc to slow
- resample was causing difference in fast and slow. Changing test to bilinear resolves this difference
* ruff reformat
* F.InterpolationMode.NEAREST_EXACT gives TypeError: Object of type InterpolationMode is not JSON serializable
* fixes offset not being applied when do_rescale and do_normalization are both true
* - using nearest_exact sampling
- added tests for rescale + normalize
* resolving reviews
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* initial documentation
* rename mask to attention_mask
* smaller tests
* fixup
* fix copies
* move to time series section
* sort docs
* isort fix
* batch_size is not a configuration
* rename to TimesFMModelForPrediction
* initial script
* add check_outputs
* remove dropout_rate
* works with torch.Tensor inputs
* rename script
* fix docstrings
* fix freq when window_size is given
* add loss
* fix _quantile_loss
* formatting
* fix isort
* add weight init
* add support for sdpa and flash_attention_2
* fixes for flash_attention
* formatting
* remove flash_attention
* fix tests
* fix file name
* fix quantile loss
* added initial TimesFMModelIntegrationTests
* fix formatting
* fix import order
* fix _quantile_loss
* add doc for SDPA
* use timesfm 2.0
* bug fix in timesfm decode function.
* compare mean forecasts
* refactor type hints, use CamelCase
* consolidate decode func
* more readable code for weight conversion
* fix-copies
* simpler init
* renaem TimesFmMLP
* use T5LayerNorm
* fix tests
* use initializer_range
* TimesFmModel instead of TimesFmDecoder
* TimesFmPositionalEmbedding takes config for its init
* 2.0-500m-pytorch default configs
* use TimesFmModel
* fix formatting
* ignore TimesFmModel for testing
* fix docstring
* override generate as its not needed
* add doc strings
* fix logging
* add docstrings to output data classes
* initial copy from t5
* added config and attention layers
* add TimesFMPositionalEmbedding
* calcuate scale_factor once
* add more configs and TimesFMResidualBlock
* fix input_dims
* standardize code format with black
* remove unneeded modules
* TimesFM Model
* order of imports
* copy from Google official implementation
* remove covariate forecasting
* Adapting TimesFM to HF format
* restructing in progress
* adapted to HF convention
* timesfm test
* the model runs
* fixing unit tests
* fixing unit tests in progress
* add post_init
* do not change TimesFMOutput
* fixing unit tests
* all unit tests passed
* remove timesfm_layers
* add intermediate_size and initialize with config
* initial documentation
* rename mask to attention_mask
* smaller tests
* fixup
* fix copies
* move to time series section
* sort docs
* isort fix
* batch_size is not a configuration
* rename to TimesFMModelForPrediction
* initial script
* add check_outputs
* remove dropout_rate
* works with torch.Tensor inputs
* rename script
* fix docstrings
* fix freq when window_size is given
* add loss
* fix _quantile_loss
* formatting
* fix isort
* add weight init
* add support for sdpa and flash_attention_2
* fixes for flash_attention
* formatting
* remove flash_attention
* fix tests
* fix file name
* fix quantile loss
* added initial TimesFMModelIntegrationTests
* fix formatting
* fix import order
* fix _quantile_loss
* add doc for SDPA
* use timesfm 2.0
* bug fix in timesfm decode function.
* compare mean forecasts
* refactor type hints, use CamelCase
* consolidate decode func
* more readable code for weight conversion
* fix-copies
* simpler init
* renaem TimesFmMLP
* use T5LayerNorm
* fix tests
* use initializer_range
* TimesFmModel instead of TimesFmDecoder
* TimesFmPositionalEmbedding takes config for its init
* 2.0-500m-pytorch default configs
* use TimesFmModel
* fix formatting
* ignore TimesFmModel for testing
* fix docstring
* override generate as its not needed
* add doc strings
* fix logging
* add docstrings to output data classes
* add _CHECKPOINT_FOR_DOC
* fix comments
* Revert "fix comments"
This reverts commit 8deeb3e191.
* add _prepare_4d_attention_mask
* we do not have generative model classes
* use Cache
* return past_key_values
* modules initialized with config only
* update year
* Update docs/source/en/model_doc/timesfm.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add layer_idx to cache
* modular timesfm
* fix test
* unwrap sequential class
* fix toctree
* remove TimesFmOnnxConfig
* fix modular
* remove TimesFmStackedDecoder
* split qkv layer into individual layers
* rename projection layers
* use ALL_ATTENTION_FUNCTIONS
* is_causal is True
* rename config
* does not support flash_attn_2
* formatting
* fix typo in docsstring
* rename inputs
* add time series mapping
* Update src/transformers/models/olmo2/modeling_olmo2.py
* Update src/transformers/models/moonshine/modeling_moonshine.py
* use updated arguments
* fix class name
* add MODEL_FOR_TIME_SERIES_PREDICTION_MAPPING
* isort
* consolidate _preprocess into forward
* fix a typo
* fix a typo
* fix toc
* fix modular
* remove aaserts
* use self.config._attn_implementation
* move to _postprocess_output
* remove timesfm_get_large_negative_number
* use view unstead of multiple unsqueeze
* make helpers static methods of the Model
* use to_tuple
* use to_tuple if not return_dict
* remove unused intitialization block as its incorporated in nn.Linear
* remove unused num_key_value_groups
* use the same convention as the masking method
* update modular
* do not use unsqueeze
* use view instead of unsqueeze
* use buffer for inv_timescales
* formatting
* modular conversion
* remove unneeded intialization
* add missing docstrings
* remove cache
* use simple_eager_attention_forward
* support tp_plan
* support for flex and flash attention masks
* Revert "support for flex and flash attention masks"
This reverts commit def36c4fcf.
* fix device
* fix tests on gpu
* remove unsued large model test
* removed unneeded comments
* add example usage
* fix style
* add import
* Update docs/source/en/model_doc/timesfm.md
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* inherit from LlamaRMSNorm
* use can_return_tuple decorator
* remvoe return_dict
* fix year
* Update docs/source/en/model_doc/timesfm.md
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* pretrained does not inherit from GenerationMixin
* use model for integration test
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Rajat Sen <rsen91@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
* refactor docs
* add serialization
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* reorder
* add link
* change automatic to autoquant
Co-authored-by: DerekLiu35 <91234588+DerekLiu35@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* nits
* refactor
* add colab
* update
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: DerekLiu35 <91234588+DerekLiu35@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Refactor ColPali model documentation
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Include quantisation exemple + real images
* simpler image loading
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update VITS model card
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update vits.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* VDR task guide
* Add to toctree
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix and enhance pipeline_webserver.md
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
* Update docs/source/en/pipeline_webserver.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/pipeline_webserver.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* use pipe
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add MLCD model
* Update codes for auto-mapping
* Add test scripts for MLCD
* Update doc for MLCD model
* Fix import error
* Fix import error
* Fix CI error for attention_outputs
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix CI error for initialization
* Fix code style for CI
* Fix code style for CI
* Reformat codes and docs for CI test
* Reformat codes and docs for CI test
* Remove unused attributes for CI test
* Fix style for CI test
* List MLCD in flash_attn doc
* Fix: typos, modulars, refactors from suggestions
* Refactoring convert_mlcd_weights_to_hf.py from suggestions
* Fix: docs conflicts
* Fix error for CI test
* Fix style for CI test
* Add integration test for MLCD
* Refactoring by class inheritance
* Fix: refactor attention interface, adjust codes
* Fix: merging conflicts
* Fix: merging conflicts
* Fix: style for CI test
* Fix: style for CI test
* Fix: set test_resize_embeddings to be False
* Fix: initializer for CI test
* Fix: conflicts, CI test, warning and refactoring
* Fix: merging conflicts
* Refactor
* Update docs
* Fix mistakes
* Remove unused args and fix multi-gpu error
* Revert position_embeddings
* Solve conflicts
* Solve conflicts
* Remove dummy
* Update _init_weights
* Update _init_weights
* Update _init_weights for CI test
* Add ImageProcessorFast to BiT processor
* propose a fast processor and add tests
* all tests pass except one
* run make
* remove useless print
* use same test as clip
* apply make
* Update src/transformers/models/bit/image_processing_bit_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* Update setup.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* Update src/transformers/models/bit/image_processing_bit_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* apply review comment
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* support fast image processor layoutlmv3
* make style
* add warning and update test
* make style
* Update src/transformers/models/layoutlmv3/image_processing_layoutlmv3_fast.py
* Update image_processing_auto.py
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* support flava fast image processor
* run style and quality
* update test
* update according to reviews
* make style
* update comment on BICUBIC
* make style
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* add test and fast image processor
* make style
* Update src/transformers/models/perceiver/image_processing_perceiver_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* make style
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* First pass at speech granite
Add encoder / projector, rename things
* Combine into one model file with causal lm outputs for forward
* Add loss calc
* Fix config loading
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
* Split new / old loading logic
* Use transformers integration for loading peft adapters
* Add generation wrapper for selective lora enablement
* Add note for qformer encoder automodel
* Guard torch/audio imports in feature extractor
* Handle granite speech autoclasses
* Handle optional deps in package structure for granite speech
* Add granite pretrained model def for init
* Add dummy objects for torch/torchaudio
* Add tests for granite speech processor
* Minor formatting fixes and refactoring
* Add options for falling back to config in forward
* Tentative model docstrings for granite speech
* Fix config type
* Remove legacy load
* Allow non-lora variants for granite speech
* Override weight tying for llm
* Use text config instead of llm config
* Add output embeddings getter to fix weight tying
* Fix relative imports
* computing the number of audio features, based on the raw audio sequence.
* collating audio inputs, and keeping the original lengths.
* asserted we have text. otherwise we can't specify the audio special token.
* assering the number of audio-symbols/audios match correctly.
running get validated_audios only when audio is present
* indentation bugfix + supporting different feature lengths when expanding audio.
* redundant, done in _get_validated_text
* adapting the tests:
- we must have text (not either audio or text)
- _get_num_audio_features takes a list of raw lengths, provided it insetad.
* Minor cleanup, remove unused import
* Add more tests for batch feature processing
* Allow setting offset in rel position embeddings
* Add config option for warning if peft is not installed w/ lora
* Port blip2 qformer code into granite speech
* Add sad test for numpy arr processing
* Allow numpy arrays / tuples in granite speech processor
* Fix config type for projector
* - pad instead of creating a zeros tensor, to keep the original dtype/device (support bfloat16)
- cast input_features to the model dtype (support bfloat16)
* merge Blip2QFormerConfig to GraniteSpeechProjectorConfig
* prevent a crash when re-saving/loading the model (line 109)
* consider additional edge cases during preprocessing.
* consider additional edge cases during preprocessing.
* add features mask for batched inference (bugfix)
* Minor refactor, remove multiaudio processor tests
* Add set input/output embeddings for granite speech
* Fix feature dim check in processor test
* Pop input features in embed test for granite speech
* Small fixes for test edge cases
Add granite speech to seq2seq causal lm mapping names
* Add small tests for granite speech model
* Fix data parallelism test
* Standardize model class names
* Fix check for copies
* Fix misaligned init check
* Skip granite speech in checkpoint check
* Use default for tie_word_embeddings in granite speech
* Fix non documentation granite speech repo issues
* Fix comments and docstring checks
* Add placeholder docs for granite speech
* Fix test naming collision
* Code formatting
* Rerun torch dummy obj regen
* Fix save pretrained for granite speech
* Import sorting
* Fix tests typo
* Remove offset hack
* Pass args through encoder config
* Remove unused prune heads from blip2
* removing einsum. replaced with explicit multiplication (relative positional encodings) and sdpa attention.
* remove Sequential from ConformerFeedForward and ConformerConvModule. + fix for sdpa attention
* remove GraniteSpeechConformerScale
* rename to hidden_states
* rename conformer layers to self.layers, remove the first linear from the list to keep the list homogenous.
* move pre-norm to the attention/feedforward blocks (avoid complex module wrapping)
* adding pre_norm into forward
* feature extractor refactoring to resemble how it's done in phi4multimodal.
* rename feature_extractor to audio_processor
* bugfix: input_feature_mask fix to get the exact number tokens.
* Fix pytest decorator in processor test
* Add (disabled) integration tests for granite speech
* Fix handling of optional feature masking
* Loosen validation in processing for vLLM compatability
* Formatting fixes
* Update init structure to mirror llama
* Make granite speech projector generic
* Update test config to reflect generic projector
* Formatting fixes
* Fix typos, add license
* Fix undefined var in input processing
* Cleanup and expose ctc encoder
* Add missing config docstrings
* Better var names, type hints, etc
* Set attn context size in init
* Add max pos emb to encoder config
* Cleanup feature extractor
* Add granite speech architecture details
* Remove granite speech qformer ref
* Add paper link, explicit calc for qkv
* Calculate padding directly in depthwise conv1d init
* Raise value error instead of asserting
* Reorder class defs (classes used at top)
* Precompute relpos distances
* Run formatting
* Pass attention distances through forward
* Apply suggestions from code review
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
* Add todo for using common batch feature extraction
* Rename audios/features
* Ensure chat template may be provided to processor
* Move granite speech docs to audio models
* Add todos for input proc refactoring
* Fix import order
* Guard torch import
* Use relative imports
* Require torch backend for processor in granite speech
* Add backend guards in feature extractor
---------
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Co-authored-by: Avihu Dekel <avihu.dekel@ibm.com>
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>
* add classifier head to donut
* add to transformers __init__
* add to auto model
* fix typo
* add loss for image classification
* add checkpoint
* remove no needed import
* reoder import
* format
* consistency
* add test of classifier
* add doc
* try ignore
* update loss for all swin models
* fix tests and some clean up
* make one general test for each modality
* remove redundant merging of kwargs
* edge cases
* dont enforce slow when reloading
* fix gemma3 tests
* has to adapt llama 4 after rebase
* remove also from overriden tests
* should be green now
* initial draft
* make documentation simpler
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* turn pros and cons into tables
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add links to each quant method page
* separate calibration vs no calibration methods
* add calibration time estimates
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add changed
* Revert "add changed"
This reverts commit 0a0166a1fe.
* update with NEW MODEL class called GLM4
* update
* Update glm4.md
* Name
* style
* fix copies
* fixup test
---------
Co-authored-by: Yuxuan Zhang <2448370773@qq.com>
* Updated documentation for Donut model
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated code suggestions
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated code suggestion to Align with the AutoModel example
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated notes section included code examples
* close hfoption block and indent
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update model card for jamba
* Apply the suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Apply suggestions from code review-2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update model page.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update as per code review.
* Update docs/source/en/model_doc/jamba.md as per code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/jamba.md as per code review
`
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update as per code review.
* fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Improved Model card for Gemma2
* Made changes in gemma2 as suggested
* Made more changes in the doc (adding image, notes, closing hfoptions)
* minor fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update Model card for gpt2
* Update link for gpt2 space
* fixes docs based on suggestions
* Add transformers-cli and quantization example for GPT-2
* Remove resources and flash attention docs and fix typos
* feat: edit falcon mamba card
* fix: edit statement on falconmamba arch
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: add right indent for tags
* fix: remove notas
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* model card for Mistral
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* apply suggestions
* fix typo
* updated with comments
* updated with comments
* updated with comments
* remove hfoption block
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update OpenAI GPT model card
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update OpenAI GPT model card: add usage examples and notes section
* Add API autodoc tags after Notes section for OpenAI GPT model
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Added missing badges
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated T5 model card with standardized format
* Updated T5 model card with standardized format, fixed typo
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Apply reviewer suggestions
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Updated model card for distilbert
* Updated the distilbert model card
* Updated model card for distilbert
* Updated the distilbert model card
* Addressed code review comments
* Addressed review comments
* fix pipeline
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update ELECTRA model card with new format
* Update ELECTRA model card with new format
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* close hfoption block
---------
Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Modify Model Card for ModernBERT.
* Update as per code review.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update model card.
* Update model card.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update code_llama.md
aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598
sub part of https://github.com/huggingface/transformers/issues/36979
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* make changes as per code review
* chore: make the function smaller for attention mask visualizer
* chore[docs]: update code_llama.md with some more suggested changes
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* chore[docs] : Update code_llama.md with indentation changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update Cohere model card to follow standard template
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update cohere.md
Update code snippet for AutoModel, quantization, and transformers-cli
* Update cohere.md
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feat: updated model card for qwen_2.5_vl
* applied suggested change 1
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* applied suggested change 2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* applied suggested change 3
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: made requested changes for quantization and notes
* suggeested model card change 4
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* updated model card wiht suggested change 5
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* updated model card wiht suggested change 6
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* updated model card wiht suggested change 7
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feat: applied requested changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feat: updated model card for falcon
* fix:rewrite model description
* fix: add link to conversion script
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: Add suggested changes
* fix: typo in link for quantization
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: fix indent and close ticks
* fix: add indent
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update clip.md
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Incorporated suggested changes
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Initial commit for Qwen3
* fix and add tests for qwen3 & qwen3_moe
* rename models for tests.
* fix
* fix
* fix and add docs.
* fix model name in docs.
* simplify modular and fix configuration issues
* Fix the red CI: ruff was updated
* revert ruff, version was wrong
* fix qwen3moe.
* fix
* make sure MOE can load
* fix copies
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
* init commit
* style
* take comments into account
* add deepseekv3 modeling
* remove redundant code
* apply make style
* apply fix-copies
* make format
* add init files
* rename deepseekv3 into deepseek_v3 based on its model_type
* rename deepseekv3 into deepseek_v3 based on its model_type
* deepseek-v3 not deepseek_v3
* set model_type as deepseek_v3
* use default docs
* apply make
* fill type and docstring
* add rope_config_validation
* use custom DeepseekV3MLP
* hold code only for checkpoints congifuration; remove redundant
* revise rope yarn for DeepSeek variation
* rename DeepSeek-V3
* some refactoring
* revise load_hook to work properly; make moe func trainable; use llama instead of mixtral
* fix attention forward
* use -1 for not-changing dim when to use exapnd
* refactor DeepseekV3TopkRouter
* use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim
* register pre_hook and hook both
* make style
* use n_shared_experts
* Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add test file
* update modeling_file according to modular file
* make style
* add mapping for DeepseekV3ForSequenceClassification
* remove aux_loss_alpha
* add deepseek_v3 for perf
* add deepseek_v3
* rename test as deepseekv3
* use tiny-deepseek-v3
* remove DeepseekV3ForSequenceClassification
* cache before padding
* remote output_router_logits
* Revert "remote output_router_logits"
This reverts commit f264f800d0.
* remove output_router_logits
* make e_score_correction_bias as buffer
* skip tests not compatible
* make style
* make e_score_correction_bias as buffer
* use rope_interleave instead of load_hook
* skip tests not compatible with MLA
* add doc for rope_interleave
* fix typo
* remove torch.no_grad for selecting topk
* fix post merge issue
* mrege with main and simplify
* nits
* final
* small fixes
* fix
* support TP better
* stash
* changes currently requires
* remove synch
* more fixes for TP
* temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used
* updates to have generation work!
* push most of the changes
* reorder functions + call for contributions!
* update readme
* nits
* update
* ruff was updated on main
* merge with main and fix copies
* revert unrelated changes
* route all tokens to all experts when testing to avoid no gradient iddues
* finish fixing all tests
* fixup
* nit
* clean config
* last readme changes
* nit
* do cnit
* typo
* last nit
* one more one more
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal>
* no image
* test
* revert jax version updates
* make fixup
* update autodoc path for model_addition_debugger
* shieldgemma2
* add missing pages to toctree
* draft of model tracer visualiser
* add context manager in addition to decorator
* add debug utils to init
* move model debugging utils to dedicated file
* add documentation
* protect some imports
* format
* move and protect imports
* format
* doc: improve errors in case of broken dummy imports.
* format
* use automatic torch backend
* update doc
* fix backend
* (TEMP) move to dummies while backend wait
* update documentation
* doc
* add prompt depth anything model by modular transformer
* add prompt depth anything docs and imports
* update code style according transformers doc
* update code style: import order issue is fixed by custom_init_isort
* fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything
* move prompt depth anything to vision models in _toctree.yml
* update backbone test; there is no need for resnet18 backbone test
* update init file & pass RUN_SLOW tests
* update len(prompt_depth) to prompt_depth.shape[0]
Co-authored-by: Joshua Lochner <admin@xenova.com>
* fix torch_int/model_doc
* fix typo
* update PromptDepthAnythingImageProcessor
* fix typo
* fix typo for prompt depth anything doc
* update promptda overview image link of huggingface repo
* fix some typos in promptda doc
* Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality.
* add copy disclaimer for prompt depth anything image processing
* fix some format typos in image processing and conversion scripts
* fix nn.ReLU(False) to nn.ReLU()
* rename residual layer as it's a sequential layer
* move size compute to a separate line/variable for easier debug in modular prompt depth anything
* fix modular format for prompt depth anything
* update modular prompt depth anything
* fix scale to meter and some internal funcs warp
* fix code style in image_processing_prompt_depth_anything.py
* fix issues in image_processing_prompt_depth_anything.py
* fix issues in image_processing_prompt_depth_anything.py
* fix issues in prompt depth anything
* update converting script similar to mllamma
* update testing for modeling prompt depth anything
* update testing for image_processing_prompt_depth_anything
* fix assertion in image_processing_prompt_depth_anything
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update docs/source/en/model_doc/prompt_depth_anything.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update docs/source/en/model_doc/prompt_depth_anything.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* update some testing
* fix testing
* fix
* add return doc for forward of prompt depth anything
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix prompt depth order
* fix format for testing prompt depth anything
* fix minor issues in prompt depth anything doc
* fix format for modular prompt depth anything
* revert format for modular prompt depth anything
* revert format for modular prompt depth anything
* update format for modular prompt depth anything
* fix parallel testing errors
* fix doc for prompt depth anything
* Add header
* Fix imports
* Licence header
---------
Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Disable inductor config setter by default
This is hard to debug and should be off by default
* remove default settings in autoquant too
* Add info to torchao.md about recommended settings
* satisfying Ruff format
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Just import torch AdamW instead
* Update docs too
* Make AdamW undocumented
* make fixup
* Add a basic wrapper class
* Add it back to the docs
* Just remove AdamW entirely
* Remove some AdamW references
* Drop AdamW from the public init
* make fix-copies
* Cleanup some references
* make fixup
* Delete lots of transformers.AdamW references
* Remove extra references to adamw_hf
* add support for fast image processors in add-new-model-like
* fix header not found add-fast-image-processor-cli
* Encourage adding fast image processor
* nit
* start improve doc
* update docs
* make requested modifs
* Fix converter
* [Broken] Adds Gemma 3 to Hugging Face Transformers
* Consolidating Config and Processor params across impls
* Sorting out configuration parameters. Adds qk_norm before RoPE. Still not sure if RoPE is right.
* Additional plumbing for CausalLM and ConditionalGeneration variants
* incomplete draft of Orbax conversion script
* More complete checkpoint conversion
* Supporting Gemma 3 1B checkpoints
* Updating RoPE for multiple frequencies
* Adjustments to rotary embedder
* Proof of life for text-only operation
* Updating the conversion script to handle multimodal projection weights
* Fixing tet-only conversions
* Cleaner conversion script with multimodal support and a simpler processor
* Additional refatcors to the Gemma3Processor
* Simplified Processor to work over text representations
* Updated conversion script to join text and vision embeddings at converion time
* Logging for debugging
* Update src/transformers/models/gemma2/modeling_gemma2.py
Co-authored-by: Joshua Lochner <admin@xenova.com>
* Removed extraneous Config params
* Switching to fast tokenizer for checkpoint conversions
* isolating siglip for performance tetsing
* Minor changes for debugging tests against baselines
* Adding average pooling for soft tokens
* Updating processor code to enable simpler embedding interleaving for arbitrary number of images in prompts
* Updating conversion script for ShieldGemma 2 conversion compatibility
* Allow disable_compile to be provided as a kwarg
* Refresh from modular
* Updated conversion script and corrected sliding window
* Fix type mismatch in cache_position (#4)
* Fix dtype (#5)
* Fix type mismatch in cache_position
* Actually fix in the modular file
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
---------
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
* fixes for embedding table overflow and missing image_soft_token_mask from Gemma3Processor
* Adding 2D pooling for image embeddings
* Revert "Adding 2D pooling for image embeddings"
This reverts commit 65350cf531.
* Gemma3 average pooling changed from 1D to 2D
* Major refactor to Gemma3MultimodalInputProjection
* Updating Gemm 3 Auto* registrations
* Add option to save Gemma 3 chat template with tokenizer during weights conversion
* Removing unused imports
* Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditionalGeneration
* Removing duplicate config property
* Removing final logit softcapping and 1-indexing of position ids
* Fixing image processor config and none --> None typo
* Fixing sliding window size for 1B
* Updating image_mean and image_std in Image Processor
* Attention masking changed to lower triangular
* Moving image special tokens to conversion script
* Mirror image processor defaults from conversion script into Gemma3ProcessorKwargs
* Remove special token variables from symbol space
* Moving image soft token mask computation from Gemma3Processor to Gemma3ForConditionalGeneration
* tie lm_head and embedding weights
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
* Correct tied weights in Gemma3CausalLM
* iterative bidirectional attention
* resolving merge conflicts
* Reverting to Gemma 2 HybridCache with sldiing window support and a sliding_window_pattern of 6
* Correcting RoPE scaling
* clean up first pass, dummy model geenration works
* final clean up before fixing tests
* causal lm test works, so fine
* Fix conversion
* Update src/transformers/models/gemma3/processing_gemma3.py
* model tests are happy
* processor tests are happy
* image processing tests added
* fixup
* Fix pre-processing in conversion
* Inputs merging
* Do not normalize vision embeddings
* Apply Ryan's (and team) changes to attention
* token type ids + mask
* template
* move embed scale, add rope scale, fix tests
* Add chat template to tokenizer
* Use prefix for causal model loading
* use existing code for sliding mask from gemma2
* self.embed_tokens already normalizes
* Correcting Gemma3TextConfig parameters in conversion script
* typo, modular overwrites my fixes
* enable device map for text model
* Conversion updates
* ultra nit: no einsums
* update image token
* copy deepcopy config + some docs
* add some test, still WIP
* Refactoring --include_chat_tempalte logic in converter
* Update src/transformers/models/gemma3/modular_gemma3.py
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
* Add eos tokens for instruct models
* dump so i can work on dgx
* Removing add_bos by default
* dump
* add fast im proc
* docs for PaS + fixup
* another fixup
* one more fixup
* fix tests
* Inverting prior BOS change
* ultra nit
* Reverting to Tokenizer saved with add_bos_token=True and chat template starting with BOS
* resize embeds, remove sqrt, add slow test outputs
* FA2 but quality is meh
* nit
* skip FA2, no idea what happened
* last bit for green CI
* please, green CI for docs
* T_T
* Fix for Gemma3 logits
* Support both options for system prompt
* Update src/transformers/models/gemma3/image_processing_gemma3_fast.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/gemma3.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/gemma3.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/gemma3.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/gemma3.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/en/model_doc/gemma3.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Docs updates now that assets are live
* Style fixes
---------
Co-authored-by: Joshua Lochner <admin@xenova.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Mayank Chaturvedi <imayank@google.com>
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>
Co-authored-by: Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
Co-authored-by: Lysandre <hi@lysand.re>
* update
* doc
* update
* Update docs/source/en/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add swanlab integration
* feat(integrate): add SwanLab as an optional experiment tracking tool in transformers
- Integrated SwanLab into the transformers library as an alternative for experiment tracking.
- Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`.
- Added necessary dependencies and documentation for SwanLab integration.
* Fix the spelling error of SwanLabCallback in callback.md
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Fix typo in comment
* Fix typo in comment
* Fix typos and update comments
* fix annotation
* chore: opt some comments
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: AAssets <20010618@qq.com>
Co-authored-by: ZeYi Lin <944270057@qq.com>
Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>
* initial commit
* small fix
* move stuff to image processing file
* remove stuff in validate turn and fix return tensor
* remove liquid stuff
* in the process of addressing comments
* changes to get the right tokenization
* new __init__ works
* fixing defulat std and mean
* works
* small testing scipt -- to be deleted before merge
* remove redundant code
* addressing comments
* fix inits, add docs templates
* refactor processor, switch to gotocr image processor
* remove image proc from init
* refactor to working llava-style architecture
* Change AyaVisionModel to AyaVisionForConditionalGeneration
* add tests
* fixups
* update doc
* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model
* better variable names + remove code paths
* Updates to aya_vision.md
* address comments
* adding copied from
* make style and remove unused projector_hidden_act from config
* sort init
* include usage of fast image proc and proc on cuda in doc
* update checkpoint iin test processor
* update checkpoint in test processor 2
* remove test_model and update docstring
* skip failing tests
---------
Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
* refactor image processor slow got ocr
* add working image processor fast
* fix fast image processor, update doc
* use one big loop for processing patches
* decompose chat template docs
* add docs
* update model docs
* qwen2-5
* pixtral
* remove old chat template
* also video as list frames supported
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* remove audio for now
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add implementation for DataCollatorForMultipleChoice based on docs.
* Add DataCollatorForMultipleChoice to import structure.
* Remove custom DataCollatorForMultipleChoice implementations from example scripts.
* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.
* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.
* Apply suggested changes and run make fixup.
* fix copies, style and fixup
* add missing documentation
* nits
* fix docstring
* style
* nits
* isort
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
* add RAdamScheduleFree optimizer
* revert schedulefree version to the minimum requirement
* refine is_schedulefree_available so that it can take min_version
* refine documents
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update awq doc
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add note for inference
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* implement config and model building blocks
* refactor model architechture
* update model outputs
* update init param to include use_fov_model
* update param name in config
* fix hidden_states and attentions outputs for fov
* sort config
* complete minor todos
* update patching
* update config for encoder
* fix config
* use correct defaults in config
* update merge for compatibility with different image size
* restructure encoder for custom configuration
* make fov model compatible with custom config
* replace word "decoder" with "fusion"
* weight conversion script
* fix fov squeeze
* update conversion script (without test)
* upload ruff image processing
* create fast image processing
* use torch interpolation for image processing
* complete post_process_depth_estimation
* config: fix imports and sort args
* apply inference in weight conversion
* use mllama script instead for weight conversion
* clean weight conversion script
* add depth-pro status in other files
* fill docstring in config
* formatting
* more formatting
* formatting with ruff
* formatting with style
* fix copied classes
* add examples; update weight convert script
* fix using check_table.py and isort
* fix config docstring
* add depth pro to sdpa docs
* undo unintentional changes in configuration_gemma.py
* minor fixes
* test image processing
* fixes and tests
* more fixes
* use output states from image_encoder instead
* Revert "use output states from image_encoder instead"
This reverts commit 2408ec54e4.
* make embeddings dynamic
* reshape output hidden states and attentions as part of computation graph
* fix ruff formating
* fix docstring failure
* use num_fov_head_layers in tests
* update doc
* check consistency with config
* ruff formatting
* update test case
* fix ruff formatting
* add tests for fov
* use interpolation in postprocess
* run and fix slow tests locally
* use scaled_images_features for image and fov encoder
* return fused_hidden_states in fusion stage
* fix example
* fix ruff
* fix copyright license for all files
* add __all__ for each file
* minor fixes
- fix download spell
- add push_to_hub option
- fix Optional type hinting
- apply single loop for DepthProImageProcessor.preprocess
* return list in post_process_depth_estimation
* minor fixes
- capitalize start of docstring
- use ignore copy
- fix examples
- move docstring templates and custom output classes to top
- remove "-> None" typehinting from __init__
- type hinting for forward passes
- fix docstrings for custom output classes
* fix "ruff check"
* update upsample and projection
* major changes: (image size and merge optimization)
- add support for images of any size
- optimize merge operation
- remove image_size from config
- use full names instead of B, C, H, W
- remove interpolation from fusion stage
- add interpolation after merge
- move validations to config
- update integration test
- add type hints for functions
* fix push_to_hub option in weights conversion
* remove image_size in weights conversion
* major changes in the architecture
- remove all DepthProViT modules and support different backbones using the AutoModel API
- set default use_fov_model to False
- validate parameters in configuration
- update interpolate function: use "nearest" for faster computation
- update reshape_feature function: remove all special tokens, possible from different backbones
- update merge function: use padding from config instead of merge_out_size
- remove patch_to_batch and batch_to_patch conversions for now
- calculate out_size dynamically in the encoder
- leave head_mask calculation to the backbone
- fix bugs with merge
- add more comments
- update tests
* placeholder for unused config attributes
* improve docs amid review
* minor change in docs
* further optimize merge
* fix formatting
* remove unused patch/batch convertion functions
* use original F.interpolate
* improve function naming
* minor chages
- use torch_int instead of int
- use proper for newly initialized tensors
- use user provided return_dict for patch_encoder
- use if-else block instead in self.use_fov_model
* rearchitect upsample block for improved modularity
* update upsample keys in weight conversion
* improve padding in merge_patches
* use double-loop for merge
* update comments
* create feature_extractor, reduce some forward code
* introduce config.use_mask_token in dinov2
* minor fixes
* minor fixes for onnx
* update __init__ to latest format
* remove DepthProConfig.to_dict()
* major changes in backbone
* update config in weight conversion
* formatting
* converted model is fp32
* improve naming and docs for feature_extractor->reconstruct_feature_maps
* minor fixes; amid review
* create intermediate vars in func call
* use torch.testing.assert_close
* use ModuleList instead of Sequential and ModuleDict
* update docs
* include fov in integraiton tests
* update docs
* improve initialization of convolution layers
* fix unused fov keys
* update tests
* ruff format
* fix test, amid kaimming initialization
* add depthpro to toctree
* add residual layer to _no_split_modules
* architecture rework
* Update src/transformers/models/depth_pro/image_processing_depth_pro.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* update docs
* improve merge_patches
* use flatten with fov_output
* ruff formatting
* update resources section in docs
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix typo "final_kernal_size"
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix output typehint for DepthProDepthEstimator
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* residual operation in 2 steps
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* use image_size instead of global patch_size in interpolation
* replace all Sequential with ModuleList
* update fov
* update heads
* fix and update conversion script for heads
* ruff formatting
* remove float32 conversion
* use "Fov" instead of "FOV" in class names
* use "Fov" instead of "FOV" in config docs
* remove prune_heads
* update fusion stage
* use device in examples
* update processor
* ruff fixes
* add do_rescale in image_processor_dict
* skip test: test_fast_is_faster_than_slow
* ruff formatting
* DepthProImageProcessorFast in other files
* revert antialias removal
* add antialias in BaseImageProcessorFast
* Revert "revert antialias removal"
This reverts commit 5caa0bd8f9.
* Revert "add antialias in BaseImageProcessorFast"
This reverts commit 3ae1134780.
* update processor for grouping and antialias
* try test_fast_is_faster_than_slow without "skip" or "flanky"
* update checkpoint
* update checkpoint
* use @is_flanky for processor test
* update checkpoint to "apple/DepthPro-hf"
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* change cuda to DEVICE
* Update docs/source/en/llm_tutorial.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add init and base image processing functions
* add add_fast_image_processor to transformers-cli
* add working fast image processor clip
* add fast image processor to doc, working tests
* remove "to be implemented" SigLip
* fix unprotected import
* fix unprotected vision import
* update ViTImageProcessorFast
* increase threshold slow fast ewuivalence
* add fast img blip
* add fast class in tests with cli
* improve cli
* add fast image processor convnext
* add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision
* add device kwarg to ImagesKwargs for fast processing on cuda
* cleanup
* fix unprotected import
* group images by sizes and add batch processing
* Add batch equivalence tests, skip when center_crop is used
* cleanup
* update init and cli
* fix-copies
* refactor convnext, cleanup base
* fix
* remove patching mixins, add piped torchvision transforms for ViT
* fix unbatched processing
* fix f strings
* protect imports
* change llava onevision to class transforms (test)
* fix convnext
* improve formatting (following Pavel review)
* fix handling device arg
* improve cli
* fix
* fix inits
* Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs
* uniformize qwen2_vl fast
* fix docstrings
* add add fast image processor llava
* remove min_pixels max_pixels from accepted size
* nit
* nit
* refactor fast image processors docstrings
* cleanup and remove fast class transforms
* update add fast image processor transformers cli
* cleanup docstring
* uniformize pixtral fast and make _process_image explicit
* fix prepare image structure llava next/onevision
* Use typed kwargs instead of explicit args
* nit fix import Unpack
* clearly separate pops and gets in base preprocess. Use explicit typed kwargs
* make qwen2_vl preprocess arguments hashable
* initial commit
* encoder+decoder layer changes WIP
* architecture checks
* working version of detection + segmentation
* fix modeling outputs
* fix return dict + output att/hs
* found the position embedding masking bug
* pre-training version
* added iamge processors
* typo in init.py
* iterupdate set to false
* fixed num_labels in class_output linear layer bias init
* multihead attention shape fixes
* test improvements
* test update
* dab-detr model_doc update
* dab-detr model_doc update2
* test fix:test_retain_grad_hidden_states_attentions
* config file clean and renaming variables
* config file clean and renaming variables fix
* updated convert_to_hf file
* small fixes
* style and qulity checks
* return_dict fix
* Merge branch main into add_dab_detr
* small comment fix
* skip test_inputs_embeds test
* image processor updates + image processor test updates
* check copies test fix update
* updates for check_copies.py test
* updates for check_copies.py test2
* tied weights fix
* fixed image processing tests and fixed shared weights issues
* added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py
* delete prints from test file
* SafeTensor modification to solve HF Trainer issue
* removing the safetensor modifications
* make fix copies and hf uplaod has been added.
* fixed index.md
* fixed repo consistency
* styel fix and dabdetrimageprocessor docstring update
* requested modifications after the first review
* Update src/transformers/models/dab_detr/image_processing_dab_detr.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* repo consistency has been fixed
* update copied NestedTensor function after main merge
* Update src/transformers/models/dab_detr/modeling_dab_detr.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* temp commit
* temp commit2
* temp commit 3
* unit tests are fixed
* fixed repo consistency
* updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file.
* temporarialy config modifications and repo consistency fixes
* Put dilation parameter back to config
* pattern embeddings have been added to the rename_keys method
* add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES
* delete FeatureExtractor part from docs.md
* requested modifications in modeling_dab_detr.py
* [run_slow] dab_detr
* deleted last segmentation code part, updated conversion script and changed the hf path in test files
* temp commit of requested modifications
* temp commit of requested modifications 2
* updated config file, resolved codepaths and refactored conversion script
* updated decodelayer block types and refactored conversion script
* style and quality update
* small modifications based on the request
* attentions are refactored
* removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed
* deleted imageprocessor
* fixed conversion script + quality and style
* fixed config_att
* [run_slow] dab_detr
* changing model path in conversion file and in test file
* fix Decoder variable naming
* testing the old loss function
* switched back to the new loss function and testing with the odl attention functions
* switched back to the new last good result modeling file
* moved back to the version when I asked the review
* missing new line at the end of the file
* old version test
* turn back to newest mdoel versino but change image processor
* style fix
* style fix after merge main
* [run_slow] dab_detr
* [run_slow] dab_detr
* added device and type for head bias data part
* [run_slow] dab_detr
* fixed model head bias data fill
* changed test_inference_object_detection_head assertTrues to torch test assert_close
* fixes part 1
* quality update
* self.bbox_embed in decoder has been restored
* changed Assert true torch closeall methods to torch testing assertclose
* modelcard markdown file has been updated
* deleted intemediate list from decoder module
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* layernorm_decay_fix
* W293 fix
* ruff format fix
* black format
* ruff format
* erase last layer
* add test_get_parameter_names_rmsnorm
* rmsnorm fix