* update env command to log deepspeed version
* suppress deepspeed import logging
* Add reminder to include configs to repro description in bug report.
* make fixup
* [WIP] update import utils for deepspeed
* Change to using is_deepspeed_available() from integrations.
* make fixup
* change order of unmasking of tokens
* library import
* class setup
* test function
* refactor
* add commit message
* test modified
* explict initiliasation of weights + made model smaller
* removed sepete testing file
* fixup
* fixup core
* test attention mask with token types
* tests fixup
* removed PaliGemmaAttentionMaskTest class
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
* Adding option to save/reload scaler
* Removing duplicate variable
* Adding save/reload test
* Small fixes on deterministic algorithm call
* Moving LLM test to another file to isolate its environment
* Moving back to old file and using subprocess to run test isolated
* Reverting back accidental change
* Reverting back accidental change
* milti-gpu: fix inputs_embeds + position_embeds
Fixing the following errors in few models:
```
> hidden_states = inputs_embeds + pos_embeds
E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:2 and xpu:3!
```
Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* multi-gpu: fix tensor device placements for various models
Fixes: #35762
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Apply make fix-copies
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
---------
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* feat: added warning to Trainer when label_names is not specified for PeftModel
* Update trainer.py
* feat: peft detectw ith `_is_peft_model`
* Update src/transformers/trainer.py
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* Applied formatting in trainer.py
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
* add RAdamScheduleFree optimizer
* revert schedulefree version to the minimum requirement
* refine is_schedulefree_available so that it can take min_version
* refine documents
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add `base_model_pp_plan` to `PretrainedConfig`
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add `_pp_plan` to `PreTrainedModel`
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add both to Llama for testing
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Fix type error
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Update to suggested schema
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* `_pp_plan` keys are not patterns
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Simplify schema
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Fix typing error
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Update input name for Llama
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Aria
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Bamba
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Cohere 1 & 2
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to diffllama and emu3
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Gemma 1 & 2
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to GLM and GPT NeoX
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Granite and Helium
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Mistral and Mixtral
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to OLMo 1 & 2
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan to Phi and Phi 3
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan for Qwen 2, 2 MoE, 2 VL and 2.5 VL
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add pp plan for Starcoder 2
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Add enum for accessing inputs and outputs
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Update type hints to use tuples
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* Change outer list to tuple
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
---------
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
* update awq doc
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/awq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add note for inference
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* make output_dir optional
* inintaied a basic testing module to validate and verify the changes
* Test output_dir default to 'tmp_trainer' when unspecified.
* test existing functionality of output_dir.
* test that output dir only created when needed
* final check
* added doc string and changed the tmp_trainer to trainer_output
* amke style fixes to test file.
* another round of fixup
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>
* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning
* Remove unused `max_size` variable in processor which was always `None` and triggered unnecessary deprecated warning
* Remove deprecated warnings and eliminate `max_size` usage
* Test use `int` as argument for `size`
Add a test to ensure test can pass successfully and backward compatibility
* The test pipelines still use `max_size`
Remove `max_size` from test pipelines and replace by `size` by a `Dict` with `'shortest_edge'` `'longest_edge'` as keys
* Reformatting
* Reformatting
* Revert "Reformatting"
This reverts commit c3040acee7.
* Revert "Reformatting"
This reverts commit ac4522e5c9.
* Revert "The test pipelines still use `max_size`"
This reverts commit eaed96f041.
* Revert "Test use `int` as argument for `size`"
This reverts commit 1925ee38c7.
* Revert "Remove deprecated warnings and eliminate `max_size` usage"
This reverts commit d8e7e6ff90.
* Change version `4.26` to "a future version"
* Reformatting
* Revert "Change version `4.26` to "a future version""
This reverts commit 2b53f9e4
* Add is_torch_greater_or_equal test decorator
* Add common test for torch.export
* Fix bit
* Fix focalnet
* Fix imagegpt
* Fix seggpt
* Fix swin2sr
* Enable torch.export test for vision models
* Enable test for video models
* Remove json
* Enable for hiera
* Enable for ijepa
* Fix detr
* Fic conditional_detr
* Fix maskformer
* Enable test maskformer
* Fix test for deformable detr
* Fix custom kernels for export in rt-detr and deformable-detr
* Enable test for all DPT
* Remove custom test for deformable detr
* Simplify test to use only kwargs for export
* Add comment
* Move compile_compatible_method_lru_cache to utils
* Fix beit export
* Fix deformable detr
* Fix copies data2vec<->beit
* Fix typos, update test to work with dict
* Add seed to the test
* Enable test for vit_mae
* Fix beit tests
* [run-slow] beit, bit, conditional_detr, data2vec, deformable_detr, detr, focalnet, imagegpt, maskformer, rt_detr, seggpt, swin2sr
* Add vitpose test
* Add textnet test
* Add dinov2 with registers
* Update tests/test_modeling_common.py
* Switch to torch.testing.assert_close
* Fix masformer
* Remove save-load from test
* Add dab_detr
* Add depth_pro
* Fix and test RT-DETRv2
* Fix dab_detr
* Revert "Fix OS err (#36094)"
This reverts commit ba29a439ad.
* Revert "Save checkpoint to temporary directory to handle partial saves during failures (#35580)"
This reverts commit 20d17358c4.
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add support for constant learning rate with cooldown
* Add more warmup and cooldown methods to 'get_wsc_schedule'
* Add more warmup and cooldown methods to 'get_wsc_schedule'
* Add more warmup and cooldown methods to 'get_wsc_schedule'
* Add more warmup and cooldown methods to 'get_wsc_schedule'
* Add more warmup and decay methods to 'get_wsd_schedule'
* support num_training_steps and num_stable_steps for get_wsd_schedule
* support num_training_steps and num_stable_steps for get_wsd_schedule
* get wsd scheduler before the `num_training_steps` decision
* fix code_quality
* Update stable branch logic
* fix code_quality
* Move stable stage decide to `get_wsd_schedule`
* Update docstring of `get_wsd_schedule`
* Update `num_train_steps` to optional
* Update `num_train_steps` to optional
* Update docstring of `get_wsd_schedule`
* Update src/transformers/optimization.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* implement config and model building blocks
* refactor model architechture
* update model outputs
* update init param to include use_fov_model
* update param name in config
* fix hidden_states and attentions outputs for fov
* sort config
* complete minor todos
* update patching
* update config for encoder
* fix config
* use correct defaults in config
* update merge for compatibility with different image size
* restructure encoder for custom configuration
* make fov model compatible with custom config
* replace word "decoder" with "fusion"
* weight conversion script
* fix fov squeeze
* update conversion script (without test)
* upload ruff image processing
* create fast image processing
* use torch interpolation for image processing
* complete post_process_depth_estimation
* config: fix imports and sort args
* apply inference in weight conversion
* use mllama script instead for weight conversion
* clean weight conversion script
* add depth-pro status in other files
* fill docstring in config
* formatting
* more formatting
* formatting with ruff
* formatting with style
* fix copied classes
* add examples; update weight convert script
* fix using check_table.py and isort
* fix config docstring
* add depth pro to sdpa docs
* undo unintentional changes in configuration_gemma.py
* minor fixes
* test image processing
* fixes and tests
* more fixes
* use output states from image_encoder instead
* Revert "use output states from image_encoder instead"
This reverts commit 2408ec54e4.
* make embeddings dynamic
* reshape output hidden states and attentions as part of computation graph
* fix ruff formating
* fix docstring failure
* use num_fov_head_layers in tests
* update doc
* check consistency with config
* ruff formatting
* update test case
* fix ruff formatting
* add tests for fov
* use interpolation in postprocess
* run and fix slow tests locally
* use scaled_images_features for image and fov encoder
* return fused_hidden_states in fusion stage
* fix example
* fix ruff
* fix copyright license for all files
* add __all__ for each file
* minor fixes
- fix download spell
- add push_to_hub option
- fix Optional type hinting
- apply single loop for DepthProImageProcessor.preprocess
* return list in post_process_depth_estimation
* minor fixes
- capitalize start of docstring
- use ignore copy
- fix examples
- move docstring templates and custom output classes to top
- remove "-> None" typehinting from __init__
- type hinting for forward passes
- fix docstrings for custom output classes
* fix "ruff check"
* update upsample and projection
* major changes: (image size and merge optimization)
- add support for images of any size
- optimize merge operation
- remove image_size from config
- use full names instead of B, C, H, W
- remove interpolation from fusion stage
- add interpolation after merge
- move validations to config
- update integration test
- add type hints for functions
* fix push_to_hub option in weights conversion
* remove image_size in weights conversion
* major changes in the architecture
- remove all DepthProViT modules and support different backbones using the AutoModel API
- set default use_fov_model to False
- validate parameters in configuration
- update interpolate function: use "nearest" for faster computation
- update reshape_feature function: remove all special tokens, possible from different backbones
- update merge function: use padding from config instead of merge_out_size
- remove patch_to_batch and batch_to_patch conversions for now
- calculate out_size dynamically in the encoder
- leave head_mask calculation to the backbone
- fix bugs with merge
- add more comments
- update tests
* placeholder for unused config attributes
* improve docs amid review
* minor change in docs
* further optimize merge
* fix formatting
* remove unused patch/batch convertion functions
* use original F.interpolate
* improve function naming
* minor chages
- use torch_int instead of int
- use proper for newly initialized tensors
- use user provided return_dict for patch_encoder
- use if-else block instead in self.use_fov_model
* rearchitect upsample block for improved modularity
* update upsample keys in weight conversion
* improve padding in merge_patches
* use double-loop for merge
* update comments
* create feature_extractor, reduce some forward code
* introduce config.use_mask_token in dinov2
* minor fixes
* minor fixes for onnx
* update __init__ to latest format
* remove DepthProConfig.to_dict()
* major changes in backbone
* update config in weight conversion
* formatting
* converted model is fp32
* improve naming and docs for feature_extractor->reconstruct_feature_maps
* minor fixes; amid review
* create intermediate vars in func call
* use torch.testing.assert_close
* use ModuleList instead of Sequential and ModuleDict
* update docs
* include fov in integraiton tests
* update docs
* improve initialization of convolution layers
* fix unused fov keys
* update tests
* ruff format
* fix test, amid kaimming initialization
* add depthpro to toctree
* add residual layer to _no_split_modules
* architecture rework
* Update src/transformers/models/depth_pro/image_processing_depth_pro.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* update docs
* improve merge_patches
* use flatten with fov_output
* ruff formatting
* update resources section in docs
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix typo "final_kernal_size"
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix output typehint for DepthProDepthEstimator
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* residual operation in 2 steps
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* use image_size instead of global patch_size in interpolation
* replace all Sequential with ModuleList
* update fov
* update heads
* fix and update conversion script for heads
* ruff formatting
* remove float32 conversion
* use "Fov" instead of "FOV" in class names
* use "Fov" instead of "FOV" in config docs
* remove prune_heads
* update fusion stage
* use device in examples
* update processor
* ruff fixes
* add do_rescale in image_processor_dict
* skip test: test_fast_is_faster_than_slow
* ruff formatting
* DepthProImageProcessorFast in other files
* revert antialias removal
* add antialias in BaseImageProcessorFast
* Revert "revert antialias removal"
This reverts commit 5caa0bd8f9.
* Revert "add antialias in BaseImageProcessorFast"
This reverts commit 3ae1134780.
* update processor for grouping and antialias
* try test_fast_is_faster_than_slow without "skip" or "flanky"
* update checkpoint
* update checkpoint
* use @is_flanky for processor test
* update checkpoint to "apple/DepthPro-hf"
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>