Yih-Dar
080e14b24c
Modify warnings
in a with
block to avoid flaky tests ( #31893 )
...
* fix
* [test_all] check before merge
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-10 17:56:12 +02:00
Sai-Suraj-27
da79b18087
fix: Removed duplicate
field definitions in some classes ( #31888 )
...
Removed duplicate field definitions in classes.
2024-07-10 13:46:31 +01:00
Yih-Dar
9d98706b3f
Fix failed tests in #31851 ( #31879 )
...
* Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868 )"
This reverts commit b45dd5de9c
.
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
* fix
* [test_all] check
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-10 14:25:24 +02:00
Yih-Dar
b45dd5de9c
Revert "Fix _init_weights
for ResNetPreTrainedModel
" ( #31868 )
...
Revert "Fix `_init_weights` for `ResNetPreTrainedModel` (#31851 )"
This reverts commit 4c8149d643
.
2024-07-09 23:00:56 +02:00
Yih-Dar
4c8149d643
Fix _init_weights
for ResNetPreTrainedModel
( #31851 )
...
* init
* test
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-09 20:09:08 +02:00
Yung-Sung Chuang
d094d8d9ec
Generate: Add new decoding strategy "DoLa" in .generate()
( #29619 )
...
Co-authored-by: Joao Gante <joao@huggingface.co>
2024-07-09 17:37:38 +01:00
Joao Gante
4c2538b863
Test loading generation config with safetensor weights ( #31550 )
...
fix test
2024-07-09 16:22:43 +02:00
fxmarty
0abf5e8eae
FX symbolic_trace: do not test decoder_inputs_embeds ( #31840 )
...
only test input_embeds, not decoder_input_embeds
2024-07-09 08:07:46 +02:00
Yih-Dar
4879ac2b33
Avoid failure TFBlipModelTest::test_pipeline_image_to_text
( #31827 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-08 13:49:21 +02:00
fxmarty
ba743700f4
transformers.fx.symbolic_trace supports inputs_embeds ( #31574 )
...
* symbolic trace supports inputs_embeds
* fix test?
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 19:17:28 +08:00
Pavel Iakubovskii
a177821b24
Add FA2 and sdpa
support for SigLIP ( #31499 )
...
* Rebase to main
* Fix attention implementation autoset for tex and vision configs
* Fixup
* Minor fixes
* Fix copies
* Fix attention_mask for FA2
* Add eqvivalence tests for siglip
* Remove right padding test
* Uncomment flaky
* Fix import
* Add to docs
* Fix test message
* Add sdpa
* Add sdpa equivalence test
* Add siglip sdpa to docs
* Fix typing for attention output
* Add sdpa tests
* Fix signature of FA2
* Autoset attn_implementation in config
* Rename bsz -> batch_size
* Move back autoset attn method
* Mark as flaky
* Correct attention mask padding
* [run-slow] siglip
* Add FA2 and sdpa docs
* Style fix
* Remove flaky for FA2 test
* Change attention implementation set
* Change attn_implementaiton propogation
* Fix typos
* Add modality to assert message
* Add more sdpa backends in test
* [run slow] siglip
* Add math sdpa backend for all options
* [run slow] siglip
2024-07-08 11:10:02 +01:00
NielsRogge
06fd7972ac
Add ZoeDepth ( #30136 )
...
* First draft
* Add docs
* Clean up code
* Convert model
* Add image processor
* Convert Zoe_K
* More improvements
* Improve variable names and docstrings
* Improve variable names
* Improve variable names
* Replace nn.sequential
* More improvements
* Convert ZoeD_NK
* Fix most tests
* Verify pixel values
* Verify pixel values
* Add squeeze
* Update beit to support arbitrary window sizes
* Improve image processor
* Improve docstring
* Improve beit
* Improve model outputs
* Add figure
* Fix beit
* Update checkpoint
* Fix repo id
* Add _keys_to_ignore_on_load_unexpected
* More improvements
* Address comments
* Address comments
* Address comments
* Address comments
* Rename variable name
* Add backbone_hidden_size
* Vectorize
* Vectorize more
* Address comments
* Clarify docstring
* Remove backbone_hidden_size
* Fix image processor
* Remove print statements
* Remove print statement
* Add integration test
* Address comments
* Address comments
* Address comments
* Address comments
* Add requires_backends
* Clean up
* Simplify conversion script
* Simplify more
* Simplify more
* Simplify more
* Clean up
* Make sure beit is loaded correctly
* Address comment
* Address bin_configurations
* Use bin_configurations
* Convert models, add integration tests
* Fix doc test
* Address comments
* Unify regressor classes
* Clarify arguments
* Improve resize_image
* Add num_relative_features
* Address comment
* [run-slow]beit,data2vec,zoedepth
* [run-slow]beit,data2vec,zoedepth
* Address comments
* Address comment
* Address comment
* Replace nn.TransformerEncoderLayer and nn.TransformerEncoder
* Replace nn.MultiheadAttention
* Add attributes for patch transformer to config
* Add tests for ensure_multiple_of
* Update organization
* Add tests
* [run-slow] beit data2vec
* Update ruff
* [run-slow] beit data2vec
* Add comment
* Improve docstrings, add test
* Fix interpolate_pos_encoding
* Fix slow tests
* Add docstring
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/zoedepth/image_processing_zoedepth.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Improve tests and docstrings
* Use run_common_tests
* Improve docstrings
* Improve docstrings
* Improve tests
* Improve tests
* Remove print statements
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-08 11:43:33 +02:00
Anton Vlasjuk
a01b033cb4
Fix galore lr display with schedulers ( #31710 )
...
* fix galore lr display with lr schedulers
* style
* add some tests to check for displayed lrs
* copy-paste err for warmup steps
* standardize the default lr to be only in the optimizer
* trying out my luck with the reads
2024-07-05 18:59:09 +01:00
Billy Cao
ac26260436
Allow FP16 or other precision inference for Pipelines ( #31342 )
...
* cast image features to model.dtype where needed to support FP16 or other precision in pipelines
* Update src/transformers/pipelines/image_feature_extraction.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Use .to instead
* Add FP16 pipeline support for zeroshot audio classification
* Remove unused torch imports
* Add docs on FP16 pipeline
* Remove unused import
* Add FP16 tests to pipeline mixin
* Add fp16 placeholder for mask_generation pipeline test
* Add FP16 tests for all pipelines
* Fix formatting
* Remove torch_dtype arg from is_pipeline_test_to_skip*
* Fix format
* trigger ci
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-05 17:21:50 +01:00
Billy Cao
1d3eaa6f7e
Add training support for SigLIP ( #31495 )
...
* Add siglip loss function
* Update docs
* Enable training tests
[experimental] enable GC training tests as it has worked for my own data
* Remove test_training* overrides to enable training tests
[run_slow] siglip
* Skip training tests for Siglip text model and ImageClassificationModel
[run_slow] siglip
* Skip GC training tests for SiglipForImageClassification
* Explicitly skip training tests for SiglipVisionModel
Add skip reason for training tests for SiglipTextModel
* Remove copied from to fix CI
2024-07-05 14:50:39 +01:00
Aymeric Roucher
1556025271
Code agent: allow function persistence between steps ( #31769 )
...
* Code agent: allow function persistence between steps
2024-07-05 11:09:11 +02:00
Yih-Dar
eef0507f3d
Fix gemma tests ( #31794 )
...
* skip 3 7b tests
* fix
* fix
* fix
* [run-slow] gemma
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-07-05 10:17:59 +02:00
Marc Sun
8c5c180de0
Fix serialization for offloaded model ( #31727 )
...
* Fix serialization
* style
* add test
2024-07-05 08:07:07 +02:00
Pavel Iakubovskii
048f599f35
Fix RT-DETR weights initialization ( #31724 )
...
* Fix init for rt-detr heads
* Fixup
* Add separate prior_prob value to config for initialization
* Add bbox init
* Change to 1 / num_labels init
* Adjust weights init test
* Fix style for test
2024-07-03 14:29:02 +01:00
Pavel Iakubovskii
b97521614a
Fix RT-DETR cache for generate_anchors ( #31671 )
...
* Fix cache and type conversion
* Add test
* Fixup
* nit
* [run slow] rt_detr
* Fix test
* Fixup
* [run slow] rt_detr
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
2024-07-03 14:19:57 +01:00
Joao Gante
ddfaf11926
Gemma 2: Update slow tests ( #31759 )
...
gemma 2 slow tests
2024-07-03 11:43:44 +02:00
jiqing-feng
7f91f168a1
fix assisted decoding ( #31401 )
...
* fix assisted decoding
* check None
* fix typo
* fix _prepare_special_tokens
* fix style
* fix lint
* add tests for assisted decoding
* fix style
* fix tests check
2024-07-03 09:22:56 +01:00
Matt
cd0935dd55
Make tool JSON schemas consistent ( #31756 )
...
Make the order of array items consistent using sorted()
2024-07-02 20:00:42 +01:00
Joao Gante
82486e5995
🚨 🚨 TextGenerationPipeline: rely on the tokenizer default kwargs ( #31747 )
...
* rely on the tokenizer default kwargs
* fix a few tests
2024-07-02 16:17:42 +02:00
Sanchit Gandhi
a9701953ff
[whisper] static kv cache ( #31166 )
...
* make work with cache abstraction
* correct for static cache
* hacks for compile
* make fast
* fix
* fix pos ids
* generate
* fix sdpa
* fix sdpa cache pos
* fix fa2
* clean fa2
* integrate cache into generate
* make style
* copies
* more copies
* update eager
* update sdpa
* update fa2
* simplify
* use cache pos
* always compute cross-cache for debug
* avoid recompiles
Co-authored-by: Arthur Zucker <arthur@huggingface.co>
* fix fix
* fix fix fix
* more fix
* try encoder-decoder cache (too messy)
* revert encoder-decoder cache
* check cross-attn cache
* use enc-dec dataclass
* use richer enc-dec dataclass
* clean-up
* revert static cache changes
* small fixes
* revert to cpu flag
* fix copies
* add static slow test
* past k/v docstring
* more docstrings
* cache_position docstrings
* add to docs
* add enc-dec cache to docs
* make style
* fix after rebase
* fix beam
* style
* fix generation strategies
* fix most decoder-only tests
* style
* skip test
* more clean up
* small docstrings
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add todo
* only crop self-attn
* check cache in mixin
* style
* fix re-compile after rebase
* move `is_updated` logic to enc-dec wrapper
* revert back
* revert cache back
* finalise design
* fix
* fix fix
* style
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* deprecate
* updates
* final updates
* style
* style
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-07-02 13:24:15 +01:00
Yih-Dar
93cd94b79d
Move some test files (tets/test_xxx_utils.py
) to tests/utils
( #31730 )
...
* move
* move
* move
* move
* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-07-02 13:46:03 +02:00
Sangbum Daniel Choi
cb298978ad
add gather_use_object arguments ( #31514 )
...
* add gather_use_object arguments
* fix name and pass the CI test for Seq2SeqTrainer
* make style
* make it to functools
* fix typo
* add accelerate version:
* adding warning
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* make style
* Update src/transformers/training_args.py
* check function move to initial part
* add test for eval_use_gather_object
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-06-28 13:50:27 +01:00
Jacky Lee
82a1fc7256
Fix return_dict in encodec ( #31646 )
...
* fix: use return_dict parameter
* fix: type checks
* fix: unused imports
* update: one-line if else
* remove: recursive check
2024-06-28 12:18:01 +01:00
Arthur
75a6319864
Fix post gemma merge ( #31660 )
...
* nit
* toctree issue
* protect gemma2 tests as well
* sdpa supported
2024-06-27 17:51:42 +02:00
Arthur
0cf60f13ab
Add gemma 2 ( #31659 )
...
* inital commit
* Add doc
* protect?
* fixup stuffs
* update tests
* fix build documentation
* mmmmmmm config attributes
* style
* nit
* uodate
* nit
* Fix docs
* protect some stuff
---------
Co-authored-by: Lysandre <lysandre@huggingface.co>
2024-06-27 17:36:19 +02:00
Sangbum Daniel Choi
be50a0338b
change anchor_image_size None for compatibility ( #31640 )
...
* change anchor_image_size None for compatibility
* make fix-copies
2024-06-27 12:36:55 +01:00
Billy Cao
3a028101e9
[QoL] Allow dtype str for torch_dtype arg of from_pretrained ( #31590 )
...
* Allow dtype str for torch_dtype in from_pretrained
* Update docstring
* Add tests for str torch_dtype
2024-06-27 12:41:49 +02:00
amyeroberts
1de7dc7403
Skip tests properly ( #31308 )
...
* Skip tests properly
* [test_all]
* Add 'reason' as kwarg for skipTest
* [test_all] Fix up
* [test_all]
2024-06-26 21:59:08 +01:00
Billy Cao
1f9f57ab4c
Fix dtype casting in swinv2 and swinv2sr to allow non-FP32 inference ( #31589 )
...
* Fix dtype casting in modeling_swin2sr to allow non-FP32 inference
* Fix formattting
* Fix for swinv2 too
* Update src/transformers/models/swin2sr/modeling_swin2sr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/swinv2/modeling_swinv2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add FP16 tests for swin2sr and swinv2
* [run_slow] swin2sr, swinv2
* [run_slow] swin2sr, swinv2
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-26 18:46:48 +01:00
Pablo Montalvo
492ee17ec3
Fix paligemma detection inference ( #31587 )
...
* fix extended attention mask
* add slow test for detection instance
* [run-slow]paligemma
2024-06-26 19:17:09 +02:00
Raushan Turganbay
e71f2863d7
Add LLaVa NeXT Video ( #31252 )
...
* squash into single commit
* run diff once more
* docstring
* tests
* minor chnages and ready to go
* Update src/transformers/models/llava_next_video/processing_llava_next_video.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vipllava/test_modeling_vipllava.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* [run-slow] llava-next-video
* [run-slow] llava-next-video
* [run-slow] llava_next_video
* fix two tests
* fix slow tests
* remove logit checks due to numeric errors
* run test once more
* [run-slow] llava_next_video
* final try to pass the test
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* [run-slow] llava_next_video
* style
* fix
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-26 21:52:28 +05:00
Pavel Iakubovskii
b1ec745475
Fix RT-DETR inference with float16 and bfloat16 ( #31639 )
...
* [run_slow] rt_detr
* Fix positional embeddings and anchors dtypes
* [run slow] rt_detr
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fixup
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-26 17:50:10 +01:00
Younes Belkada
3f93fd0694
Llama et al. / FSDP : Fix breaking change in 4.40 for FSDP ( #31161 )
...
* fix llama fsdp
* fixup
* adding FSDP tests for CPU offloading
* fixes
* fix tests
* fix tests
* add it for mixtral
* propagate the changes on other models
* Update src/transformers/models/phi/modeling_phi.py
* Delete utils/testing_scripts/fsdp_cpu_offloading.py
Remove script - FSDP + CPU offloading it tested in the test suite
* Delete utils/testing_scripts/dummy_fsdp_config.yml
* Update + add cache_positions docstring
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-26 14:50:08 +01:00
Anton Vlasjuk
b07770c5eb
[GPT-NeoX
] Add SDPA support ( #31031 )
...
* starting support for sdpa in `gptneox` models
* small comment on tests
* fix dropout
* documentation and style
* clarify concrete paths for reference
* generalise attn projections and rope application
added head mask check to sdpa mask creation
handle sdpa memory backend bug via own version flag
* update docs and style
* move dtype casting outside of general attn_projection_and_rope function
fix flash_attn_2 stuff
* more generic attn warning if output_attns or head_mask
* simplify head mask check by moving head mask creation to a later point
* remove copied llama artifact
* remove padding_mask from attention function signature
* removing unnecessary comments, only "save" attn implementation once
* [run_slow] gpt_neox
2024-06-26 13:56:36 +01:00
amyeroberts
0f67ba1d74
Add ViTImageProcessorFast to tests ( #31424 )
...
* Add ViTImageProcessor to tests
* Correct data format
* Review comments
2024-06-25 13:36:58 +01:00
Raushan Turganbay
fc689d75a0
Add video modality for InstrucBLIP ( #30182 )
...
* squash in single commit
* add docs
* dummy obj
* more changes in diff converter
* tiny fix
* make docs happy
* skip test
* repo consistency tests
* update docstring
* style
* fix tests
* change diff imports
* [run-slow] instructblipvideo
* [run-slow] instructblipvideo
* fix tests and remove logit check
* [run-slow] instructblipvideo
2024-06-25 15:45:39 +05:00
jiqing-feng
a958c4a801
fix output data type of image classification ( #31444 )
...
* fix output data type of image classification
* add tests for low-precision pipeline
* add bf16 pipeline tests
* fix bf16 tests
* Update tests/pipelines/test_pipelines_image_classification.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix import
* fix import torch
* fix style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-25 11:14:39 +01:00
Raushan Turganbay
7e86cb6c6f
Siglip: add _no_split_module
( #31566 )
...
* device-map siglip
* move split modules to PretrainedSigLip
2024-06-25 09:49:55 +05:00
Hiroshi Matsuda
0e23e60a5a
Fix bug about add_special_tokens and so on ( #31496 )
...
* fix bug about add_special_tokens and so on
* improve add_special_tokens and padding behavior
* add a test case for add_special_tokens and padding
2024-06-24 14:05:16 +01:00
Zhiyong Wang
dce253f645
Add implementation of spectrogram_batch
( #27159 )
...
* Add initial implementation of `spectrogram_batch`
* Format the initial implementation
* Add test suite for the `spectrogram_batch`
* Update `spectrogram_batch` to ensure compatibility with test suite
* Update `spectrogram_batch` to include pre and post-processing
* Add `amplitude_to_db_batch` function and associated tests
* Add `power_to_db_batch` function and associated tests
* Reimplement the test suite for `spectrogram_batch`
* Fix errors in `spectrogram_batch`
* Add the function annotation for `spectrogram_batch`
* Address code quality
* Re-add `test_chroma_equivalence` function
* Update src/transformers/audio_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/audio_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-06-24 09:19:12 +02:00
Pavel Iakubovskii
3c2d4d60d7
Correct @is_flaky test decoration ( #31480 )
...
* Correct @is_flaky decorator
2024-06-24 08:09:21 +01:00
Sangbum Daniel Choi
74a207404e
New model support RTDETR ( #29077 )
...
* fill out docs string in configuration
75dcd3a0e8 (r1506391856)
* reduce the input image size for the tests
* remove the unappropriate tests
* only 5 failes exists
* make style
* fill up missed architecture for object detection in docs
* fix auto modeling
* simple fix in missing import
* major change including backbone refactor and objectdetectionoutput refactor
* minor fix only 4 fails left
* intermediate fix
* revert __init__.py
* revert __init__.py
* make style
* fixes in pr_docs
* intermediate fix
* make style
* two fixes
* pass doctest
* only one fix left
* intermediate commit
* all fixed
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/convert_rt_detr_original_pytorch_checkpoint_to_pytorch.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/rt_detr/test_modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* function class above the model definition in dice_loss
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* simple fix
* layernorm add config.layer_norm_eps
* fix inputs_docstring
* make style
* simple fix
* add custom coco loading test in image_processor
* fix error in BaseModelOutput
https://github.com/huggingface/transformers/pull/29077#discussion_r1516657790
* simple typo
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* intermediate fix
* fix with load_backbone format
* remove unused configuration
* 3 fix test left
* make style
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: Sounak Dey <dey.sounak@gmail.com>
* change last_hidden_state to first index
* all pass fix
TO DO: minor update in comments
* make fix-copies
* remove deepcopy
* pr_document fix
* revert deepcopy due to the issue of unexpceted behavior in decoderlayer
* add atol in final
* add no_split_module
* _no_split_modules = None
* device transfer for model parallelism
* minor fix
* make fix-copies
* fix typo
* add test_image_processor with post_processing
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add config in RTDETRPredictionHead
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* set lru_cache with max_size 32
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add lru_cache import and configuration change
* change the order of definition
* make fix-copies
* add docs and change config error
* revert strange make-fix
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* test pass
* fix get_clones related and remove deepcopy
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* nit for paper section
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename denoising related parameters
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* check the image transformation logic
* make style
* make style
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* pe_encoding -> positional_encoding_temperature
* remove TODO
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* remove eval_idx since transformer DETR is giving all decoder output
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* change variable name
* make style and docs import update
* Revert "Update src/transformers/models/rt_detr/image_processing_rt_detr.py"
This reverts commit 74aa3e1de0
.
* fix typo
* add postprocessing in docs
* move import scipy to top
* change varaible name
* make fix-copies
* remove eval_idx in test
* move to after first sentence
* update image_processor since box loss requires normalized one
* change appropriate name to auxiliary_outputs
* Update src/transformers/models/rt_detr/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/rt_detr/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* make style
* remove panoptic related comments
* make style
* revert valid_processor_keys
* fix aux related test
* make style
* change origination from config to backbone API
* enable the dn_loss
* fix test and conversion
* renewal weight initialization
* change initializer_range
* make fix-up
* fix the loss issue in the auxiliary output and denoising part
* change weight loss to original RTDETR
* fix in initialization
* sync shape format of dn and aux
* make style
* stable fine-tuning and compatible conversion for resnet101
* make style
* skip input_embed
* change encoder related variable
* enable converting rtdetr_r101
* add r101 related conversion code
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/image_processing_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change name _shape to _reshape
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* maket style
* make fix-copies
* remove deprecated import
* more fix
* remove last_hidden_state for task-specific model
* Revert "remove last_hidden_state for task-specific model"
This reverts commit ccb7a34051
.
* minore change in convert
* remove print
* make style and fix-copies
* add custom rtdetr backbone for r18, r34
* remove print
* change copied
* add pad_size
* make style
* change layertype to optional to pass the CI
* make style
* add test in modeling_resnet_rt_detr
* make fix-copies
* skip tmp file test
* fix comment
* add docs
* change to modeling_resnet file format
* enabling resnet50 above
* Update src/transformers/models/rt_detr/modeling_rt_detr.py
Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com>
* enable all the rtdetr model :)
* finish except CI
* add RTDetrResNetBackbone
* make fix-copies
* fix
TO DO: CI enable
* make style
* rename test
* add docs
* add special fix
* revert resnet
* Update src/transformers/models/rt_detr/modeling_rt_detr_resnet.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* add more comment
* remove swin comment
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename convert and add verify backbone
* Update docs/source/en/_toctree.yml
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/rt_detr.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* make style
* requests for docs
* more general test docs
* general script docs
* make fix-copies
* final commit
* Revert "Update src/transformers/models/rt_detr/configuration_rt_detr.py"
This reverts commit d136225cd3
.
* skip test_model_get_set_embeddings
* remove target
* add changes
* make fix-copies
* remove decoder_attention_mask
* add load_backbone function for auto_backbone
* remove comment
* fix repo name
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* final commit
* remove unused downsample_in_bottleneck
* new test for autobackbone
* change to appropriate indices
* test fix
* fix dict in test_image_processor
* fix test
* [run-slow] rt_detr, rt_detr_resnet
* change the slow test
* [run-slow] rt_detr
* [run-slow] rt_detr, rt_detr_resnet
* make in to same cuda in CSPRepLayer
* [run-slow] rt_detr, rt_detr_resnet
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sounak Dey <dey.sounak@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Jason Wu <jasonkit@users.noreply.github.com>
Co-authored-by: ChoiSangBum <choisangbum@ChoiSangBumui-MacBookPro.local>
2024-06-21 17:50:08 +01:00
Ita Zaporozhets
1e79eade41
SPLIT PR: add user defined symbols and control symbols ( #31305 )
...
* PR SPLIT: moving origina changes for adding user defined symbols
* adding gemma test and generalizing gemma converter
* ruff
* update common test
* update serialization test
* deberta v2 tests updates as rust version adds '.' as a user added token, so a space is not added
* removing commented lines
* applying feedback - user only added_tokens to add and check piece.type instead of trainer_spec for user_defined_symbols
* add comment referencing sentencepiece
2024-06-21 01:48:10 -07:00
Yih-Dar
ec905f3a76
unskip 2 tests in cohere ( #31517 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-06-20 17:21:08 +02:00
Joao Gante
1fd60fec75
RWKV: enable generation tests ( #31490 )
...
* add rwkv tests
* has_attentions set in individual tests
2024-06-20 14:15:01 +01:00