Joao Gante
93e0e1a852
CI: add torchvision to the consistency image ( #32941 )
2024-08-26 15:17:45 +01:00
Shijie
19e6e80e10
support qwen2-vl ( #32318 )
...
* support-qwen2-vl
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* tidy
* hyphen->underscore
* make style
* add-flash2-tipd
* delete-tokenize=False
* remove-image_processor-in-init-file
* add-qwen2_vl-in-MODEL_FOR_VISION_2_SEQ_MAPPING_NAMES
* format-doct
* support-Qwen2VLVisionConfig
* remove-standardize_cache_format
* fix-letter-varaibles
* remove-torch-in-image-processor
* remove-useless-docstring
* fix-one-letter-varaible-name
* change-block-name
* default-quick-gelu-in-vision
* remove-useless-doc
* use-preimplemented-flash-forward
* fix-doc
* fix-image-processing-doc
* fix-apply-rotary-embed
* fix-flash-attn-sliding-window
* refactor
* remove-default_template
* remove-reorder_cache
* simple-get-rope_deltas
* update-prepare_inputs_for_generation
* update-attention-mask
* update-rotary_seq_len
* remove-state
* kv_seq_length
* remove-warning
* _supports_static_cache
* remove-legacy-cache
* refactor
* fix-replace
* mrope-section-doc
* code-quality
* code-quality
* polish-doc
* fix-image-processing-test
* update readme
* Update qwen2_vl.md
* fix-test
* Update qwen2_vl.md
* nit
* processor-kwargs
* hard-code-norm_layer
* code-quality
* discard-pixel-values-in-gen
* fix-inconsistent-error-msg
* unify-image-video
* hidden_act
* add-docstring
* vision-encode-as-PreTrainedModel
* pixel-to-target-dtype
* update doc and low memoryvit
* format
* format
* channel-foramt
* fix vit_flashatt
* format
* inherit-Qwen2VLPreTrainedModel
* simplify
* format-test
* remove-one-line-func-in-image-processing
* avoid-one-line-reshape
* simplify-rotary_seq_len
* avoid-single-letter-variable
* no-for-loop-sdpa
* avoid-single-letter-variable
* remove-one-line-reshape
* remove-one-line-reshape
* remove-no-rope-in-vit-logic
* default-mrope
* add-copied-from
* more-docs-for-mrope
* polish-doc
* comment-and-link
* polish-doc
* single-letter-variables
* simplify-image-processing
* video->images
* kv_seq_len-update
* vision-rope-on-the-fly
* vision-eager-attention
* change-processor-order
---------
Co-authored-by: baishuai <baishuai.bs@alibaba-inc.com>
Co-authored-by: ShuaiBai623 <43326198+ShuaiBai623@users.noreply.github.com>
2024-08-26 15:16:44 +02:00
S M Jishanul Islam
8defc95df3
Updated the custom_models.md changed cross_entropy code ( #33118 )
2024-08-26 13:15:43 +02:00
Matt
0a7af19f4d
Update Jinja docs with new functions and general cleanup ( #33097 )
2024-08-23 17:40:06 +01:00
Arun Prakash A
e3a5f35cd5
added doctring to SchedulerType class ( #32898 )
...
* added doctring to SchedulerType class
* Remove trailing whitespace src/transformers/trainer_utils.py
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fixup
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-08-23 09:15:25 -07:00
Donggeun Yu
1dbd9d3693
DeviceGuard added to use Deformable Attention more safely on multi-GPU ( #32910 )
...
* Update modeling_deformable_detr.py
* Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update ms_deform_attn_cuda.cu
* Update modeling_deformable_detr.py
* Update modeling_deformable_detr.py
* [empty] this is a empty commit
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 17:12:10 +01:00
Matt
371b9c1486
Enable some Jinja extensions and add datetime capabilities ( #32684 )
...
* Add new Jinja features:
- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format
* Add new Jinja features:
- Do extension
- Break/continue in loops
- Call strftime to get current datetime in any format
* Fix strftime template
* Add template strip() just to be safe
* Remove the do extension to make porting easier, and also because it's the least useful
* Rename test
* strftime -> strftime_now
* Split test
* Update test to use strftime_now
* Refactor everything out into chat_template_utils
* Refactor everything out into chat_template_utils
* Refactor everything out into chat_template_utils
* Refactor everything out into chat_template_utils
* Refactor everything out into chat_template_utils
2024-08-23 14:26:12 +01:00
Jason (Siyu) Zhu
adb91179b9
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer ( #32860 )
...
* add liger integration
* fix syntax
* fix import issue
* add trainer.md
* Use _apply_liger_kernel()
* Fixed log message
* Update docs/source/en/trainer.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update docs/source/en/trainer.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Update src/transformers/trainer.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/training_args.py
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Update docs/source/en/trainer.md
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
* Fixed checkstyle and updated readme
* Added test
* Fixed checkstyle
* fix docstring
* rename use_liger to use_liger_kernel
* Trigger Build
* Added test
* add fix-copies
* Fixed copy inconsistencies
---------
Co-authored-by: shimizust <sshimizu@linkedin.com>
Co-authored-by: Steven Shimizu <shimizust@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-08-23 13:20:49 +02:00
Joao Gante
970a16ec7f
Forbid PretrainedConfig
from saving generate
parameters; Update deprecations in generate
-related code 🧹 ( #32659 )
...
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-23 11:12:53 +01:00
Cyril Vallez
22e6f14525
Reducing memory usage: removing useless logits computation in generate() ( #31292 )
...
* Add .float() in all generation methods logit outputs
* Switch float-casting of logits to training only for main models
* Add `num_logits_to_keep` in Llama and add it by default in generate
* Apply style
* Add num_logits_to_keep as arg in prepare_input_for_generation
* Add support for Mistral
* Revert models except llama and mistral
* Fix default None value in _supports_num_logits_to_keep()
* Fix dimension of dummy input
* Add exception for prophetnet in _supports_num_logits_to_keep()
* Update _supports_num_logits_to_keep() to use inspect.signature()
* Add deprecation cycle + remove modification with pretraining_tp
* Apply style
* Add most used models
* Apply style
* Make `num_logits_to_keep` an int in all cases to remove if-else clause
* Add compile check for the warning
* Fix torch versions
* style
* Add gemma2
* Update warning version
* Add comment about .float operations in generation utils
* Add tests in GenerationTesterMixin and ModelTesterMixin
* Fix batch size for assisted decoding in tests
* fix small issues in test
* refacor test
* fix slicing removing dim issue
* Add nemotron support (should fix check-copy issue in CIs)
* Trigger new CIs
* Trigger new CIs
* Bump version
* Bump version in TODO
* Trigger CIs
* remove blank space
* Trigger CIs
2024-08-23 11:08:34 +01:00
Stefano Fiorucci
d806fa3e92
docs: fix outdated link to TF32 explanation ( #32947 )
...
fix outdated link
2024-08-22 13:28:00 -07:00
Joao Gante
a26de15139
Generate: Deprecate returning legacy cache by default; Handle use_cache=False
( #32863 )
2024-08-22 20:01:52 +01:00
Jinuk
09e6579d2d
🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classification.md to Korean" ( #32334 )
...
* docs: ko: tasks/knowledge_distillation_for_image_classification.md
* feat: nmt draft
* fix: manual edits
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
* Apply suggestions from code review
---------
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
Co-authored-by: Ahnjj_DEV <ahnjj.dev@gmail.com>
2024-08-22 10:42:39 -07:00
Franz Louis Cesista
273c0afc8f
Fix regression on Processor.save_pretrained
caused by #31691 ( #32921 )
...
fix save_pretrained
2024-08-22 18:42:44 +02:00
Andrés Marafioti
18199b34e5
[run_slow] idefics2 ( #32840 )
2024-08-22 18:08:03 +02:00
Joao Gante
975b988bfe
Gemma2: eager attention by default ( #32865 )
2024-08-22 15:59:30 +01:00
Shaopeng Fu
f1d822ba33
fix: (issue #32689 ) AttributeError
raised when using Trainer
with eval_on_start=True
in Jupyter Notebook. ( #32849 )
...
fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.
2024-08-22 16:42:00 +02:00
Isotr0py
ee8c01f839
Add chat_template for tokenizer extracted from GGUF model ( #32908 )
...
* add chat_template to gguf tokenizer
* add template through tokenizer config
2024-08-22 16:41:25 +02:00
regisss
99d67f1a09
Improve greedy search memory usage ( #32895 )
...
Do not call torch.repeat_interleave if expand_size is 1
2024-08-22 15:37:44 +01:00
Yih-Dar
bf97d4aa6d
Fix benchmark script ( #32635 )
...
* fix
* >= 0.3.0
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-08-22 16:07:47 +02:00
Shubham Ugare
9282413611
Add SynCode to llm_tutorial ( #32884 )
2024-08-22 15:30:22 +02:00
Younes Belkada
eeea71209a
FIX / Hub: Also catch for exceptions.ConnectionError
( #31469 )
...
* Update hub.py
* Update errors
* Apply suggestions from code review
Co-authored-by: Lucain <lucainp@gmail.com>
---------
Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-08-22 15:29:21 +02:00
Joao Gante
8b94d28f97
CI: separate step to download nltk files ( #32935 )
...
* separate step to download nltk files
* duplicated
* rm comma
2024-08-22 14:17:24 +01:00
Marc Sun
c42d264549
FEAT / Trainer: Add adamw 4bit optimizer ( #31865 )
...
* add 4bit optimizer
* style
* fix msg
* style
* add qgalore
* Revert "add qgalore"
This reverts commit 25278e805f
.
* style
* version check
2024-08-22 15:07:09 +02:00
Gal Cohen (galco)
6baa6f276a
fix: no need to dtype A in jamba ( #32924 )
...
Co-authored-by: Gal Cohen <galc@ai21.com>
2024-08-22 15:03:22 +02:00
Sai-Suraj-27
af638c4afe
fix: Added missing huggingface_hub
installation to workflows ( #32891 )
...
Added missing huggingface_hub installation to workflows.
2024-08-22 12:51:12 +01:00
Joao Gante
f6e2586a36
Jamba: update integration tests ( #32250 )
...
* try test updates
* a few more changes
* a few more changes
* a few more changes
* [run slow] jamba
* skip logits checks on older gpus
* [run slow] jamba
* oops
* [run slow] jamba
* Update tests/models/jamba/test_modeling_jamba.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/jamba/test_modeling_jamba.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-08-22 11:46:10 +01:00
Arthur
3bb7b05229
Update docker image building ( #32918 )
...
commit
2024-08-21 21:23:10 +02:00
Ruilin Huang
c6d484e38c
fix: [whisper] don't overwrite GenerationConfig's return_timestamps
when return_timestamps
is not passed to generate
function ( #31296 )
...
[whisper] don't overwrite return_timestamps when not passed to generate
2024-08-21 20:21:27 +01:00
Ahmed Almaghz
87134662f7
[i18n-ar] add README_ar.md to README.md ( #32583 )
...
* Update README.md
* Update README.md
* Add README_ar.md to i18n/README_de.md
* Add README_ar.md to i18n/README_es.md
* Add README_ar.md to i18n/README_fr.md
* Add README_ar.md to i18n/README_hd.md
* Add README_ar.md to i18n/README_ja.md
* Add README_ar.md to i18n/README_ko.md
* Add README_ar.md to i18n/README_pt-br.md
* Add README_ar.md to i18n/README_ru.md
* Add README_ar.md to i18n/README_te.md
* Add README_ar.md to i18n/README_vi.md
* Add README_ar.md to i18n/README_vi.md
* Add README_ar.md to i18n/README_zh-hans.md
* Add README_ar.md to i18n/README_zh-hant.md
* Create README_ar.md
2024-08-20 16:11:54 -07:00
Nicholas Broad
1dde50c7d2
link for optimizer names ( #32400 )
...
* link for optimizer names
Add a note and link to where the user can find more optimizer names easily because there are many more optimizers than are mentioned in the docstring.
* make fixup
2024-08-20 15:28:24 -07:00
Pavel Iakubovskii
078d5a88cd
Replace tensor.norm()
with decomposed version for CLIP executorch export ( #32887 )
...
* Replace .norm() with decomposed version for executorch export
* [run_slow] clip
2024-08-20 21:27:21 +01:00
dependabot[bot]
9800e6d170
Bump nltk from 3.7 to 3.9 in /examples/research_projects/decision_transformer ( #32903 )
...
Bump nltk in /examples/research_projects/decision_transformer
Bumps [nltk](https://github.com/nltk/nltk ) from 3.7 to 3.9.
- [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog )
- [Commits](https://github.com/nltk/nltk/compare/3.7...3.9 )
---
updated-dependencies:
- dependency-name: nltk
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-20 21:02:17 +01:00
Anton Vlasjuk
c63a3d0f17
Fix: Mamba2 norm_before_gate
usage ( #32686 )
...
* mamba2 uses norm_before_gate=False
* small nit
* remove norm_before_gate flag and follow False path only
2024-08-20 19:47:34 +02:00
Gal Cohen (galco)
01c4fc455b
fix: jamba cache fails to use torch.nn.module ( #32894 )
...
Co-authored-by: Gal Cohen <galc@ai21.com>
2024-08-20 14:50:13 +02:00
Arthur
65f4bc99f9
Fix repr for conv ( #32897 )
...
add nx
2024-08-20 14:34:24 +02:00
Marc Sun
fd06ad5438
🚨 🚨 🚨 Update min version of accelerate to 0.26.0 ( #32627 )
...
* Update min version of accelerate to 0.26.0
* dev-ci
* update min version in import
* remove useless check
* dev-ci
* style
* dev-ci
* dev-ci
2024-08-20 11:42:36 +02:00
Arthur
13e645bb40
Allow-head-dim ( #32857 )
...
* support head dim
* fix the doc
* fixup
* add oproj
Co-authored-by: Suhara
<suhara@users.noreply.github.com>>
* update
Co-authored-by: bzantium <bzantium@users.noreply.github.com>
* Co-authored-by: suhara <suhara@users.noreply.github.com>
* Update
Co-authored-by: Yoshi Suhara <suhara@users.noreply.github.com>
---------
Co-authored-by: bzantium <bzantium@users.noreply.github.com>
Co-authored-by: Yoshi Suhara <suhara@users.noreply.github.com>
2024-08-20 10:24:48 +02:00
Matt
85345bb439
Add tip to clarify tool calling ( #32883 )
2024-08-19 18:37:35 +01:00
Sai-Suraj-27
37204848f1
Docs: Fixed whisper-large-v2
model link in docs ( #32871 )
...
Fixed whisper-large-v2 model link in docs.
2024-08-19 09:50:35 -07:00
Anton Vlasjuk
61d89c19d8
Fix: Mamba2 generation mismatch between input_ids and inputs_embeds ( #32694 )
...
* fix cache when using input embeddings
* simplify check, we can always add input ids seq len since its 0 in first pass
2024-08-19 16:06:07 +02:00
Younes Belkada
93e538ae2e
Mamba / FalconMamba: Fix mamba left padding ( #32677 )
...
* fix mamba left padding
* Apply suggestions from code review
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* fix copies
* test with `inputs_embeds`
* Update src/transformers/models/falcon_mamba/modeling_falcon_mamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* copies
* clairfy
* fix last comments
* remove
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-19 16:01:35 +02:00
Isotr0py
59e8f1919c
Fix incorrect vocab size retrieval in GGUF config ( #32551 )
...
* fix gguf config vocab size
* minor fix
* link issue
2024-08-19 15:53:54 +02:00
Alan-Blanchet
5f6c080b62
RT-DETR parameterized batchnorm freezing ( #32631 )
...
* fix: Parameterized norm freezing
For the R18 model, the authors don't freeze norms in the backbone.
* Update src/transformers/models/rt_detr/configuration_rt_detr.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2024-08-19 14:50:57 +01:00
Yitong Huang
8a4857c0db
Support save/load ckpt for XLA FSDP ( #32311 )
...
* Support save/load ckpt for XLA FSDP
* Fix bug for save
* Fix style
* reserve sharded ckpt and better file naming
* minor fix
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* add is_fsdp_xla_v1_enabled
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
2024-08-19 15:44:21 +02:00
Aaron Chung
f1b720ed62
Add __repr__ for Conv1D ( #32425 )
...
* Add representation for Conv1D, for better output info.
* code format for Conv1D
* We add a __repr__ func for Conv1D, this allows the print (or output) of the model's info has a better description for Conv1D.
2024-08-19 15:26:19 +02:00
Fanli Lin
e55b33ceb4
[tests] make test_sdpa_can_compile_dynamic
device-agnostic ( #32519 )
...
* enable
* fix
2024-08-19 12:46:59 +01:00
Ita Zaporozhets
54b7703682
support torch-speech ( #32537 )
2024-08-19 11:26:35 +02:00
Kamil Akesbi
8260cb311e
Add Descript-Audio-Codec model ( #31494 )
...
* dac model
* original dac works
* add dac model
* dac can be instatiated
* add forward pass
* load weights
* all weights are used
* convert checkpoint script ready
* test
* add feature extractor
* up
* make style
* apply cookicutter
* fix tests
* iterate on FeatureExtractor
* nit
* update dac doc
* replace nn.Sequential with nn.ModuleList
* nit
* apply review suggestions 1/2
* Update src/transformers/models/dac/modeling_dac.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* up
* apply review suggestions 2/2
* update padding in FeatureExtractor
* apply review suggestions
* iterate on design and tests
* add integration tests
* feature extractor tests
* make style
* all tests pass
* make style
* fixup
* apply review suggestions
* fix-copies
* apply review suggestions
* apply review suggestions
* Update docs/source/en/model_doc/dac.md
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Update docs/source/en/model_doc/dac.md
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* anticipate transfer weights to descript
* up
* make style
* apply review suggestions
* update slow test values
* update slow tests
* update test values
* update with CI values
* update with vorace values
* update test with slice
* make style
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
2024-08-19 10:21:51 +01:00
MAHIR DAIYAN
843e5e20ca
Add Flax Dinov2 ( #31960 )
...
* tfmsenv restored in main
* installed flax
* forward pass done and all tests passed
* make fix-copies and cleaning the scripts
* fixup attempt 1
* fixup attempt 2
* fixup third attempt
* fixup attempt 4
* fixup attempt 5
* dinov2 doc fixed
* FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE
* external pos_encoding layer removed
* fixup attempt 6
* fixed integration test values
* fixup attempt 7
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* comments removed
* comment removed from the test
* fixup
* Update src/transformers/models/dinov2/modeling_flax_dinov2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* new fixes 1
* interpolate_pos_encoding function removed
* droppath rng fixed, pretrained beit copied-from still not working
* modeling_flax_dinov2.py reformatted
* Update tests/models/dinov2/test_modeling_flax_dinov2.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* added Copied from, to the tests
* copied from statements removed from tests
* fixed copied from statements in the tests
* [run_slow] dinov2
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2024-08-19 09:28:13 +01:00