mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-13 17:48:22 +06:00
![]() * Fixed typo when converting weigths to GroundingDINO vision backbone * Final modifications on modeling * Removed unnecessary class * Fixed convert structure * Added image processing * make fixup partially completed * Now text_backbone_config has its own class * Modified convert script * Removed unnecessary config attribute * Added new function to generate sub sentence mask * Renamed parameters with gamma in the name as it's currently not allowed * Removed tokenization and image_processing scripts since we'll map from existing models * Fixed some issues with configuration * Just some modifications on conversion script * Other modifications * Copied deformable detr * First commit * Added bert to model * Bert validated * Created Text and Fusion layers for Encoder * Adapted Encoder layer * Fixed typos * Adjusted Encoder * Converted encoder to hf * Modified Decoder Layer * Modified main decoder class * Removed copy comments * Fixed forward from GroundingDINOModel and GroundingDINODecoder * Added all necessary layers, configurations and forward logic up to GroundingDINOModel * Added all layers to convertion * Fixed outputs for GroundingDINOModel and GroundingDINOForObjectDetection * Fixed mask input to encoders and fixed nn.MultiheadAttention batch first and attn output * Fixed forward from GroundingDINOTextEnhancerLayer * Fixed output bug with GroundingDINODeformableLayer * Fixed bugs that prevent GroundingDINOForObjectDetection to run forward method * Fixed attentions to be passed correctly * Passing temperature arg when creating Sine position embedding * Removed copy comments * Added temperature argument for position embedding * Fixed typo when converting weigths to GroundingDINO vision backbone * Final modifications on modeling * Removed unnecessary class * Fixed convert structure * Added image processing * make fixup partially completed * Now text_backbone_config has its own class * Modified convert script * Removed unnecessary config attribute * Added new function to generate sub sentence mask * Renamed parameters with gamma in the name as it's currently not allowed * Removed tokenization and image_processing scripts since we'll map from existing models * Fixed some issues with configuration * Just some modifications on conversion script * Other modifications * Fix style * Improve fixup * Improve conversion script * Improve conversion script * Add GroundingDINOProcessor * More improvements * Return token type ids * something * Fix more tests * More improvements * More cleanup * More improvements * Fixed tests, improved modeling and config * More improvements and fixing tests * Improved tests and modeling * Improved tests and added image processor * Improved tests inference * More improvements * More test improvements * Fixed last test * Improved docstrings and comments * Fix style * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Better naming * Better naming * Added Copied statement * Added Copied statement * Moved param init from GroundingDINOBiMultiHeadAttention * Better naming * Fixing clamp style * Better naming * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/grounding_dino/configuration_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Improving conversion script * Improved config * Improved naming * Improved naming again * Improved grouding-dino.md * Moved grounding dino to multimodal * Update src/transformers/models/grounding_dino/convert_grounding_dino_to_hf.py Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> * Fixed docstrings and style * Fix docstrings * Remove timm attributes * Reorder imports * More improvements * Add Grounding DINO to pipeline * Remove model from check_repo * Added grounded post_process to GroundingDINOProcessor * Fixed style * Fixed GroundingDINOTextPrenetConfig docstrings * Aligned inputs.keys() when both image and text are passed with model_input_names * Added tests for GroundingDINOImageProcessor and GroundingDINOProcessor * Testing post_process_grounded_object_detection from GroundingDINOProcessor at test_inference_object_detection_head * Fixed order * Marked test with require_torch * Temporarily changed repo_id * More improvements * Fix style * Final improvements * Improve annotators * Fix style * Add is_torch_available * Remove type hints * vocab_tokens as one liner * Removed print statements * Renamed GroundingDINOTextPrenetConfig to GroundingDINOTextConfig * remove unnecessary comments * Removed unnecessary tests on conversion script * Renamed GroundingDINO to camel case GroundingDino * Fixed GroundingDinoProcessor docstrings * loading MSDA kernels in the modeling file * Fix copies * Replace nn.multiheadattention * Replace nn.multiheadattention * Fixed inputs for GroundingDinoMultiheadAttention & order of modules * Fixed processing to avoid messing with inputs * Added more tips for GroundingDino * Make style * Chaning name to align with SAM * Replace final nn.multiheadattention * Fix model tests * Update year, remove GenerationTesterMixin * Address comments * Address more comments * Rename TextPrenet to TextModel * Rename hidden_states * Address more comments * Address more comments * Address comment * Address more comments * Address merge * Address comment * Address comment * Address comment * Make style * Added layer norm eps to layer norms * Address more comments * More fixes * Fixed equivalence * Make fixup * Remove print statements * Address comments * Address comments * Address comments * Address comments * Address comments * Address comments * Add comment * Address comment * Remove overwriting of test * Fix bbox_embed * Improve decoder_bbox_embed_share * Simplify outputs * Updated post_process_grounded_object_detection * Renamed sources to feature_maps * Improved tests for Grounding Dino ImageProcessor and Processor * Fixed test requirements and imports * Fixed image_processing * Fixed processor tests * Fixed imports for image processing tests * Fix copies * Updated modeling * Fix style * Moved functions to correct position * Fixed copy issues * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Keeping consistency custom cuda kernels for MSDA * Make GroundingDinoProcessor logic clearer * Updated Grounding DINO checkpoints * Changed tests to correct structure * Updated gpu-cpu equivalence test * fix copies * Update src/transformers/models/grounding_dino/processing_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/processing_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/modeling_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/grounding_dino/configuration_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixed erros and style * Fix copies * Removed inheritance from PreTrainedModel from GroundingDinoTextModel * Fixed GroundingDinoTextModel * Fixed type of default backbone config * Fixed missing methods for GroundingDinoTextModel and Added timm support for GroundingDinoConvEncoder * Addressed comments * Addressed batched image processing tests * Addressed zero shot test comment * Addressed tip comment * Removed GroundingDinoTextModel from check_repo * Removed inplace masking * Addressed comments * Addressed comments * Addressed comments * Fix copies * Fixing timm test * Fixed batching equivalence test * Update docs/source/en/model_doc/grounding-dino.md Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com> * Update docs/source/en/model_doc/grounding-dino.md Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com> * Update docs/source/en/model_doc/grounding-dino.md Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com> * Addressed more comments * Added a new comment * Reduced image size * Addressed more comments * Nits * Nits * Changed the way text_config is initialized * Update src/transformers/models/grounding_dino/processing_grounding_dino.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Rafael Padilla <31217453+rafaelpadilla@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Tianqi Xu <40522713+dandansamax@users.noreply.github.com> |
||
---|---|---|
.. | ||
albert.md | ||
align.md | ||
altclip.md | ||
audio-spectrogram-transformer.md | ||
auto.md | ||
autoformer.md | ||
bark.md | ||
bart.md | ||
barthez.md | ||
bartpho.md | ||
beit.md | ||
bert-generation.md | ||
bert-japanese.md | ||
bert.md | ||
bertweet.md | ||
big_bird.md | ||
bigbird_pegasus.md | ||
biogpt.md | ||
bit.md | ||
blenderbot-small.md | ||
blenderbot.md | ||
blip-2.md | ||
blip.md | ||
bloom.md | ||
bort.md | ||
bridgetower.md | ||
bros.md | ||
byt5.md | ||
camembert.md | ||
canine.md | ||
chinese_clip.md | ||
clap.md | ||
clip.md | ||
clipseg.md | ||
clvp.md | ||
code_llama.md | ||
codegen.md | ||
cohere.md | ||
conditional_detr.md | ||
convbert.md | ||
convnext.md | ||
convnextv2.md | ||
cpm.md | ||
cpmant.md | ||
ctrl.md | ||
cvt.md | ||
data2vec.md | ||
deberta-v2.md | ||
deberta.md | ||
decision_transformer.md | ||
deformable_detr.md | ||
deit.md | ||
deplot.md | ||
depth_anything.md | ||
deta.md | ||
detr.md | ||
dialogpt.md | ||
dinat.md | ||
dinov2.md | ||
distilbert.md | ||
dit.md | ||
donut.md | ||
dpr.md | ||
dpt.md | ||
efficientformer.md | ||
efficientnet.md | ||
electra.md | ||
encodec.md | ||
encoder-decoder.md | ||
ernie_m.md | ||
ernie.md | ||
esm.md | ||
falcon.md | ||
fastspeech2_conformer.md | ||
flan-t5.md | ||
flan-ul2.md | ||
flaubert.md | ||
flava.md | ||
fnet.md | ||
focalnet.md | ||
fsmt.md | ||
funnel.md | ||
fuyu.md | ||
gemma.md | ||
git.md | ||
glpn.md | ||
gpt_bigcode.md | ||
gpt_neo.md | ||
gpt_neox_japanese.md | ||
gpt_neox.md | ||
gpt-sw3.md | ||
gpt2.md | ||
gptj.md | ||
gptsan-japanese.md | ||
graphormer.md | ||
grounding-dino.md | ||
groupvit.md | ||
herbert.md | ||
hubert.md | ||
ibert.md | ||
idefics.md | ||
imagegpt.md | ||
informer.md | ||
instructblip.md | ||
jukebox.md | ||
kosmos-2.md | ||
layoutlm.md | ||
layoutlmv2.md | ||
layoutlmv3.md | ||
layoutxlm.md | ||
led.md | ||
levit.md | ||
lilt.md | ||
llama.md | ||
llama2.md | ||
llava_next.md | ||
llava.md | ||
longformer.md | ||
longt5.md | ||
luke.md | ||
lxmert.md | ||
m2m_100.md | ||
madlad-400.md | ||
mamba.md | ||
marian.md | ||
markuplm.md | ||
mask2former.md | ||
maskformer.md | ||
matcha.md | ||
mbart.md | ||
mctct.md | ||
mega.md | ||
megatron_gpt2.md | ||
megatron-bert.md | ||
mgp-str.md | ||
mistral.md | ||
mixtral.md | ||
mluke.md | ||
mms.md | ||
mobilebert.md | ||
mobilenet_v1.md | ||
mobilenet_v2.md | ||
mobilevit.md | ||
mobilevitv2.md | ||
mpnet.md | ||
mpt.md | ||
mra.md | ||
mt5.md | ||
musicgen_melody.md | ||
musicgen.md | ||
mvp.md | ||
nat.md | ||
nezha.md | ||
nllb-moe.md | ||
nllb.md | ||
nougat.md | ||
nystromformer.md | ||
oneformer.md | ||
open-llama.md | ||
openai-gpt.md | ||
opt.md | ||
owlv2.md | ||
owlvit.md | ||
patchtsmixer.md | ||
patchtst.md | ||
pegasus_x.md | ||
pegasus.md | ||
perceiver.md | ||
persimmon.md | ||
phi.md | ||
phobert.md | ||
pix2struct.md | ||
plbart.md | ||
poolformer.md | ||
pop2piano.md | ||
prophetnet.md | ||
pvt_v2.md | ||
pvt.md | ||
qdqbert.md | ||
qwen2_moe.md | ||
qwen2.md | ||
rag.md | ||
realm.md | ||
recurrent_gemma.md | ||
reformer.md | ||
regnet.md | ||
rembert.md | ||
resnet.md | ||
retribert.md | ||
roberta-prelayernorm.md | ||
roberta.md | ||
roc_bert.md | ||
roformer.md | ||
rwkv.md | ||
sam.md | ||
seamless_m4t_v2.md | ||
seamless_m4t.md | ||
segformer.md | ||
seggpt.md | ||
sew-d.md | ||
sew.md | ||
siglip.md | ||
speech_to_text_2.md | ||
speech_to_text.md | ||
speech-encoder-decoder.md | ||
speecht5.md | ||
splinter.md | ||
squeezebert.md | ||
stablelm.md | ||
starcoder2.md | ||
superpoint.md | ||
swiftformer.md | ||
swin.md | ||
swin2sr.md | ||
swinv2.md | ||
switch_transformers.md | ||
t5.md | ||
t5v1.1.md | ||
table-transformer.md | ||
tapas.md | ||
tapex.md | ||
time_series_transformer.md | ||
timesformer.md | ||
trajectory_transformer.md | ||
transfo-xl.md | ||
trocr.md | ||
tvlt.md | ||
tvp.md | ||
udop.md | ||
ul2.md | ||
umt5.md | ||
unispeech-sat.md | ||
unispeech.md | ||
univnet.md | ||
upernet.md | ||
van.md | ||
videomae.md | ||
vilt.md | ||
vipllava.md | ||
vision-encoder-decoder.md | ||
vision-text-dual-encoder.md | ||
visual_bert.md | ||
vit_hybrid.md | ||
vit_mae.md | ||
vit_msn.md | ||
vit.md | ||
vitdet.md | ||
vitmatte.md | ||
vits.md | ||
vivit.md | ||
wav2vec2_phoneme.md | ||
wav2vec2-bert.md | ||
wav2vec2-conformer.md | ||
wav2vec2.md | ||
wavlm.md | ||
whisper.md | ||
xclip.md | ||
xglm.md | ||
xlm-prophetnet.md | ||
xlm-roberta-xl.md | ||
xlm-roberta.md | ||
xlm-v.md | ||
xlm.md | ||
xlnet.md | ||
xls_r.md | ||
xlsr_wav2vec2.md | ||
xmod.md | ||
yolos.md | ||
yoso.md |