Alara Dirik
|
12d66b4701
|
Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton
* add class and box predictor heads
* convert modified flax clip to pytorch
* fix box and class predictors
* add OwlViTImageTextEmbedder
* convert class and box head checkpoints
* convert image text embedder checkpoints
* add object detection head
* fix bugs
* update conversion script
* update conversion script
* fix q,v,k,out weight conversion conversion
* add owlvit object detection output
* fix bug in image embedder
* fix bugs in text embedder
* fix positional embeddings
* fix bug in inference mode vision pooling
* update docs, init tokenizer and processor files
* support batch processing
* add OwlViTProcessor
* remove merge conflicts
* readd owlvit imports
* fix bug in OwlViTProcessor imports
* fix bugs in processor
* update docs
* fix bugs in processor
* update owlvit docs
* add OwlViTFeatureExtractor
* style changes, add postprocess method to feature extractor
* add feature extractor and processor tests
* add object detection tests
* update conversion script
* update config paths
* update config paths
* fix configuration paths and bugs
* fix bugs in OwlViT tests
* add import checks to processor
* fix docs and minor issues
* fix docs and minor issues
* fix bugs and issues
* fix bugs and issues
* fix bugs and issues
* fix bugs and issues
* update docs and examples
* fix bugs and issues
* update conversion script, fix positional embeddings
* process 2D input ids, update tests
* fix style and quality issues
* update docs
* update docs and imports
* update OWL-ViT index.md
* fix bug in OwlViT feature ext tests
* fix code examples, return_dict by default
* return_dict by default
* minor fixes, add tests to processor
* small fixes
* add output_attentions arg to main model
* fix bugs
* remove output_hidden_states arg from main model
* update self.config variables
* add option to return last_hidden_states
* fix bug in config variables
* fix copied from statements
* fix small issues and bugs
* fix bugs
* fix bugs, support greyscale images
* run fixup
* update repo name
* merge OwlViTImageTextEmbedder with obj detection head
* fix merge conflict
* fix merge conflict
* make fixup
* fix bugs
* fix bugs
* add additional processor test
|
2022-07-22 13:35:32 +03:00 |
|
Jerry Jiarui XU
|
6c8f4c9a93
|
Adding GroupViT Models (#17313)
* add group vit and fixed test (except slow)
* passing slow test
* addressed some comments
* fixed test
* fixed style
* fixed copy
* fixed segmentation output
* fixed test
* fixed relative path
* fixed copy
* add ignore non auto configured
* fixed docstring, add doc
* fixed copies
* Apply suggestions from code review
merge suggestions
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* resolve comment, renaming model
* delete unused attr
* use fix copies
* resolve comments
* fixed attn
* remove unused vars
* refactor tests
* resolve final comments
* add demo notebook
* fixed inconsitent default
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename stage->stages
* Create single GroupViTEncoderLayer class
* Update conversion script
* Simplify conversion script
* Remove cross-attention class in favor of GroupViTAttention
* Convert other model as well, add processor to conversion script
* addressing final comment
* fixed args
* Update src/transformers/models/groupvit/modeling_groupvit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
|
2022-06-28 20:51:47 +02:00 |
|