NielsRogge
bb6f6d5338
Add X-CLIP ( #18852 )
...
* First draft
* Improve conversion script
* Make vision encoder work
* More improvements
* Improve conversion script
* Fix quality
* Add MultiframeIntegrationTransformer
* More improvements
* Make MiT output work
* Fix quality
* Add prompts generator
* Add tests
* Fix some tests
* Fix some more tests
* Fix more tests
* Improve conversion script
* Fix model outputs
* Fix more tests
* Add XClipProcessor
* Use processor in conversion script
* Fix integration test
* Update README, fix docs
* Fix all tests
* Add MIT output to XClipOutput
* Create better variable names
* Rename XClip to XCLIP
* Extend conversion script
* Add support for large models
* Add support for 16 frame models
* Add another model'
* Fix module issue
* Apply suggestions from code review
* Add figure to docs
* Fix CLIPProcessor issue
* Apply suggestions from code review
* Delete file
* Convert more checkpoints
* Convert last checkpoint
* Update nielsr to microsoft
2022-09-08 14:50:30 +02:00
Alara Dirik
12d66b4701
Add OWL-ViT model for zero-shot object detection ( #17938 )
...
* add owlvit model skeleton
* add class and box predictor heads
* convert modified flax clip to pytorch
* fix box and class predictors
* add OwlViTImageTextEmbedder
* convert class and box head checkpoints
* convert image text embedder checkpoints
* add object detection head
* fix bugs
* update conversion script
* update conversion script
* fix q,v,k,out weight conversion conversion
* add owlvit object detection output
* fix bug in image embedder
* fix bugs in text embedder
* fix positional embeddings
* fix bug in inference mode vision pooling
* update docs, init tokenizer and processor files
* support batch processing
* add OwlViTProcessor
* remove merge conflicts
* readd owlvit imports
* fix bug in OwlViTProcessor imports
* fix bugs in processor
* update docs
* fix bugs in processor
* update owlvit docs
* add OwlViTFeatureExtractor
* style changes, add postprocess method to feature extractor
* add feature extractor and processor tests
* add object detection tests
* update conversion script
* update config paths
* update config paths
* fix configuration paths and bugs
* fix bugs in OwlViT tests
* add import checks to processor
* fix docs and minor issues
* fix docs and minor issues
* fix bugs and issues
* fix bugs and issues
* fix bugs and issues
* fix bugs and issues
* update docs and examples
* fix bugs and issues
* update conversion script, fix positional embeddings
* process 2D input ids, update tests
* fix style and quality issues
* update docs
* update docs and imports
* update OWL-ViT index.md
* fix bug in OwlViT feature ext tests
* fix code examples, return_dict by default
* return_dict by default
* minor fixes, add tests to processor
* small fixes
* add output_attentions arg to main model
* fix bugs
* remove output_hidden_states arg from main model
* update self.config variables
* add option to return last_hidden_states
* fix bug in config variables
* fix copied from statements
* fix small issues and bugs
* fix bugs
* fix bugs, support greyscale images
* run fixup
* update repo name
* merge OwlViTImageTextEmbedder with obj detection head
* fix merge conflict
* fix merge conflict
* make fixup
* fix bugs
* fix bugs
* add additional processor test
2022-07-22 13:35:32 +03:00
Jerry Jiarui XU
6c8f4c9a93
Adding GroupViT Models ( #17313 )
...
* add group vit and fixed test (except slow)
* passing slow test
* addressed some comments
* fixed test
* fixed style
* fixed copy
* fixed segmentation output
* fixed test
* fixed relative path
* fixed copy
* add ignore non auto configured
* fixed docstring, add doc
* fixed copies
* Apply suggestions from code review
merge suggestions
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* resolve comment, renaming model
* delete unused attr
* use fix copies
* resolve comments
* fixed attn
* remove unused vars
* refactor tests
* resolve final comments
* add demo notebook
* fixed inconsitent default
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* rename stage->stages
* Create single GroupViTEncoderLayer class
* Update conversion script
* Simplify conversion script
* Remove cross-attention class in favor of GroupViTAttention
* Convert other model as well, add processor to conversion script
* addressing final comment
* fixed args
* Update src/transformers/models/groupvit/modeling_groupvit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-06-28 20:51:47 +02:00