* Add initial files for depth estimation pipelines
* Add test file for depth estimation pipeline
* Update model mapping names
* Add updates for depth estimation output
* Add generic test
* Hopefully fixing the tests.
* Check if test passes
* Add make fixup and make fix-copies changes after rebase with main
* Rebase with main
* Fixing up depth pipeline.
* This is not used anymore.
* Fixing the test. `Image` is a module `Image.Image` is the type.
* Update docs/source/en/main_classes/pipelines.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add ZeroShotObjectDetectionPipeline (#18445)
* Add AutoModelForZeroShotObjectDetection task
This commit also adds the following
- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
This is necessary as pipelines don't auto infer processors yet and
`OwlVitProcessor` wraps tokenizer and feature_extractor together, to
process multiple images at once
- Add auto tests and other tests for ZeroShotObjectDetectionPipeline
* Add AutoModelForZeroShotObjectDetection task
This commit also adds the following
- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
This is necessary as pipelines don't auto infer processors yet and
`OwlVitProcessor` wraps tokenizer and feature_extractor together, to
process multiple images at once
- Add auto tests and other tests for ZeroShotObjectDetectionPipeline
* Add batching for ZeroShotObjectDetectionPipeline
* Fix doc-string ZeroShotObjectDetectionPipeline
* Fix output format: ZeroShotObjectDetectionPipeline
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models
* Fixup
* Use the full encoding
* Basic refactoring to DocumentQuestionAnsweringPipeline
* Cleanup
* Improve args, docs, and implement preprocessing
* Integrate OCR
* Refactor question_answering pipeline
* Use refactored QA code in the document qa pipeline
* Fix tests
* Some small cleanups
* Use a string type annotation for Image.Image
* Update encoding with image features
* Wire through the basic docs
* Handle invalid response
* Handle empty word_boxes properly
* Docstring fix
* Integrate Donut model
* Fixup
* Incorporate comments
* Address comments
* Initial incorporation of tests
* Address Comments
* Change assert to ValueError
* Comments
* Wrap `score` in float to make it JSON serializable
* Incorporate AutoModeLForDocumentQuestionAnswering changes
* Fixup
* Rename postprocess function
* Fix auto import
* Applying comments
* Improve docs
* Remove extra assets and add copyright
* Address comments
Co-authored-by: Ankur Goyal <ankur@impira.com>
* First draft
* Add VideoMAEForVideoClassification
* Improve conversion script
* Add VideoMAEForPreTraining
* Add VideoMAEFeatureExtractor
* Improve VideoMAEFeatureExtractor
* Improve docs
* Add first draft of model tests
* Improve VideoMAEForPreTraining
* Fix base_model_prefix
* Make model take pixel_values of shape (B, T, C, H, W)
* Add loss computation of VideoMAEForPreTraining
* Improve tests
* Improve model testsé
* Make all tests pass
* Add VideoMAE to main README
* Add tests for VideoMAEFeatureExtractor
* Add integration test
* Improve conversion script
* Rename patch embedding class
* Remove VideoMAELayer from init
* Update design of patch embeddings
* Improve comments
* Improve conversion script
* Improve conversion script
* Add conversion of pretrained model
* Add loss verification of pretrained model
* Add loss verification of unnormalized targets
* Add integration test for pretraining model
* Apply suggestions from code review
* Fix bug to make feature extractor resize only shorter edge
* Address more comments
* Improve normalization of videos
* Add doc examples
* Move constants to dedicated script
* Remove scripts
* Transfer checkpoints, fix docs
* Update script
* Update image mean and std
* Fix doc tests
* Set return_tensors to NumPy by default
* Revert the previous change
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>