* [Proposal] Breaking change `zero-shot-object-detection` for improved
consistency.
This is a proposal to modify the output of `zero-shot-object-detection`
to provide better alignment with other pipelines.
The output is now strictly the same as `object-detection` whereas before
it would output lists of lists.
The name `candidate_labels` is used throughout for consistency with
other `zero-shot` pipelines.
The pipeline is changed to `ChunkPipeline` to support batching cleanly.
This removes all the lists and list of lists shenanigans, it's now a
matter of the base pipeline handling all this not this specific one.
**Breaking change**: It did remove complex calls potentials `pipe(images = [image1, image2],
text_queries=[candidates1, candidates2])` to support only
`pipe([{"image": image1, "candidate_labels": candidates1}, {"image": image2, "candidate_labels": candidates2}])`
when dealing with lists and/or datasets.
We could keep them, but it will add a lot of complexity to the code
base, since the pipeline is rather young, I'd rather break to keep the
code simpler, but we can revert this.
**Breaking change**: The name of the argument is now `image` instead of
`images` since it expects by default only 1 image. This is revertable
like the previous one.
**Breaking change**: The types is now simplified and flattened:
`pipe(inputs) == [{**object1}, {**object2}]`
instead of the previous
`pipe(inputs) == [[{**object1}, {**object1}], [{**object2}]]`
Where the different instances would be grouped by candidate labels
within lists.
IMHO this is not really desirable, since it would output empty lists and
is only adding superflous indirection compared to
`zero-shot-object-detection`.
It is relatively change free in terms of how the results, it does change
computation however since now the batching is handled by the pipeline
itself. It **did** change the results for the small models so there
seems to be a real difference in how the models handle this.
* Fixing the doctests.
* Behind is_torch_available.
* Add ResNetBackbone
* Define channels and strides as property
* Remove file
* Add test for backbone
* Update BackboneOutput class
* Remove strides property
* Fix docstring
* Add backbones to SHOULD_HAVE_THEIR_OWN_PAGE
* Fix auto mapping name
* Add sanity check for out_features
* Set stage names based on depths
* Update to tuple
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Adds image-guided object detection method to OwlViTForObjectDetection class as described in the original paper. One-shot/ image-guided object detection enables users to use a query image to search for similar objects in the input image.
Co-Authored-By: Dhruv Karan k4r4n.dhruv@gmail.com
* Slightly alter Keras dummy loss
* Slightly alter Keras dummy loss
* Add sample weight to test_keras_fit
* Fix test_keras_fit for datasets
* Skip the sample_weight stuff for models where the model tester has no batch_size
* allow loading projection in text and vision model
* begin tests
* finish test for CLIPTextModelTest
* style
* add slow tests
* add new classes for projection heads
* remove with_projection
* add in init
* add in doc
* fix tests
* fix some more tests
* fix copies
* fix docs
* remove leftover from fix-copies
* add the head models in IGNORE_NON_AUTO_CONFIGURED
* fix docstr
* fix tests
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add docstr for models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Try PT1.13 by removing torch scatter
* Skip failing tests
* Style
* Remvoe testing extras for repo utils
* Try with all decorators
* Try to wipe the cache
* Fix all tests?
* Try this way
* Fix comma
* Update to main
* Try with less deps
* Quality
* add `accelerate` support for `ViT` family
- add `_no_split_modules`
- manually cast to the right `dtype`: to change
* enable `float16` for `deit`
* fix `make fixup`
* add `slow` test for `fp16` inference
* another safety check
* Update src/transformers/models/deit/modeling_deit.py
* update relative positional embedding
* make fix copies
* add `use_cache` to list of arguments
* fixup
* 1line fucntion
* add `test_decoder_model_past_with_large_inputs_relative_pos_emb`
* add relative pos embedding test for more models
* style
* Fix ImageSegmentationPipelineTests
* Use 0.9
* no zip
* links to show images
* links to show images
* rebase
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* add model files etc for MobileNetV2
* rename files for MobileNetV1
* initial implementation of MobileNetV1
* fix conversion script
* cleanup
* write docs
* tweaks
* fix conversion script
* extract hidden states
* fix test cases
* make fixup
* fixup it all
* rename V1 to V2
* fix checkpoints
* fixup
* implement first block + weight conversion
* add remaining layers
* add output stride and dilation
* fixup
* add tests
* add deeplabv3+ head
* a bit of fixup
* finish deeplab conversion
* add link to doc
* fix issue with JIT trace
in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value.
* cleanup
* fix order of models
* fix rebase error
* remove main from doc link
* add image processor
* remove old feature extractor
* fix converter + other issues
* fixup
* fix unit test
* add to onnx tests (but these appear broken now)
* add post_process_semantic_segmentation
* use google org
* remove unused imports
* move args
* replace weird assert
* Apply fix
* Fix test
* Remove another argument which is not used
* Fix pipeline test
* Add argument back, add deprecation warning
* Add warning add other location
* Use warnings instead
* Add num_channels to config
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
* Adding support for LayoutLMvX variants for `object-detection`.
* Revert bogs `layoutlm` feature extractor which does not exist (it was a
V2 model) .
* Updated condition.
* Handling the comments.
* move generation_*.py src files into generation/*.py
* populate generation.__init__ with lazy loading
* move imports and references from generation.xxx.object to generation.object
* Attempting to test automatically the `_keys_to_ignore`.
* Style.
* First fix pass.
* Moving test on its own.
* Another batch.
* Second round removing BatchNorm
* Fixing layoutlmv{2,3} + support older Python.
* Disable miss missing warning.
* Removing dodgy additions.
* Big pass.
* mbart.
* More corrections.
* Fixup.
* Updating test_correct_missing_keys
* Add escape hatch for when the head has no extra params so doesn't need
the missing keys check.
* Fixing test.
* Greener.
* Green ! (except for weird splinter bug).
* Adding a test about `named_parameters` usage.
* Shorten message.
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* After rebase modifications.
* More explicit condition checking.
* Fixing slow tests issues.
* Remove extra pdb.
* Remove print.
* Attempt to make failure consistent + fixing roc_bert.
* Removing the seed (all tests passing with it).
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add first draft
* Update conversion script
* Improve conversion script
* Improve conversion script some more
* Add conditional embeddings
* Add initial decoder
* Fix activation function of decoder
* Make decoder outputs match original implementation
* Make decoder outputs match original implementation
* Add more copied from statements
* Improve model outputs
* Fix auto tokenizer file
* Fix more tests
* Add test
* Improve README and docs, improve conditional embeddings
* Fix more tests
* Remove print statements
* Remove initial embeddings
* Improve conversion script
* Add interpolation of position embeddings
* Finish addition of interpolation of position embeddings
* Add support for refined checkpoint
* Fix refined checkpoint
* Remove unused parameter
* Improve conversion script
* Add support for training
* Fix conversion script
* Add CLIPSegFeatureExtractor
* Fix processor
* Fix CLIPSegProcessor
* Fix conversion script
* Fix most tests
* Fix equivalence test
* Fix README
* Add model to doc tests
* Use better variable name
* Convert other checkpoint as well
* Update config, add link to paper
* Add docs
* Update organization
* Replace base_model_prefix with clip
* Fix base_model_prefix
* Fix checkpoint of config
* Fix config checkpoint
* Remove file
* Use logits for output
* Fix tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add test for SentencePiece not adding special tokens to strings
* Add SentencePieceStringConversionMixin to fix issue 15003
* Fix conversion from tokens to string for most SentencePiece tokenizers
Tokenizers fixed:
- AlbertTokenizer
- BarthezTokenizer
- CamembertTokenizer
- FNetTokenizer
- M2M100Tokenizer
- MBart50Tokenizer
- PegasusTokenizer
- Speech2TextTokenizer
* Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab
* Fix DebertaV2Tokenizer
* Ignore LayoutXLMTokenizer in SentencePiece string conversion test
* Run 'make style' and 'make quality'
* Clean convert_tokens_to_string test
Instead of explicitly ignoring LayoutXLMTokenizer in the test,
override the test in LayoutLMTokenizationTest and do nothing in it.
* Remove commented out code
* Improve robustness of convert_tokens_to_string test
Instead of comparing lengths of re-tokenized text and input_ids,
check that converting all special tokens to string yields a string
with all special tokens.
* Inline and remove SentencePieceStringConversionMixin
The convert_tokens_to_string method is now implemented
in each relevant SentencePiece tokenizer.
* Run 'make style' and 'make quality'
* Revert removal of space in convert_tokens_to_string
* Remove redundant import
* Revert test text to original
* Uncomment the lowercasing of the reverse_text variable
* Mimic Rust tokenizer behavior for tokenizers
- Albert
- Barthez
- Camembert
- MBart50
- T5
* Fix accidentally skipping test in wrong tokenizer
* Add test for equivalent Rust and slow tokenizer behavior
* Override _decode in BigBirdTokenizer to mimic Rust behavior
* Override _decode in FNetTokenizer to mimic Rust behavior
* Override _decode in XLNetTokenizer to mimic Rust behavior
* Remove unused 're' import
* Update DebertaV2Tokenizer to mimic Rust tokenizer
* Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested.
* Ignore problematic tests in Deberta V2
* Add comment on why the Deberta V2 tests are skipped