Commit Graph

89 Commits

Author SHA1 Message Date
Yih-Dar
f9909fbf85
Make ImageSegmentationPipelineTests less flaky (#20147)
* Fix ImageSegmentationPipelineTests

* Use 0.9

* no zip

* links to show images

* links to show images

* rebase

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-15 09:14:55 +01:00
Nicolas Patry
25c451e5a0
Adding chunking for whisper (all seq2seq actually). Very crude matching algorithm. (#20104)
* Very crude matching algorithm.

* Fixing tests.

* Removing comments

* Adding warning + fix short matches.

* Cleanup tests.

* Quality.

* Less noisy.

* Fixup.
2022-11-14 22:32:50 +01:00
Bartosz Szmelczynski
78a471ff71
Fix tapas scatter (#20149)
* First draft

* Remove scatter dependency

* Add require_torch

* update vectorized sum test, add clone call

* remove artifacts

* fix style

* fix style v2

* remove "scatter" mentions from the code base

* fix isort error

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-14 01:04:26 -05:00
Sylvain Gugger
9740a03f61
Skip broken test 2022-11-10 14:59:32 -05:00
Nicolas Patry
d066c3731b
Adding support for LayoutLMvX variants for object-detection. (#20143)
* Adding support for LayoutLMvX variants for `object-detection`.

* Revert bogs `layoutlm` feature extractor which does not exist (it was a
V2 model) .

* Updated condition.

* Handling the comments.
2022-11-10 11:33:38 +01:00
Nicolas Patry
ec6878f6ca
Now supporting pathlike in pipelines too. (#20030) 2022-11-03 09:14:45 +01:00
Nicolas Patry
5fd5990dce
Factored out some code in the image-segmentation pipeline. (#19727)
* Factored out some code in the image-segmentation pipeline

Re-enable `small_model_pt`.

Re-enable `small_model_pt`.

Enabling the current test with the current values.

Debugging the values on the CI.

More logs ? Printing doesn't work ?

Using the CI values instead. Seems to be a Pillow sensitivity.

Added a test showcasing that models not supporting some tasks get a
clear error.

Factored out code.

Further factor out.

Fixup.

Bad rebase.

Put `panoptic` before `instance` as it should be a superset.

* Fixing tests.

* Adding subtasks tests

+ Fixes `instance` segmentation which was broken due to default and
non kwargs arguments.

* Fix bad replace.
2022-10-26 10:44:36 +02:00
Rak Alexey
d3f4cef74d
fix image2test args forwarding (#19648)
* fix image2test args forwarding

* fix issues

* Proposing the update to the PR.

* Fixup.

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2022-10-24 09:49:24 -04:00
Alara Dirik
cca51aa151
Fix image segmentation pipeline errors, resolve backward compatibility issues (#19768)
* Fix panoptic segmentation and pipeline
* Update ImageSegmentationPipeline tests and reenable test_small_model_pt
* Resolve backward compatibility issues
2022-10-21 18:09:58 +03:00
Yih-Dar
3aaabaa214
Update ImageToTextPipelineTests.test_small_model_tf (#19785)
* update expected values for the correct TF checkpoint

* Run test

* Clean up

* fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-21 14:35:20 +02:00
Nicolas Patry
a40386669f
image-segmentation pipeline: re-enable small_model_pt test. (#19716)
* Re-enable `small_model_pt`.

Re-enable `small_model_pt`.

Enabling the current test with the current values.

Debugging the values on the CI.

More logs ? Printing doesn't work ?

Using the CI values instead. Seems to be a Pillow sensitivity.

* Update src/transformers/pipelines/image_segmentation.py

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
2022-10-20 11:57:11 +02:00
Yih-Dar
bed2edb99f
Specify TF framework explicitly in more pipeline tests (#19748)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-19 16:24:03 +02:00
David Yang
a23819ed6a
Clean up deprecation warnings (#19654)
* Clean up deprecation warnings

Notes:
Changed some strings in tests to raw strings, which will change the literal content of the strings as they are fed into whatever machine handles them.
Test cases for past in the past/past_key_values switch changed/removed due to warning of impending removal

* Add PILImageResampling abstraction for PIL.Image.Resampling
2022-10-18 13:34:47 -04:00
Yih-Dar
06a82a49ae
Specify TF framework in TF-related pipeline tests (#19719)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-18 17:40:28 +02:00
Nicolas Patry
63d13d768b
Improving image-segmentation pipeline tests. (#19710)
This PR (https://github.com/huggingface/transformers/pull/19367) introduced a few breaking changes:

- Removed an argument `mask_threshold`.
- Broke the default behavior (instance vs panoptic in the function call)
  https://github.com/huggingface/transformers/pull/19367/files#diff-60f846b86fb6a21d4caf60f5b3d593a04accb8f248de3029cccae2ff898c5bc3R119-R120
- Broke the actual masks: https://github.com/huggingface/transformers/pull/1961

This PR is the start of a handful that will aim at bringing back the old
behavior(s).

- tests should not have to specify `task` by default, unless we want to
  modify the behavior and have a lower form of segmentation running)
- `test_small_model_pt` should be working.

This specific PR starts with adding more information to the masks hash
because missing the actual mask was actual easy to miss (the hashes do
change, but it was easy to miss that one code path wasn't properly
updated).

So we go from a simple `hash` to
```
{"hash": #smaller hash, "shape": (h, w), "white_pixels": n}
```

The `shape` should help make sure the interpolation of the mask works
correctly, the `white_pixels` hopefully helps detect big regressions in
their amount when the hash gets modified.
2022-10-18 16:33:53 +02:00
Nicolas Patry
ee2a80ecc0
add return_tensors parameter for feature_extraction 2 (#19707)
* add return_tensors parameter for feature_extraction  w/ test

add return_tensor parameter for feature extraction

Revert "Merge branch 'feature-extraction-return-tensor' of https://github.com/ajsanjoaquin/transformers into feature-extraction-return-tensor"

This reverts commit d559da743b87914e111a84a98ba6dbb70d08ad88, reversing
changes made to bbef89278650c04c090beb65637a8e9572dba222.

call parameter directly

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

Fixup.

Update src/transformers/pipelines/feature_extraction.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix the imports.

* Fixing the test by not overflowing the model capacity.

Co-authored-by: AJ San Joaquin <ajsanjoaquin@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-18 16:29:00 +02:00
Nicolas Patry
713eab45d3
🚨 🚨 🚨 [Breaking change] Deformable DETR intermediate representations (#19678)
* [Breaking change] Deformable DETR intermediate representations

- Fixes naturally the `object-detection` pipeline.
- Moves from `[n_decoders, batch_size, ...]` to `[batch_size,
  n_decoders, ...]` instead.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-18 09:00:39 -04:00
Arthur
d356b89f3c
fix test whisper with new max length (#19668) 2022-10-18 08:56:37 +02:00
Sylvain Gugger
f2ecb9eec4
Revert "add return_tensor parameter for feature extraction (#19257)" (#19680)
This reverts commit 35bd089a24.
2022-10-17 11:56:29 -04:00
Ayrton San Joaquin
35bd089a24
add return_tensor parameter for feature extraction (#19257)
* add return_tensors parameter for feature_extraction  w/ test

add return_tensor parameter for feature extraction

Revert "Merge branch 'feature-extraction-return-tensor' of https://github.com/ajsanjoaquin/transformers into feature-extraction-return-tensor"

This reverts commit d559da743b87914e111a84a98ba6dbb70d08ad88, reversing
changes made to bbef89278650c04c090beb65637a8e9572dba222.

* call parameter directly

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* Fixup.

* Update src/transformers/pipelines/feature_extraction.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-17 11:17:26 -04:00
Ankur Goyal
cbc1abc4af
A few CI fixes for DocumentQuestionAnsweringPipeline (#19584)
* Fixes

* update expected values

* style

* fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-17 15:35:27 +02:00
Matt
3b3024da70
TF port of ESM (#19587)
* Partial TF port for ESM model

* Add ESM-TF tests

* Add the various imports for TF-ESM

* TF weight conversion almost ready

* Stop ignoring the decoder weights in PT

* Add tests and lots of fixes

* fix-copies

* Fix imports, add model docs

* Add get_vocab() to tokenizer

* Fix vocab links for pretrained files

* Allow multiple inputs with a sep

* Use EOS as SEP token because ESM vocab lacks SEP

* Correctly return special tokens mask from ESM tokenizer

* make fixup

* Stop testing unsupported embedding resizing

* Handle TF bias correctly

* Skip all models with slow tokenizers in the token classification test

* Fixing the batch/unbatcher of pipelines to accomodate the `None` being

passed around.

* Fixing pipeline bug caused by slow tokenizer  being different.

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/esm/modeling_tf_esm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update set_input_embeddings and the copyright notices

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-10-17 14:16:16 +01:00
Sivaudha
8aad4363d8
Fix pipeline predict transform methods (#19657)
* Remove key word argument X from pipeline predict and transform methods

As __call__ of pipeline clasees require one positional argument, passing
the input as a keyword argument inside predict, transform methods, causing
__call__ to fail. Hence in this commit the keyword argument is modified
into positional argument.

* Implement basic tests for scikitcompat pipeline interface

* Seperate tests instead of running with parameterized based on framework as both frameworks will not be active at the same time
2022-10-17 09:06:20 -04:00
Nicolas Patry
463226e2ee
Improve error messaging for ASR pipeline. (#19570)
* Improve error messaging for ASR pipeline.

- Raise error early (in `_sanitize`) so users don't waste time trying to
  run queries with invalid params.

- Fix the error was after using `config.inputs_to_logits_ratio` so our
  check was masked by the failing property does not exist.

- Added some manual check on s2t for the error message.
  No non ctc model seems to be used by the default runner (they are all
  skipped).

* Removing pdb.

* Stop the early error it doesn't really work :(.
2022-10-14 17:12:21 +02:00
Yih-Dar
62f28bc152
Fix ImageToTextPipelineTests.test_small_model_tf (#19565)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-10-14 16:29:54 +02:00
amyeroberts
83a2e694f1
Cast masks to np.unit8 before converting to PIL.Image.Image (#19616)
* Cast masks to np.unit8 before converting to PIL.Image.Image

* Update tests

* Fixup
2022-10-14 09:30:45 -04:00
Ritik Nandwal
e94384e4d8
Add depth estimation pipeline (#18618)
* Add initial files for depth estimation pipelines

* Add test file for depth estimation pipeline

* Update model mapping names

* Add updates for depth estimation output

* Add generic test

* Hopefully fixing the tests.

* Check if test passes

* Add make fixup and make fix-copies changes after rebase with main

* Rebase with main

* Fixing up depth pipeline.

* This is not used anymore.

* Fixing the test. `Image` is a module `Image.Image` is the type.

* Update docs/source/en/main_classes/pipelines.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-10-12 08:54:20 -04:00
Quancore
70a058bc65
Added tokenize keyword arguments to feature extraction pipeline (#19382)
* Added tokenize keyword arguments to feature extraction pipeline

* Reverted truncation parameter

* Import numpy moved to top
2022-10-11 12:54:41 -04:00
Ankur Goyal
a3008c5a6d
Implement multiple span support for DocumentQuestionAnswering (#19204)
* Implement multiple span support

* Address comments

* Add tests + fix bugs
2022-10-11 10:47:55 -04:00
Arthur
b722a6be72
Fix whisper for pipeline (#19482)
* update feature extractor params

* update attention mask handling

* fix doc and pipeline test

* add warning when skipping test

* add whisper translation and transcription test

* fix build doc test
2022-10-11 07:17:53 -04:00
Sylvain Gugger
d92e22d1f2
Remove ref to is_pipeline_test 2022-10-07 21:38:07 -04:00
Sylvain Gugger
9ac586b3c8
Rework pipeline tests (#19366)
* Rework pipeline tests

* Try to fix Flax tests

* Try to put it before

* Use a new decorator instead

* Remove ignore marker since it doesn't work

* Filter pipeline tests

* Woopsie

* Use the fitlered list

* Clean up and fake modif

* Remove init

* Revert fake modif
2022-10-07 18:01:58 -04:00
Alara Dirik
983451a13e
Improve and fix ImageSegmentationPipeline (#19367)
- Fixes the image segmentation pipeline test failures caused by changes to the postprocessing methods of supported models
- Updates the ImageSegmentationPipeline tests
- Improves docs, adds 'task' argument to optionally perform semantic, instance or panoptic segmentation
2022-10-07 23:34:41 +03:00
Amrit Sahu
e9a49babee
[WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930)
* Add ZeroShotObjectDetectionPipeline (#18445)

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add AutoModelForZeroShotObjectDetection task

This commit also adds the following

- Add explicit _processor method for ZeroShotObjectDetectionPipeline.
  This is necessary as pipelines don't auto infer processors yet and
  `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
  process multiple images at once

- Add auto tests and other tests for ZeroShotObjectDetectionPipeline

* Add batching for ZeroShotObjectDetectionPipeline

* Fix doc-string ZeroShotObjectDetectionPipeline

* Fix output format: ZeroShotObjectDetectionPipeline
2022-10-07 10:00:19 -04:00
Sylvain Gugger
7e7f62bfa7
Fix pipeline tests for Roberta-like tokenizers (#19365)
* Fix pipeline tests for Roberta-like tokenizers

* Fix fix
2022-10-05 17:48:14 -04:00
Sylvain Gugger
c875a96eb1
Test failing test while we resolve the issue. (#19355) 2022-10-05 12:23:48 -04:00
Karim Foda
e396358104
Add stop sequence to text generation pipeline (#18444) 2022-09-30 14:26:51 +01:00
Sylvain Gugger
f16bbf1475
Skip pipeline tests (#19248) 2022-09-29 12:25:15 -04:00
Ankur Goyal
67403413bd
Change document question answering pipeline to always return an array (#19071)
Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-20 15:17:57 +02:00
amyeroberts
30a28f5227
Update image segmentation pipeline test (#18731)
* Updated test values

The image segmentation pipeline tests - tests/pipelines/test_pipelines_image_segmentation.py - were failing after the merging of #1849  (49e44b216b). This was due to the difference in rescaling. Previously the images were rescaled by `image = image / 255`. In the new commit, a `rescale` method was added, and images rescaled using `image = image * scale`. This was known to cause small differences in the processed images (see
[PR comment](https://github.com/huggingface/transformers/pull/18499#discussion_r940347575)).

Testing locally, changing the `rescale` method to divide by a scale factor (255) resulted in the tests passing. It was therefore decided the test values could be updated, as there was no logic difference between the commits.

* Use double quotes, like previous example

* Fix up
2022-09-15 07:32:31 -04:00
Yih-Dar
6a9726ec0e
Fix DocumentQuestionAnsweringPipelineTests (#19023)
* Fix DocumentQuestionAnsweringPipelineTests

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-14 16:13:20 +02:00
NielsRogge
59407bbeb3
Add Deformable DETR (#17281)
* First draft

* More improvements

* Improve model, add custom CUDA code

* Import torch before

* Add script that imports custom layer

* Add everything in new ops directory

* Import custom layer in modeling file

* Fix ARCHIVE_MAP typo

* Creating the custom kernel on the fly.

* Import custom layer in modeling file

* More improvements

* Fix CUDA loading

* More improvements

* Improve conversion script

* Improve conversion script

* Make it work until encoder_outputs

* Make forward pass work

* More improvements

* Make logits match original implementation

* Make implementation also support single_scale model

* Add support for single_scale and dilation checkpoint

* Add support for with_box_refine model

* Support also two stage model

* Improve tests

* Fix more tests

* Make more tests pass

* Upload all models to the hub

* Clean up some code

* Improve decoder outputs

* Rename intermediate hidden states and reference points

* Improve model outputs

* Move tests to dedicated folder

* Improve model outputs

* Fix retain_grad test

* Improve docs

* Clean up and make test_initialization pass

* Improve variable names

* Add copied from statements

* Improve docs

* Fix style

* Improve docs

* Improve docs, move tests to model folder

* Fix rebase

* Remove DetrForSegmentation from auto mapping

* Apply suggestions from code review

* Improve variable names and docstrings

* Apply some more suggestions from code review

* Apply suggestion from code review

* better docs and variables names

* hint to num_queries and two_stage confusion

* remove asserts and code refactor

* add exception if two_stage is True and with_box_refine is False

* use f-strings

* Improve docs and variable names

* Fix code quality

* Fix rebase

* Add require_torch_gpu decorator

* Add pip install ninja to CI jobs

* Apply suggestion of @sgugger

* Remove DeformableDetrForObjectDetection from auto mapping

* Remove DeformableDetrModel from auto mapping

* Add model to toctree

* Add model back to mappings, skip model in pipeline tests

* Apply @sgugger's suggestion

* Fix imports in the init

* Fix copies

* Add CPU implementation

* Comment out GPU function

* Undo previous change

* Apply more suggestions

* Remove require_torch_gpu annotator

* Fix quality

* Add logger.info

* Fix logger

* Fix variable names

* Fix initializaztion

* Add missing initialization

* Update checkpoint name

* Add model to doc tests

* Add CPU/GPU equivalence test

* Add Deformable DETR to pipeline tests

* Skip model for object detection pipeline

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-09-14 11:45:21 +02:00
Ankur Goyal
2ef7742117
Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models

* Fixup

* Use the full encoding

* Basic refactoring to DocumentQuestionAnsweringPipeline

* Cleanup

* Improve args, docs, and implement preprocessing

* Integrate OCR

* Refactor question_answering pipeline

* Use refactored QA code in the document qa pipeline

* Fix tests

* Some small cleanups

* Use a string type annotation for Image.Image

* Update encoding with image features

* Wire through the basic docs

* Handle invalid response

* Handle empty word_boxes properly

* Docstring fix

* Integrate Donut model

* Fixup

* Incorporate comments

* Address comments

* Initial incorporation of tests

* Address Comments

* Change assert to ValueError

* Comments

* Wrap `score` in float to make it JSON serializable

* Incorporate AutoModeLForDocumentQuestionAnswering changes

* Fixup

* Rename postprocess function

* Fix auto import

* Applying comments

* Improve docs

* Remove extra assets and add copyright

* Address comments

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-07 13:38:49 -04:00
Sylvain Gugger
71ff88fa4f
Further reduce the number of alls to head for cached objects (#18871)
* Further reduce the number of alls to head for cached models/tokenizers/pipelines

* Fix tests

* Address review comments
2022-09-06 12:34:37 -04:00
OlivierDehaene
129d73294e
Fix naming issue with ImageToText pipeline (#18864)
Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-02 07:55:30 -04:00
OlivierDehaene
ddb69e5af8
Add Image To Text Generation pipeline (#18821)
* Add Image2TextGenerationPipeline to supported pipelines

* Add Flax and Tensorflow support

* Add Flax and Tensorflow small tests

* Add default model for Tensorflow

* Add docstring

* Fix doc style

* Add tiny models for pytorch and flax

* Remove flax from pipeline.
Fix tests

* Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow

* Fix Tensorflow support

Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-01 12:07:14 -04:00
Sylvain Gugger
0d0aada564
Use commit hash to look in cache instead of calling head (#18534)
* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments

Co-authored-by: Julien Chaumond <julien@huggingface.co>
2022-08-10 11:55:18 -04:00
Nicolas Patry
9f5fe63548
Adding a new align_to_words param to qa pipeline. (#18010)
* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-08-09 18:50:02 +02:00
Nicolas Patry
a4562552eb
[DX fix] Fixing QA pipeline streaming a dataset. (#18516)
* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.
2022-08-08 14:25:56 +02:00
Yih-Dar
280db2e39c
Fix test_dbmdz_english by updating expected values (#18482)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 16:49:54 +02:00