* Add the support for the fast (rust) implementation of BlenbderbotTokenizer
* Fix a converter and a typo in a doc
* Apply the patil-suraj's suggestion
* (Nitpick) Fast tokenization -> Fast Tokenization in doc
* Apply the SaulLu's suggestion
* Apply Narsil's suggestion to fix test pipelines
* Add encoder_no_repeat_ngram_size according to the Narsil's suggestion
* Revert the last (unnecessary) commit
* Override pipeline config for Blenderbot to allow for larger pos. emb.
* make fix-copies
* Enabling dataset iteration on pipelines.
Enabling dataset iteration on pipelines.
Unifying parameters under `set_parameters` function.
Small fix.
Last fixes after rebase
Remove print.
Fixing text2text `generate_kwargs`
No more `self.max_length`.
Fixing tf only conversational.
Consistency in start/stop index over TF/PT.
Speeding up drastically on TF (nasty bug where max_length would increase
a ton.)
Adding test for support for non fast tokenizers.
Fixign GPU usage on zero-shot.
Fix working on Tf.
Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Small cleanup.
Remove all asserts + simple format.
* Fixing audio-classification for large PR.
* Overly explicity null checking.
* Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.
* Removed internal state for parameters of the pipeline.
Instead of overriding implicitly internal state, we moved
to real named arguments on every `preprocess`, `_forward`,
`postprocess` function.
Instead `_sanitize_parameters` will be used to split all kwargs
of both __init__ and __call__ into the 3 kinds of named parameters.
* Move import warnings.
* Small fixes.
* Quality.
* Another small fix, using the CI to debug faster.
* Last fixes.
* Last fix.
* Small cleanup of tensor moving.
* is not None.
* Adding a bunch of docs + a iteration test.
* Fixing doc style.
* KeyDataset = None guard.
* RRemoving the Cuda test for pipelines (was testing).
* Even more simple iteration test.
* Correct import .
* Long day.
* Fixes in docs.
* [WIP] migrating object detection.
* Fixed the target_size bug.
* Fixup.
* Bad variable name.
* Fixing `ensure_on_device` respects original ModelOutput.
* Moving `zero-shot-classification` pipeline to new testing.
* Cleaning up old mixins.
* Fixing tests
`sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english` is
corrupted in PT.
* Adding warning.
- Enforce `test_small_models_{tf,pt}` methods to exist (enforce checking
actual values in small tests)
- Add support for non RGB image for the pipeline.
* New test format for conversational.
* Putting back old mixin.
* Re-enabling auto tests with LazyLoading.
* Feature extraction tests.
* Remove feature-extraction.
* Feature extraction with feature_extractor (No pun intended).
* Update check_model_type for fill-mask.
* Fill mask pipelines test updates.
* Model eval !!
* Adding slow test with actual values.
* Making all tests pass (skipping quite a bit.)
* Doc styling.
* Better doc cleanup.
* Making an explicit test with no pad token tokenizer.
* Typo.
* Update feature extraction pipelilne.
* Leaving 1 small model for actual values check.
* Fixes tests
- Better support for tokenizer with no pad token
- Increasing PegasusModelTesterConfig for pipelines
- Test of feature extraction are more permissive + don't test Multimodel
models + encoder-decoder.
* Fixing model loading with incorrect shape (+ model with HEAD).
* Update tests/test_pipelines_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Revert modeling_utils modification.
* Some corrections.
* Update tests/test_pipelines_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/test_pipelines_feature_extraction.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Syntax.
* Fixing text-classification tests.
* Don't modify this file.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Proposal
* Testing pipelines slightly better.
- Overall same design
- Metaclass to get proper different tests instead of subTest (not well
supported by Pytest)
- Added ANY meta object to make output checking more readable.
- Skipping architectures either without tiny_config or without
architecture.
* Small fix.
* Fixing the tests in case of None value.
* Oups.
* Rebased with more architectures.
* Fixing reformer tests (no override anymore).
* Adding more options for model tester config.
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
* Fixing roberta for slow-fast tests
* WIP getting equivalence on pipelines
* slow-to-fast equivalence - working on question-answering pipeline
* optional FAISS tests
* Pipeline Q&A
* Move pipeline tests to their own test job again
* update tokenizer to add sequence id methods
* update to tokenizers 0.9.4
* set sentencepiecce as optional
* clean up squad
* clean up pipelines to use sequence_ids
* style/quality
* wording
* Switch to use_fast = True by default
* update tests for use_fast at True by default
* fix rag tokenizer test
* removing protobuf from required dependencies
* fix NER test for use_fast = True by default
* fixing example tests (Q&A examples use slow tokenizers for now)
* protobuf in main deps extras["sentencepiece"] and example deps
* fix protobug install test
* try to fix seq2seq by switching to slow tokenizers for now
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* WIP refactoring pipeline tests - switching to fast tokenizers
* fix dialog pipeline and fill-mask
* refactoring pipeline tests backbone
* make large tests slow
* fix tests (tf Bart inactive for now)
* fix doc...
* clean up for merge
* fixing tests - remove bart from summarization until there is TF
* fix quality and RAG
* Add new translation pipeline tests - fix JAX tests
* only slow for dialog
* Fixing the missing TF-BART imports in modeling_tf_auto
* spin out pipeline tests in separate CI job
* adding pipeline test to CI YAML
* add slow pipeline tests
* speed up tf and pt join test to avoid redoing all the standalone pt and tf tests
* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
* Update src/transformers/pipelines.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/pipelines.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add require_torch and require_tf in is_pt_tf_cross_test
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>