transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-17 19:48:23 +06:00

Author	SHA1	Message	Date
Younes Belkada	914289ac4b	[`pipeline`] Fix str device issue (#24396 ) * fix str device issue * fixup * adapt from suggestions * forward contrib credits from suggestions * better fix * added backward compatibility for older PT versions * final fixes * oops * Attempting something with less branching. --------- Co-authored-by: amyeroberts <amyeroberts@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-26 13:58:36 +02:00
Sanchit Gandhi	8767958fc1	Allow dict input for audio classification pipeline (#23445 ) * Allow dict input for audio classification pipeline * make style * Empty commit to trigger CI * Empty commit to trigger CI * check for torchaudio * add pip instructions Co-authored-by: Sylvain <sylvain.gugger@gmail.com> * Update src/transformers/pipelines/audio_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * asr -> audio class * asr -> audio class --------- Co-authored-by: Sylvain <sylvain.gugger@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-06-23 13:50:37 +01:00
Yih-Dar	652ece0710	Skip `test_conditional_generation_pt_pix2struct` in Past CI (torch < 1.11) (#24417 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-22 15:34:13 +02:00
Matthijs Hollemans	cd927a4736	add word-level timestamps to Whisper (#23205 ) * let's go! * initial implementation of token-level timestamps * only return a single timestamp per token * remove token probabilities * fix return type * fix doc comment * strip special tokens * rename * revert to not stripping special tokens * only support models that have alignment_heads * add integration test * consistently name it token-level timestamps * small DTW tweak * initial support for ASR pipeline * fix pipeline doc comments * resolve token timestamps in pipeline with chunking * change warning when no final timestamp is found * return word-level timestamps * fixup * fix bug that skipped final word in each chunk * fix failing unit tests * merge punctuations into the words * also return word tokens * also return token indices * add (failing) unit test for combine_tokens_into_words * make combine_tokens_into_words private * restore OpenAI's punctuation rules * add pipeline tests * make requested changes * PR review changes * fix failing pipeline test * small stuff from PR * only return words and their timestamps, not segments * move alignment_heads into generation config * forgot to set alignment_heads in pipeline tests * tiny comment fix * grr	2023-06-21 17:48:21 +02:00
Yih-Dar	c23d131eab	Update tiny models for pipeline testing. (#24364 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-20 14:43:10 +02:00
Yih-Dar	eac8dede83	Skip some `TQAPipelineTests` tests in past CI (#24267 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-14 14:25:24 +02:00
Yih-Dar	d0d1632958	Fix Pipeline CI OOM issue (#24124 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-06-09 16:49:02 +02:00
NielsRogge	2f424d7979	[image-to-text pipeline] Add conditional text support + GIT (#23362 ) * First draft * Remove print statements * Add conditional generation * Add more tests * Remove scripts * Remove BLIP specific linkes * Add support for pix2struct * Add fast test * Address comment * Fix style	2023-05-22 21:45:50 +02:00
Yih-Dar	5777c3cb3f	Fix (skip) a pipeline test for `RwkvModel` (#23444 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-05-18 14:54:23 +02:00
Joao Gante	b369e507aa	Generate: text generation pipeline no longer emits `max_length` warning when it is not set (#23139 )	2023-05-04 18:36:23 +01:00
Yih-Dar	975159bb61	Update tiny models and a few fixes (#22928 ) * run_check_tiny_models * update summary * update mixin * update pipeline_model_mapping * update pipeline_model_mapping * Update for gpt_bigcode --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-24 14:45:22 +02:00
Yih-Dar	1e1cb6f8e5	Fix `FillMaskPipelineTests` (#22894 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-21 15:16:45 +02:00
Arthur	f143037789	Add `automatic-mask-generation` pipeline for Segment Anything Model (SAM) (#22840 ) * cleanup * updates * more refactoring * make style * update inits * support other inputs in base * update based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * Update tests/pipelines/test_pipelines_automatic_mask_generation.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * update * fixup * TODO x and y to refactor, _h _w refactored here * update docstring * more nits * style on these * more doc fix * rename variables * update * updates * style * update * fix `_mask_to_rle_pytorch` * styling * fix ask to rle, wrong outputs * add device arg * update * more updates, fix tets * udpate * update docstrings * styling * fixup * add notebook on the docs * update orginal sizes * fix docstring * updat condition on point_per-batch * updates tests * fix CI test * extend is required, append does not work! * fixup * fix CI tests * whit pixels left * address doc comments * fix doc * slow pipeline tests * update auto init * add revision * make fixup * update p!ipoeline tag when calling tests * alphabeitcal order in inits * fix copies * last style nits * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * reformat docstring * more reformat * address most of the comments * Update src/transformers/pipelines/mask_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final refactor * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup and fix slow tests * revert --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-04-20 19:27:24 +02:00
Yih-Dar	5269718cb7	Don't use `LayoutLMv2` and `LayoutLMv3` in some pipeline tests (#22774 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-17 17:45:20 +02:00
Nicolas Patry	a515d0a77c	Soft error whisper. (#22475 ) * Soft error whisper. * Fix format. --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-94.taildb5d.ts.net>	2023-04-04 16:21:57 +02:00
Sylvain Gugger	80e3b36361	Really fix quality due to ruff release	2023-03-22 20:56:22 -04:00
Sylvain	ef28df0572	Fix quality due to ruff release	2023-03-22 20:45:08 -04:00
Luc CAILLIAU	d62e7d8842	Chunkable token classification pipeline (#21771 ) * Chunkable classification pipeline The TokenClassificationPipeline is now able to process sequences longer than 512. No matter the framework, the model, the tokenizer. We just have to pass process_all=True and a stride number (optional). The behavior remains the same if you don't pass these optional parameters. For overlapping parts when using stride above 0, we consider only the max scores for each overlapped token in all chunks where the token is. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * update with latest black format * update black format * Update token_classification.py * Update token_classification.py * format correction * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update comments * Update src/transformers/pipelines/token_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Update token_classification.py Correct spaces, remove process_all and keep only stride. If stride is provided, the pipeline is applied to the whole text. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update chunk aggregation Update the chunk aggregation strategy based on entities aggregation. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py Remove unnecessary pop from outputs dict * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add chunking tests * correct formating * correct formatting * correct model id for test chunking * update scores with nested simplify * Update test_pipelines_token_classification.py * Update test_pipelines_token_classification.py * update model to a tiny one * Update test_pipelines_token_classification.py * Adding smaller test for chunking. * Fixup * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 14:13:20 -04:00
Yih-Dar	5110e5748e	🔥py38 + torch 2 🔥🔥🔥🚀 (#22204 ) * py38 + torch 2 * increment cache versions --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-16 22:59:23 +01:00
Sylvain Gugger	42ad693b7b	Regression pipeline device (#22190 ) * Fix regression in pipeline when device=-1 is passed * Add regression test	2023-03-15 14:13:38 -04:00
Lucain	923110b74f	Remove set_access_token usage + fail tests if FutureWarning (#22051 ) * Remove set_access_token usage + fail tests if FutureWarning * do not fail on FutureWarning in CI --------- Co-authored-by: testbot <lucainp@hf.co>	2023-03-09 09:23:48 -05:00
Yih-Dar	dfe9a31973	Update `AudioClassificationPipelineTests::test_small_model_pt` for PT 2.0.0 (#22023 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-08 13:56:47 +01:00
Nicolas Patry	1325459105	Refactor whisper asr pipeline to include language too. (#21427 ) * [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-02 18:12:19 +01:00
Sylvain Gugger	50a8ed3ee0	Mark pipeline tests to skip them easily (#21887 ) * Mark pipeline tests to skip them easily * Mark the mixin as pipeline test * Update src/transformers/testing_utils.py Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2023-03-02 10:55:36 -05:00
Yih-Dar	871c31a6f1	🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516 ) * Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-28 19:40:57 +01:00
Arthur	cc44e72d14	[Pipeline] Add zero shot audio classificatoin pipeline (#21600 ) * add pipeline * update init * add zero shot to init * update inits and correct checkpoints * update base to support input features * add tests * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * update pieline code * use tiny checkpoint * nits and expected value with tiny model * style * last nit on tests values * fix styling * fix collate fn that was casting t float * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-27 11:43:44 +01:00
Connor Henderson	279008adc3	fix: Change is_last chunk calc and add conditional break in chunk_iter (#21612 ) * fix: Change is_last chunk calc and add conditional break * format fix * account for 0 and full stride_rights, add comment * add new test * make style * update slow whisper asr test timestamps * use nested_simplify on output and round timestamp to hundreths place	2023-02-24 08:30:32 +01:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Jonatan Kłosko	deafc24388	Add WhisperTokenizerFast (#21222 ) * Add WhisperTokenizerFast * Fixup * Up * Up * Improve tests * Update src/transformers/models/whisper/tokenization_whisper_fast.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Keep stride in whisper pipelien test * Remove unknown token special case * Reduce vocabulary size in tests * Fix vocab size assertion * Sync copied changes from WhisperTokenizer * Skip pipeline tests * Update assertion * Remove Whisper tokenizer dependency on sentencepiece * Format --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-02-21 06:58:54 +01:00
Connor Henderson	0f96c26de6	refactor: Make direct_transformers_import util (#21652 ) * refactor: Make direct_import util * edit direct import fn * add docstring * make import function specific to transformers only * edit doc string	2023-02-16 11:32:32 -05:00
Sylvain Gugger	9d1116e995	Update deprecated load_module (#21651 )	2023-02-15 15:57:24 -05:00
Younes Belkada	f83942684d	[`pipeline`] A simple fix for half-precision & 8bit models (#21479 ) * v1 fix * adapt from suggestions * make style * fix tests * add gpu tests * update docs * fix other tests * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * better fix * make fixup * better example * revert changes * proposal * more elegant solution * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-10 10:26:17 +01:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
Yih-Dar	a6d8a149a8	Fix some pipeline tests (#21401 ) * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-02 19:03:31 +01:00
Yih-Dar	c749bd405e	Pipeline testing - using tiny models on Hub (#20426 ) * rework pipeline tests * run pipeline tests * fix * fix * fix * revert the changes in get_test_pipeline() parameter list * fix expected error message * skip a test * clean up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-01-30 10:39:43 +01:00
Nicolas Patry	8788fd0ceb	Moving to cleaner tokenizer version or `oneformer`. (#21292 ) Moving to cleaner tokenizer version.	2023-01-25 15:46:10 +01:00
Arthur	255257f3ea	[Whisper] Refactor whisper (#21252 ) * update whisper logit processor * add generate for whisper * remove part of the whisper specific code from pipeline * update logit processes * major update * enforce first timestamp * update generate * add more tests * update new decoding strategy * Apply suggestions from code review * update docstring * fixup * default config will not have multilingual ar * update expected tokenizer size, see pull on the hub for whisper-tiny	2023-01-25 13:09:43 +01:00
Nicolas Patry	99e7905422	Supporting `ImageProcessor` in place of `FeatureExtractor` for pipelines (#20851 ) * Fixing the pipeline with image processor. * Update the slow test. * Using only the first image processor. * Include exclusion mecanism for Image processor. * Do not handle Gitconfig, deemed as a bug. * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove `conversational` changes. They are not supposed to be here. * Address first row of comments. * Remove OneFormer modifications. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-01-25 10:16:31 +01:00
Arthur	b80b2218b5	[ci-daily] Fix pipeline tests (#21257 ) * use streaming dataset * fix whisper's test * add rescale argument to chunk_iter	2023-01-23 19:32:49 +01:00
Arthur	5d3cb760a0	[Whispe] Fix pipeline after timestamp merges (#21198 ) * pass return_timestamps to pre-process * add a test to test it * test does not need device 0 * remove failing bit * update test	2023-01-20 10:31:40 +01:00
Arthur	e9b4800dda	[Whisper] Fix timestamp processor (#21187 ) * add draft logit processor * add template functions * update timesapmt processor parameters * draft script * simplify code * cleanup * fixup and clean * update pipeline * style * clean up previous idea * add tokenization utils * update tokenizer and asr output * fit whisper type * style and update test * clean test * style test * update tests * update error test * udpate code (not based on review yet) * update tokenization * update asr pipeline * update code * cleanup and update test * fmt * remove text verificatino * cleanup * cleanup * add model test * update tests * update code add docstring * update code and add docstring * fix pipeline tests * add draft logit processor add template functions update timesapmt processor parameters draft script simplify code cleanup fixup and clean update pipeline style clean up previous idea add tokenization utils update tokenizer and asr output fit whisper type style and update test clean test style test update tests update error test udpate code (not based on review yet) update tokenization update asr pipeline update code cleanup and update test fmt remove text verificatino cleanup cleanup add model test update tests update code add docstring update code and add docstring fix pipeline tests * Small update. * Fixup. * Tmp. * More support. * Making `forced_decoder_ids` non mandatory for users to set. * update and fix first bug * properly process sequence right after merge if last * tofo * allow list inputs + compute begin index better * start adding tests * add the 3 edge cases * style * format sequences * fixup * update * update * style * test passes, edge cases should be good * update last value * remove Trie * update tests and expec ted values * handle bigger chunk_length * clean tests a bit * refactor chunk iter and clean pipeline * update tests * style * refactor chunk iter and clean pipeline * upade * resolve comments * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * take stride right into account * update test expected values * Update code based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * major refactor * add correct strides for tests * Update src/transformers/pipelines/automatic_speech_recognition.py * fix whisper timestamp test Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-01-19 16:25:56 +01:00
Sylvain Gugger	05e72aa0c4	Adapt repository creation to latest hf_hub (#21158 ) * Adapt repository creation to latest hf_hub * Update all examples * Fix other tests, add Flax examples * Address review comments	2023-01-18 11:14:00 -05:00
Arthur	bb300ac686	Whisper Timestamp processor and prediction (#20620 ) * add draft logit processor * add template functions * update timesapmt processor parameters * draft script * simplify code * cleanup * fixup and clean * update pipeline * style * clean up previous idea * add tokenization utils * update tokenizer and asr output * fit whisper type * style and update test * clean test * style test * update tests * update error test * udpate code (not based on review yet) * update tokenization * update asr pipeline * update code * cleanup and update test * fmt * remove text verificatino * cleanup * cleanup * add model test * update tests * update code add docstring * update code and add docstring * fix pipeline tests * add draft logit processor add template functions update timesapmt processor parameters draft script simplify code cleanup fixup and clean update pipeline style clean up previous idea add tokenization utils update tokenizer and asr output fit whisper type style and update test clean test style test update tests update error test udpate code (not based on review yet) update tokenization update asr pipeline update code cleanup and update test fmt remove text verificatino cleanup cleanup add model test update tests update code add docstring update code and add docstring fix pipeline tests * Small update. * Fixup. * Tmp. * More support. * Making `forced_decoder_ids` non mandatory for users to set. * update and fix first bug * properly process sequence right after merge if last * tofo * allow list inputs + compute begin index better * start adding tests * add the 3 edge cases * style * format sequences * fixup * update * update * style * test passes, edge cases should be good * update last value * remove Trie * update tests and expec ted values * handle bigger chunk_length * clean tests a bit * refactor chunk iter and clean pipeline * update tests * style * refactor chunk iter and clean pipeline * upade * resolve comments * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * take stride right into account * update test expected values * Update code based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2023-01-17 15:50:09 +01:00
Nicolas Patry	488a179ce1	Fixing batching pipelines on single items for ChunkPipeline (#21132 ) * Fixing #20783 * Update src/transformers/pipelines/base.py * Fixing some tests. * Fixup. * Remove ffmpeg dep + a bit more relaxed for bigbird QA precision. * Better dataset. * Prevent failing on TF. * Better condition. We can't use `can_use_iterator` since we cannot use it directly.	2023-01-16 15:04:27 +01:00
Yih-Dar	b3a0aad37d	Fix past CI (#20967 ) * Fix for Past CI * make style * clean up * unindent 2 blocks Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-01-12 18:04:21 +01:00
Arthur	e3ecbaa4ab	Patch-past-refactor (#21050 ) * small patches, forgot a line * refactor PT * the actual fix	2023-01-09 18:12:13 +01:00
Sylvain Gugger	9a046cc14e	Skip failing test until Athur looks at it.	2023-01-08 04:53:20 -05:00
Alara Dirik	cd2457809f	Improve OWL-ViT postprocessing (#20980 ) * add post_process_object_detection method * style changes	2023-01-03 19:25:09 +03:00
NielsRogge	9c6f7485a6	Add GIT (GenerativeImage2Text) (#20295 ) * First draft * Make model instantiation work * Fix copied from statement * More fixes * Add correct output head * Improve configuration * Add conversion script * Improve conversion script * Remove token_type_ids * Fix conversion of projection layers * Convert all weights * Use cats image * Make logits match * Generate caption on cats image * Add GITProcessor * Update conversion script * Add support for more checkpoints * Fix conversion script * Add initial tests * Remove cross-attention * More improvements * Remove is_decoder * Improve model tests * Improve tests * Improve model outputs * Fix model outputs equivalence * Fix more tests * Remove unused code * Use generate to generate text, no use of cache for now * Use generate more appropriately * Fix config tests * Fix style * Add support for use_cache Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix style * Fix GIT vision encoder * Update README * Fix integration test * Set bos and eos token ids * Improve docs * Improve code * Add support for provided attention_mask * Add copied from statement * Fix gradient checkpointing test * Set model_input_names * Investigate model_input_names * Remove script * Fix model inputs * Fix docstring * Rename GIT to Git * Support more models * Add support for textvqa model * Add video support * Extend conversion script for video * Add support for large variant * Add support for more models * Fix config archive map * Update integration test * Fix README * Fix CLIP mean and std * Update processor * Fix use_cache for video, thanks @gante * Remove print statements * Remove assertion * Add processor tests * Fix model_input_names * Use Auto API for processor * Fix processor tests * Fix integration test * Fix pipeline test * Make tests faster * Update conversion script * Update conversion script * Convert more checkpoints * Update conversion script * Fix typo * Update docstrings * Improve code snippets * Fix doc tests * Add more code examplesé * Fix doc tests * Add integration tests * Fix unused variable * revert * Add GIT to Japanese README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-01-03 14:17:18 +01:00
bofeng huang	47c9b22d08	Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952 ) * Add generate kwargs to AutomaticSpeechRecognitionPipeline * Add test for generation kwargs	2022-12-31 01:13:28 -05:00
bofeng huang	fe65657de1	Fix FP16 inference in TextGenerationPipeline (#20913 ) * add torch_dtype attribute to Pipeline * Use torch_dtype to cast input tensor type in AutomaticSpeechRecognitionPipeline * Fix code quality * Add TextGenerationPipeline fp16 test * Fix code quality * Remove useless require in tests Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2022-12-29 02:19:25 -05:00
Nicolas Patry	f7f0ec2f54	Adding support for `fp16` for asr pipeline. (#20864 ) * Supporting `fp16` for asr pipeline * Adding test. * Style. * Oops. * Flake8 update ? * Fixing flake8 ? * Revert "Flake8 update ?" This reverts commit `0b917fcb52`. * Style (acctidentally deleted flake8 F401.) * Move to a bigger test (no small whisper model, and s2t doesn't seem to accept torch_dtype=fp16). Also we need to use a GPU to actually compute on fp16. * Using BatchFeature capability.	2022-12-23 10:18:45 +01:00
Andreas Madsen	b4b613b102	Implement Roberta PreLayerNorm (#20305 ) * Copy RoBERTa * formatting * implement RoBERTa with prelayer normalization * update test expectations * add documentation * add convertion script for DinkyTrain weights * update checkpoint repo Unfortunately the original checkpoints assumes a hacked roberta model * add to RoBERTa-PreLayerNorm docs to toc * run utils/check_copies.py * lint files * remove unused import * fix check_repo reporting wrongly a test is missing * fix import error, caused by rebase * run make fix-copies * add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS * Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup: Fix documentation <Facebook> -> Facebook Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add missing Flax header Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * expected_slice -> EXPECTED_SLICE Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update copies after rebase * add missing copied from statements * make fix-copies * make prelayernorm explicit in code * fix checkpoint path for the original implementation * add flax integration tests * improve docs * update utils/documentation_tests.txt * lint files * Remove Copyright notice Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fix-copies * Remove EXPECTED_SLICE calculation comments Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-19 09:30:17 +01:00
Nicolas Patry	3ee958207a	Fix object detection2 (#20798 ) * Revert "Fixing object detection with `layoutlm` (#20776)" This reverts commit `fca66abe2a`. * Better fix for layoutlm object detection. * Style.	2022-12-16 13:25:36 +01:00
Younes Belkada	4341f4e224	[Pipeline] skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING` (#20790 ) skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING`	2022-12-16 12:46:58 +01:00
Nicolas Patry	fca66abe2a	Fixing object detection with `layoutlm` (#20776 ) * Fixing object detection with layoutlm. * Fixup.	2022-12-15 18:46:43 +01:00
Younes Belkada	8891193e83	[Pipeline] fix failing bloom `pipeline` test (#20778 ) fix failing `pipeline` test	2022-12-15 18:46:00 +01:00
Nicolas Patry	a9912d2fca	Even more validation. (#20762 ) * Even more validation. * Fixing order.	2022-12-15 10:05:54 +01:00
Yih-Dar	a12c5cbcd8	Change a logic in pipeline test regarding TF (#20710 ) * Fix the pipeline test regarding TF * Fix the pipeline test regarding TF * update comment Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-13 13:42:36 +01:00
Nicolas Patry	53357e8196	Adding ValueError when imcompatible parameters are used. (#20729 )	2022-12-12 15:39:13 +01:00
Nathan Raw	9e56aff58a	Add video classification pipeline (#20151 ) * 🚧 wip video classification pipeline * 🚧 wip - add is_decord_available check * 🐛 add missing import * ✅ add tests * 🔧 add decord to setup extras * 🚧 add is_decord_available * ✨ add video-classification pipeline * 📝 add video classification pipe to docs * 🐛 add missing VideoClassificationPipeline import * 📌 add decord install in test runner * ✅ fix url inputs to video-classification pipeline * ✨ updates from review * 📝 add video cls pipeline to docs * 📝 add docstring * 🔥 remove unused import * 🔥 remove some code * 📝 docfix	2022-12-08 16:22:43 -05:00
Yih-Dar	cec5f7abd1	Update summarization `run_pipeline_test` (#20623 ) * update summarization run_pipeline_test * update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-07 15:46:12 +01:00
Yih-Dar	9b14c1b6bf	Fix `AutomaticSpeechRecognitionPipelineTests.run_pipeline_test` (#20597 ) * Remove assert exception not triggered * Fix wrong expected exception string * fix * use assertRaisesRegex Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-06 15:48:49 +01:00
Arthur	538e5248b0	Ci-whisper-asr (#20588 ) * Expected output for the test changed * fix failing asr test	2022-12-05 16:50:38 +01:00
Yih-Dar	cc8aec6740	Add `require_torch` to 2 pipeline tests (#20585 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-05 16:06:39 +01:00
NielsRogge	4973d2a04c	Add Audio Spectogram Transformer (#19981 ) * First draft * Make conversion script work * Add id2label mapping, run code quality * Fix copies * Add first draft of feature extractor * Update conversion script to use feature extractor * Make more tests pass * Add docs * update input_features to input_values + pad by default to max length * Fix doc tests * Add feature extractor tests * Add proper padding/truncation to feature extractor * Add support for conversion of all audioset checkpoints * Improve docs and extend conversion script * Fix README * Rename spectogram to spectrogram * Fix copies * Add integration test * Remove dummy conv * Update to ast * Update organization * Fix init * Rename model to AST * Add require_torchaudio annotator * Move import of ASTFeatureExtractor under a is_speech_available * Fix rebase * Add pipeline config * Update name of classifier head * Rename time_dimension and frequency_dimension for clarity * Remove print statement * Fix pipeline test * Fix pipeline test * Fix index table * Fix init * Fix conversion script * Rename to ForAudioClassification * Fix index table Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-11-21 18:58:54 +01:00
Nicolas Patry	8e777b3ba4	[Proposal] Breaking change `zero-shot-object-detection` for improved consistency. (#20280 ) * [Proposal] Breaking change `zero-shot-object-detection` for improved consistency. This is a proposal to modify the output of `zero-shot-object-detection` to provide better alignment with other pipelines. The output is now strictly the same as `object-detection` whereas before it would output lists of lists. The name `candidate_labels` is used throughout for consistency with other `zero-shot` pipelines. The pipeline is changed to `ChunkPipeline` to support batching cleanly. This removes all the lists and list of lists shenanigans, it's now a matter of the base pipeline handling all this not this specific one. Breaking change: It did remove complex calls potentials `pipe(images = [image1, image2], text_queries=[candidates1, candidates2])` to support only `pipe([{"image": image1, "candidate_labels": candidates1}, {"image": image2, "candidate_labels": candidates2}])` when dealing with lists and/or datasets. We could keep them, but it will add a lot of complexity to the code base, since the pipeline is rather young, I'd rather break to keep the code simpler, but we can revert this. Breaking change: The name of the argument is now `image` instead of `images` since it expects by default only 1 image. This is revertable like the previous one. Breaking change: The types is now simplified and flattened: `pipe(inputs) == [{object1}, {object2}]` instead of the previous `pipe(inputs) == [[{object1}, {object1}], [{object2}]]` Where the different instances would be grouped by candidate labels within lists. IMHO this is not really desirable, since it would output empty lists and is only adding superflous indirection compared to `zero-shot-object-detection`. It is relatively change free in terms of how the results, it does change computation however since now the batching is handled by the pipeline itself. It did** change the results for the small models so there seems to be a real difference in how the models handle this. * Fixing the doctests. * Behind is_torch_available.	2022-11-18 15:57:28 +01:00
Younes Belkada	163ac3d3ee	Add Switch transformers (#19323 ) * first commit * add more comments * add router v1 * clean up - remove `tf` modeling files * clean up - remove `tf` modeling files * clean up * v0 routers * added more router - Implemented `ExpertsChooseMaskedRouter` - added tests - 2 more routers to implement * last router * improved docstring - completed the docstring in `router.py` - added more args in the config * v0 sparse mlp * replace wrong naming * forward pass run * update MOE layer * small router update * fixup * consistency * remove scatter router * remove abstract layer * update test and model for integration testing * v1 conversion * update * hardcode hack * all keys match * add gin conversion, without additional libraries * update conversion sctipy * delete router file * update tests wrt router deletion * fix router issues * update expert code * update, logits match, code needsREFACTORING * Refactor code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * add generate tests Co-authored-by: younesbelkada <younesbelkada@gmail.com> * add support for router loss Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fix forward error * refactor a bit * remove `FlaxSwitchTransformers` modules * more tests pass * Update code Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> * fixup * fix tests * fix doc * fix doc + tokenization * fix tokenizer test * fix test * fix loss output * update code for backward pass * add loss support * update documentation * fix documentation, clean tokenizer * more doc fix, cleanup example_switch * fix failing test * fix test * fix test * fix loss issue * move layer * update doc and fix router capacity usage * fixup * add sparse mlp index for documentation on hub * fixup * test sparse mix architecture * Apply suggestions from code review * Update docs/source/en/model_doc/switch_transformers.mdx * fixup on update * fix tests * fix another test * attempt fix * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/convert_switch_transformers_original_flax_checkpoint_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * try * all tests pass * fix jitter noise * Apply suggestions from code review * doc tests pass * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/switch_transformers/modeling_switch_transformers.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove assert * change config order * fix readme japanese * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove parallelizable tests + add one liners * remove ONNX config * fix nits - add `T5Tokenizer` in auto mapping - remove `Switch Transformers` from ONNX supported models * remove `_get_router` * remove asserts * add check in test for `router_dtype` * add `SwitchTransformersConfig` in `run_pipeline_test` * Update tests/pipelines/test_pipelines_summarization.py * add huge model conversion script * fix slow tests - add better casting for `Linear8bitLt` - remove `torchscript` tests * add make dir * style on new script * fix nits - doctest - remove `_keys_to_ignore_on_load_unexpected` * Update src/transformers/models/switch_transformers/configuration_switch_transformers.py * add google as authors * fix year * remove last `assert` statements * standardize vertical spaces * fix failing import * fix another failing test * Remove strange àuthorized_keys` * removing todo and padding that is never used Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: ybelkada <younes@huggingface.co> Co-authored-by: Younes Belkada <younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur Zucker <arthur@huggingface.co>	2022-11-15 13:06:45 +01:00
Yih-Dar	f9909fbf85	Make `ImageSegmentationPipelineTests` less flaky (#20147 ) * Fix ImageSegmentationPipelineTests * Use 0.9 * no zip * links to show images * links to show images * rebase Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-15 09:14:55 +01:00
Nicolas Patry	25c451e5a0	Adding chunking for whisper (all seq2seq actually). Very crude matching algorithm. (#20104 ) * Very crude matching algorithm. * Fixing tests. * Removing comments * Adding warning + fix short matches. * Cleanup tests. * Quality. * Less noisy. * Fixup.	2022-11-14 22:32:50 +01:00
Bartosz Szmelczynski	78a471ff71	Fix tapas scatter (#20149 ) * First draft * Remove scatter dependency * Add require_torch * update vectorized sum test, add clone call * remove artifacts * fix style * fix style v2 * remove "scatter" mentions from the code base * fix isort error Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-11-14 01:04:26 -05:00
Sylvain Gugger	9740a03f61	Skip broken test	2022-11-10 14:59:32 -05:00
Nicolas Patry	d066c3731b	Adding support for LayoutLMvX variants for `object-detection`. (#20143 ) * Adding support for LayoutLMvX variants for `object-detection`. * Revert bogs `layoutlm` feature extractor which does not exist (it was a V2 model) . * Updated condition. * Handling the comments.	2022-11-10 11:33:38 +01:00
Nicolas Patry	ec6878f6ca	Now supporting pathlike in pipelines too. (#20030 )	2022-11-03 09:14:45 +01:00
Nicolas Patry	5fd5990dce	Factored out some code in the `image-segmentation` pipeline. (#19727 ) * Factored out some code in the image-segmentation pipeline Re-enable `small_model_pt`. Re-enable `small_model_pt`. Enabling the current test with the current values. Debugging the values on the CI. More logs ? Printing doesn't work ? Using the CI values instead. Seems to be a Pillow sensitivity. Added a test showcasing that models not supporting some tasks get a clear error. Factored out code. Further factor out. Fixup. Bad rebase. Put `panoptic` before `instance` as it should be a superset. * Fixing tests. * Adding subtasks tests + Fixes `instance` segmentation which was broken due to default and non kwargs arguments. * Fix bad replace.	2022-10-26 10:44:36 +02:00
Rak Alexey	d3f4cef74d	fix image2test args forwarding (#19648 ) * fix image2test args forwarding * fix issues * Proposing the update to the PR. * Fixup. Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2022-10-24 09:49:24 -04:00
Alara Dirik	cca51aa151	Fix image segmentation pipeline errors, resolve backward compatibility issues (#19768 ) * Fix panoptic segmentation and pipeline * Update ImageSegmentationPipeline tests and reenable test_small_model_pt * Resolve backward compatibility issues	2022-10-21 18:09:58 +03:00
Yih-Dar	3aaabaa214	Update `ImageToTextPipelineTests.test_small_model_tf` (#19785 ) * update expected values for the correct TF checkpoint * Run test * Clean up * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-21 14:35:20 +02:00
Nicolas Patry	a40386669f	`image-segmentation` pipeline: re-enable `small_model_pt` test. (#19716 ) * Re-enable `small_model_pt`. Re-enable `small_model_pt`. Enabling the current test with the current values. Debugging the values on the CI. More logs ? Printing doesn't work ? Using the CI values instead. Seems to be a Pillow sensitivity. * Update src/transformers/pipelines/image_segmentation.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>	2022-10-20 11:57:11 +02:00
Yih-Dar	bed2edb99f	Specify TF framework explicitly in more pipeline tests (#19748 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-19 16:24:03 +02:00
David Yang	a23819ed6a	Clean up deprecation warnings (#19654 ) * Clean up deprecation warnings Notes: Changed some strings in tests to raw strings, which will change the literal content of the strings as they are fed into whatever machine handles them. Test cases for past in the past/past_key_values switch changed/removed due to warning of impending removal * Add PILImageResampling abstraction for PIL.Image.Resampling	2022-10-18 13:34:47 -04:00
Yih-Dar	06a82a49ae	Specify TF framework in TF-related pipeline tests (#19719 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-18 17:40:28 +02:00
Nicolas Patry	63d13d768b	Improving `image-segmentation` pipeline tests. (#19710 ) This PR (https://github.com/huggingface/transformers/pull/19367) introduced a few breaking changes: - Removed an argument `mask_threshold`. - Broke the default behavior (instance vs panoptic in the function call) https://github.com/huggingface/transformers/pull/19367/files#diff-60f846b86fb6a21d4caf60f5b3d593a04accb8f248de3029cccae2ff898c5bc3R119-R120 - Broke the actual masks: https://github.com/huggingface/transformers/pull/1961 This PR is the start of a handful that will aim at bringing back the old behavior(s). - tests should not have to specify `task` by default, unless we want to modify the behavior and have a lower form of segmentation running) - `test_small_model_pt` should be working. This specific PR starts with adding more information to the masks hash because missing the actual mask was actual easy to miss (the hashes do change, but it was easy to miss that one code path wasn't properly updated). So we go from a simple `hash` to ``` {"hash": #smaller hash, "shape": (h, w), "white_pixels": n} ``` The `shape` should help make sure the interpolation of the mask works correctly, the `white_pixels` hopefully helps detect big regressions in their amount when the hash gets modified.	2022-10-18 16:33:53 +02:00
Nicolas Patry	ee2a80ecc0	add return_tensors parameter for feature_extraction 2 (#19707 ) * add return_tensors parameter for feature_extraction w/ test add return_tensor parameter for feature extraction Revert "Merge branch 'feature-extraction-return-tensor' of https://github.com/ajsanjoaquin/transformers into feature-extraction-return-tensor" This reverts commit d559da743b87914e111a84a98ba6dbb70d08ad88, reversing changes made to bbef89278650c04c090beb65637a8e9572dba222. call parameter directly Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Fixup. Update src/transformers/pipelines/feature_extraction.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix the imports. * Fixing the test by not overflowing the model capacity. Co-authored-by: AJ San Joaquin <ajsanjoaquin@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-18 16:29:00 +02:00
Nicolas Patry	713eab45d3	🚨 🚨 🚨 [Breaking change] Deformable DETR intermediate representations (#19678 ) * [Breaking change] Deformable DETR intermediate representations - Fixes naturally the `object-detection` pipeline. - Moves from `[n_decoders, batch_size, ...]` to `[batch_size, n_decoders, ...]` instead. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-18 09:00:39 -04:00
Arthur	d356b89f3c	fix test whisper with new max length (#19668 )	2022-10-18 08:56:37 +02:00
Sylvain Gugger	f2ecb9eec4	Revert "add return_tensor parameter for feature extraction (#19257 )" (#19680 ) This reverts commit `35bd089a24`.	2022-10-17 11:56:29 -04:00
Ayrton San Joaquin	35bd089a24	add return_tensor parameter for feature extraction (#19257 ) * add return_tensors parameter for feature_extraction w/ test add return_tensor parameter for feature extraction Revert "Merge branch 'feature-extraction-return-tensor' of https://github.com/ajsanjoaquin/transformers into feature-extraction-return-tensor" This reverts commit d559da743b87914e111a84a98ba6dbb70d08ad88, reversing changes made to bbef89278650c04c090beb65637a8e9572dba222. * call parameter directly Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Fixup. * Update src/transformers/pipelines/feature_extraction.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-17 11:17:26 -04:00
Ankur Goyal	cbc1abc4af	A few CI fixes for `DocumentQuestionAnsweringPipeline` (#19584 ) * Fixes * update expected values * style * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-17 15:35:27 +02:00
Matt	3b3024da70	TF port of ESM (#19587 ) * Partial TF port for ESM model * Add ESM-TF tests * Add the various imports for TF-ESM * TF weight conversion almost ready * Stop ignoring the decoder weights in PT * Add tests and lots of fixes * fix-copies * Fix imports, add model docs * Add get_vocab() to tokenizer * Fix vocab links for pretrained files * Allow multiple inputs with a sep * Use EOS as SEP token because ESM vocab lacks SEP * Correctly return special tokens mask from ESM tokenizer * make fixup * Stop testing unsupported embedding resizing * Handle TF bias correctly * Skip all models with slow tokenizers in the token classification test * Fixing the batch/unbatcher of pipelines to accomodate the `None` being passed around. * Fixing pipeline bug caused by slow tokenizer being different. * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update set_input_embeddings and the copyright notices Co-authored-by: Your Name <you@example.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-10-17 14:16:16 +01:00
Sivaudha	8aad4363d8	Fix pipeline predict transform methods (#19657 ) * Remove key word argument X from pipeline predict and transform methods As __call__ of pipeline clasees require one positional argument, passing the input as a keyword argument inside predict, transform methods, causing __call__ to fail. Hence in this commit the keyword argument is modified into positional argument. * Implement basic tests for scikitcompat pipeline interface * Seperate tests instead of running with parameterized based on framework as both frameworks will not be active at the same time	2022-10-17 09:06:20 -04:00
Nicolas Patry	463226e2ee	Improve error messaging for ASR pipeline. (#19570 ) * Improve error messaging for ASR pipeline. - Raise error early (in `_sanitize`) so users don't waste time trying to run queries with invalid params. - Fix the error was after using `config.inputs_to_logits_ratio` so our check was masked by the failing property does not exist. - Added some manual check on s2t for the error message. No non ctc model seems to be used by the default runner (they are all skipped). * Removing pdb. * Stop the early error it doesn't really work :(.	2022-10-14 17:12:21 +02:00
Yih-Dar	62f28bc152	Fix `ImageToTextPipelineTests.test_small_model_tf` (#19565 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-10-14 16:29:54 +02:00
amyeroberts	83a2e694f1	Cast masks to np.unit8 before converting to PIL.Image.Image (#19616 ) * Cast masks to np.unit8 before converting to PIL.Image.Image * Update tests * Fixup	2022-10-14 09:30:45 -04:00
Ritik Nandwal	e94384e4d8	Add depth estimation pipeline (#18618 ) * Add initial files for depth estimation pipelines * Add test file for depth estimation pipeline * Update model mapping names * Add updates for depth estimation output * Add generic test * Hopefully fixing the tests. * Check if test passes * Add make fixup and make fix-copies changes after rebase with main * Rebase with main * Fixing up depth pipeline. * This is not used anymore. * Fixing the test. `Image` is a module `Image.Image` is the type. * Update docs/source/en/main_classes/pipelines.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-10-12 08:54:20 -04:00
Quancore	70a058bc65	Added tokenize keyword arguments to feature extraction pipeline (#19382 ) * Added tokenize keyword arguments to feature extraction pipeline * Reverted truncation parameter * Import numpy moved to top	2022-10-11 12:54:41 -04:00
Ankur Goyal	a3008c5a6d	Implement multiple span support for DocumentQuestionAnswering (#19204 ) * Implement multiple span support * Address comments * Add tests + fix bugs	2022-10-11 10:47:55 -04:00
Arthur	b722a6be72	Fix whisper for `pipeline` (#19482 ) * update feature extractor params * update attention mask handling * fix doc and pipeline test * add warning when skipping test * add whisper translation and transcription test * fix build doc test	2022-10-11 07:17:53 -04:00
Sylvain Gugger	d92e22d1f2	Remove ref to is_pipeline_test	2022-10-07 21:38:07 -04:00
Sylvain Gugger	9ac586b3c8	Rework pipeline tests (#19366 ) * Rework pipeline tests * Try to fix Flax tests * Try to put it before * Use a new decorator instead * Remove ignore marker since it doesn't work * Filter pipeline tests * Woopsie * Use the fitlered list * Clean up and fake modif * Remove init * Revert fake modif	2022-10-07 18:01:58 -04:00

1 2 3 4 5

207 Commits