Yoni Gozlan
9b479a245b
Uniformize LlavaNextVideoProcessor kwargs ( #35613 )
...
* Uniformize processor kwargs and add tests
* add videos_kwargs tests
* fix copies
* fix llava_next_video chat template tests
* remove unnecessary default kwargs
2025-02-18 14:13:51 -05:00
Raushan Turganbay
15ec971b8e
Prepare processors for VideoLLMs ( #36149 )
...
* allow processor to preprocess conversation + video metadata
* allow callable
* add test
* fix test
* nit: fix
* add metadata frames_indices
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* port updates from Orr and add one more test
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* typo
* as dataclass
* style
* docstring + maek sure tests green
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
2025-02-14 11:34:08 +01:00
Raushan Turganbay
eebd2c972c
Chat template: update for processor ( #35953 )
...
* update
* we need batched nested input to always process correctly
* update a bit
* fix copies
2025-02-10 09:52:19 +01:00
Raushan Turganbay
d1681ec2b6
VLMs: major clean up 🧼 ( #34502 )
...
only lllava models are modified
2025-01-08 10:35:23 +01:00
Matt
d5cf91b346
Separate chat templates into a single file ( #33957 )
...
* Initial draft
* Add .jinja file loading for processors
* Add processor saving of naked chat template files
* make fixup
* Add save-load test for tokenizers
* Add save-load test for tokenizers
* stash commit
* Try popping the file
* make fixup
* Pop the arg correctly
* Pop the arg correctly
* Add processor test
* Fix processor code
* stash commit
* Processor clobbers child tokenizer's chat template
* Processor clobbers child tokenizer's chat template
* make fixup
* Split processor/tokenizer files to avoid interactions
* fix test
* Expand processor tests
* Rename arg to "save_raw_chat_template" across all classes
* Update processor warning
* Move templates to single file
* Move templates to single file
* Improve testing for processor/tokenizer clashes
* Improve testing for processor/tokenizer clashes
* Extend saving test
* Test file priority correctly
* make fixup
* Don't pop the chat template file before the slow tokenizer gets a look
* Remove breakpoint
* make fixup
* Fix error
2024-11-26 14:18:04 +00:00
Pablo Montalvo
50290cf7a0
Uniformize model processors ( #31368 )
...
* add initial design for uniform processors + align model
* add uniform processors for altclip + chinese_clip
* add uniform processors for blip + blip2
* fix mutable default 👀
* add configuration test
* handle structured kwargs w defaults + add test
* protect torch-specific test
* fix style
* fix
* rebase
* update processor to generic kwargs + test
* fix style
* add sensible kwargs merge
* update test
* fix assertEqual
* move kwargs merging to processing common
* rework kwargs for type hinting
* just get Unpack from extensions
* run-slow[align]
* handle kwargs passed as nested dict
* add from_pretrained test for nested kwargs handling
* [run-slow]align
* update documentation + imports
* update audio inputs
* protect audio types, silly
* try removing imports
* make things simpler
* simplerer
* move out kwargs test to common mixin
* [run-slow]align
* skip tests for old processors
* [run-slow]align, clip
* !$#@!! protect imports, darn it
* [run-slow]align, clip
* [run-slow]align, clip
* update common processor testing
* add altclip
* add chinese_clip
* add pad_size
* [run-slow]align, clip, chinese_clip, altclip
* remove duplicated tests
* fix
* add blip, blip2, bridgetower
Added tests for bridgetower which override common. Also modified common
tests to force center cropping if existing
* fix
* update doc
* improve documentation for default values
* add model_max_length testing
This parameter depends on tokenizers received.
* Raise if kwargs are specified in two places
* fix
* removed copied from
* match defaults
* force padding
* fix tokenizer test
* clean defaults
* move tests to common
* add missing import
* fix
* adapt bridgetower tests to shortest edge
* uniformize donut processor + tests
* add wav2vec2
* extend common testing to audio processors
* add testing + bert version
* propagate common kwargs to different modalities
* BC order of arguments
* check py version
* revert kwargs merging
* add draft overlap test
* update
* fix blip2 and wav2vec due to updates
* fix copies
* ensure overlapping kwargs do not disappear
* replace .pop by .get to handle duplicated kwargs
* fix copies
* fix missing import
* add clearly wav2vec2_bert to uniformized models
* fix copies
* increase number of features
* fix style
* [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert
* [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert
* fix concatenation
* [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert
* Update tests/test_processing_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* 🧹
* address comments
* clean up + tests
* [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-02 10:41:08 +02:00
Yoni Gozlan
61ac161a9d
Add support for custom inputs and batched inputs in ProcessorTesterMixin ( #33711 )
...
* add support for custom inputs and batched inputs in ProcessorTesterMixin
* Fix batch_size behavior ProcessorTesterMixin
* Change format prepare inputs batched
* Remove override test pixtral processor
* Remove unnecessary tests and cleanup after new prepare_inputs functions
* Fix instructBlipVideo image processor
2024-10-01 23:52:03 +02:00
Yoni Gozlan
5f0c181f4e
Uniformize kwargs for image-text-to-text processors ( #32544 )
...
* uniformize FUYU processor kwargs
* Uniformize instructblip processor kwargs
* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2
* Uniformize llava_next processor
* Fix save_load test for processor with chat_template only as extra init args
* Fix import Unpack
* Fix Fuyu Processor import
* Fix FuyuProcessor import
* Fix FuyuProcessor
* Add defaults for specific kwargs kosmos2
* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs
* Add tests processor Udop
* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature
* Fix overwrite tests kwargs processors
* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop
* Fix processing test fuyu
* remove unnecessary pad_token check in instructblip ProcessorTest
* Fix BC tests and cleanup
* FIx imports fuyu
* Uniformize Pix2Struct
* Fix wrong name for FuyuProcessorKwargs
* Fix slow tests reversed inputs align fuyu llava-next, change udop warning
* Fix wrong logging import udop
* Add check images text input order
* Fix copies
* change text pair handling when positional arg
* rebase on main, fix imports in test_processing_common
* remove optional args and udop uniformization from this PR
* fix failing tests
* remove unnecessary test, fix processing utils and test processing common
* cleanup Unpack
* cleanup
* fix conflict grounding dino
2024-09-24 21:28:19 -04:00
Pablo Montalvo
9eb93854b9
Clean up Unpack imports ( #33631 )
...
clean up Unpack imports
2024-09-23 10:21:17 +02:00
Yoni Gozlan
c0c6815dc9
Add support for args to ProcessorMixin for backward compatibility ( #33479 )
...
* add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin
* change size and crop_size in processor kwargs tests to do_rescale and rescale_factor
* remove unnecessary llava processor kwargs test overwrite
* nit
* change data_arg_name to input_name
* Remove unnecessary test override
* Remove unnecessary tests Paligemma
* Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring
2024-09-20 11:40:59 -04:00
Pablo Montalvo
413008c580
add uniform processors for altclip + chinese_clip ( #31198 )
...
* add initial design for uniform processors + align model
* add uniform processors for altclip + chinese_clip
* fix mutable default 👀
* add configuration test
* handle structured kwargs w defaults + add test
* protect torch-specific test
* fix style
* fix
* rebase
* update processor to generic kwargs + test
* fix style
* add sensible kwargs merge
* update test
* fix assertEqual
* move kwargs merging to processing common
* rework kwargs for type hinting
* just get Unpack from extensions
* run-slow[align]
* handle kwargs passed as nested dict
* add from_pretrained test for nested kwargs handling
* [run-slow]align
* update documentation + imports
* update audio inputs
* protect audio types, silly
* try removing imports
* make things simpler
* simplerer
* move out kwargs test to common mixin
* [run-slow]align
* skip tests for old processors
* [run-slow]align, clip
* !$#@!! protect imports, darn it
* [run-slow]align, clip
* [run-slow]align, clip
* update common processor testing
* add altclip
* add chinese_clip
* add pad_size
* [run-slow]align, clip, chinese_clip, altclip
* remove duplicated tests
* fix
* update doc
* improve documentation for default values
* add model_max_length testing
This parameter depends on tokenizers received.
* Raise if kwargs are specified in two places
* fix
* match defaults
* force padding
* fix tokenizer test
* clean defaults
* move tests to common
* remove try/catch block
* deprecate kwarg
* format
* add copyright + remove unused method
* [run-slow]altclip, chinese_clip
* clean imports
* fix version
* clean up deprecation
* fix style
* add corner case test on kwarg overlap
* resume processing - add Unpack as importable
* add tmpdirname
* fix altclip
* fix up
* add back crop_size to specific tests
* generalize tests to possible video_processor
* add back crop_size arg
* fixup overlapping kwargs test for qformer_tokenizer
* remove copied from
* fixup chinese_clip tests values
* fixup tests - qformer tokenizers
* [run-slow] altclip, chinese_clip
* remove prepare_image_inputs
2024-09-19 17:21:54 +02:00
Raushan Turganbay
e40bb4845e
Load and save video-processor from separate folder ( #33562 )
...
* load and save from video-processor folder
* Update src/transformers/models/llava_onevision/processing_llava_onevision.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-19 09:56:52 +02:00
Yoni Gozlan
d8500cd229
Uniformize kwargs for Pixtral processor ( #33521 )
...
* add uniformized pixtral and kwargs
* update doc
* fix _validate_images_text_input_order
* nit
2024-09-17 14:44:27 -04:00
Raushan Turganbay
2f611d30d9
Qwen2-VL: clean-up and add more tests ( #33354 )
...
* clean-up on qwen2-vl and add generation tests
* add video tests
* Update tests/models/qwen2_vl/test_processing_qwen2_vl.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix and add better tests
* Update src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* update docs and address comments
* Update docs/source/en/model_doc/qwen2_vl.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/qwen2_vl.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* update
* remove size at all
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-09-12 18:24:04 +02:00
amyeroberts
f745e7d3f9
Remove repeated prepare_images in processor tests ( #33163 )
...
* Remove repeated prepare_images
* Address comments - update docstring; explanatory comment
2024-09-09 13:20:27 +01:00
Yoni Gozlan
5bcbdff159
Modify ProcessorTesterMixin for better generalization ( #32637 )
...
* Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs
* remove crop_size argument in align processor tests to be coherent with base tests
* Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino
2024-08-13 11:48:53 -04:00
Pablo Montalvo
c624d5ba0b
add initial design for uniform processors + align model ( #31197 )
...
* add initial design for uniform processors + align model
* fix mutable default 👀
* add configuration test
* handle structured kwargs w defaults + add test
* protect torch-specific test
* fix style
* fix
* fix assertEqual
* move kwargs merging to processing common
* rework kwargs for type hinting
* just get Unpack from extensions
* run-slow[align]
* handle kwargs passed as nested dict
* add from_pretrained test for nested kwargs handling
* [run-slow]align
* update documentation + imports
* update audio inputs
* protect audio types, silly
* try removing imports
* make things simpler
* simplerer
* move out kwargs test to common mixin
* [run-slow]align
* skip tests for old processors
* [run-slow]align, clip
* !$#@!! protect imports, darn it
* [run-slow]align, clip
* [run-slow]align, clip
* update doc
* improve documentation for default values
* add model_max_length testing
This parameter depends on tokenizers received.
* Raise if kwargs are specified in two places
* fix
* expand VideoInput
* fix
* fix style
* remove defaults values
* add comment to indicate documentation on adding kwargs
* protect imports
* [run-slow]align
* fix
* remove set() that breaks ordering
* test more
* removed unused func
* [run-slow]align
2024-06-13 16:27:16 +02:00
Yih-Dar
db9a7e9d3d
Don't save processor_config.json
if a processor has no extra attribute ( #28584 )
...
* not save if empty
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-01-19 09:59:14 +00:00
Yih-Dar
3005f96552
Save Processor
( #27761 )
...
* save processor
* Update tests/models/auto/test_processor_auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/test_processing_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-01-18 10:21:45 +00:00