* update conversion script
* update for bias again
* remove pdv
* use my dir
* Update how we initialize the tokenizer
* Convert in bfloat16
* Undo that one again
* fix config dump
* .to() was broken for BatchMixFeature
* quick debug breakpoint
* put the breakpoint in the right place
* Add a config flag for the multimodal projector bias
* Add a config flag for the multimodal projector bias
* Conversion script can load chat templates
* Indent config for comparison
* Stop clobbering the config
* Re-enable the config clobber
* Get rid of the config manual save - it has no effect!
* Handle adapter bias correctly
* Default vision transformer activation to silu
* Remove legacy processing path
* One commit with all the debug breakpoints before I delete them all, in case I need to revert
* Update conversion
* Remove vLLM debugging instrumentation
* Drop xformers
* Remove debug enumerates
* make fixup
* make fixup
* Break copied from in pixtral
* Propagate multimodal_projector_bias change
* Propagate multimodal_projector_bias change
* Remove debug device .to()
* Restore attention weights output
* Fix Pixtral test
* Drop image_seq_length
* Drop image_seq_length
* Put the legacy processing code back
* Add the bias option to the llava_next_video config
* Add the bias option to the llava_next_video config
* Make certain args required in converter
* Make certain args required in converter
* typo
* make fixup
* Reverting some dtype changes since it seems to work without them
---------
Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-166-244.ec2.internal>
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* add support for custom inputs and batched inputs in ProcessorTesterMixin
* Fix batch_size behavior ProcessorTesterMixin
* Change format prepare inputs batched
* Remove override test pixtral processor
* Remove unnecessary tests and cleanup after new prepare_inputs functions
* Fix instructBlipVideo image processor
* add check and prepare args for BC to ProcessorMixin, improve ProcessorTesterMixin
* change size and crop_size in processor kwargs tests to do_rescale and rescale_factor
* remove unnecessary llava processor kwargs test overwrite
* nit
* change data_arg_name to input_name
* Remove unnecessary test override
* Remove unnecessary tests Paligemma
* Move test_prepare_and_validate_optional_call_args to TesterMixin, add docstring
* initial commit
* gloups
* updates
* work
* weights match
* nits
* nits
* updates to support the tokenizer :)
* updates
* Pixtral processor (#33454)
* rough outline
* Add in image break and end tokens
* Fix
* Udo some formatting changes
* Set patch_size default
* Fix
* Fix token expansion
* nit in conversion script
* Fix image token list creation
* done
* add expected results
* Process list of list of images (#33465)
* updates
* working image and processor
* this is the expected format
* some fixes
* push current updated
* working mult images!
* add a small integration test
* Uodate configuration docstring
* Formatting
* Config docstring fix
* simplify model test
* fixup modeling and etests
* Return BatchMixFeature in image processor
* fix some copies
* update
* nits
* Update model docstring
* Apply suggestions from code review
* Fix up
* updates
* revert modeling changes
* update
* update
* fix load safe
* addd liscence
* update
* use pixel_values as required by the model
* skip some tests and refactor
* Add pixtral image processing tests (#33476)
* Image processing tests
* Add processing tests
* woops
* defaults reflect pixtral image processor
* fixup post merge
* images -> pixel values
* oups sorry Mr docbuilder
* isort
* fix
* fix processor tests
* small fixes
* nit
* update
* last nits
* oups this was really breaking!
* nits
* is composition needs to be true
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>