* [InstructBLIP] qformer_tokenizer is required input
* Bit safer
* Add to instructblipvideo processor
* Fix up
* Use video inputs
* Update tests/models/instructblipvideo/test_processor_instructblipvideo.py
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
* remove to restiction for 4-bit model
* Update src/transformers/modeling_utils.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
* bitsandbytes: prevent dtype casting while allowing device movement with .to or .cuda
* quality fix
* Improve warning message for .to() and .cuda() on bnb quantized models
---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
* don't run custom when not needed?
* update test fetcher filtering
* fixup and updates
* update
* update
* reduce burden
* nit
* nit
* mising comma
* this?
* this?
* more parallelism
* more
* nit for real parallelism on tf and torch examples
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update to make it more custom
* update to make it more custom
* update to make it more custom
* update to make it more custom
* update
* update
* update
* update
* update
* update
* use correct path
* fix path to test files and examples
* filter-tests
* filter?
* filter?
* filter?
* nits
* fix naming of the artifacts to be pushed
* list vs files
* list vs files
* fixup
* fix list of all tests
* fix the install steps
* fix the install steps
* fix the config
* fix the config
* only split if needed
* only split if needed
* extend should fix it
* extend should fix it
* arg
* arg
* update
* update
* run tests
* run tests
* run tests
* more nits
* update
* update
* update
* update
* update
* update
* update
* simpler way to show the test, reduces the complexity of the generated config
* simpler way to show the test, reduces the complexity of the generated config
* style
* oups
* oups
* fix import errors
* skip some tests for now
* update doctestjob
* more parallelism
* fixup
* test only the test in examples
* test only the test in examples
* nits
* from Arthur
* fix generated congi
* update
* update
* show tests
* oups
* oups
* fix torch job for now
* use single upload setp
* oups
* fu**k
* fix
* nit
* update
* nit
* fix
* fixes
* [test-all]
* add generate marker and generate job
* oups
* torch job runs not generate tests
* let repo utils test all utils
* UPdate
* styling
* fix repo utils test
* more parallel please
* don't test
* update
* bit more verbose sir
* more
* hub were skipped
* split by classname
* revert
* maybe?
* Amazing catch
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* fix
* update
* update
* maybe non capturing
* manual convert?
* pass artifacts as parameters as otherwise the config is too long
* artifact.json
* store output
* might not be safe?
* my token
* mmm?
* use CI job IS
* can't get a proper id?
* ups
* build num
* update
* echo url
* this?
* this!
* fix
* wget
* ish
* dang
* udpdate
* there we go
* update
* update
* pass all
* not .txt
* update
* fetcg
* fix naming
* fix
* up
* update
* update
* ??
* update
* more updates
* update
* more
* skip
* oups
* pr documentation tests are currently created differently
* update
* hmmmm
* oups
* curl -L
* update
* ????
* nit
* mmmm
* ish
* ouf
* update
* ish
* update
* update
* updatea
* nit
* nit
* up
* oups
* documentation_test fix
* test hub tests everything, just marker
* update
* fix
* test_hub is the only annoying one now
* tf threads?
* oups
* not sure what is happening?
* fix?
* just use folder for stating hub
* I am getting fucking annoyed
* fix the test?
* update
* uupdate
* ?
* fixes
* add comment!
* nit
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* first attempt at allowing both conversions from codestral and from the original mamba ssm
* allow fp16, seems default for mamba2
* dtype fix
* simplify codestral check, dont overwrite pad/eos/bos when codestral
* change file -> directory
* use path join to be safe
* style
* apply code review
- add util mamba2 tokenizer (gptneox with left padding)
- add models dict
* fix copies
* add tokenizer to docs
* empty commit to check for weird err
* make conversion user dependent on model type, defaults for original paper models
* small comment nit
* remove norm_before_gate in conversion
* simplify model dict by using shared keys directly + remove unnecessary attributes
* fix tokenization: remove separate mamba2 tokenizer, add padding option as kwarg to gptneox one and reuse it for the conversion script
* simplify even further as we pass padding side via **kwargs already
* pass module to Params4bit.from_prequantized to ensure quant_state
* make sure to check bnb version
* revert min bnb version and use inspect on method instead
* use version instead of inspect to prevent performance hit
* make the property name readable
* Customising the separator used for splicing in DataCollatorWithFlattening
* update DataCollatorWithFlattening docs
---------
Co-authored-by: weifangyuan <i.weifangyuan@yuewen.com>