* update
* doc
* update
* Update docs/source/en/gguf.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: handle input_channel_dim == channels_last
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
* fix: default PIL images to channels_last
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
* Apply suggestions from code review
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fixup from review batch
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
* test: add 1x1 PIL image to ambiguous channel test
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
* fix(mllama): avoid 0 dimension for image with impractical aspect ratio
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
---------
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* chore: fix typos in language models
* chore: fix typos in mistral model
* chore: fix model copy from issue
* chore: fix model copy from issue
* chore: fix model copy from issue
* chore: fix model copy from issue
* chore: fix model copy from issue
Fixed 2 issues regarding `tests/trainer/test_data_collator.py::TFDataCollatorIntegrationTest::test_all_mask_replacement`:
1. I got the error `RuntimeError: "bernoulli_tensor_cpu_p_" not implemented for 'Long'`. This is because the `mask_replacement_prob=1` and `torch.bernoulli` doesn't accept this type (which would be a `torch.long` dtype instead. I fixed this by manually casting the probability arguments in the `__post_init__` function of `DataCollatorForLanguageModeling`.
2. I also got the error `tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute Equal as input #1(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:Equal]` due to the line `tf.reduce_all((batch["input_ids"] == inputs) | (batch["input_ids"] == tokenizer.mask_token_id))` in `test_data_collator.py`. This occurs because the type of the `inputs` variable is `tf.int32`. Solved this by manually casting it to `tf.int64` in the test, as the expected return type of `batch["input_ids"]` is `tf.int64`.
* First draft of github action on PR opening for auto-assigning reviewers
* fix missing import
* Don't reassign reviewers if we already have them
* Temporarily comment out the opened line so we can test the script
* Correct path for codeowners file
* Update workflow permissions
* Update workflow permissions
* Update debug logs
* Strip inline comments
* Remove prefix
* Request reviews instead of assigning
* Request reviews instead of assigning
* Add TODO
* Use pull-request-target instead
* Update the script
* Set back to pull_request for testing
* Set to pull_request_target, testing works!
* Add licence
* Tighten up one of the globs
* Refactor things to be a bit less convoluted
* Only assign reviewers when marked ready for review
* Export base streamer.
Previously, the base streamer class was not exported so the set of available streamers was fixed to 3 streamer classes.
This change makes it so that customers may extend the default base streamer class.
* make fixup
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
* avoid errors when the size of `input_ids` passed to PrefixConstrainedLogitsProcessor is zero
* use more reasonable process
* avoid early return
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add swanlab integration
* feat(integrate): add SwanLab as an optional experiment tracking tool in transformers
- Integrated SwanLab into the transformers library as an alternative for experiment tracking.
- Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`.
- Added necessary dependencies and documentation for SwanLab integration.
* Fix the spelling error of SwanLabCallback in callback.md
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Fix typo in comment
* Fix typo in comment
* Fix typos and update comments
* fix annotation
* chore: opt some comments
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: AAssets <20010618@qq.com>
Co-authored-by: ZeYi Lin <944270057@qq.com>
Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com>
* initial commit
* small fix
* move stuff to image processing file
* remove stuff in validate turn and fix return tensor
* remove liquid stuff
* in the process of addressing comments
* changes to get the right tokenization
* new __init__ works
* fixing defulat std and mean
* works
* small testing scipt -- to be deleted before merge
* remove redundant code
* addressing comments
* fix inits, add docs templates
* refactor processor, switch to gotocr image processor
* remove image proc from init
* refactor to working llava-style architecture
* Change AyaVisionModel to AyaVisionForConditionalGeneration
* add tests
* fixups
* update doc
* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model
* better variable names + remove code paths
* Updates to aya_vision.md
* address comments
* adding copied from
* make style and remove unused projector_hidden_act from config
* sort init
* include usage of fast image proc and proc on cuda in doc
* update checkpoint iin test processor
* update checkpoint in test processor 2
* remove test_model and update docstring
* skip failing tests
---------
Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
* Fix edge case for continue_final_message
* lstrip() correctly
* Add regression test
* Add a clearer error message when the final message is not present
* Add a clearer error message when the final message is not present
* Fix massive bug!
* Fix pipeline-peft interaction
* once again you have committed a debug breakpoint
* Remove extra testing line
* Add a test to check adapter loading
* Correct adapter path
* make fixup
* Remove unnecessary check
* Make check a little more stringent
transformers/image_processing_utils.py:41: UserWarning: The following named arguments are not valid for `SamImageProcessor.preprocess` and were ignored: 'point_pad_value'