* add tests for linear shape behavior
* fix linear shape behavior
ended up adding the reshape at the end, after f8f8bf16_rowwise, because adding
it directly after quantize_fp8_per_row caused f8f8bf16_rowwise to drop the
seq_len dimension. (i.e., (17, 23, 1014) -> (17, 1024))
* save shape up front + comment
* Make StaticCache configurable at model construct time
* integrations import structure
* add new doc file to toc
---------
Co-authored-by: Guang Yang <guangyang@fb.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
* Bug Fix: Update hub.py
Bug:
TypeError: argument of type 'NoneType' is not iterable
Analysis:
The error `TypeError: argument of type 'NoneType' is not iterable` suggests that `model_card.data.tags` is `None`, and the code is trying to iterate through it using `not in`.
Fix:
1. **Check if `model_card.data.tags` is `None` before the loop**:
Since you're checking the variable `tags` before the loop, you should also ensure that `model_card.data.tags` is not `None`. You can do this by initializing `model_card.data.tags` to an empty list if it's `None`.
2. **Updated code**:
Add a check and initialize the `tags` if it is `None` before proceeding with the iteration.
This way, if `model_card.data.tags` is `None`, it gets converted to an empty list before checking the contents. This prevents the `TypeError`.
* Update hub.py
* Update docs for GGUF supported models
* Add tensor mappings and define class GGUFPhi3Converter
* Fix tokenizer
* Working version
* Attempt to fix some CI failures
* Run ruff format
* Add vocab, merges, decoder methods like LlamaConverter
* Resolve conflicts since Qwen2Moe was added to gguf
- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
* Import structure & first three model refactors
* Register -> Export. Export all in __all__. Sensible defaults according to filename.
* Apply most comments from Amy and some comments from Lucain
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain Pouget <lucainp@gmail.com>
* Style
* Add comment
* Clearer .py management
* Raise if not in backend mapping
* More specific type
* More efficient listdir
* Misc fixes
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain Pouget <lucainp@gmail.com>
* Fixed typo: insted to instead
* Fixed typo: relase to release
* Fixed typo: nighlty to nightly
* Fixed typos: versatible, benchamarks, becnhmark to versatile, benchmark, benchmarks
* Fixed typo in comment: quantizd to quantized
* Fixed typo: architecutre to architecture
* Fixed typo: contibution to contribution
* Fixed typo: Presequities to Prerequisites
* Fixed typo: faste to faster
* Fixed typo: extendeding to extending
* Fixed typo: segmetantion_maps to segmentation_maps
* Fixed typo: Alternativelly to Alternatively
* Fixed incorrectly defined variable: output to output_disabled
* Fixed typo in library name: tranformers.onnx to transformers.onnx
* Fixed missing import: import tensorflow as tf
* Fixed incorrectly defined variable: token_tensor to tokens_tensor
* Fixed missing import: import torch
* Fixed incorrectly defined variable and typo: uromaize to uromanize
* Fixed incorrectly defined variable and typo: uromaize to uromanize
* Fixed typo in function args: numpy.ndarry to numpy.ndarray
* Fixed Inconsistent Library Name: Torchscript to TorchScript
* Fixed Inconsistent Class Name: OneformerProcessor to OneFormerProcessor
* Fixed Inconsistent Class Named Typo: TFLNetForMultipleChoice to TFXLNetForMultipleChoice
* Fixed Inconsistent Library Name Typo: Pytorch to PyTorch
* Fixed Inconsistent Function Name Typo: captureWarning to captureWarnings
* Fixed Inconsistent Library Name Typo: Pytorch to PyTorch
* Fixed Inconsistent Class Name Typo: TrainingArgument to TrainingArguments
* Fixed Inconsistent Model Name Typo: Swin2R to Swin2SR
* Fixed Inconsistent Model Name Typo: EART to BERT
* Fixed Inconsistent Library Name Typo: TensorFLow to TensorFlow
* Fixed Broken Link for Speech Emotion Classification with Wav2Vec2
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed minor missing word Typo
* Fixed Punctuation: Two commas
* Fixed Punctuation: No Space between XLM-R and is
* Fixed Punctuation: No Space between [~accelerate.Accelerator.backward] and method
* Added backticks to display model.fit() in codeblock
* Added backticks to display openai-community/gpt2 in codeblock
* Fixed Minor Typo: will to with
* Fixed Minor Typo: is to are
* Fixed Minor Typo: in to on
* Fixed Minor Typo: inhibits to exhibits
* Fixed Minor Typo: they need to it needs
* Fixed Minor Typo: cast the load the checkpoints To load the checkpoints
* Fixed Inconsistent Class Name Typo: TFCamembertForCasualLM to TFCamembertForCausalLM
* Fixed typo in attribute name: outputs.last_hidden_states to outputs.last_hidden_state
* Added missing verbosity level: fatal
* Fixed Minor Typo: take To takes
* Fixed Minor Typo: heuristic To heuristics
* Fixed Minor Typo: setting To settings
* Fixed Minor Typo: Content To Contents
* Fixed Minor Typo: millions To million
* Fixed Minor Typo: difference To differences
* Fixed Minor Typo: while extract To which extracts
* Fixed Minor Typo: Hereby To Here
* Fixed Minor Typo: addition To additional
* Fixed Minor Typo: supports To supported
* Fixed Minor Typo: so that benchmark results TO as a consequence, benchmark
* Fixed Minor Typo: a To an
* Fixed Minor Typo: a To an
* Fixed Minor Typo: Chain-of-though To Chain-of-thought
* add self.head_dim for VisionAttention in Qwen2-VL
* add self.head_dim for VisionAttention in Qwen2-VL
* fix ci
* black the test_modeling_qwen2_vl.py
* use ruff to format test_modeling_qwen2_vl.py
* [run-slow] qwen2_vl
* use tying for python3.8
* fix the import format
* use ruff to fix the ci error I001
* [run-slow] qwen2_vl
* remove unused import
* commit for rebase
* use ruff fix ci
* [run-slow] qwen2_vl
---------
Co-authored-by: root <liji>
* Add validation for maximum sequence length in modeling_whisper.py
Added a validation check to ensure that the sequence length of labels does not exceed the maximum allowed length of 448 tokens. If the sequence length exceeds this limit, a ValueError is raised with a descriptive error message.
This change prevents the model from encountering errors or unexpected behavior due to excessively long sequences during training or fine-tuning, ensuring consistent input dimensions and improving overall robustness.
* Change exception message in src/transformers/models/whisper/modeling_whisper.py
The exception message is for whisper's label's sequence max length.
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Change 448 to config.max_target_positions in src/transformers/models/whisper/modeling_whisper.py
It's for whisper's config.max_target_positions.
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Change method's documentation in src/transformers/models/whisper/modeling_whisper.py
* Add test for maximum label's sequence length in test_modeling_whisper.py
* Add self to modeling_whisper.py
* Update test_modeling_whisper.py with respect to automatic validations
* Update modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Separate test_labels_sequence_max_length tests in test_modeling_whisper.py
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Remove assert from test_modeling_whisper.py
* Add max_target_positions to WhisperModelTester in test_modeling_whisper.py
* Update test_modeling_whisper.py with respect to ci/circleci: check_code_quality
* Update test_modeling_whisper.py with respect to ci/circleci: tests_generate
* Update test_modeling_whisper.py
* Change test_labels_sequence_max_length_error_after_changing_config in test_modeling_whisper.py
* Change self.config.max_target_positions to self.max_target_positions modeling_whisper.py
* Add new tests in test_modeling_whisper.py
* Update test_modeling_whisper.py
---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
* Load remote code only once
* Use hash as load indicator
* Add a new option `force_reload` for old behavior (i.e. always reload)
* Add test for dynamic module is cached
* Add more type annotations to improve code readability
* Address comments from code review
* Add validate images and test processing utils
* Remove encoded text from possible inputs in tests
* Removed encoded inputs as valid in processing_utils
* change text input check to be recursive
* change text check to all element of lists and not just the first one in recursive checks