* Add dithering to the `Speech2TextFeatureExtractor` API.
- in kaldi : 4a8b7f6732/src/feat/feature-window.cc (L145)
- with dithering without a seed, the features become non-deterministic due
to small Gaussian noise added to the audio (i.e. 2 runs lead to little
different outputs)
* update the PR
- add dithering also for WhisperFeatureExtractor
- not adding to Wav2Vec2FeatureExtractor (no FBANK computation)
* add unit-tests for dithering, fix docstrings
* ruff
* utils/check_copies.py --fix_and_overwrite
* update code, add seed to unit-test
* adding explanation of dithering
* Fix XGLM loss computation (PyTorch and TensorFlow)
* Update expected output string in XGLM sample test
This updates the expected output string of test_xglm_sample for torch
2.0 to the correct one and removes the one for torch 1.13.1 + cu116
(transformers moved to torch 2.0 with PR #35358).
* Update expected output IDs in XGLM generation test
**Summary:** TorchAoConfig optionally contains a
`torchao.dtypes.Layout` object which is a dataclass and not
JSON serializable, and so the following fails:
```
import json
from torchao.dtypes import TensorCoreTiledLayout
from transformers import TorchAoConfig
config = TorchAoConfig("int4_weight_only", layout=TensorCoreTiledLayout())
config.to_json_string()
json.dumps(config.to_dict())
```
This also causes `quantized_model.save_pretrained(...)` to
fail because the first step of this call is to JSON serialize
the config. Fixes https://github.com/pytorch/ao/issues/1704.
**Test Plan:**
python tests/quantization/torchao_integration/test_torchao.py -k test_json_serializable
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* archive_file may not be specified
When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check.
* Remap partial disk offload to cpu for GGUF files
GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError.
* Don't remap auto device_map and raise RuntimeError
If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk.
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* new flute
* new higgs working
* small adjustments
* progress and quallity
* small updates
* style
---------
Co-authored-by: Andrey Panferov <panferov.andrey3@wb.ru>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
* allow processor to preprocess conversation + video metadata
* allow callable
* add test
* fix test
* nit: fix
* add metadata frames_indices
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* port updates from Orr and add one more test
* Update src/transformers/processing_utils.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* typo
* as dataclass
* style
* docstring + maek sure tests green
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* decompose chat template docs
* add docs
* update model docs
* qwen2-5
* pixtral
* remove old chat template
* also video as list frames supported
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/chat_template_multimodal.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* remove audio for now
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>