transformers/tests
eustlb 5f087d1335
Add Moonshine (#34784)
* config draft

* full encoder forward

* full decoder forward

* fix sdpa and FA2

* fix sdpa and FA2

* moonshine model

* moonshine model forward

* fix attention with past_key_values

* add MoonshineForConditionalGeneration

* fix cache handling and causality for cross attention

* no causal attention mask for the encoder

* model addition (imports etc)

* small nit

* nits

* Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* add rope_theta

* nits

* model doc

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Joshua Lochner <admin@xenova.com>

* imports

* add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES

* updates modular

* make

* make fix-copies

* ruff check examples fix

* fix check_modular_conversion

* nit

* nits

* nits

* copied from -> imports

* imports fix

* integrate attention refacto

* modular edge case

* remove encoder

* convolutions params in config

* run modular_model_converter

* make

* Update docs/source/en/model_doc/moonshine.md

Co-authored-by: Joshua Lochner <admin@xenova.com>

* MoonshineModelTest

* correct typo

* make style

* integration tests

* make

* modular convert

* name conversion update (up_proj -> fc1 etc)

* update config

* update MLP

* update attention

* update encoder layer

* update decoder layer

* update convolutions parameters

* update encoder

* remove INPUTS_DOCSTRING

* update decoder

* update conditional generation

* update pretrained model

* imports

* modular converted

* update doc

* fix

* typo

* update doc

* update license

* update init

* split config in file

* two classes for MLP

* attention from GLM

* from GlmRotaryEmbedding

* split MLP

* apply arthur's review suggestions

* apply arthur's review suggestions

* apply arthur's review suggestions

* auto feature extractor

* convert modular

* fix + make

* convert modular

* make

* unsplit config

* use correct checkpoint

* wrap generate

* update tests

* typos

* make

* typo

* update doc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2025-01-10 11:00:54 +01:00
..
agents Change is_soundfile_availble to is_soundfile_available (#35030) 2025-01-03 14:37:42 +01:00
benchmark
bettertransformer
deepspeed Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
extended [tests] skip tests for xpu (#33553) 2024-09-19 19:28:04 +01:00
fixtures
fsdp FSDP grad accum fix (#34645) 2024-11-15 22:28:06 +01:00
generation fix: Qwen2-VL generate with inputs_embeds (#35466) 2025-01-08 16:36:03 +01:00
models Add Moonshine (#34784) 2025-01-10 11:00:54 +01:00
optimization
peft_integration added logic for deleting adapters once loaded (#34650) 2025-01-06 18:36:40 +00:00
pipelines Pipeline: simple API for assisted generation (#34504) 2025-01-08 17:08:02 +00:00
quantization Add Gemma2 GGUF support (#34002) 2025-01-03 14:50:07 +01:00
repo_utils Fix modular edge case + modular sorting order (#35562) 2025-01-09 17:17:52 +01:00
sagemaker Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
tokenization tokenizer train from iterator without pre_tokenizers (#35396) 2025-01-09 15:34:43 +01:00
tp Simplify Tensor Parallel implementation with PyTorch TP (#34184) 2024-11-18 19:51:49 +01:00
trainer Fix all output_dir in test_trainer.py to use tmp_dir (#35266) 2025-01-08 19:44:39 +01:00
utils More model refactoring! (#35359) 2025-01-09 11:09:09 +01:00
__init__.py
test_backbone_common.py
test_configuration_common.py Load sub-configs from composite configs (#34410) 2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py Fix Qwen2VL processor to handle odd number of frames (#35431) 2025-01-08 13:49:00 +01:00
test_image_transforms.py
test_modeling_common.py Skip torchscript tests if a cache object is in model's outputs (#35596) 2025-01-10 10:46:03 +01:00
test_modeling_flax_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
test_processing_common.py VLMs: major clean up 🧼 (#34502) 2025-01-08 10:35:23 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py [tokenizers] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593) 2025-01-09 17:46:50 +01:00