transformers/tests
Jinan Zhou a91020aed0
Add TimesFM Time Series Forecasting Model (#34082)
* initial documentation

* rename mask to attention_mask

* smaller tests

* fixup

* fix copies

* move to time series section

* sort docs

* isort fix

* batch_size is not a configuration

* rename to TimesFMModelForPrediction

* initial script

* add check_outputs

* remove dropout_rate

* works with torch.Tensor inputs

* rename script

* fix docstrings

* fix freq when window_size is given

* add loss

* fix _quantile_loss

* formatting

* fix isort

* add weight init

* add support for sdpa and flash_attention_2

* fixes for flash_attention

* formatting

* remove flash_attention

* fix tests

* fix file name

* fix quantile loss

* added initial TimesFMModelIntegrationTests

* fix formatting

* fix import order

* fix _quantile_loss

* add doc for SDPA

* use timesfm 2.0

* bug fix in timesfm decode function.

* compare mean forecasts

* refactor type hints, use CamelCase

* consolidate decode func

* more readable code for weight conversion

* fix-copies

* simpler init

* renaem TimesFmMLP

* use T5LayerNorm

* fix tests

* use initializer_range

* TimesFmModel instead of TimesFmDecoder

* TimesFmPositionalEmbedding takes config for its init

* 2.0-500m-pytorch default configs

* use TimesFmModel

* fix formatting

* ignore TimesFmModel for testing

* fix docstring

* override generate as its not needed

* add doc strings

* fix logging

* add docstrings to output data classes

* initial copy from t5

* added config and attention layers

* add TimesFMPositionalEmbedding

* calcuate scale_factor once

* add more configs and TimesFMResidualBlock

* fix input_dims

* standardize code format with black

* remove unneeded modules

* TimesFM Model

* order of imports

* copy from Google official implementation

* remove covariate forecasting

* Adapting TimesFM to HF format

* restructing in progress

* adapted to HF convention

* timesfm test

* the model runs

* fixing unit tests

* fixing unit tests in progress

* add post_init

* do not change TimesFMOutput

* fixing unit tests

* all unit tests passed

* remove timesfm_layers

* add intermediate_size and initialize with config

* initial documentation

* rename mask to attention_mask

* smaller tests

* fixup

* fix copies

* move to time series section

* sort docs

* isort fix

* batch_size is not a configuration

* rename to TimesFMModelForPrediction

* initial script

* add check_outputs

* remove dropout_rate

* works with torch.Tensor inputs

* rename script

* fix docstrings

* fix freq when window_size is given

* add loss

* fix _quantile_loss

* formatting

* fix isort

* add weight init

* add support for sdpa and flash_attention_2

* fixes for flash_attention

* formatting

* remove flash_attention

* fix tests

* fix file name

* fix quantile loss

* added initial TimesFMModelIntegrationTests

* fix formatting

* fix import order

* fix _quantile_loss

* add doc for SDPA

* use timesfm 2.0

* bug fix in timesfm decode function.

* compare mean forecasts

* refactor type hints, use CamelCase

* consolidate decode func

* more readable code for weight conversion

* fix-copies

* simpler init

* renaem TimesFmMLP

* use T5LayerNorm

* fix tests

* use initializer_range

* TimesFmModel instead of TimesFmDecoder

* TimesFmPositionalEmbedding takes config for its init

* 2.0-500m-pytorch default configs

* use TimesFmModel

* fix formatting

* ignore TimesFmModel for testing

* fix docstring

* override generate as its not needed

* add doc strings

* fix logging

* add docstrings to output data classes

* add _CHECKPOINT_FOR_DOC

* fix comments

* Revert "fix comments"

This reverts commit 8deeb3e191.

* add _prepare_4d_attention_mask

* we do not have generative model classes

* use Cache

* return past_key_values

* modules initialized with config only

* update year

* Update docs/source/en/model_doc/timesfm.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add layer_idx to cache

* modular timesfm

* fix test

* unwrap sequential class

* fix toctree

* remove TimesFmOnnxConfig

* fix modular

* remove TimesFmStackedDecoder

* split qkv layer into individual layers

* rename projection layers

* use ALL_ATTENTION_FUNCTIONS

* is_causal is True

* rename config

* does not support flash_attn_2

* formatting

* fix typo in docsstring

* rename inputs

* add time series mapping

* Update src/transformers/models/olmo2/modeling_olmo2.py

* Update src/transformers/models/moonshine/modeling_moonshine.py

* use updated arguments

* fix class name

* add MODEL_FOR_TIME_SERIES_PREDICTION_MAPPING

* isort

* consolidate _preprocess into forward

* fix a typo

* fix a typo

* fix toc

* fix modular

* remove aaserts

* use self.config._attn_implementation

* move to _postprocess_output

* remove timesfm_get_large_negative_number

* use view unstead of multiple unsqueeze

* make helpers static methods of the Model

* use to_tuple

* use to_tuple if not return_dict

* remove unused intitialization block as its incorporated in nn.Linear

* remove unused num_key_value_groups

* use the same convention as the masking method

* update modular

* do not use unsqueeze

* use view instead of unsqueeze

* use buffer for inv_timescales

* formatting

* modular conversion

* remove unneeded intialization

* add missing docstrings

* remove cache

* use simple_eager_attention_forward

* support tp_plan

* support for flex and flash attention masks

* Revert "support for flex and flash attention masks"

This reverts commit def36c4fcf.

* fix device

* fix tests on gpu

* remove unsued large model test

* removed unneeded comments

* add example usage

* fix style

* add import

* Update docs/source/en/model_doc/timesfm.md

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* inherit from LlamaRMSNorm

* use can_return_tuple decorator

* remvoe return_dict

* fix year

* Update docs/source/en/model_doc/timesfm.md

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

* pretrained does not inherit from GenerationMixin

* use model for integration test

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Rajat Sen <rsen91@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2025-04-16 15:00:53 +02:00
..
bettertransformer Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
deepspeed Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
extended Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
fixtures Implementation of SuperPoint and AutoModelForKeypointDetection (#28966) 2024-03-19 14:43:02 +00:00
fsdp Remove old code for PyTorch, Accelerator and tokenizers (#37234) 2025-04-10 20:54:21 +02:00
generation enable test_offloaded_cache_implementation on XPU (#37514) 2025-04-16 11:04:57 +02:00
models Add TimesFM Time Series Forecasting Model (#34082) 2025-04-16 15:00:53 +02:00
optimization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
peft_integration enable several cases on XPU (#37516) 2025-04-16 11:01:04 +02:00
pipelines Simplify soft dependencies and update the dummy-creation process (#36827) 2025-04-11 11:08:36 +02:00
quantization Fixes hqq by following a new path for bias parameter in pre_quantized models (#37530) 2025-04-16 13:58:14 +02:00
repo_utils Simplify soft dependencies and update the dummy-creation process (#36827) 2025-04-11 11:08:36 +02:00
sagemaker Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
tensor_parallel enable tp on CPU (#36299) 2025-03-31 10:55:47 +02:00
tokenization Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
trainer Remove old code for PyTorch, Accelerator and tokenizers (#37234) 2025-04-10 20:54:21 +02:00
utils Simplify soft dependencies and update the dummy-creation process (#36827) 2025-04-11 11:08:36 +02:00
__init__.py GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
test_backbone_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_configuration_common.py Update composition flag usage (#36263) 2025-04-09 11:48:49 +02:00
test_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_processing_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_image_transforms.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_common.py enable several cases on XPU (#37516) 2025-04-16 11:01:04 +02:00
test_modeling_flax_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_modeling_tf_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_pipeline_mixin.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_processing_common.py Add Qwen2.5-Omni (#36752) 2025-04-14 12:36:41 +02:00
test_sequence_feature_extraction_common.py Use Python 3.9 syntax in tests (#37343) 2025-04-08 14:12:08 +02:00
test_tokenization_common.py 🚨 🚨 Allow saving and loading multiple "raw" chat template files (#36588) 2025-04-11 16:37:23 +01:00
test_training_args.py Fix TrainingArguments.torch_empty_cache_steps post_init check (#36734) 2025-03-17 16:09:46 +01:00