transformers/docs/source/en/model_doc/patchtst.md
Gift Sinthong 2ac5b9325e
[time series] Add PatchTST (#25927)
* Initial commit of PatchTST model classes

Co-authored-by: Phanwadee Sinthong <phsinthong@gmail.com>
Co-authored-by: Nam Nguyen <namctin@gmail.com>
Co-authored-by: Vijay Ekambaram <vijaykr.e@gmail.com>
Co-authored-by: Ngoc Diep Do <55230119+diepi@users.noreply.github.com>
Co-authored-by: Wesley Gifford <79663411+wgifford@users.noreply.github.com>

* Add PatchTSTForPretraining

* update to include classification

Co-authored-by: Phanwadee Sinthong <phsinthong@gmail.com>
Co-authored-by: Nam Nguyen <namctin@gmail.com>
Co-authored-by: Vijay Ekambaram <vijaykr.e@gmail.com>
Co-authored-by: Ngoc Diep Do <55230119+diepi@users.noreply.github.com>
Co-authored-by: Wesley Gifford <79663411+wgifford@users.noreply.github.com>

* clean up auto files

* Add PatchTSTForPrediction

* Fix relative import

* Replace original PatchTSTEncoder with ChannelAttentionPatchTSTEncoder

* temporary adding absolute path + add PatchTSTForForecasting class

* Update base PatchTSTModel + Unittest

* Update ForecastHead to use the config class

* edit cv_random_masking, add mask to model output

* Update configuration_patchtst.py

* add masked_loss to the pretraining

* add PatchEmbeddings

* Update configuration_patchtst.py

* edit loss which considers mask in the pretraining

* remove patch_last option

* Add commits from internal repo

* Update ForecastHead

* Add model weight initilization + unittest

* Update PatchTST unittest to use local import

* PatchTST integration tests for pretraining and prediction

* Added PatchTSTForRegression + update unittest to include label generation

* Revert unrelated model test file

* Combine similar output classes

* update PredictionHead

* Update configuration_patchtst.py

* Add Revin

* small edit to PatchTSTModelOutputWithNoAttention

* Update modeling_patchtst.py

* Updating integration test for forecasting

* Fix unittest after class structure changed

* docstring updates

* change input_size to num_input_channels

* more formatting

* Remove some unused params

* Add a comment for pretrained models

* add channel_attention option

add channel_attention option and remove unused positional encoders.

* Update PatchTST models to use HF's MultiHeadAttention module

* Update paper + github urls

* Fix hidden_state return value

* Update integration test to use PatchTSTForForecasting

* Adding dataclass decorator for model output classes

* Run fixup script

* Rename model repos for integration test

* edit argument explanation

* change individual option to shared_projection

* style

* Rename integration test + import cleanup

* Fix outpu_hidden_states return value

* removed unused mode

* added std, mean and nops scaler

* add initial distributional loss for predition

* fix typo in docs

* add generate function

* formatting

* add num_parallel_samples

* Fix a typo

* copy weighted_average function, edit PredictionHead

* edit PredictionHead

* add distribution head to forecasting

* formatting

* Add generate function for forecasting

* Add generate function to prediction task

* formatting

* use argsort

* add past_observed_mask ordering

* fix arguments

* docs

* add back test_model_outputs_equivalence test

* formatting

* cleanup

* formatting

* use ACT2CLS

* formatting

* fix add_start_docstrings decorator

* add distribution head and generate function to regression task

add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.

* add distribution head and generate function to regression task

add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.

* fix typos

* add forecast_masking

* fixed tests

* use set_seed

* fix doc test

* formatting

* Update docs/source/en/model_doc/patchtst.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* better var names

* rename PatchTSTTranspose

* fix argument names and docs string

* remove compute_num_patches and unused class

* remove assert

* renamed to PatchTSTMasking

* use num_labels for classification

* use num_labels

* use default num_labels from super class

* move model_type after docstring

* renamed PatchTSTForMaskPretraining

* bs -> batch_size

* more review fixes

* use hidden_state

* rename encoder layer and block class

* remove commented seed_number

* edit docstring

* Add docstring

* formatting

* use past_observed_mask

* doc suggestion

* make fix-copies

* use Args:

* add docstring

* add docstring

* change some variable names and add PatchTST before some class names

* formatting

* fix argument types

* fix tests

* change x variable to patch_input

* format

* formatting

* fix-copies

* Update tests/models/patchtst/test_modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* move loss to forward

* Update src/transformers/models/patchtst/modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/patchtst/modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/patchtst/modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/patchtst/modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/patchtst/modeling_patchtst.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* formatting

* fix a bug when pre_norm is set to True

* output_hidden_states is set to False as default

* set pre_norm=True as default

* format docstring

* format

* output_hidden_states is None by default

* add missing docs

* better var names

* docstring: remove default to False in output_hidden_states

* change labels name to target_values in regression task

* format

* fix tests

* change to forecast_mask_ratios and random_mask_ratio

* change mask names

* change future_values to target_values param in the prediction class

* remove nn.Sequential and make PatchTSTBatchNorm class

* black

* fix argument name for prediction

* add output_attentions option

* add output_attentions to PatchTSTEncoder

* formatting

* Add attention output option to all classes

* Remove PatchTSTEncoderBlock

* create PatchTSTEmbedding class

* use config in PatchTSTPatchify

* Use config in PatchTSTMasking class

* add channel_attn_weights

* Add PatchTSTScaler class

* add output_attentions arg to test function

* format

* Update doc with image patchtst.md

* fix-copies

* rename Forecast <-> Prediction

* change name of a few parameters to match with PatchTSMixer.

* Remove *ForForecasting class to match with other time series models.

* make style

* Remove PatchTSTForForecasting in the test

* remove PatchTSTForForecastingOutput class

* change test_forecast_head to test_prediction_head

* style

* fix docs

* fix tests

* change num_labels to num_targets

* Remove PatchTSTTranspose

* remove arguments in PatchTSTMeanScaler

* remove arguments in PatchTSTStdScaler

* add config as an argument to all the scaler classes

* reformat

* Add norm_eps for batchnorm and layernorm

* reformat.

* reformat

* edit docstring

* update docstring

* change variable name pooling to pooling_type

* fix output_hidden_states as tuple

* fix bug when calling PatchTSTBatchNorm

* change stride to patch_stride

* create PatchTSTPositionalEncoding class and restructure the PatchTSTEncoder

* formatting

* initialize scalers with configs

* edit output_hidden_states

* style

* fix forecast_mask_patches doc string

---------

Co-authored-by: Gift Sinthong <gift.sinthong@ibm.com>
Co-authored-by: Nam Nguyen <namctin@gmail.com>
Co-authored-by: Vijay Ekambaram <vijaykr.e@gmail.com>
Co-authored-by: Ngoc Diep Do <55230119+diepi@users.noreply.github.com>
Co-authored-by: Wesley Gifford <79663411+wgifford@users.noreply.github.com>
Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com>
Co-authored-by: nnguyen <nnguyen@us.ibm.com>
Co-authored-by: Ngoc Diep Do <diiepy@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-13 19:06:32 +01:00

3.4 KiB

PatchTST

Overview

The PatchTST model was proposed in A Time Series is Worth 64 Words: Long-term Forecasting with Transformers by Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam.

The abstract from the paper is the following:

We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy.

Tips:

The model can also be used for time series classification and time series regression. See the respective [PatchTSTForClassification] and [PatchTSTForRegression] classes.

At a high level the model vectorizes time series into patches of a given size and encodes them via a Transformer which then outputs the prediction length forecasts:

model

This model was contributed by namctin, gsinthong, diepi, vijaye12, wmgifford, and kashif.

The original code can be found here.

PatchTSTConfig

autodoc PatchTSTConfig

PatchTSTModel

autodoc PatchTSTModel - forward

PatchTSTForPrediction

autodoc PatchTSTForPrediction - forward

PatchTSTForClassification

autodoc PatchTSTForClassification - forward

PatchTSTForPretraining

autodoc PatchTSTForPretraining - forward

PatchTSTForRegression

autodoc PatchTSTForRegression - forward