GitHub - Zero-coder/iTransformer-inverted-Transformer-: Implementation of iTransformer - SOTA Time Series Forecasting using Attention networks, out of Tsinghua / Ant group

iTransformer

Implementation of iTransformer - SOTA Time Series Forecasting using Attention networks, out of Tsinghua / Ant group

All that remains is tabular data (xgboost still champion here) before one can truly declare "Attention is all you need"

In before Apple gets the authors to change the name.

Appreciation

StabilityAI and 🤗 Huggingface for the generous sponsorship, as well as my other sponsors, for affording me the independence to open source current artificial intelligence techniques.

Install

$ pip install iTransformer

Usage

import torch
from iTransformer import iTransformer

# using solar energy settings

model = iTransformer(
    num_variates = 137,
    lookback_len = 96,               # or the lookback length in the paper
    dim = 256,                       # model dimensions
    depth = 6,                       # depth
    heads = 8,                       # attention heads
    dim_head = 64,                   # head dimension
    pred_length = (12, 24, 36, 48),  # can be one prediction, or many
    num_tokens_per_variate = 1       # experimental setting that projects each variate to more than one token. the idea is that the network can learn to divide up into time tokens for more granular attention across time. thanks to flash attention, you should be able to accommodate long sequence lengths just fine
)

time_series = torch.randn(2, 96, 137)  # (batch, lookback len, variates)

preds = model(time_series)

# preds -> Dict[int, Tensor[batch, pred_length, variate]]
#       -> (12: (2, 12, 137), 24: (2, 24, 137), 36: (2, 36, 137), 48: (2, 48, 137))

Todo

beef up the transformer with latest findings
improvise a 2d version - either global pool across time at end, or use a CLS token for attention pooling

Citation

@misc{liu2023itransformer,
  title   = {iTransformer: Inverted Transformers Are Effective for Time Series Forecasting}, 
  author  = {Yong Liu and Tengge Hu and Haoran Zhang and Haixu Wu and Shiyu Wang and Lintao Ma and Mingsheng Long},
  year    = {2023},
  eprint  = {2310.06625},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG}
}

@misc{burtsev2020memory,
    title   = {Memory Transformer},
    author  = {Mikhail S. Burtsev and Grigory V. Sapunov},
    year    = {2020},
    eprint  = {2006.11527},
    archivePrefix = {arXiv},
    primaryClass = {cs.CL}
}

@inproceedings{Darcet2023VisionTN,
    title   = {Vision Transformers Need Registers},
    author  = {Timoth'ee Darcet and Maxime Oquab and Julien Mairal and Piotr Bojanowski},
    year    = {2023},
    url     = {https://api.semanticscholar.org/CorpusID:263134283}
}

@inproceedings{dao2022flashattention,
    title   = {Flash{A}ttention: Fast and Memory-Efficient Exact Attention with {IO}-Awareness},
    author  = {Dao, Tri and Fu, Daniel Y. and Ermon, Stefano and Rudra, Atri and R{\'e}, Christopher},
    booktitle = {Advances in Neural Information Processing Systems},
    year    = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
iTransformer		iTransformer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
itransformer.png		itransformer.png
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iTransformer

Appreciation

Install

Usage

Todo

Citation

About

Releases

Packages

Languages

License

Zero-coder/iTransformer-inverted-Transformer-

Folders and files

Latest commit

History

Repository files navigation

iTransformer

Appreciation

Install

Usage

Todo

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages