Gated Attention-Enhanced Informer

Yufeng Zhang

doi:10.26689/jera.v9i5.12321

Download PDF

Keywords

Informer
Self-attention
Gated attention
Prediction

DOI

10.26689/jera.v9i5.12321

Submitted : 2025-09-17

Accepted : 2025-10-02

Published : 2025-10-17

Abstract

The Informer model leverages its innovative ProbSparse self-attention mechanism to demonstrate significant performance advantages in long-sequence time-series forecasting tasks. However, when confronted with time-series data exhibiting multi-scale characteristics and substantial noise, the model’s attention mechanism reveals inherent limitations. Specifically, the model is susceptible to interference from local noise or irrelevant patterns, leading to diminished focus on globally critical information and consequently impairing forecasting accuracy. To address this challenge, this study proposes an enhanced architecture that integrates a Gated Attention mechanism into the original Informer framework. This mechanism employs learnable gating functions to dynamically and selectively impose differentiated weighting on crucial temporal segments and discriminative feature dimensions within the input sequence. This adaptive weighting strategy is designed to effectively suppress noise interference while amplifying the capture of core dynamic patterns. Consequently, it substantially strengthens the model’s capability to represent complex temporal dynamics and ultimately elevates its predictive performance.

References

Cirstea RG, Yang B, Guo C, et al., 2022, Towards Spatio-Temporal Aware Traffic Time Series Forecasting, 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022: 2900–2913.

Yuan C, Ma X, Wang H, et al., 2023, COVID-19-MLSF: A Multi-Task Learning-Based Stock Market Forecasting Framework During the COVID-19 Pandemic. Expert Systems with Applications, 217: 119549.

Fu R, Zhang Z, Li L, 2016, Using LSTM and GRU Neural Network Methods for Traffic Flow Prediction, 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC). IEEE, 2016: 324–328.

Zhou H, Zhang S, Peng J, et al., 2021, Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting, Proceedings of the AAAI Conference on Artificial Intelligence, 35(12): 11106–11115.

Yu Y, Si X, Hu C, et al., 2019, A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Computation, 31(7): 1235–1270.

Wu H, Xu J, Wang J, et al., 2021, Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. Advances in Neural Information Processing Systems, 34: 22419–22430.