Breaking the Context Bottleneck on Long Time Series Forecasting
- URL: http://arxiv.org/abs/2412.16572v1
- Date: Sat, 21 Dec 2024 10:29:34 GMT
- Title: Breaking the Context Bottleneck on Long Time Series Forecasting
- Authors: Chao Ma, Yikai Hou, Xiang Li, Yinggang Sun, Haining Yu, Zhou Fang, Jiaxing Qu,
- Abstract summary: Long-term time-series forecasting is essential for planning and decision-making in economics, energy, and transportation.
Recent advancements have enhanced the efficiency of these models, but the challenge of effectively leveraging longer sequences persists.
We propose the Logsparse Decomposable Multiscaling (LDM) framework for the efficient and effective processing of long sequences.
- Score: 6.36010639533526
- License:
- Abstract: Long-term time-series forecasting is essential for planning and decision-making in economics, energy, and transportation, where long foresight is required. To obtain such long foresight, models must be both efficient and effective in processing long sequence. Recent advancements have enhanced the efficiency of these models; however, the challenge of effectively leveraging longer sequences persists. This is primarily due to the tendency of these models to overfit when presented with extended inputs, necessitating the use of shorter input lengths to maintain tolerable error margins. In this work, we investigate the multiscale modeling method and propose the Logsparse Decomposable Multiscaling (LDM) framework for the efficient and effective processing of long sequences. We demonstrate that by decoupling patterns at different scales in time series, we can enhance predictability by reducing non-stationarity, improve efficiency through a compact long input representation, and simplify the architecture by providing clear task assignments. Experimental results demonstrate that LDM not only outperforms all baselines in long-term forecasting benchmarks, but also reducing both training time and memory costs.
Related papers
- MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation [17.4088244981231]
This paper addresses the problem of long-term dense anticipation.
The goal of this task is to predict actions and their durations several minutes into the future based on provided video observations.
To address this uncertainty, models are designed to predict several potential future action sequences.
arXiv Detail & Related papers (2025-01-15T14:46:44Z) - UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba [7.594115034632109]
We propose UmambaTSF, a novel long-term time series forecasting framework.
It integrates multi-scale feature extraction capabilities of U-shaped encoder-decoder multilayer perceptrons (MLP) with Mamba's long sequence representation.
UmambaTSF achieves state-of-the-art performance and excellent generality on widely used benchmark datasets.
arXiv Detail & Related papers (2024-10-15T04:56:43Z) - TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting [49.6208017412376]
TimeBridge is a novel framework designed to bridge the gap between non-stationarity and dependency modeling.
TimeBridge consistently achieves state-of-the-art performance in both short-term and long-term forecasting.
arXiv Detail & Related papers (2024-10-06T10:41:03Z) - Test Time Learning for Time Series Forecasting [1.4605709124065924]
Test-Time Training (TTT) modules consistently outperform state-of-the-art models, including the Mamba-based TimeMachine.
Our results show significant improvements in Mean Squared Error (MSE) and Mean Absolute Error (MAE)
This work sets a new benchmark for time-series forecasting and lays the groundwork for future research in scalable, high-performance forecasting models.
arXiv Detail & Related papers (2024-09-21T04:40:08Z) - Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting [19.426131129034115]
This paper proposes a novel framework leveraging self-supervised multiscale representation learning to capture both long-term and near-term workload patterns.
The long-term history is encoded through multiscale representations while the near-term observations are modeled via temporal flow fusion.
arXiv Detail & Related papers (2024-07-29T04:42:18Z) - Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles.
Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query.
Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z) - CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling [52.404072802235234]
We introduce Chunked Instruction-aware State Eviction (CItruS), a novel modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states.
Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget.
arXiv Detail & Related papers (2024-06-17T18:34:58Z) - LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences.
We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook.
LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z) - Bidirectional Long-Range Parser for Sequential Data Understanding [3.76054468268713]
We introduce BLRP (Bidirectional Long-Range), a novel and versatile attention mechanism designed to increase performance and efficiency on long-sequence tasks.
We show the benefits and versatility of our approach on vision and language domains by demonstrating competitive results against state-of-the-art methods.
arXiv Detail & Related papers (2024-04-08T05:45:03Z) - Grouped self-attention mechanism for a memory-efficient Transformer [64.0125322353281]
Real-world tasks such as forecasting weather, electricity consumption, and stock market involve predicting data that vary over time.
Time-series data are generally recorded over a long period of observation with long sequences owing to their periodic characteristics and long-range dependencies over time.
We propose two novel modules, Grouped Self-Attention (GSA) and Compressed Cross-Attention (CCA)
Our proposed model efficiently exhibited reduced computational complexity and performance comparable to or better than existing methods.
arXiv Detail & Related papers (2022-10-02T06:58:49Z) - Long Short-Term Transformer for Online Action Detection [96.23884916995978]
Long Short-term TRansformer (LSTR) is a new temporal modeling algorithm for online action detection.
Compared to prior work, LSTR provides an effective and efficient method to model long videos with less algorithm design.
arXiv Detail & Related papers (2021-07-07T17:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.