Related papers: Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering

Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering

URL: http://arxiv.org/abs/2508.20206v1
Date: Wed, 27 Aug 2025 18:33:57 GMT
Title: Filter then Attend: Improving attention-based Time Series Forecasting with Spectral Filtering
Authors: Elisha Dayag, Nhat Thanh Van Tran, Jack Xin,
Abstract summary: Learnable frequency filters can be an integral part of a deep forecasting model by enhancing the model's spectral utilization.<n>In this paper, we establish that adding a filter to the beginning of transformer-based models enhances their performance in long time-series forecasting.
Score: 2.0901018134712297
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer-based models are at the forefront in long time-series forecasting (LTSF). While in many cases, these models are able to achieve state of the art results, they suffer from a bias toward low-frequencies in the data and high computational and memory requirements. Recent work has established that learnable frequency filters can be an integral part of a deep forecasting model by enhancing the model's spectral utilization. These works choose to use a multilayer perceptron to process their filtered signals and thus do not solve the issues found with transformer-based models. In this paper, we establish that adding a filter to the beginning of transformer-based models enhances their performance in long time-series forecasting. We add learnable filters, which only add an additional $\approx 1000$ parameters to several transformer-based models and observe in multiple instances 5-10 \% relative improvement in forecasting performance. Additionally, we find that with filters added, we are able to decrease the embedding dimension of our models, resulting in transformer-based architectures that are both smaller and more effective than their non-filtering base models. We also conduct synthetic experiments to analyze how the filters enable Transformer-based models to better utilize the full spectrum for forecasting.

Related papers

FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting [13.7064358833964]
FilterTS is a novel forecasting model that utilizes specialized filtering techniques based on the frequency domain.<n>FilterTS significantly outperforms existing methods in terms of prediction accuracy and computational efficiency.
arXiv Detail & Related papers (2025-05-07T06:19:00Z)
Filtering with Time-frequency Analysis: An Adaptive and Lightweight Model for Sequential Recommender Systems Based on Discrete Wavelet Transform [0.8246494848934447]
We design an adaptive time-frequency filter with DWT technique, which decomposes user interests into multiple signals with different frequency and time, and can automatically learn weights of these signals.<n>We also develop DWTRec, a model for sequential recommendation all based on the adaptive time-frequency filter.<n>Experiments show that our model outperforms state-of-the-art baseline models in datasets with different domains, sparsity levels and average sequence lengths.
arXiv Detail & Related papers (2025-03-30T13:28:42Z)
FilterNet: Harnessing Frequency Filters for Time Series Forecasting [34.83702192033196]
FilterNet is built upon our proposed learnable frequency filters to extract key informative temporal patterns by selectively passing or attenuating certain components of time series signals. equipped with the two filters, FilterNet can approximately surrogate the linear and attention mappings widely adopted in time series literature.
arXiv Detail & Related papers (2024-11-03T16:20:41Z)
Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z)
Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting [46.63798583414426]
Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis. Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation. Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks.
arXiv Detail & Related papers (2024-01-22T13:15:40Z)
VST++: Efficient and Stronger Visual Saliency Transformer [74.26078624363274]
We develop an efficient and stronger VST++ model to explore global long-range dependencies. We evaluate our model across various transformer-based backbones on RGB, RGB-D, and RGB-T SOD benchmark datasets.
arXiv Detail & Related papers (2023-10-18T05:44:49Z)
Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler [103.97487121678276]
Filter pruning simultaneously accelerates the computation and reduces the memory overhead of CNNs. We propose a novel Knowledge-driven Differential Filter Sampler(KDFS) with Masked Filter Modeling(MFM) framework for filter pruning.
arXiv Detail & Related papers (2023-07-01T02:28:41Z)
Combining Slow and Fast: Complementary Filtering for Dynamics Learning [9.11991227308599]
We propose a learning-based model learning approach to dynamics model learning. We also propose a hybrid model that requires an additional physics-based simulator.
arXiv Detail & Related papers (2023-02-27T13:32:47Z)
Computational Doob's h-transforms for Online Filtering of Discretely Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms. The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z)
FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain. Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z)
Filter-enhanced MLP is All You Need for Sequential Recommendation [89.0974365344997]
In online platforms, logged user behavior data is inevitable to contain noise. We borrow the idea of filtering algorithms from signal processing that attenuates the noise in the frequency domain. We propose textbfFMLP-Rec, an all-MLP model with learnable filters for sequential recommendation task.
arXiv Detail & Related papers (2022-02-28T05:49:35Z)
Towards data-driven filters in Paraview [0.0]
We develop filters that expose the abilities of pre-trained machine learning models to the visualization system. The filters transform the input data by feeding it into the model and then provide the model's output as input to the remaining visualization pipeline. A series of simplistic use cases for segmentation and classification on image and fluid data is presented.
arXiv Detail & Related papers (2021-08-11T13:02:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.