Tuning Frequency Bias of State Space Models
- URL: http://arxiv.org/abs/2410.02035v1
- Date: Wed, 2 Oct 2024 21:04:22 GMT
- Title: Tuning Frequency Bias of State Space Models
- Authors: Annan Yu, Dongwei Lyu, Soon Hoe Lim, Michael W. Mahoney, N. Benjamin Erichson,
- Abstract summary: State space models (SSMs) leverage linear, time-invariant (LTI) systems to learn sequences with long-range dependencies.
We find that SSMs exhibit an implicit bias toward capturing low-frequency components more effectively than high-frequency ones.
- Score: 48.60241978021799
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State space models (SSMs) leverage linear, time-invariant (LTI) systems to effectively learn sequences with long-range dependencies. By analyzing the transfer functions of LTI systems, we find that SSMs exhibit an implicit bias toward capturing low-frequency components more effectively than high-frequency ones. This behavior aligns with the broader notion of frequency bias in deep learning model training. We show that the initialization of an SSM assigns it an innate frequency bias and that training the model in a conventional way does not alter this bias. Based on our theory, we propose two mechanisms to tune frequency bias: either by scaling the initialization to tune the inborn frequency bias; or by applying a Sobolev-norm-based filter to adjust the sensitivity of the gradients to high-frequency inputs, which allows us to change the frequency bias via training. Using an image-denoising task, we empirically show that we can strengthen, weaken, or even reverse the frequency bias using both mechanisms. By tuning the frequency bias, we can also improve SSMs' performance on learning long-range sequences, averaging an 88.26% accuracy on the Long-Range Arena (LRA) benchmark tasks.
Related papers
- Towards Combating Frequency Simplicity-biased Learning for Domain Generalization [36.777767173275336]
Domain generalization methods aim to learn transferable knowledge from source domains that can generalize well to unseen target domains.
Recent studies show that neural networks frequently suffer from a simplicity-biased learning behavior which leads to over-reliance on specific frequency sets.
We propose two effective data augmentation modules designed to collaboratively and adaptively adjust the frequency characteristic of the dataset.
arXiv Detail & Related papers (2024-10-21T16:17:01Z) - FreSh: Frequency Shifting for Accelerated Neural Representation Learning [11.175745750843484]
Implicit Neural Representations (INRs) have recently gained attention as a powerful approach for continuously representing signals such as images, videos, and 3D shapes using multilayer perceptrons (MLPs)
Low-frequency details are known to exhibit a low-frequency bias, limiting their ability to capture high-frequency details accurately.
We propose frequency shifting (or FreSh) to align the frequency spectrum of the initial output with that of the target signal.
arXiv Detail & Related papers (2024-10-07T14:05:57Z) - Oscillatory State-Space Models [61.923849241099184]
We propose Lineary State-Space models (LinOSS) for efficiently learning on long sequences.
A stable discretization, integrated over time using fast associative parallel scans, yields the proposed state-space model.
We show that LinOSS is universal, i.e., it can approximate any continuous and causal operator mapping between time-varying functions.
arXiv Detail & Related papers (2024-10-04T22:00:13Z) - Exploring Cross-Domain Few-Shot Classification via Frequency-Aware Prompting [37.721042095518044]
Cross-Domain Few-Shot Learning has witnessed great stride with the development of meta-learning.
We propose a Frequency-Aware Prompting method with mutual attention for Cross-Domain Few-Shot classification.
arXiv Detail & Related papers (2024-06-24T08:14:09Z) - Fredformer: Frequency Debiased Transformer for Time Series Forecasting [8.356290446630373]
The Transformer model has shown leading performance in time series forecasting.
It tends to learn low-frequency features in the data and overlook high-frequency features, showing a frequency bias.
We propose Fredformer, a framework designed to mitigate frequency bias by learning features equally across different frequency bands.
arXiv Detail & Related papers (2024-06-13T11:29:21Z) - Incremental Spatial and Spectral Learning of Neural Operators for
Solving Large-Scale PDEs [86.35471039808023]
We introduce the Incremental Fourier Neural Operator (iFNO), which progressively increases the number of frequency modes used by the model.
We show that iFNO reduces total training time while maintaining or improving generalization performance across various datasets.
Our method demonstrates a 10% lower testing error, using 20% fewer frequency modes compared to the existing Fourier Neural Operator, while also achieving a 30% faster training.
arXiv Detail & Related papers (2022-11-28T09:57:15Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
Adaptive Noise Spectral Shaping [51.698273019061645]
SpecGrad adapts the diffusion noise so that its time-varying spectral envelope becomes close to the conditioning log-mel spectrogram.
It is processed in the time-frequency domain to keep the computational cost almost the same as the conventional DDPM-based neural vocoders.
arXiv Detail & Related papers (2022-03-31T02:08:27Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z) - Robust Learning with Frequency Domain Regularization [1.370633147306388]
We introduce a new regularization method by constraining the frequency spectra of the filter of the model.
We demonstrate the effectiveness of our regularization by (1) defensing to adversarial perturbations; (2) reducing the generalization gap in different architecture; and (3) improving the generalization ability in transfer learning scenario without fine-tune.
arXiv Detail & Related papers (2020-07-07T07:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.