Linear RNNs for autoregressive generation of long music samples
- URL: http://arxiv.org/abs/2510.02401v1
- Date: Wed, 01 Oct 2025 17:26:54 GMT
- Title: Linear RNNs for autoregressive generation of long music samples
- Authors: Konrad Szewczyk, Daniel Gallo Fernández, James Townsend,
- Abstract summary: We present a model, HarmonicRNN, which attains state of the art log-likelihoods and perceptual metrics on small-scale datasets.
- Score: 2.867517731896504
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Directly learning to generate audio waveforms in an autoregressive manner is a challenging task, due to the length of the raw sequences and the existence of important structure on many different timescales. Traditional approaches based on recurrent neural networks, as well as causal convolutions and self-attention, have only had limited success on this task. However, recent work has shown that deep state space models, also referred to as linear RNNs, can be highly efficient in this context. In this work, we push the boundaries of linear RNNs applied to raw audio modeling, investigating the effects of different architectural choices and using context-parallelism to enable training on sequences up to one minute (1M tokens) in length. We present a model, HarmonicRNN, which attains state of the art log-likelihoods and perceptual metrics on small-scale datasets.
Related papers
- Continuous-Time Piecewise-Linear Recurrent Neural Networks [10.4029480932728]
We aim to learn a generative surrogate model which approximates the underlying, data-generating DS.<n>In scientific and medical areas, these models need to be mechanistically tractable.
arXiv Detail & Related papers (2026-02-17T15:16:12Z) - MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z) - Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion [10.867398697751742]
We introduce BLUR (Bidirectional Linear Unit for Recurrent network), which uses forward and backward linear recurrent units (LRUs) to capture both past and future dependencies with high computational efficiency.<n>Experiments on sequential image and time series datasets reveal that BLUR not only surpasses transformers and traditional RNNs in accuracy but also significantly reduces computational costs.
arXiv Detail & Related papers (2025-04-11T20:42:58Z) - RotRNN: Modelling Long Sequences with Rotations [7.037239398244858]
Linear recurrent neural networks, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs) have recently shown state-of-the-art performance on long sequence modelling benchmarks.
We propose RotRNN -- a linear recurrent model which utilises the convenient properties of rotation matrices.
We show that RotRNN provides a simple and efficient model with a robust normalisation procedure, and a practical implementation that remains faithful to its theoretical derivation.
arXiv Detail & Related papers (2024-07-09T21:37:36Z) - On the Resurgence of Recurrent Models for Long Sequences -- Survey and
Research Opportunities in the Transformer Era [59.279784235147254]
This survey is aimed at providing an overview of these trends framed under the unifying umbrella of Recurrence.
It emphasizes novel research opportunities that become prominent when abandoning the idea of processing long sequences.
arXiv Detail & Related papers (2024-02-12T23:55:55Z) - Hierarchically Gated Recurrent Neural Network for Sequence Modeling [36.14544998133578]
We propose a gated linear RNN model dubbed Hierarchically Gated Recurrent Neural Network (HGRN)
Experiments on language modeling, image classification, and long-range arena benchmarks showcase the efficiency and effectiveness of our proposed model.
arXiv Detail & Related papers (2023-11-08T16:50:05Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Reverse engineering recurrent neural networks with Jacobian switching
linear dynamical systems [24.0378100479104]
Recurrent neural networks (RNNs) are powerful models for processing time-series data.
The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges.
We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation.
arXiv Detail & Related papers (2021-11-01T20:49:30Z) - Online learning of windmill time series using Long Short-term Cognitive
Networks [58.675240242609064]
The amount of data generated on windmill farms makes online learning the most viable strategy to follow.
We use Long Short-term Cognitive Networks (LSTCNs) to forecast windmill time series in online settings.
Our approach reported the lowest forecasting errors with respect to a simple RNN, a Long Short-term Memory, a Gated Recurrent Unit, and a Hidden Markov Model.
arXiv Detail & Related papers (2021-07-01T13:13:24Z) - UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations.
The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.