Sequence Modeling with Spectral Mean Flows
- URL: http://arxiv.org/abs/2510.15366v1
- Date: Fri, 17 Oct 2025 06:56:57 GMT
- Title: Sequence Modeling with Spectral Mean Flows
- Authors: Jinwoo Kim, Max Beier, Petar Bevanda, Nayun Kim, Seunghoon Hong,
- Abstract summary: Key question in sequence modeling with neural networks is how to represent and learn highly nonlinear and probabilistic state dynamics.<n>We propose a new approach to sequence modeling based on an operator-theoretic view of a hidden Markov tensor (HMM)<n>A generative process is then defined as maximum mean discrepancy (MMD) gradient flow in the space of sequences.
- Score: 18.38715347739777
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A key question in sequence modeling with neural networks is how to represent and learn highly nonlinear and probabilistic state dynamics. Operator theory views such dynamics as linear maps on Hilbert spaces containing mean embedding vectors of distributions, offering an appealing but currently overlooked perspective. We propose a new approach to sequence modeling based on an operator-theoretic view of a hidden Markov model (HMM). Instead of materializing stochastic recurrence, we embed the full sequence distribution as a tensor in the product Hilbert space. A generative process is then defined as maximum mean discrepancy (MMD) gradient flow in the space of sequences. To overcome challenges with large tensors and slow sampling convergence, we introduce spectral mean flows, a novel tractable algorithm integrating two core concepts. First, we propose a new neural architecture by leveraging spectral decomposition of linear operators to derive a scalable tensor network decomposition of sequence mean embeddings. Second, we extend MMD gradient flows to time-dependent Hilbert spaces and connect them to flow matching via the continuity equation, enabling simulation-free learning and faster sampling. We demonstrate competitive results on a range of time-series modeling datasets. Code is available at https://github.com/jw9730/spectral-mean-flow.
Related papers
- Synergizing Transport-Based Generative Models and Latent Geometry for Stochastic Closure Modeling [1.665466637453776]
We show that flow matching in a lower-dimensional latent space is suited for fast sampling of closure models.<n>We control the latent space distortion and thus ensure the physical fidelity of the sampled closure term.
arXiv Detail & Related papers (2026-02-19T05:24:00Z) - Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning [52.26396748560348]
We provide an overview of high dimensional dynamical systems driven by random matrices.<n>We focus on applications to simple models of learning and generalization in machine learning theory.
arXiv Detail & Related papers (2026-01-03T00:12:32Z) - Hierarchical Stochastic Differential Equation Models for Latent Manifold Learning in Neural Time Series [0.0]
We propose a novel hierarchical differential equation (SDE) model that balances computational efficiency and interpretability.<n>We derive training and inference procedures and show that the computational cost of inference scales linearly with the length of the observation data.
arXiv Detail & Related papers (2025-07-29T06:51:58Z) - Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling [25.420559502119485]
Continuous Normalizing Flows (CNFs) enable elegant generative modeling but remain bottlenecked by slow sampling.<n>Recent approaches such as Rectified Flow and OT-CFM accelerate sampling by straightening trajectories, yet the learned dynamics remain nonlinear black boxes.<n>We propose globally linearizing flow dynamics via Koopman theory.
arXiv Detail & Related papers (2025-06-27T15:16:16Z) - Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z) - Reduced-Order Neural Operators: Learning Lagrangian Dynamics on Highly Sparse Graphs [19.1312659245072]
We present GIOROM, a data-driven discretization invariant framework for accelerating Lagrangian simulations through reduced-order modeling (ROM)<n>We leverage a data-driven graph-based neural approximation of the PDE solution operator.<n>GIOROM achieves a 6.6$times$-32$times$ reduction in input dimensionality while maintaining high-fidelity reconstructions across diverse Lagrangian regimes.
arXiv Detail & Related papers (2024-07-04T13:37:26Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - Tangent Bundle Convolutional Learning: from Manifolds to Cellular Sheaves and Back [84.61160272624262]
We define tangent bundle filters and tangent bundle neural networks (TNNs) based on this convolution operation.
Tangent bundle filters admit a spectral representation that generalizes the ones of scalar manifold filters, graph filters and standard convolutional filters in continuous time.
We numerically evaluate the effectiveness of the proposed architecture on various learning tasks.
arXiv Detail & Related papers (2023-03-20T17:57:15Z) - Stochastic Mirror Descent in Average Ensemble Models [38.38572705720122]
The mirror descent (SMD) is a general class of training algorithms, which includes the celebrated gradient descent (SGD) as a special case.
In this paper we explore the performance of the mirror potential algorithm on mean-field ensemble models.
arXiv Detail & Related papers (2022-10-27T11:04:00Z) - Learning and Inference in Sparse Coding Models with Langevin Dynamics [3.0600309122672726]
We describe a system capable of inference and learning in a probabilistic latent variable model.
We demonstrate this idea for a sparse coding model by deriving a continuous-time equation for inferring its latent variables via Langevin dynamics.
We show that Langevin dynamics lead to an efficient procedure for sampling from the posterior distribution in the 'L0 sparse' regime, where latent variables are encouraged to be set to zero as opposed to having a small L1 norm.
arXiv Detail & Related papers (2022-04-23T23:16:47Z) - Graph Gamma Process Generalized Linear Dynamical Systems [60.467040479276704]
We introduce graph gamma process (GGP) linear dynamical systems to model real multivariate time series.
For temporal pattern discovery, the latent representation under the model is used to decompose the time series into a parsimonious set of multivariate sub-sequences.
We use the generated random graph, whose number of nonzero-degree nodes is finite, to define both the sparsity pattern and dimension of the latent state transition matrix.
arXiv Detail & Related papers (2020-07-25T04:16:34Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.