Related papers: Flow Equivariant Recurrent Neural Networks

Flow Equivariant Recurrent Neural Networks

URL: http://arxiv.org/abs/2507.14793v1
Date: Sun, 20 Jul 2025 02:52:21 GMT
Title: Flow Equivariant Recurrent Neural Networks
Authors: T. Anderson Keller,
Abstract summary: In machine learning, neural network architectures that respect symmetries of their data are called equivariant.<n>We extend equivariant network theory to this regime of flows', capturing natural transformations over time.<n>We show that these models significantly outperform their non-equivariant counterparts in terms of training speed, length generalization, and velocity generalization.
Score: 2.900810893770134
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data arrives at our senses as a continuous stream, smoothly transforming from one instant to the next. These smooth transformations can be viewed as continuous symmetries of the environment that we inhabit, defining equivalence relations between stimuli over time. In machine learning, neural network architectures that respect symmetries of their data are called equivariant and have provable benefits in terms of generalization ability and sample efficiency. To date, however, equivariance has been considered only for static transformations and feed-forward networks, limiting its applicability to sequence models, such as recurrent neural networks (RNNs), and corresponding time-parameterized sequence transformations. In this work, we extend equivariant network theory to this regime of `flows' -- one-parameter Lie subgroups capturing natural transformations over time, such as visual motion. We begin by showing that standard RNNs are generally not flow equivariant: their hidden states fail to transform in a geometrically structured manner for moving stimuli. We then show how flow equivariance can be introduced, and demonstrate that these models significantly outperform their non-equivariant counterparts in terms of training speed, length generalization, and velocity generalization, on both next step prediction and sequence classification. We present this work as a first step towards building sequence models that respect the time-parameterized symmetries which govern the world around us.

Related papers

Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Improving Equivariant Networks with Probabilistic Symmetry Breaking [9.164167226137664]
Equivariant networks encode known symmetries into neural networks, often enhancing generalizations.<n>This poses an important problem, both (1) for prediction tasks on domains where self-symmetries are common, and (2) for generative models, which must break symmetries in order to reconstruct from highly symmetric latent spaces.<n>We present novel theoretical results that establish sufficient conditions for representing such distributions.
arXiv Detail & Related papers (2025-03-27T21:04:49Z)
Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components. Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z)
Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.<n>We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.<n>Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z)
EqNIO: Subequivariant Neural Inertial Odometry [33.96552018734359]
We show that IMU data transforms equivariantly, when rotated around the gravity vector and reflected with respect to arbitrary planes parallel to gravity. We then map the IMU data into this frame, thereby achieving an invariant canonicalization that can be directly used with off-the-shelf inertial odometry networks.
arXiv Detail & Related papers (2024-08-12T17:42:46Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Unnatural Algorithms in Machine Learning [0.0]
We show that optimization algorithms with this property can be viewed as discrete approximations of natural gradient descent. We introduce a simple method of introducing this naturality more generally and examine a number of popular machine learning training algorithms.
arXiv Detail & Related papers (2023-12-07T22:43:37Z)
Deformation Robust Roto-Scale-Translation Equivariant CNNs [10.44236628142169]
Group-equivariant convolutional neural networks (G-CNNs) achieve significantly improved generalization performance with intrinsic symmetry. General theory and practical implementation of G-CNNs have been studied for planar images under either rotation or scaling transformation.
arXiv Detail & Related papers (2021-11-22T03:58:24Z)
Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST. We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z)
Physical invariance in neural networks for subgrid-scale scalar flux modeling [5.333802479607541]
We present a new strategy to model the subgrid-scale scalar flux in a three-dimensional turbulent incompressible flow using physics-informed neural networks (NNs) We show that the proposed transformation-invariant NN model outperforms both purely data-driven ones and parametric state-of-the-art subgrid-scale models.
arXiv Detail & Related papers (2020-10-09T16:09:54Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.