Related papers: Geometric Flow Models over Neural Network Weights

Geometric Flow Models over Neural Network Weights

URL: http://arxiv.org/abs/2504.03710v1
Date: Thu, 27 Mar 2025 19:29:44 GMT
Title: Geometric Flow Models over Neural Network Weights
Authors: Ege Erdogan,
Abstract summary: A generative model of neural network weights would be useful for a diverse set of applications, such as deep learning, learned optimization, and transfer learning.<n>Existing work on weight-space generative models often ignores the symmetries of neural network weights, or only takes into account a subset of them.<n>We build on recent work on generative modeling with flow matching, and weight-space graph neural networks to design three different weight-space flows.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep generative models such as flow and diffusion models have proven to be effective in modeling high-dimensional and complex data types such as videos or proteins, and this has motivated their use in different data modalities, such as neural network weights. A generative model of neural network weights would be useful for a diverse set of applications, such as Bayesian deep learning, learned optimization, and transfer learning. However, the existing work on weight-space generative models often ignores the symmetries of neural network weights, or only takes into account a subset of them. Modeling those symmetries, such as permutation symmetries between subsequent layers in an MLP, the filters in a convolutional network, or scaling symmetries arising with the use of non-linear activations, holds the potential to make weight-space generative modeling more efficient by effectively reducing the dimensionality of the problem. In this light, we aim to design generative models in weight-space that more comprehensively respect the symmetries of neural network weights. We build on recent work on generative modeling with flow matching, and weight-space graph neural networks to design three different weight-space flows. Each of our flows takes a different approach to modeling the geometry of neural network weights, and thus allows us to explore the design space of weight-space flows in a principled way. Our results confirm that modeling the geometry of neural networks more faithfully leads to more effective flow models that can generalize to different tasks and architectures, and we show that while our flows obtain competitive performance with orders of magnitude fewer parameters than previous work, they can be further improved by scaling them up. We conclude by listing potential directions for future work on weight-space generative models.

Related papers

DeepWeightFlow: Re-Basined Flow Matching for Generating Neural Network Weights [10.97849774373198]
We present DeepWeightFlow, a Flow Matching model that operates directly in weight space to generate diverse and high-accuracy neural network weights.<n>The neural networks generated by DeepWeightFlow do not require fine-tuning to perform well and can scale to large networks.
arXiv Detail & Related papers (2026-01-08T15:56:28Z)
Rapid training of Hamiltonian graph networks using random features [1.1199585259018459]
Hamiltonian Graph Networks (HGN) can be trained up to 600x faster--but with comparable accuracy--by replacing iterative optimization with random feature-based parameter construction.<n>We show robust performance in diverse simulations, including N-body mass-spring and molecular systems in up to 3 dimensions and 10,000 particles with different geometries.<n>We reveal that even when trained on minimal 8-node systems, the model can generalize in a zero-shot manner to systems as large as 4096 nodes without retraining.
arXiv Detail & Related papers (2025-06-06T22:10:05Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.<n>Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
A quatum inspired neural network for geometric modeling [14.214656118952178]
We introduce an innovative equivariant Matrix Product State (MPS)-based message-passing strategy. Our method effectively models complex many-body relationships, suppressing mean-field approximations. It seamlessly replaces the standard message-passing and layer-aggregation modules intrinsic to geometric GNNs.
arXiv Detail & Related papers (2024-01-03T15:59:35Z)
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference [57.42762335081385]
We study a new vertical-layered representation of neural network weights for encapsulating all quantized models into a single one. We can theoretically achieve any precision network for on-demand service while only needing to train and maintain one model.
arXiv Detail & Related papers (2022-12-10T15:57:38Z)
Human Trajectory Prediction via Neural Social Physics [63.62824628085961]
Trajectory prediction has been widely pursued in many fields, and many model-based and model-free methods have been explored. We propose a new method combining both methodologies based on a new Neural Differential Equation model. Our new model (Neural Social Physics or NSP) is a deep neural network within which we use an explicit physics model with learnable parameters.
arXiv Detail & Related papers (2022-07-21T12:11:18Z)
Physics-Informed Neural State Space Models via Learning and Evolution [1.1086440815804224]
We study methods for discovering neural state space dynamics models for system identification. We employ an asynchronous genetic search algorithm that alternates between model selection and optimization.
arXiv Detail & Related papers (2020-11-26T23:35:08Z)
Physics-Based Deep Neural Networks for Beam Dynamics in Charged Particle Accelerators [0.0]
Taylor maps arising in the representation of dynamics are mapped onto the weights of a neural network. The resulting network approximates the dynamical system with perfect accuracy prior to training. We demonstrate our approach on the examples of the existing PETRA III and the planned PETRA IV storage rings at DESY.
arXiv Detail & Related papers (2020-07-07T15:33:11Z)
Learning Queuing Networks by Recurrent Neural Networks [0.0]
We propose a machine-learning approach to derive performance models from data. We exploit a deterministic approximation of their average dynamics in terms of a compact system of ordinary differential equations. This allows for an interpretable structure of the neural network, which can be trained from system measurements to yield a white-box parameterized model.
arXiv Detail & Related papers (2020-02-25T10:56:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.