DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
- URL: http://arxiv.org/abs/2503.01140v2
- Date: Sun, 23 Mar 2025 03:49:23 GMT
- Title: DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
- Authors: Jonathan Geuter, Clément Bonet, Anna Korba, David Alvarez-Melis,
- Abstract summary: Deep Equilibrium Models (DEQs) are a class of implicit neural networks that solve for a fixed point of a neural network in their forward pass.<n>We present Distributional Deep Equilibrium Models (DDEQs), extending DEQs to discrete measure inputs, such as sets or point clouds.<n>In experiments, we show that they can compete with state-of-the-art models in tasks such as point cloud classification and point cloud completion.
- Score: 13.420336353905675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Equilibrium Models (DEQs) are a class of implicit neural networks that solve for a fixed point of a neural network in their forward pass. Traditionally, DEQs take sequences as inputs, but have since been applied to a variety of data. In this work, we present Distributional Deep Equilibrium Models (DDEQs), extending DEQs to discrete measure inputs, such as sets or point clouds. We provide a theoretically grounded framework for DDEQs. Leveraging Wasserstein gradient flows, we show how the forward pass of the DEQ can be adapted to find fixed points of discrete measures under permutation-invariance, and derive adequate network architectures for DDEQs. In experiments, we show that they can compete with state-of-the-art models in tasks such as point cloud classification and point cloud completion, while being significantly more parameter-efficient.
Related papers
- Consistency Deep Equilibrium Models [8.278751626877431]
Deep Equilibrium Models (DEQs) have emerged as a powerful paradigm in deep learning.<n>DEQs incur significant inference latency due to the iterative nature of fixed-point solvers.<n>We introduce the Consistency Deep Equilibrium Model (C-DEQ) to accelerate DEQ inference.
arXiv Detail & Related papers (2026-02-03T02:42:48Z) - Gradient flow for deep equilibrium single-index models [32.2015869030351]
Deep equilibrium models (DEQs) have emerged as a powerful paradigm for training infinitely deep weight-tied neural networks.<n>We rigorously study the gradient descent dynamics for DEQs in the simple setting of linear models and single-index models.<n>We then prove linear convergence of gradient descent to a global minimizer for linear DEQs and deep equilibrium single-index models.
arXiv Detail & Related papers (2025-11-21T06:14:41Z) - AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation [83.22330172077308]
AiDE-Q iteratively generates high-quality synthetic labeled datasets.<n>We conduct extensive numerical simulations on a diverse set of quantum many-body and molecular systems.
arXiv Detail & Related papers (2025-09-30T11:29:14Z) - Numerical PDE solvers outperform neural PDE solvers [5.303553599778495]
DeepFDM is a finite-difference framework for learning spatially varying coefficients in time-dependent partial differential equations.<n>It enforces stability and first-order convergence via CFL-compliant coefficient parameterizations.<n>It attains normalized mean-squared errors one to two orders of magnitude smaller than Fourier Neural Operators, U-Nets and ResNets.
arXiv Detail & Related papers (2025-07-28T18:50:37Z) - Scale-Consistent Learning for Partial Differential Equations [79.48661503591943]
We propose a data augmentation scheme based on scale-consistency properties of PDEs.<n>We then design a scale-informed neural operator that can model a wide range of scales.<n>With scale-consistency, the model trained on $Re$ of 1000 can generalize to $Re$ ranging from 250 to 10000.
arXiv Detail & Related papers (2025-07-24T21:29:52Z) - Generative Latent Neural PDE Solver using Flow Matching [8.397730500554047]
We propose a latent diffusion model for PDE simulation that embeds the PDE state in a lower-dimensional latent space.
Our framework uses an autoencoder to map different types of meshes onto a unified structured latent grid, capturing complex geometries.
Numerical experiments show that the proposed model outperforms several deterministic baselines in both accuracy and long-term stability.
arXiv Detail & Related papers (2025-03-28T16:44:28Z) - Mitigating Barren Plateaus in Quantum Neural Networks via an AI-Driven Submartingale-Based Framework [3.0617189749929348]
We propose AdaInit to mitigate barren plateaus (BPs) in quantum neural networks (QNNs)<n>AdaInit iteratively synthesizes initial parameters for QNNs that yield non-negligible gradient variance, thereby mitigating BPs.<n>We provide rigorous theoretical analyses of the submartingale-based process and empirically validate that AdaInit consistently outperforms existing methods in maintaining higher gradient variance across various QNN scales.
arXiv Detail & Related papers (2025-02-17T05:57:15Z) - Partial-differential-algebraic equations of nonlinear dynamics by Physics-Informed Neural-Network: (I) Operator splitting and framework assessment [51.3422222472898]
Several forms for constructing novel physics-informed-networks (PINN) for the solution of partial-differential-algebraic equations are proposed.
Among these novel methods are the PDE forms, which evolve from the lower-level form with fewer unknown dependent variables to higher-level form with more dependent variables.
arXiv Detail & Related papers (2024-07-13T22:48:17Z) - Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors [0.0]
Diffusion or score-based models recently showed high performance in image generation.
We study theoretically the behavior of diffusion models and their numerical implementation when the data distribution is Gaussian.
arXiv Detail & Related papers (2024-05-23T07:28:56Z) - Positive concave deep equilibrium models [7.148312060227714]
Deep equilibrium (DEQ) models are a memory efficient alternative to standard neural networks.
We introduce a novel class of DEQ models called positive concave deep equilibrium (pcDEQ) models.
Our approach, which is based on nonlinear Perron-Frobenius theory, enforces nonnegative weights and activation functions that are concave on the positive orthant.
arXiv Detail & Related papers (2024-02-06T14:24:29Z) - Deep Equilibrium Based Neural Operators for Steady-State PDEs [100.88355782126098]
We study the benefits of weight-tied neural network architectures for steady-state PDEs.
We propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE.
arXiv Detail & Related papers (2023-11-30T22:34:57Z) - Nonlinear dimensionality reduction then and now: AIMs for dissipative
PDEs in the ML era [0.0]
This study presents a collection of purely data-driven for constructing reduced-order models (ROMs) for distributed dynamical systems.
The particular motivation is the so-called post-processing Galerkin method of Garcia-Archilla, Novo and Titi.
The proposed methodology can express the ROMs in terms of (a) theoretical (Fourier coefficients), (b) linear data-driven (POD modes) and/or (c) nonlinear data-driven (Diffusion Maps) coordinates.
arXiv Detail & Related papers (2023-10-24T13:10:43Z) - LatentPINNs: Generative physics-informed neural networks via a latent
representation learning [0.0]
We introduce latentPINN, a framework that utilizes latent representations of the PDE parameters as additional (to the coordinates) inputs into PINNs.
We use a two-stage training scheme in which the first stage, we learn the latent representations for the distribution of PDE parameters.
In the second stage, we train a physics-informed neural network over inputs given by randomly drawn samples from the coordinate space within the solution domain.
arXiv Detail & Related papers (2023-05-11T16:54:17Z) - Score-based Generative Modeling Through Backward Stochastic Differential
Equations: Inversion and Generation [6.2255027793924285]
The proposed BSDE-based diffusion model represents a novel approach to diffusion modeling, which extends the application of differential equations (SDEs) in machine learning.
We demonstrate the theoretical guarantees of the model, the benefits of using Lipschitz networks for score matching, and its potential applications in various areas such as diffusion inversion, conditional diffusion, and uncertainty quantification.
arXiv Detail & Related papers (2023-04-26T01:15:35Z) - Global Convergence of Over-parameterized Deep Equilibrium Models [52.65330015267245]
A deep equilibrium model (DEQ) is implicitly defined through an equilibrium point of an infinite-depth weight-tied model with an input-injection.
Instead of infinite computations, it solves an equilibrium point directly with root-finding and computes gradients with implicit differentiation.
We propose a novel probabilistic framework to overcome the technical difficulty in the non-asymptotic analysis of infinite-depth weight-tied models.
arXiv Detail & Related papers (2022-05-27T08:00:13Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - Discrete Denoising Flows [87.44537620217673]
We introduce a new discrete flow-based model for categorical random variables: Discrete Denoising Flows (DDFs)
In contrast with other discrete flow-based models, our model can be locally trained without introducing gradient bias.
We show that DDFs outperform Discrete Flows on modeling a toy example, binary MNIST and Cityscapes segmentation maps, measured in log-likelihood.
arXiv Detail & Related papers (2021-07-24T14:47:22Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.