A PAC-Bayesian Generalization Bound for Equivariant Networks
- URL: http://arxiv.org/abs/2210.13150v1
- Date: Mon, 24 Oct 2022 12:07:03 GMT
- Title: A PAC-Bayesian Generalization Bound for Equivariant Networks
- Authors: Arash Behboodi, Gabriele Cesa, Taco Cohen
- Abstract summary: We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
- Score: 15.27608414735815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equivariant networks capture the inductive bias about the symmetry of the
learning task by building those symmetries into the model. In this paper, we
study how equivariance relates to generalization error utilizing PAC Bayesian
analysis for equivariant networks, where the transformation laws of feature
spaces are determined by group representations. By using perturbation analysis
of equivariant networks in Fourier domain for each layer, we derive norm-based
PAC-Bayesian generalization bounds. The bound characterizes the impact of group
size, and multiplicity and degree of irreducible representations on the
generalization error and thereby provide a guideline for selecting them. In
general, the bound indicates that using larger group size in the model improves
the generalization error substantiated by extensive numerical experiments.
Related papers
- Understanding Generalization from Embedding Dimension and Distributional Convergence [13.491874401333021]
We study generalization from a representation-centric perspective and analyze how the geometry of learned embeddings controls predictive performance for a fixed trained model.<n>We show that population risk can be bounded by two factors: (i) the intrinsic dimension of the embedding distribution, which determines the convergence rate of empirical embedding distribution to the population distribution in Wasserstein distance, and (ii) the sensitivity of the downstream mapping from embeddings to predictions, characterized by Lipschitz constants.
arXiv Detail & Related papers (2026-01-30T09:32:04Z) - Towards A Unified PAC-Bayesian Framework for Norm-based Generalization Bounds [63.47271262149291]
We propose a unified framework for PAC-Bayesian norm-based generalization.<n>The key to our approach is a sensitivity matrix that quantifies the network outputs with respect to structured weight perturbations.<n>We derive a family of generalization bounds that recover several existing PAC-Bayesian results as special cases.
arXiv Detail & Related papers (2026-01-13T00:42:22Z) - Implicit Bias and Invariance: How Hopfield Networks Efficiently Learn Graph Orbits [9.02293509509624]
We infer the full isomorphism class of a graph from a small random sample.<n>Findings highlight a unifying mechanism for generalization in Hopfield networks.
arXiv Detail & Related papers (2025-12-16T12:06:58Z) - Drawback of Enforcing Equivariance and its Compensation via the Lens of Expressive Power [75.44625156899468]
We investigate the impact of equivariance constraints on the expressivity of equivariant and layer-wise equivariant networks.<n>We show that despite a larger model size, the resulting architecture could still correspond to a hypothesis space with lower complexity.
arXiv Detail & Related papers (2025-12-10T14:18:59Z) - Mathematical Foundation of Interpretable Equivariant Surrogate Models [4.433915375867081]
This paper introduces a rigorous mathematical framework for neural network explainability.
The central concept involves quantifying the distance between GEOs by measuring the non-commutativity of specific diagrams.
We show how it can be applied in classical machine learning scenarios, like image classification with convolutional neural networks.
arXiv Detail & Related papers (2025-03-03T15:06:43Z) - Generalization Bounds for Equivariant Networks on Markov Data [18.548000339222234]
We introduce a new McDiarmid's inequality to derive a generalization bound for neural networks trained on Markov datasets.
This bound provides practical insights into selecting low-dimensional irreducible representations, enhancing generalization performance for fixed-width equivariant neural networks.
arXiv Detail & Related papers (2025-03-01T01:53:48Z) - Generalization for Least Squares Regression With Simple Spiked Covariances [3.9134031118910264]
The generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood.
Recent work has made progress by describing the spectrum of the feature matrix at the hidden layer.
Yet, the generalization error for linear models with spiked covariances has not been previously determined.
arXiv Detail & Related papers (2024-10-17T19:46:51Z) - Equivariant score-based generative models provably learn distributions with symmetries efficiently [7.90752151686317]
Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency.
We provide the first theoretical analysis and guarantees of score-based generative models (SGMs) for learning distributions that are invariant with respect to some group symmetry.
arXiv Detail & Related papers (2024-10-02T05:14:28Z) - Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry [3.0518581575184225]
We develop a theory about the relationship between invariant and equivariant maps with regard to a group $G$.
We leverage this theory in the context of deep neural networks with group symmetries in order to obtain novel insight into their mechanisms.
arXiv Detail & Related papers (2024-09-25T13:27:41Z) - Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines [15.67299102925013]
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps.
Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
arXiv Detail & Related papers (2024-05-22T14:25:02Z) - GIT: Detecting Uncertainty, Out-Of-Distribution and Adversarial Samples
using Gradients and Invariance Transformations [77.34726150561087]
We propose a holistic approach for the detection of generalization errors in deep neural networks.
GIT combines the usage of gradient information and invariance transformations.
Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures.
arXiv Detail & Related papers (2023-07-05T22:04:38Z) - Banana: Banach Fixed-Point Network for Pointcloud Segmentation with
Inter-Part Equivariance [31.875925637190328]
In this paper, we present Banana, a Banach fixed-point network for equivariant segmentation with inter-part equivariance by construction.
Our key insight is to iteratively solve a fixed-point problem, where point-part assignment labels and per-part SE(3)-equivariance co-evolve simultaneously.
Our formulation naturally provides a strict definition of inter-part equivariance that generalizes to unseen inter-part configurations.
arXiv Detail & Related papers (2023-05-25T17:59:32Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Generalizing Convolutional Neural Networks for Equivariance to Lie
Groups on Arbitrary Continuous Data [52.78581260260455]
We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group.
We apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems.
arXiv Detail & Related papers (2020-02-25T17:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.