A PAC-Bayesian Generalization Bound for Equivariant Networks
- URL: http://arxiv.org/abs/2210.13150v1
- Date: Mon, 24 Oct 2022 12:07:03 GMT
- Title: A PAC-Bayesian Generalization Bound for Equivariant Networks
- Authors: Arash Behboodi, Gabriele Cesa, Taco Cohen
- Abstract summary: We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
- Score: 15.27608414735815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equivariant networks capture the inductive bias about the symmetry of the
learning task by building those symmetries into the model. In this paper, we
study how equivariance relates to generalization error utilizing PAC Bayesian
analysis for equivariant networks, where the transformation laws of feature
spaces are determined by group representations. By using perturbation analysis
of equivariant networks in Fourier domain for each layer, we derive norm-based
PAC-Bayesian generalization bounds. The bound characterizes the impact of group
size, and multiplicity and degree of irreducible representations on the
generalization error and thereby provide a guideline for selecting them. In
general, the bound indicates that using larger group size in the model improves
the generalization error substantiated by extensive numerical experiments.
Related papers
- Generalization for Least Squares Regression With Simple Spiked Covariances [3.9134031118910264]
The generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood.
Recent work has made progress by describing the spectrum of the feature matrix at the hidden layer.
Yet, the generalization error for linear models with spiked covariances has not been previously determined.
arXiv Detail & Related papers (2024-10-17T19:46:51Z) - Equivariant score-based generative models provably learn distributions with symmetries efficiently [7.90752151686317]
Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency.
We provide the first theoretical analysis and guarantees of score-based generative models (SGMs) for learning distributions that are invariant with respect to some group symmetry.
arXiv Detail & Related papers (2024-10-02T05:14:28Z) - Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry [3.0518581575184225]
We develop a theory about the relationship between invariant and equivariant maps with regard to a group $G$.
We leverage this theory in the context of deep neural networks with group symmetries in order to obtain novel insight into their mechanisms.
arXiv Detail & Related papers (2024-09-25T13:27:41Z) - Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines [15.67299102925013]
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps.
Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
arXiv Detail & Related papers (2024-05-22T14:25:02Z) - GIT: Detecting Uncertainty, Out-Of-Distribution and Adversarial Samples
using Gradients and Invariance Transformations [77.34726150561087]
We propose a holistic approach for the detection of generalization errors in deep neural networks.
GIT combines the usage of gradient information and invariance transformations.
Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures.
arXiv Detail & Related papers (2023-07-05T22:04:38Z) - Banana: Banach Fixed-Point Network for Pointcloud Segmentation with
Inter-Part Equivariance [31.875925637190328]
In this paper, we present Banana, a Banach fixed-point network for equivariant segmentation with inter-part equivariance by construction.
Our key insight is to iteratively solve a fixed-point problem, where point-part assignment labels and per-part SE(3)-equivariance co-evolve simultaneously.
Our formulation naturally provides a strict definition of inter-part equivariance that generalizes to unseen inter-part configurations.
arXiv Detail & Related papers (2023-05-25T17:59:32Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Generalizing Convolutional Neural Networks for Equivariance to Lie
Groups on Arbitrary Continuous Data [52.78581260260455]
We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group.
We apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems.
arXiv Detail & Related papers (2020-02-25T17:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.