Related papers: A PAC-Bayesian Generalization Bound for Equivariant Networks

A PAC-Bayesian Generalization Bound for Equivariant Networks

URL: http://arxiv.org/abs/2210.13150v1
Date: Mon, 24 Oct 2022 12:07:03 GMT
Title: A PAC-Bayesian Generalization Bound for Equivariant Networks
Authors: Arash Behboodi, Gabriele Cesa, Taco Cohen
Abstract summary: We derive norm-based PAC-Bayesian generalization bounds for equivariant networks. The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error. In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
Score: 15.27608414735815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Equivariant networks capture the inductive bias about the symmetry of the learning task by building those symmetries into the model. In this paper, we study how equivariance relates to generalization error utilizing PAC Bayesian analysis for equivariant networks, where the transformation laws of feature spaces are determined by group representations. By using perturbation analysis of equivariant networks in Fourier domain for each layer, we derive norm-based PAC-Bayesian generalization bounds. The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error and thereby provide a guideline for selecting them. In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.

Related papers

Mathematical Foundation of Interpretable Equivariant Surrogate Models [4.433915375867081]
This paper introduces a rigorous mathematical framework for neural network explainability. The central concept involves quantifying the distance between GEOs by measuring the non-commutativity of specific diagrams. We show how it can be applied in classical machine learning scenarios, like image classification with convolutional neural networks.
arXiv Detail & Related papers (2025-03-03T15:06:43Z)
Generalization Bounds for Equivariant Networks on Markov Data [18.548000339222234]
We introduce a new McDiarmid's inequality to derive a generalization bound for neural networks trained on Markov datasets. This bound provides practical insights into selecting low-dimensional irreducible representations, enhancing generalization performance for fixed-width equivariant neural networks.
arXiv Detail & Related papers (2025-03-01T01:53:48Z)
Generalization for Least Squares Regression With Simple Spiked Covariances [3.9134031118910264]
The generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood. Recent work has made progress by describing the spectrum of the feature matrix at the hidden layer. Yet, the generalization error for linear models with spiked covariances has not been previously determined.
arXiv Detail & Related papers (2024-10-17T19:46:51Z)
Equivariant score-based generative models provably learn distributions with symmetries efficiently [7.90752151686317]
Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency. We provide the first theoretical analysis and guarantees of score-based generative models (SGMs) for learning distributions that are invariant with respect to some group symmetry.
arXiv Detail & Related papers (2024-10-02T05:14:28Z)
Decomposition of Equivariant Maps via Invariant Maps: Application to Universal Approximation under Symmetry [3.0518581575184225]
We develop a theory about the relationship between invariant and equivariant maps with regard to a group $G$. We leverage this theory in the context of deep neural networks with group symmetries in order to obtain novel insight into their mechanisms.
arXiv Detail & Related papers (2024-09-25T13:27:41Z)
Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines [15.67299102925013]
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps. Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
arXiv Detail & Related papers (2024-05-22T14:25:02Z)
GIT: Detecting Uncertainty, Out-Of-Distribution and Adversarial Samples using Gradients and Invariance Transformations [77.34726150561087]
We propose a holistic approach for the detection of generalization errors in deep neural networks. GIT combines the usage of gradient information and invariance transformations. Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures.
arXiv Detail & Related papers (2023-07-05T22:04:38Z)
Banana: Banach Fixed-Point Network for Pointcloud Segmentation with Inter-Part Equivariance [31.875925637190328]
In this paper, we present Banana, a Banach fixed-point network for equivariant segmentation with inter-part equivariance by construction. Our key insight is to iteratively solve a fixed-point problem, where point-part assignment labels and per-part SE(3)-equivariance co-evolve simultaneously. Our formulation naturally provides a strict definition of inter-part equivariance that generalizes to unseen inter-part configurations.
arXiv Detail & Related papers (2023-05-25T17:59:32Z)
Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks. We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space. We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z)
Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data. We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG) Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z)
Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals. We propose a general graph estimator based on a novel structured fusion regularization. We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets. We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator. We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data [52.78581260260455]
We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group. We apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems.
arXiv Detail & Related papers (2020-02-25T17:40:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.