Generalization Bounds for Equivariant Networks on Markov Data
- URL: http://arxiv.org/abs/2503.00292v1
- Date: Sat, 01 Mar 2025 01:53:48 GMT
- Title: Generalization Bounds for Equivariant Networks on Markov Data
- Authors: Hui Li, Zhiguo Wang, Bohui Chen, Li Sheng,
- Abstract summary: We introduce a new McDiarmid's inequality to derive a generalization bound for neural networks trained on Markov datasets.<n>This bound provides practical insights into selecting low-dimensional irreducible representations, enhancing generalization performance for fixed-width equivariant neural networks.
- Score: 18.548000339222234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Equivariant neural networks play a pivotal role in analyzing datasets with symmetry properties, particularly in complex data structures. However, integrating equivariance with Markov properties presents notable challenges due to the inherent dependencies within such data. Previous research has primarily concentrated on establishing generalization bounds under the assumption of independently and identically distributed data, frequently neglecting the influence of Markov dependencies. In this study, we investigate the impact of Markov properties on generalization performance alongside the role of equivariance within this context. We begin by applying a new McDiarmid's inequality to derive a generalization bound for neural networks trained on Markov datasets, using Rademacher complexity as a central measure of model capacity. Subsequently, we utilize group theory to compute the covering number under equivariant constraints, enabling us to obtain an upper bound on the Rademacher complexity based on this covering number. This bound provides practical insights into selecting low-dimensional irreducible representations, enhancing generalization performance for fixed-width equivariant neural networks.
Related papers
- Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Generalized Precision Matrix for Scalable Estimation of Nonparametric
Markov Networks [11.77890309304632]
A Markov network characterizes the conditional independence structure, or Markov property, among a set of random variables.
In this work, we characterize the conditional independence structure in general distributions for all data types.
We also allow general functional relations among variables, thus giving rise to a Markov network structure learning algorithm.
arXiv Detail & Related papers (2023-05-19T01:53:10Z) - A PAC-Bayesian Generalization Bound for Equivariant Networks [15.27608414735815]
We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
arXiv Detail & Related papers (2022-10-24T12:07:03Z) - Generalization capabilities of neural networks in lattice applications [0.0]
We investigate the advantages of adopting translationally equivariant neural networks in favor of non-equivariant ones.
We show that our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts.
arXiv Detail & Related papers (2021-12-23T11:48:06Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Generalization Error for Linear Regression under Distributed Learning [0.0]
We consider the setting where the unknowns are distributed over a network of nodes.
We present an analytical characterization of the dependence of the generalization error on the partitioning of the unknowns over nodes.
arXiv Detail & Related papers (2020-04-30T08:49:46Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Generalizing Convolutional Neural Networks for Equivariance to Lie
Groups on Arbitrary Continuous Data [52.78581260260455]
We propose a general method to construct a convolutional layer that is equivariant to transformations from any specified Lie group.
We apply the same model architecture to images, ball-and-stick molecular data, and Hamiltonian dynamical systems.
arXiv Detail & Related papers (2020-02-25T17:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.