Related papers: The Selective G-Bispectrum and its Inversion: Applications to G-Invariant Networks

The Selective G-Bispectrum and its Inversion: Applications to G-Invariant Networks

URL: http://arxiv.org/abs/2407.07655v2
Date: Wed, 6 Nov 2024 13:46:35 GMT
Title: The Selective G-Bispectrum and its Inversion: Applications to G-Invariant Networks
Authors: Simon Mataigne, Johan Mathe, Sophia Sanborn, Christopher Hillar, Nina Miolane,
Abstract summary: We show that the $G$-Bispectrum can be reduced into a textitselective $G$-Bispectrum. We demonstrate how its integration in neural networks enhances accuracy and robustness compared to traditional approaches.
Score: 3.8311785959108637
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An important problem in signal processing and deep learning is to achieve \textit{invariance} to nuisance factors not relevant for the task. Since many of these factors are describable as the action of a group $G$ (e.g. rotations, translations, scalings), we want methods to be $G$-invariant. The $G$-Bispectrum extracts every characteristic of a given signal up to group action: for example, the shape of an object in an image, but not its orientation. Consequently, the $G$-Bispectrum has been incorporated into deep neural network architectures as a computational primitive for $G$-invariance\textemdash akin to a pooling mechanism, but with greater selectivity and robustness. However, the computational cost of the $G$-Bispectrum ($\mathcal{O}(|G|^2)$, with $|G|$ the size of the group) has limited its widespread adoption. Here, we show that the $G$-Bispectrum computation contains redundancies that can be reduced into a \textit{selective $G$-Bispectrum} with $\mathcal{O}(|G|)$ complexity. We prove desirable mathematical properties of the selective $G$-Bispectrum and demonstrate how its integration in neural networks enhances accuracy and robustness compared to traditional approaches, while enjoying considerable speeds-up compared to the full $G$-Bispectrum.

Related papers

Symmetry-Breaking Descent for Invariant Cost Functionals [0.0]
We study the problem of reducing a task cost functional $W(S)$, defined over Sobolev-class signals $S$, when the cost is invariant under a global symmetry group $G subset mathrmDiff(M)$.<n>We propose a variational method that exploits the symmetry structure to construct explicit, symmetry-breaking deformations of the input signal.
arXiv Detail & Related papers (2025-05-19T15:06:31Z)
A General Framework for Robust G-Invariance in G-Equivariant Networks [5.227502964814928]
We introduce a general method for achieving robust group-invariance in group-equivariant convolutional neural networks ($G$-CNNs) The completeness of the triple correlation endows the $G$-TC layer with strong robustness. We demonstrate the benefits of this method on both commutative and non-commutative groups.
arXiv Detail & Related papers (2023-10-28T02:27:34Z)
Universality of max-margin classifiers [10.797131009370219]
We study the role of featurization maps and the high-dimensional universality of the misclassification error for non-Gaussian features. In particular, the overparametrization threshold and generalization error can be computed within a simpler model.
arXiv Detail & Related papers (2023-09-29T22:45:56Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$. We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z)
On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure [77.60508571062958]
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems. Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor) We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(Glog(delta_G)+ Xlog(delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors.
arXiv Detail & Related papers (2022-11-28T08:40:12Z)
A Classification of $G$-invariant Shallow Neural Networks [1.4213973379473654]
We prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation. We enumerate the $G$-SNN architectures for some example groups $G$ and visualize their structure.
arXiv Detail & Related papers (2022-05-18T21:18:16Z)
Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z)
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios [69.2027612631023]
We show that locality is key in determining the learning curve exponent $beta$. We conclude by proving, using a natural assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case.
arXiv Detail & Related papers (2021-06-16T08:27:31Z)
Robust Model Selection and Nearly-Proper Learning for GMMs [26.388358539260473]
In learning theory, a standard assumption is that the data is generated from a finite mixture model. But what happens when the number of components is not known in advance? We are able to approximately determine the minimum number of components needed to fit the distribution within a logarithmic factor.
arXiv Detail & Related papers (2021-06-05T01:58:40Z)
Geometric Deep Learning and Equivariant Neural Networks [0.9381376621526817]
We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks. We develop gauge equivariant convolutional neural networks on arbitrary manifold $mathcalM$ using principal bundles with structure group $K$ and equivariant maps between sections of associated vector bundles. We analyze several applications of this formalism, including semantic segmentation and object detection networks.
arXiv Detail & Related papers (2021-05-28T15:41:52Z)
Homogeneous vector bundles and $G$-equivariant convolutional neural networks [0.0]
$G$-equivariant convolutional neural networks (GCNNs) are a geometric deep learning model for data defined on a homogeneous $G$-space $mathcalM$. In this paper, we analyze GCNNs on homogeneous spaces $mathcalM = G/K$ in the case of unimodular Lie groups $G$ and compact subgroups $K leq G$.
arXiv Detail & Related papers (2021-05-12T02:06:04Z)
Improving Robustness and Generality of NLP Models Using Disentangled Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$. We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.