Related papers: Efficient Generalized Spherical CNNs

Efficient Generalized Spherical CNNs

URL: http://arxiv.org/abs/2010.11661v3
Date: Mon, 8 Mar 2021 11:55:27 GMT
Title: Efficient Generalized Spherical CNNs
Authors: Oliver J. Cobb, Christopher G. R. Wallis, Augustine N. Mavor-Parker, Augustin Marignier, Matthew A. Price, Mayeul d'Avezac, Jason D. McEwen
Abstract summary: We present a generalized spherical CNN framework that encompasses various existing approaches and allows them to be leveraged alongside each other. We show that these developments allow the construction of more expressive hybrid models that achieve state-of-the-art accuracy and parameter efficiency on spherical benchmark problems.
Score: 7.819876182082904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many problems across computer vision and the natural sciences require the analysis of spherical data, for which representations may be learned efficiently by encoding equivariance to rotational symmetries. We present a generalized spherical CNN framework that encompasses various existing approaches and allows them to be leveraged alongside each other. The only existing non-linear spherical CNN layer that is strictly equivariant has complexity $\mathcal{O}(C^2L^5)$, where $C$ is a measure of representational capacity and $L$ the spherical harmonic bandlimit. Such a high computational cost often prohibits the use of strictly equivariant spherical CNNs. We develop two new strictly equivariant layers with reduced complexity $\mathcal{O}(CL^4)$ and $\mathcal{O}(CL^3 \log L)$, making larger, more expressive models computationally feasible. Moreover, we adopt efficient sampling theory to achieve further computational savings. We show that these developments allow the construction of more expressive hybrid models that achieve state-of-the-art accuracy and parameter efficiency on spherical benchmark problems.

Related papers

E2Former: A Linear-time Efficient and Equivariant Transformer for Scalable Molecular Modeling [44.75336958712181]
We introduce E2Former, an equivariant and efficient transformer architecture that incorporates the Wigner $6j$ convolution (Wigner $6j$ Conv) By shifting the computational burden from edges to nodes, the Wigner $6j$ Conv reduces the complexity from $O(|mathcalE|)$ to $ O(| mathcalV|)$ while preserving both the model's expressive power and rotational equivariance. This development could suggest a promising direction for scalable and efficient molecular modeling.
arXiv Detail & Related papers (2025-01-31T15:22:58Z)
The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective [55.15192437680943]
We study the sample complexity of online reinforcement learning for nonlinear dynamical systems with continuous state and action spaces. Our algorithms are likely to be useful in practice, due to their simplicity, the ability to incorporate prior knowledge, and their benign transient behavior.
arXiv Detail & Related papers (2025-01-27T10:01:28Z)
Differentiable and accelerated spherical harmonic and Wigner transforms [7.636068929252914]
Modelling and analysis of spherical data requires efficient computation of gradients for machine learning or other differentiable programming tasks. We develop novel algorithms for accelerated and differentiable computation of generalised Fourier transforms on the sphere. We observe up to a 400-fold acceleration when benchmarked against alternative C codes.
arXiv Detail & Related papers (2023-11-24T18:59:04Z)
Globally Convergent Accelerated Algorithms for Multilinear Sparse Logistic Regression with $\ell_0$-constraints [2.323238724742687]
Multilinear logistic regression serves as a powerful tool for the analysis of multidimensional data. We propose an Accelerated Proximal Alternating Minim-MLSR model to solve the $ell_0$-MLSR. We also demonstrate that APALM$+$ is globally convergent to a first-order critical point as well as to establish convergence by using the Kurdy-Lojasiewicz property.
arXiv Detail & Related papers (2023-09-17T11:05:08Z)
Distribution learning via neural differential equations: a nonparametric statistical perspective [1.4436965372953483]
This work establishes the first general statistical convergence analysis for distribution learning via ODE models trained through likelihood transformations. We show that the latter can be quantified via the $C1$-metric entropy of the class $mathcal F$. We then apply this general framework to the setting of $Ck$-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes $mathcal F$: $Ck$ functions and neural networks.
arXiv Detail & Related papers (2023-09-03T00:21:37Z)
Scalable and Equivariant Spherical CNNs by Discrete-Continuous (DISCO) Convolutions [5.8808473430456525]
No existing spherical convolutional neural network (CNN) framework is both computationally scalable and rotationally equivariant. We develop a hybrid discrete-continuous (DISCO) group convolution that is simultaneously equivariant and computationally to high-resolution. For 4k spherical images we realize a saving of $109$ in computational cost and $104$ in memory usage when compared to the most efficient alternative equivariant spherical convolution.
arXiv Detail & Related papers (2022-09-27T18:00:01Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions. We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z)
PDO-e$\text{S}^\text{2}$CNNs: Partial Differential Operator Based Equivariant Spherical CNNs [77.53203546732664]
We use partial differential operators to design a spherical equivariant CNN, PDO-e$textStext2$CNN, which is exactly rotation equivariant in the continuous domain. In experiments, PDO-e$textStext2$CNNs show greater parameter efficiency and outperform other spherical CNNs significantly on several tasks.
arXiv Detail & Related papers (2021-04-08T07:54:50Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time [70.15611146583068]
We study training of Convolutional Neural Networks (CNNs) with ReLU activations. We introduce exact convex optimization with a complexity with respect to the number of data samples, the number of neurons, and data dimension.
arXiv Detail & Related papers (2020-06-26T04:47:20Z)
Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA. We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z)
Efficient algorithms for multivariate shape-constrained convex regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints. For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.