Related papers: Learning Polynomial Problems with $SL(2,\mathbb{R})$ Equivariance

Learning Polynomial Problems with $SL(2,\mathbb{R})$ Equivariance

URL: http://arxiv.org/abs/2312.02146v1
Date: Mon, 4 Dec 2023 18:59:19 GMT
Title: Learning Polynomial Problems with $SL(2,\mathbb{R})$ Equivariance
Authors: Hannah Lawrence, Mitchell Tong Harris
Abstract summary: We show that neural networks can effectively solve problems in a data-driven fashion, achieving tenfold speedups while retaining high accuracy. We observe that these learning problems are equivariant to the non-compact group $SL(2,mathbbR)$, which consists of area-preserving linear transformations.
Score: 6.5783892500847205
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Optimizing and certifying the positivity of polynomials are fundamental primitives across mathematics and engineering applications, from dynamical systems to operations research. However, solving these problems in practice requires large semidefinite programs, with poor scaling in dimension and degree. In this work, we demonstrate for the first time that neural networks can effectively solve such problems in a data-driven fashion, achieving tenfold speedups while retaining high accuracy. Moreover, we observe that these polynomial learning problems are equivariant to the non-compact group $SL(2,\mathbb{R})$, which consists of area-preserving linear transformations. We therefore adapt our learning pipelines to accommodate this structure, including data augmentation, a new $SL(2,\mathbb{R})$-equivariant architecture, and an architecture equivariant with respect to its maximal compact subgroup, $SO(2, \mathbb{R})$. Surprisingly, the most successful approaches in practice do not enforce equivariance to the entire group, which we prove arises from an unusual lack of architecture universality for $SL(2,\mathbb{R})$ in particular. A consequence of this result, which is of independent interest, is that there exists an equivariant function for which there is no sequence of equivariant polynomials multiplied by arbitrary invariants that approximates the original function. This is a rare example of a symmetric problem where data augmentation outperforms a fully equivariant architecture, and provides interesting lessons in both theory and practice for other problems with non-compact symmetries.

Related papers

The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective [55.15192437680943]
We study the sample complexity of online reinforcement learning for nonlinear dynamical systems with continuous state and action spaces. Our algorithms are likely to be useful in practice, due to their simplicity, the ability to incorporate prior knowledge, and their benign transient behavior.
arXiv Detail & Related papers (2025-01-27T10:01:28Z)
Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions [18.086061048484616]
We study finite-sum monotone inclusion problems, which model broad classes of equilibrium problems. Our main contributions are variants of the classical Halpern iteration that employ variance reduction to obtain improved complexity guarantees. We argue that, up to poly-logarithmic factors, this complexity is unimprovable in the monotone Lipschitz setting.
arXiv Detail & Related papers (2023-10-04T17:24:45Z)
Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and Group Convolution [90.67482899242093]
A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input. We provide an in-depth analysis of both methods and their equivalence and relate the two constructions to multiview convolutional networks. We also derive new TFN non-linearities from our equivalence principle and test them on practical benchmark datasets.
arXiv Detail & Related papers (2022-11-29T03:42:11Z)
Sparse Polynomial Optimization: Theory and Practice [5.27013884159732]
Book presents several efforts to tackle this challenge with important scientific implications. It provides alternative optimization schemes that scale well in terms of computational complexity. We present sparsity-exploiting hierarchies of relaxations, for either unconstrained or constrained problems.
arXiv Detail & Related papers (2022-08-23T18:56:05Z)
Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. We show that FA-based models have maximal expressive power in a broad setting. We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)
Bilinear Classes: A Structural Framework for Provable Generalization in RL [119.42509700822484]
Bilinear Classes is a new structural framework which permits generalization in reinforcement learning. The framework incorporates nearly all existing models in which a sample complexity is achievable. Our main result provides an RL algorithm which has sample complexity for Bilinear Classes.
arXiv Detail & Related papers (2021-03-19T16:34:20Z)
Multiscale regression on unknown manifolds [13.752772802705978]
We construct low-dimensional coordinates on $mathcalM$ at multiple scales and perform multiscale regression by local fitting. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our algorithm has quasilinear complexity in the sample size, with constants linear in $D$ and exponential in $d$.
arXiv Detail & Related papers (2021-01-13T15:14:31Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
A conditional one-output likelihood formulation for multitask Gaussian processes [0.0]
Multitask Gaussian processes (MTGP) are the Gaussian process framework's solution for multioutput regression problems. Here we introduce a novel approach that simplifies the multitask learning. We show that it is computationally competitive with state of the art options.
arXiv Detail & Related papers (2020-06-05T14:59:06Z)
Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$. We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives [50.12802772165797]
The averaging technique combines all iterative solutions into a single solution. Experiments have demonstrated trade-off and the effectiveness of averaging compared with other averaging schemes.
arXiv Detail & Related papers (2020-03-09T18:14:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.