Stochastic Flows and Geometric Optimization on the Orthogonal Group
- URL: http://arxiv.org/abs/2003.13563v1
- Date: Mon, 30 Mar 2020 15:37:50 GMT
- Title: Stochastic Flows and Geometric Optimization on the Orthogonal Group
- Authors: Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii
Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte,
Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos,
Adrian Weller, Vikas Sindhwani
- Abstract summary: We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
- Score: 52.50121190744979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new class of stochastic, geometrically-driven optimization
algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous
manifolds obtained from the action of the rotation group $SO(d)$. We
theoretically and experimentally demonstrate that our methods can be applied in
various fields of machine learning including deep, convolutional and recurrent
neural networks, reinforcement learning, normalizing flows and metric learning.
We show an intriguing connection between efficient stochastic optimization on
the orthogonal group and graph theory (e.g. matching problem, partition
functions over graphs, graph-coloring). We leverage the theory of Lie groups
and provide theoretical results for the designed class of algorithms. We
demonstrate broad applicability of our methods by showing strong performance on
the seemingly unrelated tasks of learning world models to obtain stable
policies for the most difficult $\mathrm{Humanoid}$ agent from
$\mathrm{OpenAI}$ $\mathrm{Gym}$ and improving convolutional neural networks.
Related papers
- Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - An iterative clustering algorithm for the Contextual Stochastic Block
Model with optimality guarantees [4.007017852999008]
We propose a new iterative algorithm to cluster networks with side information for nodes.
We show that our algorithm is optimal under the Contextual Symmetric Block Model.
arXiv Detail & Related papers (2021-12-20T12:04:07Z) - Harnessing Heterogeneity: Learning from Decomposed Feedback in Bayesian
Modeling [68.69431580852535]
We introduce a novel GP regression to incorporate the subgroup feedback.
Our modified regression has provably lower variance -- and thus a more accurate posterior -- compared to previous approaches.
We execute our algorithm on two disparate social problems.
arXiv Detail & Related papers (2021-07-07T03:57:22Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient
adaptive algorithms for neural networks [0.0]
We present a new class of Langevin based algorithms, which overcomes many of the known shortcomings of popular adaptive vanishing algorithms.
In particular, we provide a nonasymptotic analysis and full theoretical guarantees for the convergence properties of an algorithm of this novel class, which we named TH$varepsilon$O POULA (or, simply, TheoPouLa)
arXiv Detail & Related papers (2021-05-28T15:58:48Z) - A Novel Surrogate-assisted Evolutionary Algorithm Applied to
Partition-based Ensemble Learning [0.0]
We propose a novel surrogate-assisted Algorithm for solving expensive optimization problems.
We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Evolutionary Gene- Evolutionary Optimal Mixing Algorithm.
We test the proposed algorithm on an ensemble learning problem.
arXiv Detail & Related papers (2021-04-16T11:51:18Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.