Convex optimization over a probability simplex
- URL: http://arxiv.org/abs/2305.09046v1
- Date: Mon, 15 May 2023 22:14:22 GMT
- Title: Convex optimization over a probability simplex
- Authors: James Chok and Geoffrey M. Vasil
- Abstract summary: We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex problems over the probability simplex.
Each iteration of the Cauchy-Simplex consists of simple operations, making it well-suited for high-dimensional problems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new iteration scheme, the Cauchy-Simplex, to optimize convex
problems over the probability simplex $\{w\in\mathbb{R}^n\ |\ \sum_i w_i=1\
\textrm{and}\ w_i\geq0\}$. Other works have taken steps to enforce positivity
or unit normalization automatically but never simultaneously within a unified
setting. This paper presents a natural framework for manifestly requiring the
probability condition. Specifically, we map the simplex to the positive
quadrant of a unit sphere, envisage gradient descent in latent variables, and
map the result back in a way that only depends on the simplex variable.
Moreover, proving rigorous convergence results in this formulation leads
inherently to tools from information theory (e.g. cross entropy and KL
divergence). Each iteration of the Cauchy-Simplex consists of simple
operations, making it well-suited for high-dimensional problems. We prove that
it has a convergence rate of ${O}(1/T)$ for convex functions, and numerical
experiments of projection onto convex hulls show faster convergence than
similar algorithms. Finally, we apply our algorithm to online learning problems
and prove the convergence of the average regret for (1) Prediction with expert
advice and (2) Universal Portfolios.
Related papers
- Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems.
We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z) - Convergence of Some Convex Message Passing Algorithms to a Fixed Point [0.0]
A popular approach to the MAP inference problem in graphical models is to minimize an upper bound obtained from a dual linear programming or Lagrangian relaxation by (block-coordinate) descent.
This is also known as convex/convergent message passing; examples are max-sum diffusion and sequential tree-reweighted message passing (TRW-S)
We show that the algorithm terminates within $mathcalO (1/varepsilon)$ iterations.
arXiv Detail & Related papers (2024-03-07T13:14:21Z) - Strictly Low Rank Constraint Optimization -- An Asymptotically
$\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution.
We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
We introduce and analyze Structured Zeroth order Descent (SSZD), a finite difference approach that approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space.
For convex convex we prove almost sure convergence of functions on $O( (d/l) k-c1/2$)$ for every $c1/2$, which is arbitrarily close to the one of the Gradient Descent (SGD) in terms of one number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Learning Sparse Graph with Minimax Concave Penalty under Gaussian Markov
Random Fields [51.07460861448716]
This paper presents a convex-analytic framework to learn from data.
We show that a triangular convexity decomposition is guaranteed by a transform of the corresponding to its upper part.
arXiv Detail & Related papers (2021-09-17T17:46:12Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - A Unified Single-loop Alternating Gradient Projection Algorithm for
Nonconvex-Concave and Convex-Nonconcave Minimax Problems [8.797831153231664]
We develop an efficient algorithm for solving minimax problems with theoretical general convexnon objective guarantees.
We show that the proposed algorithm can be used to solve both noncaveepsilon concave (strongly) and (strongly) convexnonconcave minimax problems.
arXiv Detail & Related papers (2020-06-03T04:00:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.