Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave
Optimization
- URL: http://arxiv.org/abs/2309.15298v1
- Date: Tue, 26 Sep 2023 22:22:45 GMT
- Title: Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave
Optimization
- Authors: Mastane Achab
- Abstract summary: We show that functions in general not convex but still satisfy generalized convexity inequalities.
We propose the Cross Gradient Descent (XGD) algorithm moving in the opposite direction of the cross-gradient.
We introduce the so-called checkered regression method relying on a sum-log-concave function.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper extends the classic theory of convex optimization to the
minimization of functions that are equal to the negated logarithm of what we
term as a sum-log-concave function, i.e., a sum of log-concave functions. In
particular, we show that such functions are in general not convex but still
satisfy generalized convexity inequalities. These inequalities unveil the key
importance of a certain vector that we call the cross-gradient and that is, in
general, distinct from the usual gradient. Thus, we propose the Cross Gradient
Descent (XGD) algorithm moving in the opposite direction of the cross-gradient
and derive a convergence analysis. As an application of our sum-log-concave
framework, we introduce the so-called checkered regression method relying on a
sum-log-concave function. This classifier extends (multiclass) logistic
regression to non-linearly separable problems since it is capable of
tessellating the feature space by using any given number of hyperplanes,
creating a checkerboard-like pattern of decision regions.
Related papers
- Optimal lower bounds for logistic log-likelihoods [1.3124513975412255]
The logit transform is arguably the most widely-employed link function beyond linear settings.
It is still unclear whether tangent lower bounds sharper than quadratic ones can be derived without undermining the tractability of the resulting minorizer.
This article addresses such a challenging question through the design and study of a novel piece-wise quadratic lower bound.
arXiv Detail & Related papers (2024-10-14T09:09:33Z) - Non-asymptotic convergence bounds for Sinkhorn iterates and their
gradients: a coupling approach [10.568851068989972]
We focus on a relaxation of the original OT problem, the entropic OT problem, which allows to implement efficient and practical algorithmic solutions.
This formulation, also known as the Schr"odinger Bridge problem, notably connects with Optimal Control (SOC) and can be solved with the popular Sinkhorn algorithm.
arXiv Detail & Related papers (2023-04-13T13:58:25Z) - Universal coding, intrinsic volumes, and metric complexity [3.4392739159262145]
We study the sequential probability assignment in the Gaussian setting, where the goal is to predict, or equivalently, a sequence of real-valued observations.
As part of our analysis, we also describe the minimax for a general set. We finally relate our findings with classical results in information theory.
arXiv Detail & Related papers (2023-03-13T16:54:04Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Linear Convergence of ISTA and FISTA [8.261388753972234]
We revisit the class of iterative shrinkage-thresholding algorithms (ISTA) for solving the linear inverse problem with sparse representation.
We find that the previous assumption for the smooth part to be convex weakens the least-square model.
We generalize the linear convergence to composite optimization in both the objective value and the squared proximal subgradient norm.
arXiv Detail & Related papers (2022-12-13T02:02:50Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Learning Sparse Graph with Minimax Concave Penalty under Gaussian Markov
Random Fields [51.07460861448716]
This paper presents a convex-analytic framework to learn from data.
We show that a triangular convexity decomposition is guaranteed by a transform of the corresponding to its upper part.
arXiv Detail & Related papers (2021-09-17T17:46:12Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.