Analysis of Generalized Bregman Surrogate Algorithms for Nonsmooth
Nonconvex Statistical Learning
- URL: http://arxiv.org/abs/2112.09191v1
- Date: Thu, 16 Dec 2021 20:37:40 GMT
- Title: Analysis of Generalized Bregman Surrogate Algorithms for Nonsmooth
Nonconvex Statistical Learning
- Authors: Yiyuan She, Zhifeng Wang, Jiuwu Jin
- Abstract summary: This paper focuses on a broad Bregman-surrogate framework including the adaptive approximation, mirror, iterative threshold descent, DC programming and many others as examples.
- Score: 2.049702429898688
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern statistical applications often involve minimizing an objective
function that may be nonsmooth and/or nonconvex. This paper focuses on a broad
Bregman-surrogate algorithm framework including the local linear approximation,
mirror descent, iterative thresholding, DC programming and many others as
particular instances. The recharacterization via generalized Bregman functions
enables us to construct suitable error measures and establish global
convergence rates for nonconvex and nonsmooth objectives in possibly high
dimensions. For sparse learning problems with a composite objective, under some
regularity conditions, the obtained estimators as the surrogate's fixed points,
though not necessarily local minimizers, enjoy provable statistical guarantees,
and the sequence of iterates can be shown to approach the statistical truth
within the desired accuracy geometrically fast. The paper also studies how to
design adaptive momentum based accelerations without assuming convexity or
smoothness by carefully controlling stepsize and relaxation parameters.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Probabilistic Iterative Hard Thresholding for Sparse Learning [2.5782973781085383]
We present an approach towards solving expectation objective optimization problems with cardinality constraints.
We prove convergence of the underlying process, and demonstrate the performance on two Machine Learning problems.
arXiv Detail & Related papers (2024-09-02T18:14:45Z) - A Universal Class of Sharpness-Aware Minimization Algorithms [57.29207151446387]
We introduce a new class of sharpness measures, leading to new sharpness-aware objective functions.
We prove that these measures are textitly expressive, allowing any function of the training loss Hessian matrix to be represented by appropriate hyper and determinants.
arXiv Detail & Related papers (2024-06-06T01:52:09Z) - Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization [10.009748368458409]
We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity.
Our method enables fully differentiable approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep learning.
arXiv Detail & Related papers (2023-07-07T13:06:12Z) - Distributed Estimation and Inference for Semi-parametric Binary Response Models [8.309294338998539]
This paper studies the maximum score estimator of a semi-parametric binary choice model under a distributed computing environment.
An intuitive divide-and-conquer estimator is computationally expensive and restricted by a non-regular constraint on the number of machines.
arXiv Detail & Related papers (2022-10-15T23:06:46Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms
and Theoretical Studies [1.6457778420360534]
A framework of outlier-resistant estimation is introduced to robustify arbitrarily loss function.
A new technique is proposed to alleviate the requirement on starting point such that on regular datasets the number of data reestimations can be substantially reduced.
The obtained estimators, though not necessarily globally or even globally, enjoymax optimality in both low dimensions.
arXiv Detail & Related papers (2021-12-15T20:35:21Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable
Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows.
We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z) - Support recovery and sup-norm convergence rates for sparse pivotal
estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level.
We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.