Pathwise optimization for bridge-type estimators and its applications
- URL: http://arxiv.org/abs/2412.04047v1
- Date: Thu, 05 Dec 2024 10:38:29 GMT
- Title: Pathwise optimization for bridge-type estimators and its applications
- Authors: Alessandro De Gregorio, Francesco Iafrate,
- Abstract summary: Pathwise methods allow to efficiently compute the full path for penalized estimators.
We apply these algorithms to the penalized estimation of processes observed at discrete times.
- Score: 49.1574468325115
- License:
- Abstract: Sparse parametric models are of great interest in statistical learning and are often analyzed by means of regularized estimators. Pathwise methods allow to efficiently compute the full solution path for penalized estimators, for any possible value of the penalization parameter $\lambda$. In this paper we deal with the pathwise optimization for bridge-type problems; i.e. we are interested in the minimization of a loss function, such as negative log-likelihood or residual sum of squares, plus the sum of $\ell^q$ norms with $q\in(0,1]$ involving adpative coefficients. For some loss functions this regularization achieves asymptotically the oracle properties (such as the selection consistency). Nevertheless, since the objective function involves nonconvex and nondifferentiable terms, the minimization problem is computationally challenging. The aim of this paper is to apply some general algorithms, arising from nonconvex optimization theory, to compute efficiently the path solutions for the adaptive bridge estimator with multiple penalties. In particular, we take into account two different approaches: accelerated proximal gradient descent and blockwise alternating optimization. The convergence and the path consistency of these algorithms are discussed. In order to assess our methods, we apply these algorithms to the penalized estimation of diffusion processes observed at discrete times. This latter represents a recent research topic in the field of statistics for time-dependent data.
Related papers
- Provable benefits of annealing for estimating normalizing constants:
Importance Sampling, Noise-Contrastive Estimation, and beyond [24.86929310909572]
We show that using the geometric path brings down the estimation error from an exponential to a function of the distance between the target and proposal.
We propose a two-step estimator to approximate the optimal path in an efficient way.
arXiv Detail & Related papers (2023-10-05T21:16:55Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Robust estimation via generalized quasi-gradients [28.292300073453877]
We show why many recently proposed robust estimation problems are efficiently solvable.
We identify the existence of "generalized quasi-gradients"
We show that generalized quasi-gradients exist and construct efficient algorithms.
arXiv Detail & Related papers (2020-05-28T15:14:33Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z) - The estimation error of general first order methods [12.472245917779754]
We consider two families of estimation problems: high-dimensional regression and low-dimensional matrix estimation.
We derive lower bounds the error that hold in the high-dimensional optimals in which both the number of observations and the number of parameters diverge.
These lower bounds sense that there exist algorithms whose estimation error matches the lower bounds up to sparseally negligible terms.
arXiv Detail & Related papers (2020-02-28T18:13:47Z) - Stochastic Optimization for Regularized Wasserstein Estimators [10.194798773447879]
We introduce an algorithm to solve a regularized version of the problem of Wasserstein estimators gradient, with a time per step which is sublinear in the natural dimensions.
We show that this algorithm can be extended to other tasks, including estimation of Wasserstein barycenters.
arXiv Detail & Related papers (2020-02-20T12:04:05Z) - Super-efficiency of automatic differentiation for functions defined as a
minimum [16.02151272607889]
In min-min optimization, one has to compute the gradient of a function defined as a minimum.
We study the error made by these estimators as a function of the optimization error.
arXiv Detail & Related papers (2020-02-10T13:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.