Efficient and robust high-dimensional sparse logistic regression via
nonlinear primal-dual hybrid gradient algorithms
- URL: http://arxiv.org/abs/2111.15426v1
- Date: Tue, 30 Nov 2021 14:16:48 GMT
- Title: Efficient and robust high-dimensional sparse logistic regression via
nonlinear primal-dual hybrid gradient algorithms
- Authors: J\'er\^ome Darbon and Gabriel P. Langlois
- Abstract summary: We propose an iterative algorithm that provably computes a solution to a logistic regression problem regularized by an elastic net penalty.
This result improves on the known complexity bound of $O(min(m2n,mn2)log (1/epsilon))$ for first-order optimization methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Logistic regression is a widely used statistical model to describe the
relationship between a binary response variable and predictor variables in data
sets. It is often used in machine learning to identify important predictor
variables. This task, variable selection, typically amounts to fitting a
logistic regression model regularized by a convex combination of $\ell_1$ and
$\ell_{2}^{2}$ penalties. Since modern big data sets can contain hundreds of
thousands to billions of predictor variables, variable selection methods depend
on efficient and robust optimization algorithms to perform well.
State-of-the-art algorithms for variable selection, however, were not
traditionally designed to handle big data sets; they either scale poorly in
size or are prone to produce unreliable numerical results. It therefore remains
challenging to perform variable selection on big data sets without access to
adequate and costly computational resources. In this paper, we propose a
nonlinear primal-dual algorithm that addresses these shortcomings.
Specifically, we propose an iterative algorithm that provably computes a
solution to a logistic regression problem regularized by an elastic net penalty
in $O(T(m,n)\log(1/\epsilon))$ operations, where $\epsilon \in (0,1)$ denotes
the tolerance and $T(m,n)$ denotes the number of arithmetic operations required
to perform matrix-vector multiplication on a data set with $m$ samples each
comprising $n$ features. This result improves on the known complexity bound of
$O(\min(m^2n,mn^2)\log(1/\epsilon))$ for first-order optimization methods such
as the classic primal-dual hybrid gradient or forward-backward splitting
methods.
Related papers
- Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream.
Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error.
For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z) - Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data [17.657917523817243]
We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional optimization problem.
In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches.
We derive rates of convergence in expectation, that are of order $mathcalO(log T/T)$ and $mathcalO (1/T1-iota)$ for any $iota>0$.
arXiv Detail & Related papers (2024-05-29T19:21:55Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Conformalization of Sparse Generalized Linear Models [2.1485350418225244]
Conformal prediction method estimates a confidence set for $y_n+1$ that is valid for any finite sample size.
Although attractive, computing such a set is computationally infeasible in most regression problems.
We show how our path-following algorithm accurately approximates conformal prediction sets.
arXiv Detail & Related papers (2023-07-11T08:36:12Z) - Easy Differentially Private Linear Regression [16.325734286930764]
We study an algorithm which uses the exponential mechanism to select a model with high Tukey depth from a collection of non-private regression models.
We find that this algorithm obtains strong empirical performance in the data-rich setting.
arXiv Detail & Related papers (2022-08-15T17:42:27Z) - Mixability made efficient: Fast online multiclass logistic regression [68.8204255655161]
We show that mixability can be a powerful tool to obtain algorithms with optimal regret.
The resulting methods often suffer from high computational complexity which has reduced their practical applicability.
arXiv Detail & Related papers (2021-10-08T08:22:05Z) - Linear Regression by Quantum Amplitude Estimation and its Extension to
Convex Optimization [0.0]
We propose a new quantum algorithm for linear regression, which has the complexity of $O(epsilon-1)$ and keeps the logarithmic dependence on the number of data points $N_D$.
In this method, we overcome bottleneck parts in the calculation, which take the form of the sum over data points and therefore have the complexity proportional to $N_D$.
arXiv Detail & Related papers (2021-05-28T00:04:11Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Conditional Uncorrelation and Efficient Non-approximate Subset Selection
in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation.
By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.