An efficient projection neural network for $\ell_1$-regularized logistic
regression
- URL: http://arxiv.org/abs/2105.05449v1
- Date: Wed, 12 May 2021 06:13:44 GMT
- Title: An efficient projection neural network for $\ell_1$-regularized logistic
regression
- Authors: Majid Mohammadi, Amir Ahooye Atashin, Damian A. Tamburri
- Abstract summary: This paper presents a simple projection neural network for $ell_$-regularized logistics regression.
The proposed neural network does not require any extra auxiliary variable nor any smooth approximation.
We also investigate the convergence of the proposed neural network by using the Lyapunov theory and show that it converges to a solution of the problem with any arbitrary initial value.
- Score: 10.517079029721257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: $\ell_1$ regularization has been used for logistic regression to circumvent
the overfitting and use the estimated sparse coefficient for feature selection.
However, the challenge of such a regularization is that the $\ell_1$ norm is
not differentiable, making the standard algorithms for convex optimization not
applicable to this problem. This paper presents a simple projection neural
network for $\ell_1$-regularized logistics regression. In contrast to many
available solvers in the literature, the proposed neural network does not
require any extra auxiliary variable nor any smooth approximation, and its
complexity is almost identical to that of the gradient descent for logistic
regression without $\ell_1$ regularization, thanks to the projection operator.
We also investigate the convergence of the proposed neural network by using the
Lyapunov theory and show that it converges to a solution of the problem with
any arbitrary initial value. The proposed neural solution significantly
outperforms state-of-the-art methods with respect to the execution time and is
competitive in terms of accuracy and AUROC.
Related papers
- Convergence Rate Analysis of LION [54.28350823319057]
LION converges iterations of $cal(sqrtdK-)$ measured by gradient Karush-Kuhn-T (sqrtdK-)$.
We show that LION can achieve lower loss and higher performance compared to standard SGD.
arXiv Detail & Related papers (2024-11-12T11:30:53Z) - Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent [83.85536329832722]
We show that gradient descent (SGD) can efficiently solve the $k$-parity problem on a $d$dimensional hypercube.
We then demonstrate how a trained neural network with SGD, solving the $k$-parity problem with small statistical errors.
arXiv Detail & Related papers (2024-04-18T17:57:53Z) - Decoupled Weight Decay for Any $p$ Norm [1.1510009152620668]
We consider a simple yet effective approach to sparsification, based on the Bridge, $L_p$ regularization during training.
We introduce a novel weight decay scheme, which generalizes the standard $L$ weight decay to any $p$ norm.
We empirically demonstrate that it leads to highly sparse networks, while maintaining performance comparable to standard $L$ regularization.
arXiv Detail & Related papers (2024-04-16T18:02:15Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO
Regularization [15.517787031620864]
The territory of LASSO is extended to two-layer ReLU neural networks, a fashionable and powerful nonlinear regression model.
We show that the LASSO estimator can stably reconstruct the neural network and identify $mathcalSstar$ when the number of samples scales logarithmically.
Our theory lies in an extended Restricted Isometry Property (RIP)-based analysis framework for two-layer ReLU neural networks.
arXiv Detail & Related papers (2023-05-07T13:05:09Z) - spred: Solving $L_1$ Penalty with SGD [6.2255027793924285]
We propose to minimize a differentiable objective with $L_$ using a simple reparametrization.
We prove that the reparametrization trick is completely benign" with an exactiable non function.
arXiv Detail & Related papers (2022-10-03T20:07:51Z) - Bounding the Width of Neural Networks via Coupled Initialization -- A
Worst Case Analysis [121.9821494461427]
We show how to significantly reduce the number of neurons required for two-layer ReLU networks.
We also prove new lower bounds that improve upon prior work, and that under certain assumptions, are best possible.
arXiv Detail & Related papers (2022-06-26T06:51:31Z) - Generalized Quantile Loss for Deep Neural Networks [0.8594140167290096]
This note presents a simple way to add a count (or quantile) constraint to a regression neural net, such that given $n$ samples in the training set it guarantees that the prediction of $mn$ samples will be larger than the actual value (the label)
Unlike standard quantile regression networks, the presented method can be applied to any loss function and not necessarily to the standard quantile regression loss, which minimizes the mean absolute differences.
arXiv Detail & Related papers (2020-12-28T16:37:02Z) - Projection Neural Network for a Class of Sparse Regression Problems with
Cardinality Penalty [9.698438188398434]
We consider a class of sparse regression problems, whose objective function is the summation of a convex loss function and a cardinality penalty.
By constructing a smoothing function for the cardinality function, we propose a projected neural network and design a correction method for solving this problem.
The solution of the proposed neural network is unique, global existent, bounded and globally Lipschitz continuous.
arXiv Detail & Related papers (2020-04-02T08:05:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.