Learning Fast Approximations of Sparse Nonlinear Regression
- URL: http://arxiv.org/abs/2010.13490v1
- Date: Mon, 26 Oct 2020 11:31:08 GMT
- Title: Learning Fast Approximations of Sparse Nonlinear Regression
- Authors: Yuhai Song, Zhong Cao, Kailun Wu, Ziang Yan, Changshui Zhang
- Abstract summary: In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA)
Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
- Score: 50.00693981886832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The idea of unfolding iterative algorithms as deep neural networks has been
widely applied in solving sparse coding problems, providing both solid
theoretical analysis in convergence rate and superior empirical performance.
However, for sparse nonlinear regression problems, a similar idea is rarely
exploited due to the complexity of nonlinearity. In this work, we bridge this
gap by introducing the Nonlinear Learned Iterative Shrinkage Thresholding
Algorithm (NLISTA), which can attain a linear convergence under suitable
conditions. Experiments on synthetic data corroborate our theoretical results
and show our method outperforms state-of-the-art methods.
Related papers
- Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Non-Parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence [65.63201894457404]
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear differential equations.
The key idea essentially consists of fitting a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations.
arXiv Detail & Related papers (2023-05-24T20:43:47Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point.
Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z) - A deep branching solver for fully nonlinear partial differential
equations [0.1474723404975345]
We present a multidimensional deep learning implementation of a branching algorithm for the numerical solution of fully nonlinear PDEs.
This approach is designed to tackle functional nonlinearities involving gradient terms of any orders.
arXiv Detail & Related papers (2022-03-07T09:46:46Z) - Lower Bounds on the Generalization Error of Nonlinear Learning Models [2.1030878979833467]
We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data.
We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime.
We derive explicit generalization lower bounds for general biased estimators, in the cases of linear regression and of two-layered networks.
arXiv Detail & Related papers (2021-03-26T20:37:54Z) - Progressive Batching for Efficient Non-linear Least Squares [31.082253632197023]
Most improvements of the basic Gauss-Newton tackle convergence guarantees or leverage the sparsity of the underlying problem structure for computational speedup.
Our work borrows ideas from both machine learning and statistics, and we present an approach for non-linear least-squares that guarantees convergence while at the same time significantly reduces the required amount of computation.
arXiv Detail & Related papers (2020-10-21T13:00:04Z) - The role of optimization geometry in single neuron learning [12.891722496444036]
Recent experiments have demonstrated the choice of optimization geometry can impact generalization performance when learning expressive neural model networks.
We show how the interplay between geometry and the feature geometry sets the out-of-sample leads and improves performance.
arXiv Detail & Related papers (2020-06-15T17:39:44Z) - A Novel Learnable Gradient Descent Type Algorithm for Non-convex
Non-smooth Inverse Problems [3.888272676868008]
We propose a novel type to solve inverse problems consisting general architecture and neural intimating.
Results that the proposed network outperforms the state reconstruction methods on different image problems in terms of efficiency and results.
arXiv Detail & Related papers (2020-03-15T03:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.