Related papers: Sparsest Univariate Learning Models Under Lipschitz Constraint

Sparsest Univariate Learning Models Under Lipschitz Constraint

URL: http://arxiv.org/abs/2112.13542v1
Date: Mon, 27 Dec 2021 07:03:43 GMT
Title: Sparsest Univariate Learning Models Under Lipschitz Constraint
Authors: Shayan Aziznejad, Thomas Debarre, Michael Unser
Abstract summary: We propose continuous-domain formulations for one-dimensional regression problems. We control the Lipschitz constant explicitly using a user-defined upper-bound. We show that both problems admit global minimizers that are continuous and piecewise-linear.
Score: 31.28451181040038
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Beside the minimization of the prediction error, two of the most desirable properties of a regression scheme are stability and interpretability. Driven by these principles, we propose continuous-domain formulations for one-dimensional regression problems. In our first approach, we use the Lipschitz constant as a regularizer, which results in an implicit tuning of the overall robustness of the learned mapping. In our second approach, we control the Lipschitz constant explicitly using a user-defined upper-bound and make use of a sparsity-promoting regularizer to favor simpler (and, hence, more interpretable) solutions. The theoretical study of the latter formulation is motivated in part by its equivalence, which we prove, with the training of a Lipschitz-constrained two-layer univariate neural network with rectified linear unit (ReLU) activations and weight decay. By proving representer theorems, we show that both problems admit global minimizers that are continuous and piecewise-linear (CPWL) functions. Moreover, we propose efficient algorithms that find the sparsest solution of each problem: the CPWL mapping with the least number of linear regions. Finally, we illustrate numerically the outcome of our formulations.

Related papers

A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression [22.917692982875025]
We introduce a novel piecewise Lyapunov function that enables us to handle functions $f$ with only first-order differentiability. We derive finite-time moment bounds under general diminishing stepsizes, as well as constant stepsizes. Our results have wide applications, especially in online statistical methods.
arXiv Detail & Related papers (2025-04-11T00:20:37Z)
GLinSAT: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent [12.409030267572243]
We first reformulate the neural network output projection problem as an entropy-regularized linear programming problem. Based on an accelerated gradient descent algorithm with numerical performance enhancement, we present our architecture, GLinSAT, to solve the problem. This is the first general linear satisfiability layer in which all the operations are differentiable and matrix-factorization-free.
arXiv Detail & Related papers (2024-09-26T03:12:53Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
The Inductive Bias of Flatness Regularization for Deep Matrix Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks. We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z)
Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN. We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness. Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z)
Decentralized Feature-Distributed Optimization for Generalized Linear Models [19.800898945436384]
We consider the "all-for-one" decentralized learning problem for generalized linear models. The features of each sample are partitioned among several collaborating agents in a connected network, but only one agent observes the response variables. We apply the Chambolle--Pock primal--dual algorithm to an equivalent saddle-point formulation of the problem.
arXiv Detail & Related papers (2021-10-28T16:42:47Z)
Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection [136.4014229319618]
We study the role of the representation of state-action value functions in regret minimization in finite-horizon Markov Decision Processes (MDPs) with linear structure. We first derive a necessary condition on the representation, called universally spanning optimal features (UNISOFT), to achieve constant regret in any MDP with linear reward function.
arXiv Detail & Related papers (2021-10-27T22:07:08Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
Lipschitz Bounded Equilibrium Networks [3.2872586139884623]
This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations. The new parameterization admits a Lipschitz bound during training via unconstrained optimization. In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.
arXiv Detail & Related papers (2020-10-05T01:00:40Z)
Lipschitz Bounds and Provably Robust Training by Laplacian Smoothing [7.4769019455423855]
We formulate the adversarially robust learning problem as one of loss minimization with a Lipschitz constraint. We show that the saddle point of the associated Lagrangian is characterized by a Poisson equation with weighted Laplace operator. We design a provably robust training scheme using graph-based discretization of the input space and a primal-dual algorithm to converge to the Lagrangian's saddle point.
arXiv Detail & Related papers (2020-06-05T22:02:21Z)
Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression. The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
Deep Neural Networks with Trainable Activations and Controlled Lipschitz Constant [26.22495169129119]
We introduce a variational framework to learn the activation functions of deep neural networks. Our aim is to increase the capacity of the network while controlling an upper-bound of the Lipschitz constant. We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU.
arXiv Detail & Related papers (2020-01-17T12:32:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.