Related papers: Robust and Provably Monotonic Networks

Robust and Provably Monotonic Networks

URL: http://arxiv.org/abs/2112.00038v1
Date: Tue, 30 Nov 2021 19:01:32 GMT
Title: Robust and Provably Monotonic Networks
Authors: Ouail Kitouni, Niklas Nolte, Mike Williams
Abstract summary: We present a new method to constrain the Lipschitz constant of dense deep learning models. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavy-flavor decays in the LHCb realtime data-processing system.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Lipschitz constant of the map between the input and output space represented by a neural network is a natural metric for assessing the robustness of the model. We present a new method to constrain the Lipschitz constant of dense deep learning models that can also be generalized to other architectures. The method relies on a simple weight normalization scheme during training that ensures the Lipschitz constant of every layer is below an upper limit specified by the analyst. A simple residual connection can then be used to make the model monotonic in any subset of its inputs, which is useful in scenarios where domain knowledge dictates such dependence. Examples can be found in algorithmic fairness requirements or, as presented here, in the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider. Our normalization is minimally constraining and allows the underlying architecture to maintain higher expressiveness compared to other techniques which aim to either control the Lipschitz constant of the model or ensure its monotonicity. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavy-flavor decays in the LHCb realtime data-processing system.

Related papers

Numerical and statistical analysis of NeuralODE with Runge-Kutta time integration [1.3654846342364308]
We give a detailed account on the consistency of Maximum Likelihood based empirical risk minimization for a generic class of target measures. We also give a numerical analysis of the NeuralODE algorithm based on the 2nd order Runge-Kutta (RK) time integration.
arXiv Detail & Related papers (2025-03-13T11:58:18Z)
Expressive Monotonic Neural Networks [1.0128808054306184]
The monotonic dependence of the outputs of a neural network on some of its inputs is a crucial inductive bias in many scenarios where domain knowledge dictates such behavior. We propose a weight-constrained architecture with a single residual connection to achieve exact monotonic dependence in any subset of the inputs. We show how the algorithm is used to train powerful, robust, and interpretable discriminators that achieve competitive performance.
arXiv Detail & Related papers (2023-07-14T17:59:53Z)
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory. We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability. It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z)
Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation [79.13041340708395]
Lipschitz constants are connected to many properties of neural networks, such as robustness, fairness, and generalization. Existing methods for computing Lipschitz constants either produce relatively loose upper bounds or are limited to small networks. We develop an efficient framework for computing the $ell_infty$ local Lipschitz constant of a neural network by tightly upper bounding the norm of Clarke Jacobian.
arXiv Detail & Related papers (2022-10-13T22:23:22Z)
Git Re-Basin: Merging Models modulo Permutation Symmetries [3.5450828190071655]
We show how simple algorithms can be used to fit large networks in practice. We demonstrate the first (to our knowledge) demonstration of zero mode connectivity between independently trained models. We also discuss shortcomings in the linear mode connectivity hypothesis.
arXiv Detail & Related papers (2022-09-11T10:44:27Z)
Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN. We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness. Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning. computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard. We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.