Related papers: Constrained Monotonic Neural Networks

Constrained Monotonic Neural Networks

URL: http://arxiv.org/abs/2205.11775v4
Date: Wed, 31 May 2023 18:53:13 GMT
Title: Constrained Monotonic Neural Networks
Authors: Davor Runje, Sharath M. Shankaranarayana
Abstract summary: Wider adoption of neural networks in many critical domains such as finance and healthcare is being hindered by the need to explain their predictions. Monotonicity constraint is one of the most requested properties in real-world scenarios. We show it can approximate any continuous monotone function on a compact subset of $mathbbRn$.
Score: 0.685316573653194
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Wider adoption of neural networks in many critical domains such as finance and healthcare is being hindered by the need to explain their predictions and to impose additional constraints on them. Monotonicity constraint is one of the most requested properties in real-world scenarios and is the focus of this paper. One of the oldest ways to construct a monotonic fully connected neural network is to constrain signs on its weights. Unfortunately, this construction does not work with popular non-saturated activation functions as it can only approximate convex functions. We show this shortcoming can be fixed by constructing two additional activation functions from a typical unsaturated monotonic activation function and employing each of them on the part of neurons. Our experiments show this approach of building monotonic neural networks has better accuracy when compared to other state-of-the-art methods, while being the simplest one in the sense of having the least number of parameters, and not requiring any modifications to the learning procedure or post-learning steps. Finally, we prove it can approximate any continuous monotone function on a compact subset of $\mathbb{R}^n$.

Related papers

A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff [57.25901375384457]
We propose a nonasymptotic generalization theory for multilayer neural networks with arbitrary Lipschitz activations and general Lipschitz loss functions. In particular, it doens't require the boundness of loss function, as commonly assumed in the literature. We show the near minimax optimality of our theory for multilayer ReLU networks for regression problems.
arXiv Detail & Related papers (2025-03-03T23:34:12Z)
1-Lipschitz Neural Networks are more expressive with N-Activations [19.858602457988194]
Small changes to a system's inputs should not result in large changes to its outputs. We show that commonly used activation functions, such as MaxMin, unnecessarily restrict the class of representable functions. We introduce the new N-activation function that is provably more expressive than currently popular activation functions.
arXiv Detail & Related papers (2023-11-10T15:12:04Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Neural Estimation of Submodular Functions with Applications to Differentiable Subset Selection [50.14730810124592]
Submodular functions and variants, through their ability to characterize diversity and coverage, have emerged as a key tool for data selection and summarization. We propose FLEXSUBNET, a family of flexible neural models for both monotone and non-monotone submodular functions.
arXiv Detail & Related papers (2022-10-20T06:00:45Z)
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z)
Learning a Single Neuron with Bias Using Gradient Descent [53.15475693468925]
We study the fundamental problem of learning a single neuron with a bias term. We show that this is a significantly different and more challenging problem than the bias-less case.
arXiv Detail & Related papers (2021-06-02T12:09:55Z)
Certified Monotonic Neural Networks [15.537695725617576]
We propose to certify the monotonicity of the general piece-wise linear neural networks by solving a mixed integer linear programming problem. Our approach does not require human-designed constraints on the weight space and also yields more accurate approximation.
arXiv Detail & Related papers (2020-11-20T04:58:13Z)
No one-hidden-layer neural network can represent multivariable functions [0.0]
In a function approximation with a neural network, an input dataset is mapped to an output index by optimizing the parameters of each hidden-layer unit. We present constraints on the parameters and its second derivative by constructing a continuum version of a one-hidden-layer neural network with the rectified linear unit (ReLU) activation function.
arXiv Detail & Related papers (2020-06-19T06:46:54Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
Counterexample-Guided Learning of Monotonic Neural Networks [32.73558242733049]
We focus on monotonicity constraints, which are common and require that the function's output increases with increasing values of specific input features. We develop a counterexample-guided technique to provably enforce monotonicity constraints at prediction time. We also propose a technique to use monotonicity as an inductive bias for deep learning.
arXiv Detail & Related papers (2020-06-16T01:04:26Z)
On Sharpness of Error Bounds for Multivariate Neural Network Approximation [0.0]
The paper deals with best non-linear approximation by such sums of ridge functions. Error bounds are presented in terms of moduli of smoothness.
arXiv Detail & Related papers (2020-04-05T14:00:52Z)
Exact Hard Monotonic Attention for Character-Level Transduction [76.66797368985453]
We show that neural sequence-to-sequence models that use non-monotonic soft attention often outperform popular monotonic models. We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns a latent alignment jointly while learning to transduce.
arXiv Detail & Related papers (2019-05-15T17:51:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.