Constrained Monotonic Neural Networks
- URL: http://arxiv.org/abs/2205.11775v4
- Date: Wed, 31 May 2023 18:53:13 GMT
- Title: Constrained Monotonic Neural Networks
- Authors: Davor Runje, Sharath M. Shankaranarayana
- Abstract summary: Wider adoption of neural networks in many critical domains such as finance and healthcare is being hindered by the need to explain their predictions.
Monotonicity constraint is one of the most requested properties in real-world scenarios.
We show it can approximate any continuous monotone function on a compact subset of $mathbbRn$.
- Score: 0.685316573653194
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Wider adoption of neural networks in many critical domains such as finance
and healthcare is being hindered by the need to explain their predictions and
to impose additional constraints on them. Monotonicity constraint is one of the
most requested properties in real-world scenarios and is the focus of this
paper. One of the oldest ways to construct a monotonic fully connected neural
network is to constrain signs on its weights. Unfortunately, this construction
does not work with popular non-saturated activation functions as it can only
approximate convex functions. We show this shortcoming can be fixed by
constructing two additional activation functions from a typical unsaturated
monotonic activation function and employing each of them on the part of
neurons. Our experiments show this approach of building monotonic neural
networks has better accuracy when compared to other state-of-the-art methods,
while being the simplest one in the sense of having the least number of
parameters, and not requiring any modifications to the learning procedure or
post-learning steps. Finally, we prove it can approximate any continuous
monotone function on a compact subset of $\mathbb{R}^n$.
Related papers
- 1-Lipschitz Neural Networks are more expressive with N-Activations [19.858602457988194]
Small changes to a system's inputs should not result in large changes to its outputs.
We show that commonly used activation functions, such as MaxMin, unnecessarily restrict the class of representable functions.
We introduce the new N-activation function that is provably more expressive than currently popular activation functions.
arXiv Detail & Related papers (2023-11-10T15:12:04Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Neural Estimation of Submodular Functions with Applications to
Differentiable Subset Selection [50.14730810124592]
Submodular functions and variants, through their ability to characterize diversity and coverage, have emerged as a key tool for data selection and summarization.
We propose FLEXSUBNET, a family of flexible neural models for both monotone and non-monotone submodular functions.
arXiv Detail & Related papers (2022-10-20T06:00:45Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Learning a Single Neuron with Bias Using Gradient Descent [53.15475693468925]
We study the fundamental problem of learning a single neuron with a bias term.
We show that this is a significantly different and more challenging problem than the bias-less case.
arXiv Detail & Related papers (2021-06-02T12:09:55Z) - Certified Monotonic Neural Networks [15.537695725617576]
We propose to certify the monotonicity of the general piece-wise linear neural networks by solving a mixed integer linear programming problem.
Our approach does not require human-designed constraints on the weight space and also yields more accurate approximation.
arXiv Detail & Related papers (2020-11-20T04:58:13Z) - No one-hidden-layer neural network can represent multivariable functions [0.0]
In a function approximation with a neural network, an input dataset is mapped to an output index by optimizing the parameters of each hidden-layer unit.
We present constraints on the parameters and its second derivative by constructing a continuum version of a one-hidden-layer neural network with the rectified linear unit (ReLU) activation function.
arXiv Detail & Related papers (2020-06-19T06:46:54Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Counterexample-Guided Learning of Monotonic Neural Networks [32.73558242733049]
We focus on monotonicity constraints, which are common and require that the function's output increases with increasing values of specific input features.
We develop a counterexample-guided technique to provably enforce monotonicity constraints at prediction time.
We also propose a technique to use monotonicity as an inductive bias for deep learning.
arXiv Detail & Related papers (2020-06-16T01:04:26Z) - On Sharpness of Error Bounds for Multivariate Neural Network
Approximation [0.0]
The paper deals with best non-linear approximation by such sums of ridge functions.
Error bounds are presented in terms of moduli of smoothness.
arXiv Detail & Related papers (2020-04-05T14:00:52Z) - Exact Hard Monotonic Attention for Character-Level Transduction [76.66797368985453]
We show that neural sequence-to-sequence models that use non-monotonic soft attention often outperform popular monotonic models.
We develop a hard attention sequence-to-sequence model that enforces strict monotonicity and learns a latent alignment jointly while learning to transduce.
arXiv Detail & Related papers (2019-05-15T17:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.