Generalized Quantile Loss for Deep Neural Networks
- URL: http://arxiv.org/abs/2012.14348v1
- Date: Mon, 28 Dec 2020 16:37:02 GMT
- Title: Generalized Quantile Loss for Deep Neural Networks
- Authors: Dvir Ben Or, Michael Kolomenkin, Gil Shabat
- Abstract summary: This note presents a simple way to add a count (or quantile) constraint to a regression neural net, such that given $n$ samples in the training set it guarantees that the prediction of $mn$ samples will be larger than the actual value (the label)
Unlike standard quantile regression networks, the presented method can be applied to any loss function and not necessarily to the standard quantile regression loss, which minimizes the mean absolute differences.
- Score: 0.8594140167290096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This note presents a simple way to add a count (or quantile) constraint to a
regression neural net, such that given $n$ samples in the training set it
guarantees that the prediction of $m<n$ samples will be larger than the actual
value (the label). Unlike standard quantile regression networks, the presented
method can be applied to any loss function and not necessarily to the standard
quantile regression loss, which minimizes the mean absolute differences. Since
this count constraint has zero gradients almost everywhere, it cannot be
optimized using standard gradient descent methods. To overcome this problem, an
alternation scheme, which is based on standard neural network optimization
procedures, is presented with some theoretical analysis.
Related papers
- Loss Minimization Yields Multicalibration for Large Neural Networks [16.047146428254592]
Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups.
We show that minimizing the squared loss over all neural nets of size $n$ implies multicalibration for all but a bounded number of unlucky values of $n$.
arXiv Detail & Related papers (2023-04-19T05:16:20Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - Likelihood-Free Inference with Generative Neural Networks via Scoring
Rule Minimization [0.0]
Inference methods yield posterior approximations for simulator models with intractable likelihood.
Many works trained neural networks to approximate either the intractable likelihood or the posterior directly.
Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization.
arXiv Detail & Related papers (2022-05-31T13:32:55Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality.
We prove that quantile regression suffers from an inherent under-coverage bias.
Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - An efficient projection neural network for $\ell_1$-regularized logistic
regression [10.517079029721257]
This paper presents a simple projection neural network for $ell_$-regularized logistics regression.
The proposed neural network does not require any extra auxiliary variable nor any smooth approximation.
We also investigate the convergence of the proposed neural network by using the Lyapunov theory and show that it converges to a solution of the problem with any arbitrary initial value.
arXiv Detail & Related papers (2021-05-12T06:13:44Z) - Quantile regression with deep ReLU Networks: Estimators and minimax
rates [4.522666263036413]
We study quantile regression with rectified linear unit (ReLU) neural networks.
We derive an upper bound on the expected mean squared error of a ReLU network.
These tight bounds imply ReLU networks with quantile regression achieve minimax rates for broad collections of function types.
arXiv Detail & Related papers (2020-10-16T08:34:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.