Related papers: Calibrating a Deep Neural Network with Its Predecessors

Calibrating a Deep Neural Network with Its Predecessors

URL: http://arxiv.org/abs/2302.06245v2
Date: Tue, 23 May 2023 04:24:56 GMT
Title: Calibrating a Deep Neural Network with Its Predecessors
Authors: Linwei Tao, Minjing Dong, Daochang Liu, Changming Sun, Chang Xu
Abstract summary: We study the limitions of early stopping and analyze the overfitting problem of a network considering each individual block. We propose a novel regularization method, predecessor combination search (PCS), to improve calibration. PCS achieves the state-of-the-art calibration performance on multiple datasets and architectures.
Score: 39.3413000646559
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Confidence calibration - the process to calibrate the output probability distribution of neural networks - is essential for safety-critical applications of such networks. Recent works verify the link between mis-calibration and overfitting. However, early stopping, as a well-known technique to mitigate overfitting, fails to calibrate networks. In this work, we study the limitions of early stopping and comprehensively analyze the overfitting problem of a network considering each individual block. We then propose a novel regularization method, predecessor combination search (PCS), to improve calibration by searching a combination of best-fitting block predecessors, where block predecessors are the corresponding network blocks with weight parameters from earlier training stages. PCS achieves the state-of-the-art calibration performance on multiple datasets and architectures. In addition, PCS improves model robustness under dataset distribution shift.

Related papers

Imitation Learning of MPC with Neural Networks: Error Guarantees and Sparsification [5.260346080244568]
We present a framework for bounding the approximation error in imitation model predictive controllers utilizing neural networks. We discuss how this method can be used to design a stable neural network controller with performance guarantees.
arXiv Detail & Related papers (2025-01-07T10:18:37Z)
Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training. We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z)
ACLS: Adaptive and Conditional Label Smoothing for Network Calibration [30.80635918457243]
Many approaches to network calibration adopt a regularization-based method that exploits a regularization term to smooth the miscalibrated confidences. We present in this paper an in-depth analysis of existing regularization-based methods, providing a better understanding on how they affect to network calibration. We introduce a novel loss function, dubbed ACLS, that unifies the merits of existing regularization methods, while avoiding the limitations.
arXiv Detail & Related papers (2023-08-23T04:52:48Z)
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks [23.7125322065694]
Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models. We create statistically guaranteed schemes to principally emphcharacterize, and emphremove, the uncertainty of over- parameterized neural networks. In particular, our approach, based on what we call a procedural-noise-correcting (PNC) predictor, removes the procedural uncertainty by using only emphone auxiliary network that is trained on a suitably labeled dataset.
arXiv Detail & Related papers (2023-06-09T05:15:53Z)
Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z)
Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy. These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem. We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z)
Post-hoc Calibration of Neural Networks by g-Layers [51.42640515410253]
In recent years, there is a surge of research on neural network calibration. It is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained. We prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network.
arXiv Detail & Related papers (2020-06-23T07:55:10Z)
Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks [54.23874144090228]
A common approach is to learn a post-hoc calibration function that transforms the output of the original network into calibrated confidence scores. Previous post-hoc calibration techniques work only with simple calibration functions. We propose a new neural network architecture that represents a class of intra order-preserving functions.
arXiv Detail & Related papers (2020-03-15T12:57:21Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness. We show that focal loss allows us to learn models that are already very well calibrated. We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.