Related papers: Bias and Priors in Machine Learning Calibrations for High Energy Physics

Bias and Priors in Machine Learning Calibrations for High Energy Physics

URL: http://arxiv.org/abs/2205.05084v1
Date: Tue, 10 May 2022 18:00:00 GMT
Title: Bias and Priors in Machine Learning Calibrations for High Energy Physics
Authors: Rikab Gambhir, Benjamin Nachman, and Jesse Thaler
Abstract summary: We highlight the prior dependence of some machine learning-based calibration strategies. Recent proposals for both simulation-based and data-based calibrations inherit properties of the sample used for training. In the case of simulation-based calibration, we argue that our recently proposed Gaussian Ansatz approach can avoid some of the pitfalls of prior dependence.
Score: 1.5675763601034223
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning offers an exciting opportunity to improve the calibration of nearly all reconstructed objects in high-energy physics detectors. However, machine learning approaches often depend on the spectra of examples used during training, an issue known as prior dependence. This is an undesirable property of a calibration, which needs to be applicable in a variety of environments. The purpose of this paper is to explicitly highlight the prior dependence of some machine learning-based calibration strategies. We demonstrate how some recent proposals for both simulation-based and data-based calibrations inherit properties of the sample used for training, which can result in biases for downstream analyses. In the case of simulation-based calibration, we argue that our recently proposed Gaussian Ansatz approach can avoid some of the pitfalls of prior dependence, whereas prior-independent data-based calibration remains an open problem.

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration. We identify the critical limitations of regression-based methods with the widely used data generation pipeline. We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z)
Beware of Calibration Data for Pruning Large Language Models [41.1689082093302]
Post-training pruning is a promising method that does not require resource-intensive iterative training. We show that the effects of calibration data even value more than designing advanced pruning strategies. Our preliminary exploration also discloses that using calibration data similar to the training data can yield better performance.
arXiv Detail & Related papers (2024-10-23T09:36:21Z)
Machine Learning for predicting chaotic systems [0.0]
We show that well-tuned simple methods, as well as untuned baseline methods, often outperform state-of-the-art deep learning models. These findings underscore the importance of matching prediction methods to data characteristics and available computational resources.
arXiv Detail & Related papers (2024-07-29T16:34:47Z)
Reassessing How to Compare and Improve the Calibration of Machine Learning Models [7.183341902583164]
A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. We show that there exist trivial recalibration approaches that can appear seemingly state-of-the-art unless calibration and prediction metrics are accompanied by additional generalization metrics.
arXiv Detail & Related papers (2024-06-06T13:33:45Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods [9.662269016653296]
We consider an individual calibration objective for characterizing the quantiles of the prediction model. Existing methods have been largely and lack of statistical guarantee in terms of individual calibration. We propose simple nonparametric calibration methods that are agnostic of the underlying prediction model.
arXiv Detail & Related papers (2023-05-20T21:31:51Z)
Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks. We analyze problem statement, calibration definitions, and different approaches to evaluation. Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z)
Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance. We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance. Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z)
Variable-Based Calibration for Machine Learning Classifiers [11.9995808096481]
We introduce the notion of variable-based calibration to characterize calibration properties of a model. We find that models with near-perfect expected calibration error can exhibit significant miscalibration as a function of features of the data.
arXiv Detail & Related papers (2022-09-30T00:49:31Z)
Calibrate: Interactive Analysis of Probabilistic Model Output [5.444048397001003]
We present Calibrate, an interactive reliability diagram that is resistant to drawbacks in traditional approaches. We demonstrate the utility of Calibrate through use cases on both real-world and synthetic data.
arXiv Detail & Related papers (2022-07-27T20:01:27Z)
Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems [66.61691401921296]
This paper presents an investigation over several methods of score calibration for deep speaker embedding extractors. An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system.
arXiv Detail & Related papers (2022-03-28T21:22:22Z)
T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.