Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models
- URL: http://arxiv.org/abs/2401.16692v2
- Date: Sat, 18 May 2024 03:51:18 GMT
- Title: Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models
- Authors: Yewen Fan, Nian Si, Xiangchen Song, Kun Zhang,
- Abstract summary: There is a lack of focus on evaluating the performance of deep learning pipelines.
With the increased use of large datasets and complex models, the training process is run only once and the result is compared to previous benchmarks.
Traditional solutions, such as running the training process multiple times, are often infeasible due to computational constraints.
We introduce a novel metric framework, the Calibrated Loss Metric, designed to address this issue by reducing the variance present in its conventional counterpart.
- Score: 16.308958212406583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The adoption of deep learning across various fields has been extensive, yet there is a lack of focus on evaluating the performance of deep learning pipelines. Typically, with the increased use of large datasets and complex models, the training process is run only once and the result is compared to previous benchmarks. This practice can lead to imprecise comparisons due to the variance in neural network evaluation metrics, which stems from the inherent randomness in the training process. Traditional solutions, such as running the training process multiple times, are often infeasible due to computational constraints. In this paper, we introduce a novel metric framework, the Calibrated Loss Metric, designed to address this issue by reducing the variance present in its conventional counterpart. Consequently, this new metric enhances the accuracy in detecting effective modeling improvements. Our approach is substantiated by theoretical justifications and extensive experimental validations within the context of Deep Click-Through Rate Prediction Models.
Related papers
- Probabilistic Calibration by Design for Neural Network Regression [2.3020018305241337]
We introduce a novel end-to-end model training procedure called Quantile Recalibration Training.
We demonstrate the performance of our method in a large-scale experiment involving 57 regression datasets.
arXiv Detail & Related papers (2024-03-18T17:04:33Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Bayesian Deep Learning for Remaining Useful Life Estimation via Stein
Variational Gradient Descent [14.784809634505903]
We show that Bayesian deep learning models trained via Stein variational gradient descent consistently outperform with respect to convergence speed and predictive performance.
We propose a method to enhance performance based on the uncertainty information provided by the Bayesian models.
arXiv Detail & Related papers (2024-02-02T02:21:06Z) - Stabilizing Subject Transfer in EEG Classification with Divergence
Estimation [17.924276728038304]
We propose several graphical models to describe an EEG classification task.
We identify statistical relationships that should hold true in an idealized training scenario.
We design regularization penalties to enforce these relationships in two stages.
arXiv Detail & Related papers (2023-10-12T23:06:52Z) - On double-descent in uncertainty quantification in overparametrized
models [24.073221004661427]
Uncertainty quantification is a central challenge in reliable and trustworthy machine learning.
We show a trade-off between classification accuracy and calibration, unveiling a double descent like behavior in the calibration curve of optimally regularized estimators.
This is in contrast with the empirical Bayes method, which we show to be well calibrated in our setting despite the higher generalization error and overparametrization.
arXiv Detail & Related papers (2022-10-23T16:01:08Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Calibrated and Sharp Uncertainties in Deep Learning via Simple Density
Estimation [7.184701179854522]
This paper argues for reasoning about uncertainty in terms these properties and proposes simple algorithms for enforcing them in deep learning.
Our methods focus on the strongest notion of calibration--distribution calibration--and enforce it by fitting a low-dimensional density or quantile function with a neural estimator.
Empirically, we find that our methods improve predictive uncertainties on several tasks with minimal computational and implementation overhead.
arXiv Detail & Related papers (2021-12-14T06:19:05Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.