Related papers: No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks

URL: http://arxiv.org/abs/2104.00795v1
Date: Thu, 1 Apr 2021 22:40:25 GMT
Title: No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks
Authors: Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi
Abstract summary: We use Conditional Risk Minimization ( CRM) framework for hierarchy-aware classification. Given a cost matrix and a reliable estimate of likelihoods, CRM simply amends mistakes at inference time. It significantly outperforms the state-of-the-art and consistently obtains large reductions in the average hierarchical distance of top-$k$ predictions.
Score: 17.55334996757232
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There has been increasing interest in building deep hierarchy-aware classifiers that aim to quantify and reduce the severity of mistakes, and not just reduce the number of errors. The idea is to exploit the label hierarchy (e.g., the WordNet ontology) and consider graph distances as a proxy for mistake severity. Surprisingly, on examining mistake-severity distributions of the top-1 prediction, we find that current state-of-the-art hierarchy-aware deep classifiers do not always show practical improvement over the standard cross-entropy baseline in making better mistakes. The reason for the reduction in average mistake-severity can be attributed to the increase in low-severity mistakes, which may also explain the noticeable drop in their accuracy. To this end, we use the classical Conditional Risk Minimization (CRM) framework for hierarchy-aware classification. Given a cost matrix and a reliable estimate of likelihoods (obtained from a trained network), CRM simply amends mistakes at inference time; it needs no extra hyperparameters and requires adding just a few lines of code to the standard cross-entropy baseline. It significantly outperforms the state-of-the-art and consistently obtains large reductions in the average hierarchical distance of top-$k$ predictions across datasets, with very little loss in accuracy. CRM, because of its simplicity, can be used with any off-the-shelf trained model that provides reliable likelihood estimates.

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We show that calibration error and refinement error are not minimized simultaneously during training. We introduce a new metric for early stopping and hyper parameter tuning that makes it possible to minimize refinement error during training. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes. The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range. We propose to construct hierarchical classifiers for solving imbalanced regression tasks. Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z)
Test-Time Amendment with a Coarse Classifier for Fine-Grained Classification [10.719054378755981]
We present a novel approach for Post-Hoc Correction called Hierarchical Ensembles (HiE) HiE utilizes label hierarchy to improve the performance of fine-grained classification at test-time using the coarse-grained predictions. Our approach brings notable gains in top-1 accuracy while significantly decreasing the severity of mistakes as training data decreases for the fine-grained classes.
arXiv Detail & Related papers (2023-02-01T10:55:27Z)
Probable Domain Generalization via Quantile Risk Minimization [90.15831047587302]
Domain generalization seeks predictors which perform well on unseen test distributions. We propose a new probabilistic framework for DG where the goal is to learn predictors that perform well with high probability.
arXiv Detail & Related papers (2022-07-20T14:41:09Z)
Hierarchical Average Precision Training for Pertinent Image Retrieval [0.0]
This paper introduces a new hierarchical AP training method for pertinent image retrieval (HAP-PIER) HAP-PIER is based on a new H-AP metric, which integrates errors' importance and better evaluate rankings. Experiments on 6 datasets show that HAPPIER significantly outperforms state-of-the-art methods for hierarchical retrieval.
arXiv Detail & Related papers (2022-07-05T07:55:18Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Automation for Interpretable Machine Learning Through a Comparison of Loss Functions to Regularisers [0.0]
This paper explores the use of the Fit to Median Error measure in machine learning regression automation. It improves interpretability by regularising learnt input-output relationships to the conditional median. Networks optimised for their Fit to Median Error are shown to approximate the ground truth more consistently.
arXiv Detail & Related papers (2021-06-07T08:50:56Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Generalizing Variational Autoencoders with Hierarchical Empirical Bayes [6.273154057349038]
We present Hierarchical Empirical Bayes Autoencoder (HEBAE), a computationally stable framework for probabilistic generative models. Our key contributions are two-fold. First, we make gains by placing a hierarchical prior over the encoding distribution, enabling us to adaptively balance the trade-off between minimizing the reconstruction loss function and avoiding over-regularization.
arXiv Detail & Related papers (2020-07-20T18:18:39Z)
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels. We show that the quality of gradient estimation matters more in risk minimization. We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy [88.51943635427709]
Adversarial training augments the training set with perturbations to improve the robust error. We show that the standard error could increase even when the augmented perturbations have noiseless observations from the optimal linear predictor.
arXiv Detail & Related papers (2020-02-25T08:03:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.