No Cost Likelihood Manipulation at Test Time for Making Better Mistakes
in Deep Networks
- URL: http://arxiv.org/abs/2104.00795v1
- Date: Thu, 1 Apr 2021 22:40:25 GMT
- Title: No Cost Likelihood Manipulation at Test Time for Making Better Mistakes
in Deep Networks
- Authors: Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania, Vineet Gandhi
- Abstract summary: We use Conditional Risk Minimization ( CRM) framework for hierarchy-aware classification.
Given a cost matrix and a reliable estimate of likelihoods, CRM simply amends mistakes at inference time.
It significantly outperforms the state-of-the-art and consistently obtains large reductions in the average hierarchical distance of top-$k$ predictions.
- Score: 17.55334996757232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been increasing interest in building deep hierarchy-aware
classifiers that aim to quantify and reduce the severity of mistakes, and not
just reduce the number of errors. The idea is to exploit the label hierarchy
(e.g., the WordNet ontology) and consider graph distances as a proxy for
mistake severity. Surprisingly, on examining mistake-severity distributions of
the top-1 prediction, we find that current state-of-the-art hierarchy-aware
deep classifiers do not always show practical improvement over the standard
cross-entropy baseline in making better mistakes. The reason for the reduction
in average mistake-severity can be attributed to the increase in low-severity
mistakes, which may also explain the noticeable drop in their accuracy. To this
end, we use the classical Conditional Risk Minimization (CRM) framework for
hierarchy-aware classification. Given a cost matrix and a reliable estimate of
likelihoods (obtained from a trained network), CRM simply amends mistakes at
inference time; it needs no extra hyperparameters and requires adding just a
few lines of code to the standard cross-entropy baseline. It significantly
outperforms the state-of-the-art and consistently obtains large reductions in
the average hierarchical distance of top-$k$ predictions across datasets, with
very little loss in accuracy. CRM, because of its simplicity, can be used with
any off-the-shelf trained model that provides reliable likelihood estimates.
Related papers
- Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Test-Time Amendment with a Coarse Classifier for Fine-Grained
Classification [10.719054378755981]
We present a novel approach for Post-Hoc Correction called Hierarchical Ensembles (HiE)
HiE utilizes label hierarchy to improve the performance of fine-grained classification at test-time using the coarse-grained predictions.
Our approach brings notable gains in top-1 accuracy while significantly decreasing the severity of mistakes as training data decreases for the fine-grained classes.
arXiv Detail & Related papers (2023-02-01T10:55:27Z) - Probable Domain Generalization via Quantile Risk Minimization [90.15831047587302]
Domain generalization seeks predictors which perform well on unseen test distributions.
We propose a new probabilistic framework for DG where the goal is to learn predictors that perform well with high probability.
arXiv Detail & Related papers (2022-07-20T14:41:09Z) - Hierarchical Average Precision Training for Pertinent Image Retrieval [0.0]
This paper introduces a new hierarchical AP training method for pertinent image retrieval (HAP-PIER)
HAP-PIER is based on a new H-AP metric, which integrates errors' importance and better evaluate rankings.
Experiments on 6 datasets show that HAPPIER significantly outperforms state-of-the-art methods for hierarchical retrieval.
arXiv Detail & Related papers (2022-07-05T07:55:18Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Automation for Interpretable Machine Learning Through a Comparison of
Loss Functions to Regularisers [0.0]
This paper explores the use of the Fit to Median Error measure in machine learning regression automation.
It improves interpretability by regularising learnt input-output relationships to the conditional median.
Networks optimised for their Fit to Median Error are shown to approximate the ground truth more consistently.
arXiv Detail & Related papers (2021-06-07T08:50:56Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Generalizing Variational Autoencoders with Hierarchical Empirical Bayes [6.273154057349038]
We present Hierarchical Empirical Bayes Autoencoder (HEBAE), a computationally stable framework for probabilistic generative models.
Our key contributions are two-fold. First, we make gains by placing a hierarchical prior over the encoding distribution, enabling us to adaptively balance the trade-off between minimizing the reconstruction loss function and avoiding over-regularization.
arXiv Detail & Related papers (2020-07-20T18:18:39Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z) - Understanding and Mitigating the Tradeoff Between Robustness and
Accuracy [88.51943635427709]
Adversarial training augments the training set with perturbations to improve the robust error.
We show that the standard error could increase even when the augmented perturbations have noiseless observations from the optimal linear predictor.
arXiv Detail & Related papers (2020-02-25T08:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.