A First Step Towards Distribution Invariant Regression Metrics
- URL: http://arxiv.org/abs/2009.05176v1
- Date: Thu, 10 Sep 2020 23:40:46 GMT
- Title: A First Step Towards Distribution Invariant Regression Metrics
- Authors: Mario Michael Krell and Bilal Wehbe
- Abstract summary: In classification, it has been stated repeatedly that performance metrics like the F-Measure and Accuracy are highly dependent on the class distribution.
We show that the same problem exists in regression. The distribution of odometry parameters in robotic applications can for example largely vary between different recording sessions.
Here, we need regression algorithms that either perform equally well for all function values, or that focus on certain boundary regions like high speed.
- Score: 1.370633147306388
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Regression evaluation has been performed for decades. Some metrics have been
identified to be robust against shifting and scaling of the data but
considering the different distributions of data is much more difficult to
address (imbalance problem) even though it largely impacts the comparability
between evaluations on different datasets. In classification, it has been
stated repeatedly that performance metrics like the F-Measure and Accuracy are
highly dependent on the class distribution and that comparisons between
different datasets with different distributions are impossible. We show that
the same problem exists in regression. The distribution of odometry parameters
in robotic applications can for example largely vary between different
recording sessions. Here, we need regression algorithms that either perform
equally well for all function values, or that focus on certain boundary regions
like high speed. This has to be reflected in the evaluation metric. We propose
the modification of established regression metrics by weighting with the
inverse distribution of function values $Y$ or the samples $X$ using an
automatically tuned Gaussian kernel density estimator. We show on synthetic and
robotic data in reproducible experiments that classical metrics behave wrongly,
whereas our new metrics are less sensitive to changing distributions,
especially when correcting by the marginal distribution in $X$. Our new
evaluation concept enables the comparison of results between different datasets
with different distributions. Furthermore, it can reveal overfitting of a
regression algorithm to overrepresented target values. As an outcome,
non-overfitting regression algorithms will be more likely chosen due to our
corrected metrics.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Uncertainty Voting Ensemble for Imbalanced Deep Regression [20.176217123752465]
In this paper, we introduce UVOTE, a method for learning from imbalanced data.
We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty.
We show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates.
arXiv Detail & Related papers (2023-05-24T14:12:21Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - RIFLE: Imputation and Robust Inference from Low Order Marginals [10.082738539201804]
We develop a statistical inference framework for regression and classification in the presence of missing data without imputation.
Our framework, RIFLE, estimates low-order moments of the underlying data distribution with corresponding confidence intervals to learn a distributionally robust model.
Our experiments demonstrate that RIFLE outperforms other benchmark algorithms when the percentage of missing values is high and/or when the number of data points is relatively small.
arXiv Detail & Related papers (2021-09-01T23:17:30Z) - Human Pose Regression with Residual Log-likelihood Estimation [48.30425850653223]
We propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.
RLE learns the change of the distribution instead of the unreferenced underlying distribution to facilitate the training process.
Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAP improvement on MSCOCO without any test-time overhead.
arXiv Detail & Related papers (2021-07-23T15:06:31Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Delving into Deep Imbalanced Regression [41.90687097747504]
We define Deep Imbalanced Regression (DIR) as learning from imbalanced data with continuous targets.
Motivated by the intrinsic difference between categorical and continuous label space, we propose distribution smoothing for both labels and features.
Our work fills the gap in benchmarks and techniques for practical imbalanced regression problems.
arXiv Detail & Related papers (2021-02-18T18:56:03Z) - Nonlinear Distribution Regression for Remote Sensing Applications [6.664736150040092]
In many remote sensing applications one wants to estimate variables or parameters of interest from observations.
Standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate to the two.
This paper introduces a nonlinear (kernel-based) method for distribution regression that solves the previous problems without making any assumption on the statistics of the grouped data.
arXiv Detail & Related papers (2020-12-07T22:04:43Z) - Learning from Aggregate Observations [82.44304647051243]
We study the problem of learning from aggregate observations where supervision signals are given to sets of instances.
We present a general probabilistic framework that accommodates a variety of aggregate observations.
Simple maximum likelihood solutions can be applied to various differentiable models.
arXiv Detail & Related papers (2020-04-14T06:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.