An Effectiveness Metric for Ordinal Classification: Formal Properties
and Experimental Results
- URL: http://arxiv.org/abs/2006.01245v1
- Date: Mon, 1 Jun 2020 20:35:46 GMT
- Title: An Effectiveness Metric for Ordinal Classification: Formal Properties
and Experimental Results
- Authors: Enrique Amig\'o, Julio Gonzalo, Stefano Mizzaro, Jorge
Carrillo-de-Albornoz
- Abstract summary: We propose a new metric for Ordinal Classification, Closeness Evaluation Measure, rooted on Measurement Theory and Information Theory.
Our theoretical analysis and experimental results over both synthetic data and data from NLP shared tasks indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously.
- Score: 9.602361044877426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Ordinal Classification tasks, items have to be assigned to classes that
have a relative ordering, such as positive, neutral, negative in sentiment
analysis. Remarkably, the most popular evaluation metrics for ordinal
classification tasks either ignore relevant information (for instance,
precision/recall on each of the classes ignores their relative ordering) or
assume additional information (for instance, Mean Average Error assumes
absolute distances between classes). In this paper we propose a new metric for
Ordinal Classification, Closeness Evaluation Measure, that is rooted on
Measurement Theory and Information Theory. Our theoretical analysis and
experimental results over both synthetic data and data from NLP shared tasks
indicate that the proposed metric captures quality aspects from different
traditional tasks simultaneously. In addition, it generalizes some popular
classification (nominal scale) and error minimization (interval scale) metrics,
depending on the measurement scale in which it is instantiated.
Related papers
- Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions [8.640930010669042]
We propose a unimodal regularisation approach to improve the classification performance of the first and last classes.
Performance in the extreme classes is compared using a new metric that takes into account their sensitivities.
The results for the proposed metric show that the generalised beta distribution generally improves classification performance in the extreme classes.
arXiv Detail & Related papers (2024-07-17T08:57:42Z) - $F_β$-plot -- a visual tool for evaluating imbalanced data classifiers [0.0]
The paper proposes a simple approach to analyzing the popular parametric metric $F_beta$.
It is possible to indicate for a given pool of analyzed classifiers when a given model should be preferred depending on user requirements.
arXiv Detail & Related papers (2024-04-11T18:07:57Z) - Discordance Minimization-based Imputation Algorithms for Missing Values
in Rating Data [4.100928307172084]
When multiple rating lists are combined or considered together, subjects often have missing ratings.
We propose analyses on missing value patterns using six real-world data sets in various applications.
We propose optimization models and algorithms that minimize the total rating discordance across rating providers.
arXiv Detail & Related papers (2023-11-07T14:42:06Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for data-efficient representation learning.
We establish relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Association Graph Learning for Multi-Task Classification with Category
Shifts [68.58829338426712]
We focus on multi-task classification, where related classification tasks share the same label space and are learned simultaneously.
We learn an association graph to transfer knowledge among tasks for missing classes.
Our method consistently performs better than representative baselines.
arXiv Detail & Related papers (2022-10-10T12:37:41Z) - Re-Examining System-Level Correlations of Automatic Summarization
Evaluation Metrics [64.81682222169113]
How reliably an automatic summarization evaluation metric replicates human judgments of summary quality is quantified by system-level correlations.
We identify two ways in which the definition of the system-level correlation is inconsistent with how metrics are used to evaluate systems in practice.
arXiv Detail & Related papers (2022-04-21T15:52:14Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Classification Performance Metric for Imbalance Data Based on Recall and
Selectivity Normalized in Class Labels [0.0]
We introduce a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels.
This paper shows that the proposed performance measure has the right properties for the imbalanced dataset.
arXiv Detail & Related papers (2020-06-23T20:38:48Z) - Classifier uncertainty: evidence, potential impact, and probabilistic
treatment [0.0]
We present an approach to quantify the uncertainty of classification performance metrics based on a probability model of the confusion matrix.
We show that uncertainties can be surprisingly large and limit performance evaluation.
arXiv Detail & Related papers (2020-06-19T12:49:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.