Splitting criteria for ordinal decision trees: an experimental study
- URL: http://arxiv.org/abs/2412.13697v2
- Date: Mon, 17 Feb 2025 18:53:15 GMT
- Title: Splitting criteria for ordinal decision trees: an experimental study
- Authors: Rafael Ayllón-Gavilán, Francisco José Martínez-Estudillo, David Guijo-Rubio, César Hervás-Martínez, Pedro Antonio Gutiérrez,
- Abstract summary: Ordinal Classification (OC) is a machine learning field that addresses classification tasks where the labels exhibit a natural order.
OC takes the ordinal relationship into account, producing more accurate and relevant results.
This work conducts an experimental study of tree-based methodologies designed to capture ordinal relationships.
- Score: 6.575723870852787
- License:
- Abstract: Ordinal Classification (OC) is a machine learning field that addresses classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as equally distinct, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has implications. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are one of the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work conducts an experimental study of tree-based methodologies specifically designed to capture ordinal relationships. A comprehensive survey of ordinal splitting criteria is provided, standardising the notations used in the literature for clarity. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering 45 publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. Statistical analysis of the results highlights OGini as the most effective ordinal splitting criterion to date. Source code, datasets, and results are made available to the research community.
Related papers
- ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items.
The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z) - Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions [8.640930010669042]
We propose a unimodal regularisation approach to improve the classification performance of the first and last classes.
Performance in the extreme classes is compared using a new metric that takes into account their sensitivities.
The results for the proposed metric show that the generalised beta distribution generally improves classification performance in the extreme classes.
arXiv Detail & Related papers (2024-07-17T08:57:42Z) - Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z) - Convolutional and Deep Learning based techniques for Time Series Ordinal Classification [7.047582157120573]
Time Series Ordinal Classification (TSOC) is the field covering this gap, yet unexplored in the literature.
This paper presents a first benchmarking of TSOC methodologies, exploiting the ordering of the target labels to boost the performance of current TSC state-of-the-art.
arXiv Detail & Related papers (2023-06-16T11:57:11Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Ordinal Causal Discovery [2.0305676256390934]
This paper proposes an identifiable ordinal causal discovery method that exploits the ordinal information contained in many real-world applications to uniquely identify the causal structure.
We show that the proposed ordinal causal discovery method has favorable and robust performance compared to state-of-the-art alternative methods in both ordinal categorical and non-categorical data.
arXiv Detail & Related papers (2022-01-19T03:11:26Z) - Precision-Recall Curve (PRC) Classification Trees [5.503321733964237]
We propose a novel tree-based algorithm based on the area under the precision-recall curve (AUPRC) for variable selection in the classification context.
Our algorithm, named as the "Precision-Recall Curve classification tree", or simply the "PRC classification tree" modifies two crucial stages in tree building.
arXiv Detail & Related papers (2020-11-15T22:31:06Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - Cooperative Bi-path Metric for Few-shot Learning [50.98891758059389]
We make two contributions to investigate the few-shot classification problem.
We report a simple and effective baseline trained on base classes in the way of traditional supervised learning.
We propose a cooperative bi-path metric for classification, which leverages the correlations between base classes and novel classes to further improve the accuracy.
arXiv Detail & Related papers (2020-08-10T11:28:52Z) - An Effectiveness Metric for Ordinal Classification: Formal Properties
and Experimental Results [9.602361044877426]
We propose a new metric for Ordinal Classification, Closeness Evaluation Measure, rooted on Measurement Theory and Information Theory.
Our theoretical analysis and experimental results over both synthetic data and data from NLP shared tasks indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously.
arXiv Detail & Related papers (2020-06-01T20:35:46Z) - Convolution-Weight-Distribution Assumption: Rethinking the Criteria of
Channel Pruning [90.2947802490534]
We find two blind spots in the study of pruning criteria.
The ranks of filters'Importance Score are almost identical, resulting in similar pruned structures.
The filters'Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well.
arXiv Detail & Related papers (2020-04-24T09:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.