Related papers: Splitting criteria for ordinal decision trees: an experimental study

Splitting criteria for ordinal decision trees: an experimental study

URL: http://arxiv.org/abs/2412.13697v2
Date: Mon, 17 Feb 2025 18:53:15 GMT
Title: Splitting criteria for ordinal decision trees: an experimental study
Authors: Rafael Ayllón-Gavilán, Francisco José Martínez-Estudillo, David Guijo-Rubio, César Hervás-Martínez, Pedro Antonio Gutiérrez,
Abstract summary: Ordinal Classification (OC) is a machine learning field that addresses classification tasks where the labels exhibit a natural order.<n>OC takes the ordinal relationship into account, producing more accurate and relevant results.<n>This work conducts an experimental study of tree-based methodologies designed to capture ordinal relationships.
Score: 6.575723870852787
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ordinal Classification (OC) is a machine learning field that addresses classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as equally distinct, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has implications. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are one of the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work conducts an experimental study of tree-based methodologies specifically designed to capture ordinal relationships. A comprehensive survey of ordinal splitting criteria is provided, standardising the notations used in the literature for clarity. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering 45 publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. Statistical analysis of the results highlights OGini as the most effective ordinal splitting criterion to date. Source code, datasets, and results are made available to the research community.

Related papers

Quantifying Query Fairness Under Unawareness [82.33181164973365]
We introduce a robust fairness estimator based on quantification that effectively handles multiple sensitive attributes beyond binary classifications.<n>Our method outperforms existing baselines across various sensitive attributes and is the first to establish a reliable protocol for measuring fairness under unawareness.
arXiv Detail & Related papers (2025-06-04T16:31:44Z)
Generalized Category Discovery via Reciprocal Learning and Class-Wise Distribution Regularization [6.696520328216944]
Generalized Category Discovery (GCD) aims to identify unlabeled samples by leveraging the base knowledge from labeled ones.<n>Recent parametric-based methods suffer from inferior base discrimination due to unreliable self-supervision.<n>We propose a Reciprocal Learning Framework (RLF) that introduces an auxiliary branch devoted to base classification.
arXiv Detail & Related papers (2025-06-03T00:12:39Z)
OrdRankBen: A Novel Ranking Benchmark for Ordinal Relevance in NLP [6.6002656593260225]
Benchmark datasets play a crucial role in providing standardized testbeds that ensure fair comparisons. Existing NLP ranking benchmarks typically use binary relevance labels or continuous relevance scores, neglecting ordinal relevance scores. We introduce OrdRankBen, a novel benchmark designed to capture multi-granularity relevance distinctions.
arXiv Detail & Related papers (2025-03-02T00:28:55Z)
ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items. The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z)
Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions [8.640930010669042]
We propose a unimodal regularisation approach to improve the classification performance of the first and last classes. Performance in the extreme classes is compared using a new metric that takes into account their sensitivities. The results for the proposed metric show that the generalised beta distribution generally improves classification performance in the extreme classes.
arXiv Detail & Related papers (2024-07-17T08:57:42Z)
Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs) Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations. Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z)
Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes. We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z)
Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking. xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics. We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z)
Convolutional and Deep Learning based techniques for Time Series Ordinal Classification [7.047582157120573]
Time Series Ordinal Classification (TSOC) is the field covering this gap, yet unexplored in the literature. This paper presents a first benchmarking of TSOC methodologies, exploiting the ordering of the target labels to boost the performance of current TSC state-of-the-art.
arXiv Detail & Related papers (2023-06-16T11:57:11Z)
Hierarchical confusion matrix for classification performance evaluation [0.0]
We develop the concept of a hierarchical confusion matrix and prove its applicability to all types of hierarchical classification problems. We use measures based on the novel confusion matrix to evaluate models within a benchmark for three real world hierarchical classification applications. The results outline the reasonability of this approach and its usefulness to evaluate hierarchical classification problems.
arXiv Detail & Related papers (2023-06-15T19:31:59Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Ordinal Causal Discovery [2.0305676256390934]
This paper proposes an identifiable ordinal causal discovery method that exploits the ordinal information contained in many real-world applications to uniquely identify the causal structure. We show that the proposed ordinal causal discovery method has favorable and robust performance compared to state-of-the-art alternative methods in both ordinal categorical and non-categorical data.
arXiv Detail & Related papers (2022-01-19T03:11:26Z)
Precision-Recall Curve (PRC) Classification Trees [5.503321733964237]
We propose a novel tree-based algorithm based on the area under the precision-recall curve (AUPRC) for variable selection in the classification context. Our algorithm, named as the "Precision-Recall Curve classification tree", or simply the "PRC classification tree" modifies two crucial stages in tree building.
arXiv Detail & Related papers (2020-11-15T22:31:06Z)
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes. We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution. Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z)
Cooperative Bi-path Metric for Few-shot Learning [50.98891758059389]
We make two contributions to investigate the few-shot classification problem. We report a simple and effective baseline trained on base classes in the way of traditional supervised learning. We propose a cooperative bi-path metric for classification, which leverages the correlations between base classes and novel classes to further improve the accuracy.
arXiv Detail & Related papers (2020-08-10T11:28:52Z)
Convolutional Ordinal Regression Forest for Image Ordinal Estimation [52.67784321853814]
We propose a novel ordinal regression approach, termed Convolutional Ordinal Regression Forest or CORF, for image ordinal estimation. The proposed CORF integrates ordinal regression and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships. The effectiveness of the proposed CORF is verified on two image ordinal estimation tasks, showing significant improvements and better stability over the state-of-the-art ordinal regression methods.
arXiv Detail & Related papers (2020-08-07T10:41:17Z)
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results [9.602361044877426]
We propose a new metric for Ordinal Classification, Closeness Evaluation Measure, rooted on Measurement Theory and Information Theory. Our theoretical analysis and experimental results over both synthetic data and data from NLP shared tasks indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously.
arXiv Detail & Related papers (2020-06-01T20:35:46Z)
Convolution-Weight-Distribution Assumption: Rethinking the Criteria of Channel Pruning [90.2947802490534]
We find two blind spots in the study of pruning criteria. The ranks of filters'Importance Score are almost identical, resulting in similar pruned structures. The filters'Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well.
arXiv Detail & Related papers (2020-04-24T09:54:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.