Cost-Sensitive Stacking: an Empirical Evaluation
- URL: http://arxiv.org/abs/2301.01748v1
- Date: Wed, 4 Jan 2023 18:28:07 GMT
- Title: Cost-Sensitive Stacking: an Empirical Evaluation
- Authors: Natalie Lawrance and Marie-Anne Guerry and George Petrides
- Abstract summary: Cost-sensitive learning adapts classification algorithms to account for differences in misclassification costs.
There is no consensus in the literature as to what cost-sensitive stacking is.
Our experiments, conducted on twelve datasets, show that for best performance, both levels of stacking require cost-sensitive classification decision.
- Score: 3.867363075280544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many real-world classification problems are cost-sensitive in nature, such
that the misclassification costs vary between data instances. Cost-sensitive
learning adapts classification algorithms to account for differences in
misclassification costs. Stacking is an ensemble method that uses predictions
from several classifiers as the training data for another classifier, which in
turn makes the final classification decision.
While a large body of empirical work exists where stacking is applied in
various domains, very few of these works take the misclassification costs into
account. In fact, there is no consensus in the literature as to what
cost-sensitive stacking is. In this paper we perform extensive experiments with
the aim of establishing what the appropriate setup for a cost-sensitive
stacking ensemble is. Our experiments, conducted on twelve datasets from a
number of application domains, using real, instance-dependent misclassification
costs, show that for best performance, both levels of stacking require
cost-sensitive classification decision.
Related papers
- Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Cost-Accuracy Aware Adaptive Labeling for Active Learning [9.761953860259942]
In many real settings, different labelers have different labeling costs and can yield different labeling accuracies.
We propose a new algorithm for selecting instances, labelers and their corresponding costs and labeling accuracies.
Our proposed algorithm demonstrates state-of-the-art performance on five UCI and a real crowdsourcing dataset.
arXiv Detail & Related papers (2021-05-24T17:21:00Z) - Cost-Based Budget Active Learning for Deep Learning [0.9732863739456035]
We propose a Cost-Based Bugdet Active Learning (CBAL) which considers the classification uncertainty as well as instance diversity in a population constrained by a budget.
A principled approach based on the min-max is considered to minimize both the labeling and decision cost of the selected instances.
arXiv Detail & Related papers (2020-12-09T17:42:44Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Classification with Rejection Based on Cost-sensitive Classification [83.50402803131412]
We propose a novel method of classification with rejection by ensemble of learning.
Experimental results demonstrate the usefulness of our proposed approach in clean, noisy, and positive-unlabeled classification.
arXiv Detail & Related papers (2020-10-22T14:05:05Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - Misclassification cost-sensitive ensemble learning: A unifying framework [7.90398448280017]
Our contribution is a unifying framework that provides a comprehensive and insightful overview on cost-sensitive ensemble methods.
Our framework contains natural extensions and generalisations of ideas across methods, be it AdaBoost, Bagging or Random Forest.
arXiv Detail & Related papers (2020-07-14T21:18:33Z) - Global Multiclass Classification and Dataset Construction via
Heterogeneous Local Experts [37.27708297562079]
We show how to minimize the number of labelers while ensuring the reliability of the resulting dataset.
Experiments with the MNIST and CIFAR-10 datasets demonstrate the favorable accuracy of our aggregation scheme.
arXiv Detail & Related papers (2020-05-21T18:07:42Z) - Angle-Based Cost-Sensitive Multicategory Classification [34.174072286426885]
We propose a novel angle-based cost-sensitive classification framework for multicategory classification without the sum-to-zero constraint.
To show the usefulness of the framework, two cost-sensitive multicategory boosting algorithms are derived as concrete instances.
arXiv Detail & Related papers (2020-03-08T00:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.