Ensemble pruning via an integer programming approach with diversity
constraints
- URL: http://arxiv.org/abs/2205.01088v1
- Date: Mon, 2 May 2022 17:59:11 GMT
- Title: Ensemble pruning via an integer programming approach with diversity
constraints
- Authors: Marcelo Ant\^onio Mendes Bastos, Humberto Brand\~ao C\'esar de
Oliveira, Cristiano Arbex Valle
- Abstract summary: In this paper, we consider a binary classification problem and propose an integer programming (IP) approach for selecting optimal subsets.
We also propose constraints to ensure minimum diversity levels in the ensemble.
Our approach yields competitive results when compared to some of the best and most used pruning methods in literature.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensemble learning combines multiple classifiers in the hope of obtaining
better predictive performance. Empirical studies have shown that ensemble
pruning, that is, choosing an appropriate subset of the available classifiers,
can lead to comparable or better predictions than using all classifiers. In
this paper, we consider a binary classification problem and propose an integer
programming (IP) approach for selecting optimal classifier subsets. We propose
a flexible objective function to adapt to desired criteria of different
datasets. We also propose constraints to ensure minimum diversity levels in the
ensemble. Despite the general case of IP being NP-Hard, state-of-the-art
solvers are able to quickly obtain good solutions for datasets with up to 60000
data points. Our approach yields competitive results when compared to some of
the best and most used pruning methods in literature.
Related papers
- An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - An Evolutionary Approach for Creating of Diverse Classifier Ensembles [11.540822622379176]
We propose a framework for classifier selection and fusion based on a four-step protocol called CIF-E.
We implement and evaluate 24 varied ensemble approaches following the proposed CIF-E protocol.
Experiments show that the proposed evolutionary approach can outperform the state-of-the-art literature approaches in many well-known UCI datasets.
arXiv Detail & Related papers (2022-08-23T14:23:27Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - Set-valued prediction in hierarchical classification with constrained
representation complexity [4.258263831866309]
We focus on hierarchical multi-class classification problems, where valid sets correspond to internal nodes of the hierarchy.
We propose three methods and evaluate them on benchmark datasets.
arXiv Detail & Related papers (2022-03-13T15:13:19Z) - Gated recurrent units and temporal convolutional network for multilabel
classification [122.84638446560663]
This work proposes a new ensemble method for managing multilabel classification.
The core of the proposed approach combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam gradients optimization approach.
arXiv Detail & Related papers (2021-10-09T00:00:16Z) - Scalable Optimal Classifiers for Adversarial Settings under Uncertainty [10.90668635921398]
We consider the problem of finding optimal classifiers in an adversarial setting where the class-1 data is generated by an attacker whose objective is not known to the defender.
We show that this low-dimensional characterization enables to develop a training method to compute provably approximately optimal classifiers in a scalable manner.
arXiv Detail & Related papers (2021-06-28T13:33:53Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z) - Global Multiclass Classification and Dataset Construction via
Heterogeneous Local Experts [37.27708297562079]
We show how to minimize the number of labelers while ensuring the reliability of the resulting dataset.
Experiments with the MNIST and CIFAR-10 datasets demonstrate the favorable accuracy of our aggregation scheme.
arXiv Detail & Related papers (2020-05-21T18:07:42Z) - Ranking a set of objects: a graph based least-square approach [70.7866286425868]
We consider the problem of ranking $N$ objects starting from a set of noisy pairwise comparisons provided by a crowd of equal workers.
We propose a class of non-adaptive ranking algorithms that rely on a least-squares intrinsic optimization criterion for the estimation of qualities.
arXiv Detail & Related papers (2020-02-26T16:19:09Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.