Diversity-Aware Weighted Majority Vote Classifier for Imbalanced Data
- URL: http://arxiv.org/abs/2004.07605v1
- Date: Thu, 16 Apr 2020 11:27:50 GMT
- Title: Diversity-Aware Weighted Majority Vote Classifier for Imbalanced Data
- Authors: Anil Goyal and Jihed Khiari
- Abstract summary: We propose a diversity-aware ensemble learning based algorithm, DAMVI, to deal with imbalanced binary classification tasks.
We show efficiency of the proposed approach with respect to state-of-art models on predictive maintenance task, credit card fraud detection, webpage classification and medical applications.
- Score: 1.2944868613449219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a diversity-aware ensemble learning based
algorithm, referred to as DAMVI, to deal with imbalanced binary classification
tasks. Specifically, after learning base classifiers, the algorithm i)
increases the weights of positive examples (minority class) which are "hard" to
classify with uniformly weighted base classifiers; and ii) then learns weights
over base classifiers by optimizing the PAC-Bayesian C-Bound that takes into
account the accuracy and diversity between the classifiers. We show efficiency
of the proposed approach with respect to state-of-art models on predictive
maintenance task, credit card fraud detection, webpage classification and
medical applications.
Related papers
- Bipartite Ranking Fairness through a Model Agnostic Ordering Adjustment [54.179859639868646]
We propose a model agnostic post-processing framework xOrder for achieving fairness in bipartite ranking.
xOrder is compatible with various classification models and ranking fairness metrics, including supervised and unsupervised fairness metrics.
We evaluate our proposed algorithm on four benchmark data sets and two real-world patient electronic health record repositories.
arXiv Detail & Related papers (2023-07-27T07:42:44Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - Learning-From-Disagreement: A Model Comparison and Visual Analytics
Framework [21.055845469999532]
We propose a learning-from-disagreement framework to visually compare two classification models.
Specifically, we train a discriminator to learn from the disagreed instances.
We interpret the trained discriminator with the SHAP values of different meta-features.
arXiv Detail & Related papers (2022-01-19T20:15:35Z) - BALanCe: Deep Bayesian Active Learning via Equivalence Class Annealing [7.9107076476763885]
BALanCe is a deep active learning framework that mitigates the effect of uncertainty estimates.
Batch-BALanCe is a generalization of the sequential algorithm to the batched setting.
We show that Batch-BALanCe achieves state-of-the-art performance on several benchmark datasets for active learning.
arXiv Detail & Related papers (2021-12-27T15:38:27Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - SetConv: A New Approach for Learning from Imbalanced Data [29.366843553056594]
We propose a set convolution operation and an episodic training strategy to extract a single representative for each class.
We prove that our proposed algorithm is permutation-invariant despite the order of inputs.
arXiv Detail & Related papers (2021-04-03T22:33:30Z) - Unbiased Subdata Selection for Fair Classification: A Unified Framework
and Scalable Algorithms [0.8376091455761261]
We show that many classification models within this framework can be recast as mixed-integer convex programs.
We then show that in the proposed problem, when the classification outcomes, "unsolvable subdata selection," is strongly-solvable.
This motivates us to develop an iterative refining strategy (IRS) to solve the classification instances.
arXiv Detail & Related papers (2020-12-22T21:09:38Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.