ASE: Anomaly Scoring Based Ensemble Learning for Imbalanced Datasets
- URL: http://arxiv.org/abs/2203.10769v2
- Date: Tue, 22 Mar 2022 03:55:30 GMT
- Title: ASE: Anomaly Scoring Based Ensemble Learning for Imbalanced Datasets
- Authors: Xiayu Liang, Ying Gao, Shanrong Xu
- Abstract summary: We come up with a bagging ensemble learning framework based on an anomaly detection scoring system.
We test out that our ensemble learning model can dramatically improve performance of base estimators.
- Score: 3.214208422566496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, many industries have applied classification algorithms to help them
solve problems in their business, like finance, medicine, manufacturing
industry and so on. However, in real-life scenarios, positive examples only
make up a small part of all instances and our datasets suffer from high
imbalance ratio which leads to poor performance of existing classification
models. To solve this problem, we come up with a bagging ensemble learning
framework based on an anomaly detection scoring system. We test out that our
ensemble learning model can dramatically improve performance of base estimators
(e.g. Decision Tree, Multilayer perceptron, KNN) and is more efficient than
other existing methods under a wide range of imbalance ratio, data scale and
data dimension.
Related papers
- Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning [18.318758111829386]
We propose an efficient single-branch SSL method based on non-parametric instance discrimination.
We also propose a novel self-distillation loss that minimizes the KL divergence between the probability distribution and its square root version.
arXiv Detail & Related papers (2024-04-30T06:39:04Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Machine Learning Based Missing Values Imputation in Categorical Datasets [2.5611256859404983]
This research looked into the use of machine learning algorithms to fill in the gaps in categorical datasets.
The emphasis was on ensemble models constructed using the Error Correction Output Codes framework.
Deep learning for missing data imputation has obstacles despite these encouraging results, including the requirement for large amounts of labeled data.
arXiv Detail & Related papers (2023-06-10T03:29:48Z) - Parameterized Neural Networks for Finance [0.0]
We discuss and analyze a neural network architecture, that enables learning a model class for a set of different data samples.
We apply the approach to one of the standard problems asset managers and banks are facing: the calibration of spread curves.
arXiv Detail & Related papers (2023-04-18T10:18:28Z) - A review of ensemble learning and data augmentation models for class
imbalanced problems: combination, implementation and evaluation [0.196629787330046]
Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other.
In this paper, we evaluate data augmentation and ensemble learning methods used to address prominent benchmark CI problems.
arXiv Detail & Related papers (2023-04-06T04:37:10Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers.
Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.