Noise-Resilient Ensemble Learning using Evidence Accumulation Clustering
- URL: http://arxiv.org/abs/2110.09212v1
- Date: Mon, 18 Oct 2021 11:52:45 GMT
- Title: Noise-Resilient Ensemble Learning using Evidence Accumulation Clustering
- Authors: Ga\"elle Candel, David Naccache
- Abstract summary: Ensemble learning methods combine multiple algorithms performing the same task to build a group with superior quality.
These systems are well adapted to the distributed setup, where each peer or machine of the network hosts one algorithm and communicate its results to its peers.
However, the network can be corrupted, altering the prediction accuracy of a peer, which has a deleterious effect on the ensemble quality.
We propose a noise-resilient ensemble classification method, which helps to improve accuracy and correct random errors.
- Score: 1.7188280334580195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensemble Learning methods combine multiple algorithms performing the same
task to build a group with superior quality. These systems are well adapted to
the distributed setup, where each peer or machine of the network hosts one
algorithm and communicate its results to its peers. Ensemble learning methods
are naturally resilient to the absence of several peers thanks to the ensemble
redundancy. However, the network can be corrupted, altering the prediction
accuracy of a peer, which has a deleterious effect on the ensemble quality. In
this paper, we propose a noise-resilient ensemble classification method, which
helps to improve accuracy and correct random errors. The approach is inspired
by Evidence Accumulation Clustering , adapted to classification ensembles. We
compared it to the naive voter model over four multi-class datasets. Our model
showed a greater resilience, allowing us to recover prediction under a very
high noise level. In addition as the method is based on the evidence
accumulation clustering, our method is highly flexible as it can combines
classifiers with different label definitions.
Related papers
- Graph-based Active Learning for Entity Cluster Repair [1.7453520331111723]
Cluster repair methods aim to determine errors in clusters and modify them so that each cluster consists of records representing the same entity.
Current cluster repair methodologies assume duplicate-free data sources, where each record from one source corresponds to a unique record from another.
Recent approaches apply clustering methods in combination with link categorization methods so they can be applied to data sources with duplicates.
We introduce a novel approach for cluster repair that utilizes graph metrics derived from the underlying similarity graphs.
arXiv Detail & Related papers (2024-01-26T16:42:49Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Overlapping oriented imbalanced ensemble learning method based on
projective clustering and stagewise hybrid sampling [22.32930261633615]
This paper proposes an ensemble learning algorithm based on dual clustering and stage-wise hybrid sampling (DCSHS)
The major advantage of our algorithm is that it can exploit the intersectionality of the CCS to realize the soft elimination of overlapping majority samples.
arXiv Detail & Related papers (2022-11-30T01:49:06Z) - Neural Active Learning on Heteroskedastic Distributions [29.01776999862397]
We demonstrate the catastrophic failure of active learning algorithms on heteroskedastic datasets.
We propose a new algorithm that incorporates a model difference scoring function for each data point to filter out the noisy examples and sample clean examples.
arXiv Detail & Related papers (2022-11-02T07:30:19Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Unsupervised Clustered Federated Learning in Complex Multi-source
Acoustic Environments [75.8001929811943]
We introduce a realistic and challenging, multi-source and multi-room acoustic environment.
We present an improved clustering control strategy that takes into account the variability of the acoustic scene.
The proposed approach is optimized using clustering-based measures and validated via a network-wide classification task.
arXiv Detail & Related papers (2021-06-07T14:51:39Z) - SetConv: A New Approach for Learning from Imbalanced Data [29.366843553056594]
We propose a set convolution operation and an episodic training strategy to extract a single representative for each class.
We prove that our proposed algorithm is permutation-invariant despite the order of inputs.
arXiv Detail & Related papers (2021-04-03T22:33:30Z) - Active Hybrid Classification [79.02441914023811]
This paper shows how crowd and machines can support each other in tackling classification problems.
We propose an architecture that orchestrates active learning and crowd classification and combines them in a virtuous cycle.
arXiv Detail & Related papers (2021-01-21T21:09:07Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.