Random Walk-steered Majority Undersampling
- URL: http://arxiv.org/abs/2109.12423v1
- Date: Sat, 25 Sep 2021 19:07:41 GMT
- Title: Random Walk-steered Majority Undersampling
- Authors: Payel Sadhukhan, Arjun Pakrashi, Brian Mac Namee
- Abstract summary: We propose Random Walk-steered Majority Undersampling (RWMaU)
RWMaU undersamples the majority points of a class imbalanced dataset in order to balance the classes.
Empirical evaluation on 21 datasets and 3 classifiers demonstrate substantial improvement in performance of RWMaU over the competing methods.
- Score: 10.227026799075215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we propose Random Walk-steered Majority Undersampling (RWMaU),
which undersamples the majority points of a class imbalanced dataset, in order
to balance the classes. Rather than marking the majority points which belong to
the neighborhood of a few minority points, we are interested to perceive the
closeness of the majority points to the minority class. Random walk, a powerful
tool for perceiving the proximities of connected points in a graph, is used to
identify the majority points which lie close to the minority class of a
class-imbalanced dataset. The visit frequencies and the order of visits of the
majority points in the walks enable us to perceive an overall closeness of the
majority points to the minority class. The ones lying close to the minority
class are subsequently undersampled. Empirical evaluation on 21 datasets and 3
classifiers demonstrate substantial improvement in performance of RWMaU over
the competing methods.
Related papers
- Confronting Discrimination in Classification: Smote Based on
Marginalized Minorities in the Kernel Space for Imbalanced Data [0.0]
We propose a novel classification oversampling approach based on the decision boundary and sample proximity relationships.
We test the proposed method on a classic financial fraud dataset.
arXiv Detail & Related papers (2024-02-13T04:03:09Z) - Exploring Vacant Classes in Label-Skewed Federated Learning [113.65301899666645]
Label skews, characterized by disparities in local label distribution across clients, pose a significant challenge in federated learning.
This paper introduces FedVLS, a novel approach to label-skewed federated learning that integrates vacant-class distillation and logit suppression simultaneously.
arXiv Detail & Related papers (2024-01-04T16:06:31Z) - Adversarial Reweighting Guided by Wasserstein Distance for Bias
Mitigation [24.160692009892088]
Under-representation of minorities in the data makes the disparate treatment of subpopulations difficult to deal with during learning.
We propose a novel adversarial reweighting method to address such emphrepresentation bias.
arXiv Detail & Related papers (2023-11-21T15:46:11Z) - Generative Oversampling for Imbalanced Data via Majority-Guided VAE [15.93867386081279]
We propose a novel over-sampling model, called Majority-Guided VAE(MGVAE), which generates new minority samples under the guidance of a majority-based prior.
In this way, the newly generated minority samples can inherit the diversity and richness of the majority ones, thus mitigating overfitting in downstream tasks.
arXiv Detail & Related papers (2023-02-14T06:35:23Z) - Centrality and Consistency: Two-Stage Clean Samples Identification for
Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method.
First, we employ a class-level feature clustering procedure for the early identification of clean samples.
Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z) - Few-shot Forgery Detection via Guided Adversarial Interpolation [56.59499187594308]
Existing forgery detection methods suffer from significant performance drops when applied to unseen novel forgery approaches.
We propose Guided Adversarial Interpolation (GAI) to overcome the few-shot forgery detection problem.
Our method is validated to be robust to choices of majority and minority forgery approaches.
arXiv Detail & Related papers (2022-04-12T16:05:10Z) - Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes.
It causes severe biases of the head classes (with majority samples) against the tailed ones.
We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z) - The Majority Can Help The Minority: Context-rich Minority Oversampling
for Long-tailed Classification [20.203461156516937]
We propose a novel minority over-sampling method to augment diversified minority samples.
Our key idea is to paste a foreground patch from a minority class to a background image from a majority class having affluent contexts.
Our method achieves state-of-the-art performance on various long-tailed classification benchmarks.
arXiv Detail & Related papers (2021-12-01T10:58:30Z) - Counterfactual-based minority oversampling for imbalanced classification [11.140929092818235]
A key challenge of oversampling in imbalanced classification is that the generation of new minority samples often neglects the usage of majority classes.
We present a new oversampling framework based on the counterfactual theory.
arXiv Detail & Related papers (2020-08-21T14:13:15Z) - Contrastive Examples for Addressing the Tyranny of the Majority [83.93825214500131]
We propose to create a balanced training dataset, consisting of the original dataset plus new data points in which the group memberships are intervened.
We show that current generative adversarial networks are a powerful tool for learning these data points, called contrastive examples.
arXiv Detail & Related papers (2020-04-14T14:06:44Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.