Approximate Borderline Sampling using Granular-Ball for Classification Tasks
- URL: http://arxiv.org/abs/2506.02366v1
- Date: Tue, 03 Jun 2025 02:04:27 GMT
- Title: Approximate Borderline Sampling using Granular-Ball for Classification Tasks
- Authors: Qin Xie, Qinghua Zhang, Shuyin Xia,
- Abstract summary: Recently, the sampling method based on granular-ball (GB) has shown promising performance in generality and noisy classification tasks.<n>In this paper, an approximate borderline sampling method using GBs is proposed for classification tasks.
- Score: 7.340676924771606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data sampling enhances classifier efficiency and robustness through data compression and quality improvement. Recently, the sampling method based on granular-ball (GB) has shown promising performance in generality and noisy classification tasks. However, some limitations remain, including the absence of borderline sampling strategies and issues with class boundary blurring or shrinking due to overlap between GBs. In this paper, an approximate borderline sampling method using GBs is proposed for classification tasks. First, a restricted diffusion-based GB generation (RD-GBG) method is proposed, which prevents GB overlaps by constrained expansion, preserving precise geometric representation of GBs via redefined ones. Second, based on the concept of heterogeneous nearest neighbor, a GB-based approximate borderline sampling (GBABS) method is proposed, which is the first general sampling method capable of both borderline sampling and improving the quality of class noise datasets. Additionally, since RD-GBG incorporates noise detection and GBABS focuses on borderline samples, GBABS performs outstandingly on class noise datasets without the need for an optimal purity threshold. Experimental results demonstrate that the proposed methods outperform the GB-based sampling method and several representative sampling methods. Our source code is publicly available at https://github.com/CherylTse/GBABS.
Related papers
- LGBQPC: Local Granular-Ball Quality Peaks Clustering [51.58924743533048]
The density peaks clustering (DPC) algorithm has attracted considerable attention for its ability to detect arbitrarily shaped clusters.<n>Recent advancements integrating granular-ball computing with DPC have led to the GB-based DPC algorithm, which improves computational efficiency.<n>This paper proposes the local GB quality peaks clustering (LGBQPC) algorithm, which offers comprehensive improvements to GBDPC in both GB generation and clustering processes.
arXiv Detail & Related papers (2025-05-16T15:26:02Z) - A robust three-way classifier with shadowed granular-balls based on justifiable granularity [53.39844791923145]
We construct a robust three-way classifier with shadowed GBs for uncertain data.
Our model demonstrates in managing uncertain data and effectively mitigates classification risks.
arXiv Detail & Related papers (2024-07-03T08:54:45Z) - Generation of Granular-Balls for Clustering Based on the Principle of Justifiable Granularity [51.58924743533048]
This article introduces a novel GB generation method for clustering tasks.
We define the coverage and specificity of a GB and introduce a comprehensive measure for assessing GB quality.
Compared to previous GB generation methods, the new method maximizes the overall quality of generated GBs.
arXiv Detail & Related papers (2024-05-11T04:21:32Z) - MB-RACS: Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network [65.1004435124796]
We propose a Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network (MB-RACS) framework.
Our experiments demonstrate that the proposed MB-RACS method surpasses current leading methods.
arXiv Detail & Related papers (2024-01-19T04:40:20Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - GBG++: A Fast and Stable Granular Ball Generation Method for Classification [17.7229704582645]
Granular ball computing is an efficient, robust, and scalable learning method.<n>The stability and efficiency of existing GBG methods need to be further improved.<n>A fast and stable GBG (GBG++) method is proposed first.
arXiv Detail & Related papers (2023-05-29T04:00:19Z) - Plug-and-Play split Gibbs sampler: embedding deep generative priors in
Bayesian inference [12.91637880428221]
This paper introduces a plug-and-play sampling algorithm that leverages variable splitting to efficiently sample from a posterior distribution.
It divides the challenging task of posterior sampling into two simpler sampling problems.
Its performance is compared to recent state-of-the-art optimization and sampling methods.
arXiv Detail & Related papers (2023-04-21T17:17:51Z) - GAN Based Boundary Aware Classifier for Detecting Out-of-distribution
Samples [24.572516991009323]
We propose a GAN based boundary aware classifier (GBAC) for generating a closed hyperspace which only contains most id data.
Our method is based on the fact that the traditional neural net seperates the feature space as several unclosed regions which are not suitable for ood detection.
With GBAC as an auxiliary module, the ood data distributed outside the closed hyperspace will be assigned with much lower score, allowing more effective ood detection.
arXiv Detail & Related papers (2021-12-22T03:35:54Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - UGRWO-Sampling for COVID-19 dataset: A modified random walk
under-sampling approach based on graphs to imbalanced data classification [2.15242029196761]
This paper proposes a new RWO-Sampling (Random Walk Over-Sampling) based on graphs for imbalanced datasets.
Two schemes based on under-sampling and over-sampling methods are introduced to keep the proximity information robust to noises and outliers.
arXiv Detail & Related papers (2020-02-10T03:29:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.