A Semi-Supervised Adaptive Discriminative Discretization Method
Improving Discrimination Power of Regularized Naive Bayes
- URL: http://arxiv.org/abs/2111.10983v3
- Date: Wed, 5 Apr 2023 02:26:58 GMT
- Title: A Semi-Supervised Adaptive Discriminative Discretization Method
Improving Discrimination Power of Regularized Naive Bayes
- Authors: Shihe Wang, Jianfeng Ren and Ruibin Bai
- Abstract summary: We propose a semi-supervised adaptive discriminative discretization framework for naive Bayes.
It could better estimate the data distribution by utilizing both labeled data and unlabeled data through pseudo-labeling techniques.
The proposed method also significantly reduces the information loss during discretization by utilizing an adaptive discriminative discretization scheme.
- Score: 0.48342038441006785
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, many improved naive Bayes methods have been developed with enhanced
discrimination capabilities. Among them, regularized naive Bayes (RNB) produces
excellent performance by balancing the discrimination power and generalization
capability. Data discretization is important in naive Bayes. By grouping
similar values into one interval, the data distribution could be better
estimated. However, existing methods including RNB often discretize the data
into too few intervals, which may result in a significant information loss. To
address this problem, we propose a semi-supervised adaptive discriminative
discretization framework for naive Bayes, which could better estimate the data
distribution by utilizing both labeled data and unlabeled data through
pseudo-labeling techniques. The proposed method also significantly reduces the
information loss during discretization by utilizing an adaptive discriminative
discretization scheme, and hence greatly improves the discrimination power of
classifiers. The proposed RNB+, i.e., regularized naive Bayes utilizing the
proposed discretization framework, is systematically evaluated on a wide range
of machine-learning datasets. It significantly and consistently outperforms
state-of-the-art NB classifiers.
Related papers
- DRoP: Distributionally Robust Pruning [11.930434318557156]
We conduct the first systematic study of the impact of data pruning on classification bias of trained models.
We propose DRoP, a distributionally robust approach to pruning and empirically demonstrate its performance on standard computer vision benchmarks.
arXiv Detail & Related papers (2024-04-08T14:55:35Z) - Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos [11.443755718706562]
Federated learning (FL) is aimed at leveraging vast distributed datasets.
Previous studies have explored discrete representations to enhance model generalization across minor distributional shifts.
We have identified that models derived from FL exhibit markedly increased uncertainty when applied to data silos with unfamiliar distributions.
arXiv Detail & Related papers (2024-02-29T06:13:10Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Boosting the Discriminant Power of Naive Bayes [17.43377106246301]
We propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes.
The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers.
arXiv Detail & Related papers (2022-09-20T08:02:54Z) - A Max-relevance-min-divergence Criterion for Data Discretization with
Applications on Naive Bayes [22.079025650097932]
In many classification models, data is discretized to better estimate its distribution.
We propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data.
We propose a more practical solution, Max-Relevance-Min-Divergence (MRmD) discretization scheme, where each attribute is discretized separately, by simultaneously maximizing the discriminant information and the generalization ability of the discretized data.
arXiv Detail & Related papers (2022-09-20T07:45:00Z) - Adaptive Dimension Reduction and Variational Inference for Transductive
Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction.
Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z) - Mitigating Algorithmic Bias with Limited Annotations [65.060639928772]
When sensitive attributes are not disclosed or available, it is needed to manually annotate a small part of the training data to mitigate bias.
We propose Active Penalization Of Discrimination (APOD), an interactive framework to guide the limited annotations towards maximally eliminating the effect of algorithmic bias.
APOD shows comparable performance to fully annotated bias mitigation, which demonstrates that APOD could benefit real-world applications when sensitive information is limited.
arXiv Detail & Related papers (2022-07-20T16:31:19Z) - Augmentation-Aware Self-Supervision for Data-Efficient GAN Training [68.81471633374393]
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
We propose a novel augmentation-aware self-supervised discriminator that predicts the augmentation parameter of the augmented data.
We compare our method with state-of-the-art (SOTA) methods using the class-conditional BigGAN and unconditional StyleGAN2 architectures.
arXiv Detail & Related papers (2022-05-31T10:35:55Z) - Reusing the Task-specific Classifier as a Discriminator:
Discriminator-free Adversarial Domain Adaptation [55.27563366506407]
We introduce a discriminator-free adversarial learning network (DALN) for unsupervised domain adaptation (UDA)
DALN achieves explicit domain alignment and category distinguishment through a unified objective.
DALN compares favorably against the existing state-of-the-art (SOTA) methods on a variety of public datasets.
arXiv Detail & Related papers (2022-04-08T04:40:18Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - On Positive-Unlabeled Classification in GAN [130.43248168149432]
This paper defines a positive and unlabeled classification problem for standard GANs.
It then leads to a novel technique to stabilize the training of the discriminator in GANs.
arXiv Detail & Related papers (2020-02-04T05:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.