Related papers: Boosting the Discriminant Power of Naive Bayes

Boosting the Discriminant Power of Naive Bayes

URL: http://arxiv.org/abs/2209.09532v1
Date: Tue, 20 Sep 2022 08:02:54 GMT
Title: Boosting the Discriminant Power of Naive Bayes
Authors: Shihe Wang, Jianfeng Ren, Xiaoyu Lian, Ruibin Bai, Xudong Jiang
Abstract summary: We propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes. The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers.
Score: 17.43377106246301
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Naive Bayes has been widely used in many applications because of its simplicity and ability in handling both numerical data and categorical data. However, lack of modeling of correlations between features limits its performance. In addition, noise and outliers in the real-world dataset also greatly degrade the classification performance. In this paper, we propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes. The proposed stack auto-encoder consists of two auto-encoders for different purposes. The first encoder shrinks the initial features to derive a compact feature representation in order to remove the noise and redundant information. The second encoder boosts the discriminant power of the features by expanding them into a higher-dimensional space so that different classes of samples could be better separated in the higher-dimensional space. By integrating the proposed feature augmentation method with the regularized naive Bayes, the discrimination power of the model is greatly enhanced. The proposed method is evaluated on a set of machine-learning benchmark datasets. The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers.

Related papers

Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier [0.0]
We supervised classification for datasets with a very large number of input variables. We propose a regularization of the model log-like Baylihood. The various proposed algorithms result in optimization-based weighted na"ivees scheme.
arXiv Detail & Related papers (2024-09-17T11:54:14Z)
Improved Out-of-Scope Intent Classification with Dual Encoding and Threshold-based Re-Classification [6.975902383951604]
Current methodologies face difficulties with the unpredictable distribution of outliers. We present the Dual for Threshold-Based Re-Classification (DETER) to address these challenges. Our model outperforms previous benchmarks, increasing up to 13% and 5% in F1 score for known and unknown intents.
arXiv Detail & Related papers (2024-05-30T11:46:42Z)
XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification. XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations. Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z)
Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only. We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z)
Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task. A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks. Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z)
Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input. We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise. We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z)
Ensemble Classifier Design Tuned to Dataset Characteristics for Network Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset. The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z)
A Semi-Supervised Adaptive Discriminative Discretization Method Improving Discrimination Power of Regularized Naive Bayes [0.48342038441006785]
We propose a semi-supervised adaptive discriminative discretization framework for naive Bayes. It could better estimate the data distribution by utilizing both labeled data and unlabeled data through pseudo-labeling techniques. The proposed method also significantly reduces the information loss during discretization by utilizing an adaptive discriminative discretization scheme.
arXiv Detail & Related papers (2021-11-22T04:36:40Z)
Dual Adversarial Auto-Encoders for Clustering [152.84443014554745]
We propose Dual Adversarial Auto-encoder (Dual-AAE) for unsupervised clustering. By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of Auto-encoders. Experiments on four benchmarks show that Dual-AAE achieves superior performance over state-of-the-art clustering methods.
arXiv Detail & Related papers (2020-08-23T13:16:34Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.