Related papers: Optimal partition of feature using Bayesian classifier

Optimal partition of feature using Bayesian classifier

URL: http://arxiv.org/abs/2304.14537v1
Date: Thu, 27 Apr 2023 21:19:06 GMT
Title: Optimal partition of feature using Bayesian classifier
Authors: Sanjay Vishwakarma and Srinjoy Ganguly
Abstract summary: In Naive Bayes, certain features are called independent features as they have no conditional correlation or dependency when predicting a classification. We propose a novel technique called the Comonotone-Independence (CIBer) which is able to overcome the challenges posed by the Naive Bayes method.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Naive Bayesian classifier is a popular classification method employing the Bayesian paradigm. The concept of having conditional dependence among input variables sounds good in theory but can lead to a majority vote style behaviour. Achieving conditional independence is often difficult, and they introduce decision biases in the estimates. In Naive Bayes, certain features are called independent features as they have no conditional correlation or dependency when predicting a classification. In this paper, we focus on the optimal partition of features by proposing a novel technique called the Comonotone-Independence Classifier (CIBer) which is able to overcome the challenges posed by the Naive Bayes method. For different datasets, we clearly demonstrate the efficacy of our technique, where we achieve lower error rates and higher or equivalent accuracy compared to models such as Random Forests and XGBoost.

Related papers

Know When to Abstain: Optimal Selective Classification with Likelihood Ratios [10.317060648446828]
We revisit the design of optimal selection functions through the lens of the Neyman-Pearson lemma.<n>We show that this perspective unifies the behavior of several post-hoc selection baselines, and also motivates new approaches to selective classification.<n>We evaluate our proposed methods across a range of vision and language tasks, including both supervised learning and vision-language models.
arXiv Detail & Related papers (2025-05-21T01:26:21Z)
Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier [0.0]
We supervised classification for datasets with a very large number of input variables. We propose a regularization of the model log-like Baylihood. The various proposed algorithms result in optimization-based weighted na"ivees scheme.
arXiv Detail & Related papers (2024-09-17T11:54:14Z)
Variable selection for Na\"ive Bayes classification [2.8265531928694116]
The Na"ive Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. We propose a sparse version of the Na"ive Bayes that is characterized by three properties. Our findings show that, when compared against well-referenced feature selection approaches, the proposed sparse Na"ive Bayes obtains competitive results.
arXiv Detail & Related papers (2024-01-31T18:01:36Z)
Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias [5.698050337128548]
Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples. For neural networks, softmax prediction probabilities are often used as a confidence measure, although they are known to be overconfident, even for wrong predictions. We propose a novel confidence measure, called $mathcalT$-similarity, built upon the prediction diversity of an ensemble of linear classifiers.
arXiv Detail & Related papers (2023-10-23T11:30:06Z)
When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment. We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z)
Multi-Label Quantification [78.83284164605473]
Quantification, variously called "labelled prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies of the classes of interest in unsupervised data samples. We propose methods for inferring estimators of class prevalence values that strive to leverage the dependencies among the classes of interest in order to predict their relative frequencies more accurately.
arXiv Detail & Related papers (2022-11-15T11:29:59Z)
Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method. First, we employ a class-level feature clustering procedure for the early identification of clean samples. Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z)
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification [86.32752788233913]
In classification problems, the Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance. We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show emphuncertainty of the classes. Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data.
arXiv Detail & Related papers (2022-02-01T13:22:26Z)
Improving usual Naive Bayes classifier performances with Neural Naive Bayes based models [6.939768185086753]
This paper introduces the original Neural Naive Bayes, modeling the parameters of the classifier induced from the Naive Bayes with neural network functions. We also introduce new Neural Pooled Markov Chain models, alleviating the independence condition.
arXiv Detail & Related papers (2021-11-14T10:42:26Z)
A new class of generative classifiers based on staged tree models [2.66269503676104]
Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. An applied analysis to predict the fate of the passengers of the Titanic highlights the insights that the new class of generative classifiers can give.
arXiv Detail & Related papers (2020-12-26T19:30:35Z)
Naive Feature Selection: a Nearly Tight Convex Relaxation for Sparse Naive Bayes [51.55826927508311]
We propose a sparse version of naive Bayes, which can be used for feature selection. We prove that our convex relaxation bounds becomes tight as the marginal contribution of additional features decreases. Both binary and multinomial sparse models are solvable in time almost linear in problem size.
arXiv Detail & Related papers (2019-05-23T19:30:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.