Factoring Multidimensional Data to Create a Sophisticated Bayes
Classifier
- URL: http://arxiv.org/abs/2105.05181v1
- Date: Tue, 11 May 2021 16:34:12 GMT
- Title: Factoring Multidimensional Data to Create a Sophisticated Bayes
Classifier
- Authors: Anthony LaTorre
- Abstract summary: We derive an explicit formula for calculating the marginal likelihood of a given factorization of a categorical dataset.
These likelihoods can be used to order all possible factorizations and select the "best" way to factor the overall distribution from which the dataset is drawn.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we derive an explicit formula for calculating the marginal
likelihood of a given factorization of a categorical dataset. Since the
marginal likelihood is proportional to the posterior probability of the
factorization, these likelihoods can be used to order all possible
factorizations and select the "best" way to factor the overall distribution
from which the dataset is drawn. The best factorization can then be used to
construct a Bayes classifier which benefits from factoring out mutually
independent sets of variables.
Related papers
- Obtaining Explainable Classification Models using Distributionally
Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules.
An inherent trade-off exists between rule set sparsity and its prediction accuracy.
We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z) - On the Correlation between Random Variables and their Principal
Components [0.0]
The article attempts to find an algebraic formula describing the correlation coefficients between random variables and the principal components representing them.
It is possible to apply this formula to optimize the number of principal components in Principal Component Analysis, as well as to optimize the number of factors in Factor Analysis.
arXiv Detail & Related papers (2023-10-09T20:35:38Z) - Domain Generalization via Rationale Invariance [70.32415695574555]
This paper offers a new perspective to ease the challenge of domain generalization, which involves maintaining robust results even in unseen environments.
We propose treating the element-wise contributions to the final results as the rationale for making a decision and representing the rationale for each sample as a matrix.
Our experiments demonstrate that the proposed approach achieves competitive results across various datasets, despite its simplicity.
arXiv Detail & Related papers (2023-08-22T03:31:40Z) - Disentanglement of Correlated Factors via Hausdorff Factorized Support [53.23740352226391]
We propose a relaxed disentanglement criterion - the Hausdorff Factorized Support (HFS) criterion - that encourages a factorized support, rather than a factorial distribution.
We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks.
arXiv Detail & Related papers (2022-10-13T20:46:42Z) - Feature Selection via the Intervened Interpolative Decomposition and its
Application in Diversifying Quantitative Strategies [4.913248451323163]
We propose a probabilistic model for computing an interpolative decomposition (ID) in which each column of the observed matrix has its own priority or importance.
We evaluate the proposed models on real-world datasets, including ten Chinese A-share stocks.
arXiv Detail & Related papers (2022-09-29T03:36:56Z) - Factorizable Joint Shift in Multinomial Classification [3.3504365823045035]
We derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features.
Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift.
arXiv Detail & Related papers (2022-07-29T07:21:44Z) - On the rate of convergence of a classifier based on a Transformer
encoder [55.41148606254641]
The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed.
It is shown that this classifier is able to circumvent the curse of dimensionality provided the aposteriori probability satisfies a suitable hierarchical composition model.
arXiv Detail & Related papers (2021-11-29T14:58:29Z) - Degenerate Gaussian factors for probabilistic inference [0.0]
We propose a parametrised factor that enables inference on Gaussian networks where linear dependencies exist among the random variables.
By using this principled factor definition, degeneracies can be accommodated accurately and automatically at little additional computational cost.
arXiv Detail & Related papers (2021-04-30T13:58:29Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Supervised Quantile Normalization for Low-rank Matrix Approximation [50.445371939523305]
We learn the parameters of quantile normalization operators that can operate row-wise on the values of $X$ and/or of its factorization $UV$ to improve the quality of the low-rank representation of $X$ itself.
We demonstrate the applicability of these techniques on synthetic and genomics datasets.
arXiv Detail & Related papers (2020-02-08T21:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.