Binned semiparametric Bayesian networks
- URL: http://arxiv.org/abs/2506.21997v2
- Date: Tue, 01 Jul 2025 09:17:43 GMT
- Title: Binned semiparametric Bayesian networks
- Authors: Rafael Sojo, Javier Díaz-Rozo, Concha Bielza, Pedro Larrañaga,
- Abstract summary: We introduce a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation.<n>Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks.
- Score: 3.6998629873543125
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks, the sparse binned kernel density estimation and the Fourier kernel density estimation. These two probability distributions address the curse of dimensionality, which typically impacts binned models, by using sparse tensors and restricting the number of parent nodes in conditional probability calculations. To evaluate the proposal, we perform a complexity analysis and conduct several comparative experiments using synthetic data and datasets from the UCI Machine Learning repository. The experiments include different binning rules, parent restrictions, grid sizes, and number of instances to get a holistic view of the model's behavior. As a result, our binned semiparametric Bayesian networks achieve structural learning and log-likelihood estimations with no statistically significant differences compared to the semiparametric Bayesian networks, but at a much higher speed. Thus, the new binned semiparametric Bayesian networks prove to be a reliable and more efficient alternative to their non-binned counterparts.
Related papers
- Two-sample comparison through additive tree models for density ratios [3.0262553206264893]
We propose algorithms for training additive tree models for the density ratio using a new loss function called the balancing loss.<n>We show that due to the loss function's resemblance to an exponential family kernel, the new loss can serve as a pseudo-likelihood for which conjugate priors exist.<n>We provide insights on the balancing loss through its close connection to the exponential loss in binary classification and to the variational form of f-divergence.
arXiv Detail & Related papers (2025-08-05T04:08:49Z) - Mixture models for data with unknown distributions [0.6345523830122168]
We describe and analyze a broad class of mixture models for real-valued multivariate data.<n>We return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time.<n>We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.
arXiv Detail & Related papers (2025-02-26T22:42:40Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Learning Likelihood Ratios with Neural Network Classifiers [0.12277343096128711]
approximations of the likelihood ratio may be computed using clever parametrizations of neural network-based classifiers.
We present a series of empirical studies detailing the performance of several common loss functionals and parametrizations of the classifier output.
arXiv Detail & Related papers (2023-05-17T18:11:38Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators [100.58924375509659]
Straight-through (ST) estimator gained popularity due to its simplicity and efficiency.
Several techniques were proposed to improve over ST while keeping the same low computational complexity.
We conduct a theoretical analysis of Bias and Variance of these methods in order to understand tradeoffs and verify originally claimed properties.
arXiv Detail & Related papers (2021-10-07T15:16:07Z) - Semiparametric Bayesian Networks [5.205440005969871]
We introduce semiparametric Bayesian networks that combine parametric and nonparametric conditional probability distributions.
Their aim is to incorporate the bounded complexity of parametric models and the flexibility of nonparametric ones.
arXiv Detail & Related papers (2021-09-07T11:47:32Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.