Bayesian Multivariate Density-Density Regression
- URL: http://arxiv.org/abs/2504.12617v2
- Date: Mon, 22 Sep 2025 19:38:42 GMT
- Title: Bayesian Multivariate Density-Density Regression
- Authors: Khai Nguyen, Yang Ni, Peter Mueller,
- Abstract summary: We introduce a novel and scalable Bayesian framework for multivariate-density-density regression (DDR)<n>Our approach addresses the critical issue of distributions residing in spaces of differing dimensions.<n>We show that Bayesian DDR provides robust fits, superior predictive performance compared to traditional methods, and valuable insights into complex biological interactions.
- Score: 25.35298354797079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel and scalable Bayesian framework for multivariate-density-density regression (DDR), designed to model relationships between multivariate distributions. Our approach addresses the critical issue of distributions residing in spaces of differing dimensions. We utilize a generalized Bayes framework, circumventing the need for a fully specified likelihood by employing the sliced Wasserstein distance to measure the discrepancy between fitted and observed distributions. This choice not only handles high-dimensional data and varying sample sizes efficiently but also facilitates a Metropolis-adjusted Langevin algorithm (MALA) for posterior inference. Furthermore, we establish the posterior consistency of our generalized Bayesian approach, ensuring that the posterior distribution concentrates around the true parameters as the sample size increases. Through simulations and application to a population-scale single-cell dataset, we show that Bayesian DDR provides robust fits, superior predictive performance compared to traditional methods, and valuable insights into complex biological interactions.
Related papers
- Bayesian Multiple Multivariate Density-Density Regression [25.35298354797079]
We propose the first approach for multiple multivariate density-density regression (MDDR)<n>We define a fitted distribution using a sliced Wasserstein barycenter (SWB) of push-forwards of the predictors.<n> Regression functions, which map predictors' supports to the response support, and barycenter weights are inferred within a generalized Bayes framework.<n>We demonstrate MDDR in an application to inference for population-scale single-cell data.
arXiv Detail & Related papers (2026-01-06T01:21:20Z) - Efficient Covariance Estimation for Sparsified Functional Data [51.69796254617083]
proposed Random-knots (Random-knots-Spatial) and B-spline (Bspline-Spatial) estimators of the covariance function are computationally efficient.<n>Asymptotic pointwise of the covariance are obtained for sparsified individual trajectories under some regularity conditions.
arXiv Detail & Related papers (2025-11-23T00:50:33Z) - Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - Generative Distribution Embeddings [1.3252809892089024]
We introduce generative distribution embeddings (GDE), a framework that lifts autoencoders to the space of distributions.<n>In GDEs, an encoder acts on sets of samples, and the decoder is replaced by a generator which aims to match the input distribution.<n>We apply GDEs to six key problems in computational biology.
arXiv Detail & Related papers (2025-05-23T17:58:57Z) - Likelihood-Free Adaptive Bayesian Inference via Nonparametric Distribution Matching [2.0319002824093015]
We propose Adaptive Bayesian Inference (ABI), a framework that bypasses traditional data-space discrepancies.<n>ABI transforms the problem of measuring divergence between posterior distributions into a tractable sequence of conditional quantile regression tasks.<n>We demonstrate that ABI significantly outperforms data-based Wasserstein, summary-based ABC, and state-of-the-art likelihood-free simulators.
arXiv Detail & Related papers (2025-05-07T17:50:14Z) - Robust and Scalable Variational Bayes [2.014089835498735]
We propose a robust framework for variational Bayes (VB) that effectively handles outliers and contamination of arbitrary nature in large datasets.<n>Our approach divides the dataset into disjoint subsets, computes the posterior for each subset, and applies VB approximation independently to these posteriors.<n>This novel aggregation method yields the Variational Median Posterior (VM-Posterior) distribution.
arXiv Detail & Related papers (2025-04-16T23:20:43Z) - A Bayesian Approach Toward Robust Multidimensional Ellipsoid-Specific Fitting [0.0]
This work presents a novel and effective method for fitting multidimensional ellipsoids to scattered data in the contamination of noise and outliers.
We incorporate a uniform prior distribution to constrain the search for primitive parameters within an ellipsoidal domain.
We apply it to a wide range of practical applications such as microscopy cell counting, 3D reconstruction, geometric shape approximation, and magnetometer calibration tasks.
arXiv Detail & Related papers (2024-07-27T14:31:51Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Generative inpainting of incomplete Euclidean distance matrices of trajectories generated by a fractional Brownian motion [46.1232919707345]
Fractional Brownian motion (fBm) features both randomness and strong scale-free correlations.
Here we examine a zoo of diffusion-based inpainting methods on a specific dataset of corrupted images.
We find that the conditional diffusion generation readily reproduces the built-in correlations of fBm paths in different memory regimes.
arXiv Detail & Related papers (2024-04-10T14:22:16Z) - TIC-TAC: A Framework for Improved Covariance Estimation in Deep Heteroscedastic Regression [109.69084997173196]
Deepscedastic regression involves jointly optimizing the mean and covariance of the predicted distribution using the negative log-likelihood.
Recent works show that this may result in sub-optimal convergence due to the challenges associated with covariance estimation.
We study two questions: (1) Does the predicted covariance truly capture the randomness of the predicted mean?
Our results show that not only does TIC accurately learn the covariance, it additionally facilitates an improved convergence of the negative log-likelihood.
arXiv Detail & Related papers (2023-10-29T09:54:03Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Graph Fourier MMD for Signals on Graphs [67.68356461123219]
We propose a novel distance between distributions and signals on graphs.
GFMMD is defined via an optimal witness function that is both smooth on the graph and maximizes difference in expectation.
We showcase it on graph benchmark datasets as well as on single cell RNA-sequencing data analysis.
arXiv Detail & Related papers (2023-06-05T00:01:17Z) - On counterfactual inference with unobserved confounding [36.18241676876348]
Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit.
We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors.
We derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality.
arXiv Detail & Related papers (2022-11-14T04:14:37Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - $k$-Variance: A Clustered Notion of Variance [23.57925128327]
We introduce $k$-variance, a generalization of variance built on the machinery of random bipartite matchings.
We provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets.
arXiv Detail & Related papers (2020-12-13T04:25:32Z) - Nonlinear Distribution Regression for Remote Sensing Applications [6.664736150040092]
In many remote sensing applications one wants to estimate variables or parameters of interest from observations.
Standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate to the two.
This paper introduces a nonlinear (kernel-based) method for distribution regression that solves the previous problems without making any assumption on the statistics of the grouped data.
arXiv Detail & Related papers (2020-12-07T22:04:43Z) - Information-Theoretic Bounds on Transfer Generalization Gap Based on
Jensen-Shannon Divergence [42.275148861039895]
In transfer learning, training and testing data sets are drawn from different data distributions.
This work presents novel information-theoretic upper bounds on the average transfer generalization gap.
arXiv Detail & Related papers (2020-10-13T11:03:25Z) - Linear Optimal Transport Embedding: Provable Wasserstein classification
for certain rigid transformations and perturbations [79.23797234241471]
Discriminating between distributions is an important problem in a number of scientific fields.
The Linear Optimal Transportation (LOT) embeds the space of distributions into an $L2$-space.
We demonstrate the benefits of LOT on a number of distribution classification problems.
arXiv Detail & Related papers (2020-08-20T19:09:33Z) - Large scale analysis of generalization error in learning using margin
based classification methods [2.436681150766912]
We derive the expression for the generalization error of a family of large-margin classifiers in the limit of both sample size $n$ and dimension $p$.
For two layer neural networks, we reproduce the recently developed double descent' phenomenology for several classification models.
arXiv Detail & Related papers (2020-07-16T20:31:26Z) - VAE-KRnet and its applications to variational Bayes [4.9545850065593875]
We have proposed a generative model, called VAE-KRnet, for density estimation or approximation.
VAE is used a dimension reduction technique to capture the latent space, and KRnet is used to model the distribution of the latent variable.
VAE-KRnet can be used as a density model to approximate either data distribution or an arbitrary probability density function.
arXiv Detail & Related papers (2020-06-29T23:14:36Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.