Uncertainty-Aware PCA for Arbitrarily Distributed Data Modeled by Gaussian Mixture Models
- URL: http://arxiv.org/abs/2508.13990v1
- Date: Tue, 19 Aug 2025 16:31:41 GMT
- Title: Uncertainty-Aware PCA for Arbitrarily Distributed Data Modeled by Gaussian Mixture Models
- Authors: Daniel Klötzl, Ozan Tastekin, David Hägele, Marina Evers, Daniel Weiskopf,
- Abstract summary: Multidimensional data is often associated with uncertainties that are not well-described by normal distributions.<n>We show how multidimensional distributions can be projected to a low-dimensional space using uncertainty-aware principal component analysis (UAPCA)<n>We evaluate our approach by comparing the distributions in low-dimensional space obtained by our method and UAPCA to those obtained by sample-based projections.
- Score: 7.664298995468921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multidimensional data is often associated with uncertainties that are not well-described by normal distributions. In this work, we describe how such distributions can be projected to a low-dimensional space using uncertainty-aware principal component analysis (UAPCA). We propose to model multidimensional distributions using Gaussian mixture models (GMMs) and derive the projection from a general formulation that allows projecting arbitrary probability density functions. The low-dimensional projections of the densities exhibit more details about the distributions and represent them more faithfully compared to UAPCA mappings. Further, we support including user-defined weights between the different distributions, which allows for varying the importance of the multidimensional distributions. We evaluate our approach by comparing the distributions in low-dimensional space obtained by our method and UAPCA to those obtained by sample-based projections.
Related papers
- Nonparametric Estimation of Joint Entropy through Partitioned Sample-Spacing Method [0.0]
We propose a nonparametric estimator of multivariate joint entropy based on partitioned sample spacings (PSS)<n>PSS consistently outperforms k-nearest neighbor estimators and achieves accuracy competitive with recent normalizing flow-based methods.<n>These properties position PSS as a practical alternative to normalizing flow-based approaches, with broad potential in information-theoretic machine learning applications.
arXiv Detail & Related papers (2025-11-17T17:05:34Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space [8.971989179518216]
Given a sparsely observed matrix of empirical distributions, we seek to impute the true distributions associated with both observed and unobserved matrix entries.<n>We utilize tools from optimal transport to generalize the nearest neighbors method to the distributional setting.
arXiv Detail & Related papers (2024-10-17T00:50:17Z) - Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables.<n>The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Scalable Counterfactual Distribution Estimation in Multivariate Causal
Models [12.88471300865496]
We consider the problem of estimating the counterfactual joint distribution of multiple quantities of interests in a multivariate causal model.
We propose a method that alleviates both issues simultaneously by leveraging a robust latent one-dimensional subspace.
We demonstrate the advantages of our approach over existing methods on both synthetic and real-world data.
arXiv Detail & Related papers (2023-11-02T01:45:44Z) - Classification of Heavy-tailed Features in High Dimensions: a
Superstatistical Approach [1.4469725791865984]
We characterise the learning of a mixture of two clouds of data points with generic centroids.
We study the generalisation performance of the obtained estimator, we analyse the role of regularisation, and we analytically the separability transition.
arXiv Detail & Related papers (2023-04-06T07:53:05Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - Personalized Trajectory Prediction via Distribution Discrimination [78.69458579657189]
Trarimiy prediction is confronted with the dilemma to capture the multi-modal nature of future dynamics.
We present a distribution discrimination (DisDis) method to predict personalized motion patterns.
Our method can be integrated with existing multi-modal predictive models as a plug-and-play module.
arXiv Detail & Related papers (2021-07-29T17:42:12Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - A likelihood approach to nonparametric estimation of a singular
distribution using deep generative models [4.329951775163721]
We investigate a likelihood approach to nonparametric estimation of a singular distribution using deep generative models.
We prove that a novel and effective solution exists by perturbing the data with an instance noise.
We also characterize the class of distributions that can be efficiently estimated via deep generative models.
arXiv Detail & Related papers (2021-05-09T23:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.