Dimensionality Reduction as Probabilistic Inference
- URL: http://arxiv.org/abs/2304.07658v2
- Date: Wed, 24 May 2023 21:56:08 GMT
- Title: Dimensionality Reduction as Probabilistic Inference
- Authors: Aditya Ravuri, Francisco Vargas, Vidhi Lalchand, Neil D. Lawrence
- Abstract summary: Dimensionality reduction (DR) algorithms compress high-dimensional data into a lower dimensional representation while preserving important features of the data.
We introduce the ProbDR variational framework, which interprets a wide range of classical DR algorithms as probabilistic inference algorithms in this framework.
- Score: 10.714603218784175
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Dimensionality reduction (DR) algorithms compress high-dimensional data into
a lower dimensional representation while preserving important features of the
data. DR is a critical step in many analysis pipelines as it enables
visualisation, noise reduction and efficient downstream processing of the data.
In this work, we introduce the ProbDR variational framework, which interprets a
wide range of classical DR algorithms as probabilistic inference algorithms in
this framework. ProbDR encompasses PCA, CMDS, LLE, LE, MVU, diffusion maps,
kPCA, Isomap, (t-)SNE, and UMAP. In our framework, a low-dimensional latent
variable is used to construct a covariance, precision, or a graph Laplacian
matrix, which can be used as part of a generative model for the data. Inference
is done by optimizing an evidence lower bound. We demonstrate the internal
consistency of our framework and show that it enables the use of probabilistic
programming languages (PPLs) for DR. Additionally, we illustrate that the
framework facilitates reasoning about unseen data and argue that our generative
models approximate Gaussian processes (GPs) on manifolds. By providing a
unified view of DR, our framework facilitates communication, reasoning about
uncertainties, model composition, and extensions, particularly when domain
knowledge is present.
Related papers
- Out-of-Core Dimensionality Reduction for Large Data via Out-of-Sample Extensions [8.368145000145594]
Dimensionality reduction (DR) is a well-established approach for the visualization of high-dimensional data sets.
We propose the use of out-of-sample extensions to perform DR on large data sets.
We provide an evaluation of the projection quality of five common DR algorithms.
arXiv Detail & Related papers (2024-08-07T23:30:53Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Directed Cyclic Graph for Causal Discovery from Multivariate Functional
Data [15.26007975367927]
We introduce a functional linear structural equation model for causal structure learning.
To enhance interpretability, our model involves a low-dimensional causal embedded space.
We prove that the proposed model is causally identifiable under standard assumptions.
arXiv Detail & Related papers (2023-10-31T15:19:24Z) - Simultaneous Dimensionality Reduction: A Data Efficient Approach for Multimodal Representations Learning [0.0]
We explore two primary classes of approaches to dimensionality reduction (DR): Independent Dimensionality Reduction (IDR) and Simultaneous Dimensionality Reduction (SDR)
In IDR, each modality is compressed independently, striving to retain as much variation within each modality as possible.
In SDR, one simultaneously compresses the modalities to maximize the covariation between the reduced descriptions while paying less attention to how much individual variation is preserved.
arXiv Detail & Related papers (2023-10-05T04:26:24Z) - Feature Learning for Dimensionality Reduction toward Maximal Extraction
of Hidden Patterns [25.558967594684056]
Dimensionality reduction (DR) plays a vital role in the visual analysis of high-dimensional data.
This paper presents a feature learning framework, FEALM, designed to generate an optimized set of data projections for nonlinear DR.
We develop interactive visualizations to assist comparison of obtained DR results and interpretation of each DR result.
arXiv Detail & Related papers (2022-06-28T11:18:19Z) - Design of Compressed Sensing Systems via Density-Evolution Framework for
Structure Recovery in Graphical Models [10.667885727418705]
It has been shown that learning the structure of Bayesian networks from observational data is an NP-Hard problem.
We propose a novel density-evolution based framework for optimizing compressed linear measurement systems.
We show that the structure of GBN can indeed be recovered from resulting compressed measurements.
arXiv Detail & Related papers (2022-03-17T22:16:38Z) - BCDAG: An R package for Bayesian structure and Causal learning of
Gaussian DAGs [77.34726150561087]
We introduce the R package for causal discovery and causal effect estimation from observational data.
Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, the number of variables in the dataset.
We then illustrate the main functions and algorithms on both real and simulated datasets.
arXiv Detail & Related papers (2022-01-28T09:30:32Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.