Gaussian Determinantal Processes: a new model for directionality in data
- URL: http://arxiv.org/abs/2111.09990v1
- Date: Fri, 19 Nov 2021 00:57:33 GMT
- Title: Gaussian Determinantal Processes: a new model for directionality in data
- Authors: Subhro Ghosh, Philippe Rigollet
- Abstract summary: In this work, we investigate a parametric family of Gaussian DPPs with a clearly interpretable effect of parametric modulation on the observed points.
We show that parameter modulation impacts the observed points by introducing directionality in their repulsion structure, and the principal directions correspond to the directions of maximal dependency.
This model readily yields a novel and viable alternative to Principal Component Analysis (PCA) as a dimension reduction tool that favors directions along which the data is most spread out.
- Score: 10.591948377239921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determinantal point processes (a.k.a. DPPs) have recently become popular
tools for modeling the phenomenon of negative dependence, or repulsion, in
data. However, our understanding of an analogue of a classical parametric
statistical theory is rather limited for this class of models. In this work, we
investigate a parametric family of Gaussian DPPs with a clearly interpretable
effect of parametric modulation on the observed points. We show that parameter
modulation impacts the observed points by introducing directionality in their
repulsion structure, and the principal directions correspond to the directions
of maximal (i.e. the most long ranged) dependency.
This model readily yields a novel and viable alternative to Principal
Component Analysis (PCA) as a dimension reduction tool that favors directions
along which the data is most spread out. This methodological contribution is
complemented by a statistical analysis of a spiked model similar to that
employed for covariance matrices as a framework to study PCA. These theoretical
investigations unveil intriguing questions for further examination in random
matrix theory, stochastic geometry and related topics.
Related papers
- Induced Covariance for Causal Discovery in Linear Sparse Structures [55.2480439325792]
Causal models seek to unravel the cause-effect relationships among variables from observed data.
This paper introduces a novel causal discovery algorithm designed for settings in which variables exhibit linearly sparse relationships.
arXiv Detail & Related papers (2024-10-02T04:01:38Z) - Model-Based Reparameterization Policy Gradient Methods: Theory and
Practical Algorithms [88.74308282658133]
Reization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics.
Recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes.
We propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls.
arXiv Detail & Related papers (2023-10-30T18:43:21Z) - SLEM: Machine Learning for Path Modeling and Causal Inference with Super
Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data.
Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon.
We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - On the Influence of Enforcing Model Identifiability on Learning dynamics
of Gaussian Mixture Models [14.759688428864159]
We propose a technique for extracting submodels from singular models.
Our method enforces model identifiability during training.
We show how the method can be applied to more complex models like deep neural networks.
arXiv Detail & Related papers (2022-06-17T07:50:22Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Surrogate-based variational data assimilation for tidal modelling [0.0]
Data assimilation (DA) is widely used to combine physical knowledge and observations.
In a context of climate change, old calibrations can not necessarily be used for new scenarios.
This raises the question of DA computational cost.
Two methods are proposed to replace the complex model by a surrogate.
arXiv Detail & Related papers (2021-06-08T07:39:38Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Factor Analysis, Probabilistic Principal Component Analysis, Variational
Inference, and Variational Autoencoder: Tutorial and Survey [5.967999555890417]
This tutorial and survey paper on factor analysis, probabilistic Principal Component Analysis (PCA), variational inference, and Variational Autoencoder (VAE)
They asssume that every data point is generated from or caused by a low-dimensional latent factor.
For their inference and generative behaviour, these models can also be used for generation of new data points in the data space.
arXiv Detail & Related papers (2021-01-04T01:29:09Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.