Statistical modeling of categorical trajectories with multivariate functional principal components
- URL: http://arxiv.org/abs/2502.09986v1
- Date: Fri, 14 Feb 2025 08:15:17 GMT
- Title: Statistical modeling of categorical trajectories with multivariate functional principal components
- Authors: Hervé Cardot, Caroline Peltier,
- Abstract summary: We turn the problem of statistical modeling of a categorical process into a multivariate functional data analysis issue.
We show how it can be easily extended to analyze experiments, such as temporal check-all-that-apply, in which two states or more can be observed at the same time.
- Score: 0.0
- License:
- Abstract: There are many examples in which the statistical units of interest are samples of a continuous time categorical random process, that is to say a continuous time stochastic process taking values in a finite state space. Without loosing any information, we associate to each state a binary random function, taking values in $\{0,1\}$, and turn the problem of statistical modeling of a categorical process into a multivariate functional data analysis issue. The (multivariate) covariance operator has nice interpretations in terms of departure from independence of the joint probabilities and the multivariate functional principal components are simple to interpret. Under the weak hypothesis assuming only continuity in probability of the $0-1$ trajectories, it is simple to build consistent estimators of the covariance kernel and perform multivariate functional principal components analysis. The sample paths being piecewise constant, with a finite number of jumps, this a rare case in functional data analysis in which the trajectories can be observed exhaustively. The approach is illustrated on a data set of sensory perceptions, considering different gustometer-controlled stimuli experiments. We show how it can be easily extended to analyze experiments, such as temporal check-all-that-apply, in which two states or more can be observed at the same time.
Related papers
- High-dimensional logistic regression with missing data: Imputation, regularization, and universality [7.167672851569787]
We study high-dimensional, ridge-regularized logistic regression.
We provide exact characterizations of both the prediction error and the estimation error.
arXiv Detail & Related papers (2024-10-01T21:41:21Z) - Logistic-beta processes for dependent random probabilities with beta marginals [58.91121576998588]
We propose a novel process called the logistic-beta process, whose logistic transformation yields a process with common beta marginals.
It can model dependence on both discrete and continuous domains, such as space or time, and has a flexible dependence structure through correlation kernels.
We illustrate the benefits through nonparametric binary regression and conditional density estimation examples, both in simulation studies and in a pregnancy outcome application.
arXiv Detail & Related papers (2024-02-10T21:41:32Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Binary Independent Component Analysis via Non-stationarity [7.283533791778359]
We consider independent component analysis of binary data.
We start by assuming a linear mixing model in a continuous-valued latent space, followed by a binary observation model.
In stark contrast to the continuous-valued case, we prove non-identifiability of the model with few observed variables.
arXiv Detail & Related papers (2021-11-30T14:23:53Z) - Finite-Time Convergence Rates of Decentralized Stochastic Approximation
with Applications in Multi-Agent and Multi-Task Learning [16.09467599829253]
We study a data-driven approach for finding the root of an operator under noisy measurements.
A network of agents, each with its own operator and data observations, cooperatively find the fixed point of the aggregate operator over a decentralized communication graph.
Our main contribution is to provide a finite-time analysis of this decentralized approximation method when the data observed at each agent are sampled from a Markov process.
arXiv Detail & Related papers (2020-10-28T17:01:54Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.