MMM: Clustering Multivariate Longitudinal Mixed-type Data
- URL: http://arxiv.org/abs/2509.12166v1
- Date: Mon, 15 Sep 2025 17:30:31 GMT
- Title: MMM: Clustering Multivariate Longitudinal Mixed-type Data
- Authors: Francesco Amato, Julien Jacques,
- Abstract summary: We introduce the Mixture of Mixed-Matrices (MMM) model.<n>The model is able to handle continuous, ordinal, binary, nominal and count data.<n>A real-world application on financial data is presented.
- Score: 0.2578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multivariate longitudinal data of mixed-type are increasingly collected in many science domains. However, algorithms to cluster this kind of data remain scarce, due to the challenge to simultaneously model the within- and between-time dependence structures for multivariate data of mixed kind. We introduce the Mixture of Mixed-Matrices (MMM) model: reorganizing the data in a three-way structure and assuming that the non-continuous variables are observations of underlying latent continuous variables, the model relies on a mixture of matrix-variate normal distributions to perform clustering in the latent dimension. The MMM model is thus able to handle continuous, ordinal, binary, nominal and count data and to concurrently model the heterogeneity, the association among the responses and the temporal dependence structure in a parsimonious way and without assuming conditional independence. The inference is carried out through an MCMC-EM algorithm, which is detailed. An evaluation of the model through synthetic data shows its inference abilities. A real-world application on financial data is presented.
Related papers
- UniDiff: A Unified Diffusion Framework for Multimodal Time Series Forecasting [90.47915032778366]
We propose UniDiff, a unified diffusion framework for multimodal time series forecasting.<n>At its core lies a unified and parallel fusion module, where a single cross-attention mechanism integrates structural information from timestamps and semantic context from texts.<n>Experiments on real-world benchmark datasets across eight domains demonstrate that the proposed UniDiff model achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-08T05:36:14Z) - Beyond Marginals: Learning Joint Spatio-Temporal Patterns for Multivariate Anomaly Detection [2.893006778402251]
In time series data, an anomaly may be indicated by the simultaneous deviation of interrelated time series.<n>Our approach addresses this by modeling joint dependencies in the latent space.
arXiv Detail & Related papers (2025-09-18T14:57:55Z) - Hybrid Bernstein Normalizing Flows for Flexible Multivariate Density Regression with Interpretable Marginals [3.669506968635671]
Density regression models allow a comprehensive understanding of data by modeling the complete conditional probability distribution.<n>In this paper, we combine MCTM with state-of-the-art and autoregressive NF to leverage the transparency of MCTM for modeling interpretable feature effects.<n>We demonstrate our method's versatility in various numerical experiments and compare it with MCTM and other NF models on both simulated and real-world data.
arXiv Detail & Related papers (2025-05-20T10:17:07Z) - Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models [65.82630283336051]
We show that the space spanned by the combination of dimensions and attributes is insufficiently sampled by existing training scheme of diffusion generative models.
We present a simple fix to this problem by constructing processes that fully exploit the structures, hence the name ComboStoc.
arXiv Detail & Related papers (2024-05-22T15:23:10Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Mixture of Coupled HMMs for Robust Modeling of Multivariate Healthcare
Time Series [7.5986411724707095]
We propose a novel class of models, a mixture of coupled hidden Markov models (M-CHMM)
To make the model learning feasible, we derive two algorithms to sample the sequences of the latent variables in the CHMM.
Compared to existing inference methods, our algorithms are computationally tractable, improve mixing, and allow for likelihood estimation.
arXiv Detail & Related papers (2023-11-14T02:55:37Z) - Latent Processes Identification From Multi-View Time Series [17.33428123777779]
We propose a novel framework that employs the contrastive learning technique to invert the data generative process for enhanced identifiability.
MuLTI integrates a permutation mechanism that merges corresponding overlapped variables by the establishment of an optimal transport formula.
arXiv Detail & Related papers (2023-05-14T14:21:58Z) - Mixed data Deep Gaussian Mixture Model: A clustering model for mixed
datasets [0.0]
We introduce a model-based clustering method called Mixed Deep Gaussian Mixture Model (MDGMM)
This architecture is flexible and can be adapted to mixed as well as to continuous or non-continuous data.
Our model provides continuous low-dimensional representations of the data which can be a useful tool to visualize mixed datasets.
arXiv Detail & Related papers (2020-10-13T19:52:46Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Graph Gamma Process Generalized Linear Dynamical Systems [60.467040479276704]
We introduce graph gamma process (GGP) linear dynamical systems to model real multivariate time series.
For temporal pattern discovery, the latent representation under the model is used to decompose the time series into a parsimonious set of multivariate sub-sequences.
We use the generated random graph, whose number of nonzero-degree nodes is finite, to define both the sparsity pattern and dimension of the latent state transition matrix.
arXiv Detail & Related papers (2020-07-25T04:16:34Z) - Variational Conditional Dependence Hidden Markov Models for
Skeleton-Based Action Recognition [7.9603223299524535]
This paper revisits conventional sequential modeling approaches, aiming to address the problem of capturing time-varying temporal dependency patterns.
We propose a different formulation of HMMs, whereby the dependence on past frames is dynamically inferred from the data.
We derive a tractable inference algorithm based on the forward-backward algorithm.
arXiv Detail & Related papers (2020-02-13T23:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.