Copula-Based Density Estimation Models for Multivariate Zero-Inflated
Continuous Data
- URL: http://arxiv.org/abs/2304.00537v1
- Date: Sun, 2 Apr 2023 13:43:37 GMT
- Title: Copula-Based Density Estimation Models for Multivariate Zero-Inflated
Continuous Data
- Authors: Keita Hamamoto
- Abstract summary: We propose two copula-based density estimation models that can cope with multivariate correlation among zero-inflated continuous variables.
In order to overcome the difficulty in the use of copulas due to the tied-data problem in zero-inflated data, we propose a new type of copula, rectified Gaussian copula.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-inflated continuous data ubiquitously appear in many fields, in which
lots of exactly zero-valued data are observed while others distribute
continuously. Due to the mixed structure of discreteness and continuity in its
distribution, statistical analysis is challenging especially for multivariate
case. In this paper, we propose two copula-based density estimation models that
can cope with multivariate correlation among zero-inflated continuous
variables. In order to overcome the difficulty in the use of copulas due to the
tied-data problem in zero-inflated data, we propose a new type of copula,
rectified Gaussian copula, and present efficient methods for parameter
estimation and likelihood computation. Numerical experiments demonstrates the
superiority of our proposals compared to conventional density estimation
methods.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Kinetic Interacting Particle Langevin Monte Carlo [0.0]
This paper introduces and analyses interacting underdamped Langevin algorithms, for statistical inference in latent variable models.
We propose a diffusion process that evolves jointly in the space of parameters and latent variables.
We provide two explicit discretisations of this diffusion as practical algorithms to estimate parameters of statistical models.
arXiv Detail & Related papers (2024-07-08T09:52:46Z) - Synthetic Tabular Data Validation: A Divergence-Based Approach [8.062368743143388]
Divergences quantify discrepancies between data distributions.
Traditional approaches calculate divergences independently for each feature.
We propose a novel approach that uses divergence estimation to overcome the limitations of marginal comparisons.
arXiv Detail & Related papers (2024-05-13T15:07:52Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Empirical Density Estimation based on Spline Quasi-Interpolation with
applications to Copulas clustering modeling [0.0]
Density estimation is a fundamental technique employed in various fields to model and to understand the underlying distribution of data.
In this paper we propose the mono-variate approximation of the density using quasi-interpolation.
The presented algorithm is validated on artificial and real datasets.
arXiv Detail & Related papers (2024-02-18T11:49:38Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Estimating Latent Population Flows from Aggregated Data via Inversing
Multi-Marginal Optimal Transport [57.16851632525864]
We study the problem of estimating latent population flows from aggregated count data.
This problem arises when individual trajectories are not available due to privacy issues or measurement fidelity.
We propose to estimate the transition flows from aggregated data by learning the cost functions of the MOT framework.
arXiv Detail & Related papers (2022-12-30T03:03:23Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Training Normalizing Flows from Dependent Data [31.42053454078623]
We propose a likelihood objective of normalizing flows incorporating dependencies between the data points.
We show that respecting dependencies between observations can improve empirical results on both synthetic and real-world data.
arXiv Detail & Related papers (2022-09-29T16:50:34Z) - Linear Discriminant Analysis with High-dimensional Mixed Variables [10.774094462083843]
This paper develops a novel approach for classifying high-dimensional observations with mixed variables.
We overcome the challenge of having to split data into exponentially many cells.
Results on the estimation accuracy and the misclassification rates are established.
arXiv Detail & Related papers (2021-12-14T03:57:56Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.