Conditional canonical correlation estimation based on covariates with
random forests
- URL: http://arxiv.org/abs/2011.11555v2
- Date: Wed, 3 Feb 2021 22:55:03 GMT
- Title: Conditional canonical correlation estimation based on covariates with
random forests
- Authors: Cansu Alakus, Denis Larocque, Sebastien Jacquemont, Fanny Barlaam,
Charles-Olivier Martin, Kristian Agbogba, Sarah Lippe, Aurelie Labbe
- Abstract summary: We propose a new method called Random Forest with Canonical Correlation Analysis (RFCCA) to estimate the conditional canonical correlations between two sets of variables.
The proposed method and the global significance test is evaluated through simulation studies that show it provides accurate canonical correlation estimations and well-controlled Type-1 error.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Investigating the relationships between two sets of variables helps to
understand their interactions and can be done with canonical correlation
analysis (CCA). However, the correlation between the two sets can sometimes
depend on a third set of covariates, often subject-related ones such as age,
gender, or other clinical measures. In this case, applying CCA to the whole
population is not optimal and methods to estimate conditional CCA, given the
covariates, can be useful. We propose a new method called Random Forest with
Canonical Correlation Analysis (RFCCA) to estimate the conditional canonical
correlations between two sets of variables given subject-related covariates.
The individual trees in the forest are built with a splitting rule specifically
designed to partition the data to maximize the canonical correlation
heterogeneity between child nodes. We also propose a significance test to
detect the global effect of the covariates on the relationship between two sets
of variables. The performance of the proposed method and the global
significance test is evaluated through simulation studies that show it provides
accurate canonical correlation estimations and well-controlled Type-1 error. We
also show an application of the proposed method with EEG data.
Related papers
- Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science.
We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation.
We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z) - Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Covariance regression with random forests [0.0]
CovRegRF is implemented in a freely available R package on CRAN.
An application of the proposed method to thyroid disease data is also presented.
arXiv Detail & Related papers (2022-09-16T21:21:18Z) - Multi-modality fusion using canonical correlation analysis methods:
Application in breast cancer survival prediction from histology and genomics [16.537929113715432]
We study the use of canonical correlation analysis (CCA) and penalized variants of CCA for the fusion of two modalities.
We analytically show that, with known model parameters, posterior mean estimators that jointly use both modalities outperform arbitrary linear mixing of single modality posterior estimators in latent variable prediction.
arXiv Detail & Related papers (2021-11-27T21:18:01Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Grouping effects of sparse CCA models in variable selection [6.196334136139173]
We analyze the grouping effect of the standard and simplified SCCA models in variable selection.
Our theoretical analysis shows that for grouped variable selection, the simplified SCCA jointly selects or deselects a group of variables together.
arXiv Detail & Related papers (2020-08-07T22:27:31Z) - Probabilistic Canonical Correlation Analysis for Sparse Count Data [3.1753001245931323]
Canonical correlation analysis is an important technique for exploring the relationship between two sets of continuous variables.
We propose a model-based probabilistic approach for correlation and canonical correlation estimation for two sparse count data sets.
arXiv Detail & Related papers (2020-05-11T02:19:57Z) - Learning from Aggregate Observations [82.44304647051243]
We study the problem of learning from aggregate observations where supervision signals are given to sets of instances.
We present a general probabilistic framework that accommodates a variety of aggregate observations.
Simple maximum likelihood solutions can be applied to various differentiable models.
arXiv Detail & Related papers (2020-04-14T06:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.