sJIVE: Supervised Joint and Individual Variation Explained
- URL: http://arxiv.org/abs/2102.13278v1
- Date: Fri, 26 Feb 2021 02:54:45 GMT
- Title: sJIVE: Supervised Joint and Individual Variation Explained
- Authors: Elise F. Palzer, Christine Wendt, Russell Bowler, Craig P. Hersh,
Sandra E. Safo, and Eric F. Lock
- Abstract summary: Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in biomedical research.
We propose a method called supervised joint and individual variation explained (sJIVE) that can simultaneously identify shared (joint) and source-specific (individual) underlying structure.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Analyzing multi-source data, which are multiple views of data on the same
subjects, has become increasingly common in molecular biomedical research.
Recent methods have sought to uncover underlying structure and relationships
within and/or between the data sources, and other methods have sought to build
a predictive model for an outcome using all sources. However, existing methods
that do both are presently limited because they either (1) only consider data
structure shared by all datasets while ignoring structures unique to each
source, or (2) they extract underlying structures first without consideration
to the outcome. We propose a method called supervised joint and individual
variation explained (sJIVE) that can simultaneously (1) identify shared (joint)
and source-specific (individual) underlying structure and (2) build a linear
prediction model for an outcome using these structures. These two components
are weighted to compromise between explaining variation in the multi-source
data and in the outcome. Simulations show sJIVE to outperform existing methods
when large amounts of noise are present in the multi-source data. An
application to data from the COPDGene study reveals gene expression and
proteomic patterns that are predictive of lung function. Functions to perform
sJIVE are included in the R.JIVE package, available online at
http://github.com/lockEF/r.jive .
Related papers
- Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables.
We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels.
We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z) - Scalable Randomized Kernel Methods for Multiview Data Integration and
Prediction [4.801208484529834]
We develop scalable randomized kernel methods for jointly associating data from multiple sources and simultaneously predicting an outcome or classifying a unit into one of two or more classes.
The proposed methods model nonlinear relationships in multiview data together with predicting a clinical outcome and are capable of identifying variables or groups of variables that best contribute to the relationships among the views.
arXiv Detail & Related papers (2023-04-10T16:14:42Z) - Multi-View Independent Component Analysis with Shared and Individual
Sources [0.0]
Independent component analysis (ICA) is a blind source separation method for linear disentanglement of independent latent sources from observed data.
We prove that the corresponding linear structure is identifiable, and the shared sources can be recovered, provided that sufficiently many diverse views and data points are available.
We show empirically that our objective recovers the sources in high dimensional settings, also in the case when the measurements are corrupted by noise.
arXiv Detail & Related papers (2022-10-05T08:23:05Z) - Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z) - Causal Discovery in Linear Structural Causal Models with Deterministic
Relations [27.06618125828978]
We focus on the task of causal discovery form observational data.
We derive a set of necessary and sufficient conditions for unique identifiability of the causal structure.
arXiv Detail & Related papers (2021-10-30T21:32:42Z) - Variational Selective Autoencoder: Learning from Partially-Observed
Heterogeneous Data [45.23338389559936]
We propose the variational selective autoencoder (VSAE) to learn representations from partially-observed heterogeneous data.
VSAE learns the latent dependencies in heterogeneous data by modeling the joint distribution of observed data, unobserved data, and the imputation mask.
It results in a unified model for various downstream tasks including data generation and imputation.
arXiv Detail & Related papers (2021-02-25T04:39:13Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.