Double-matched matrix decomposition for multi-view data
- URL: http://arxiv.org/abs/2105.03396v1
- Date: Fri, 7 May 2021 17:09:57 GMT
- Title: Double-matched matrix decomposition for multi-view data
- Authors: Dongbang Yuan and Irina Gaynanova
- Abstract summary: We consider the problem of extracting joint and individual signals from multi-view data, that is data collected from different sources on matched samples.
Our proposed double-matched matrix decomposition allows to simultaneously extract joint and individual signals across subjects.
We apply our method to data from the English Premier League soccer matches, and find joint and individual multi-view signals that align with domain specific knowledge.
- Score: 0.6091702876917281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of extracting joint and individual signals from
multi-view data, that is data collected from different sources on matched
samples. While existing methods for multi-view data decomposition explore
single matching of data by samples, we focus on double-matched multi-view data
(matched by both samples and source features). Our motivating example is the
miRNA data collected from both primary tumor and normal tissues of the same
subjects; the measurements from two tissues are thus matched both by subjects
and by miRNAs. Our proposed double-matched matrix decomposition allows to
simultaneously extract joint and individual signals across subjects, as well as
joint and individual signals across miRNAs. Our estimation approach takes
advantage of double-matching by formulating a new type of optimization problem
with explicit row space and column space constraints, for which we develop an
efficient iterative algorithm. Numerical studies indicate that taking advantage
of double-matching leads to superior signal estimation performance compared to
existing multi-view data decomposition based on single-matching. We apply our
method to miRNA data as well as data from the English Premier League soccer
matches, and find joint and individual multi-view signals that align with
domain specific knowledge.
Related papers
- ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Propensity Score Alignment of Unpaired Multimodal Data [0.0]
Multimodal representation learning techniques typically rely on paired samples to learn common representations.
This paper presents an approach to address the challenge of aligning unpaired samples across disparate modalities in multimodal representation learning.
arXiv Detail & Related papers (2024-04-02T02:36:21Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Scalable Randomized Kernel Methods for Multiview Data Integration and
Prediction [4.801208484529834]
We develop scalable randomized kernel methods for jointly associating data from multiple sources and simultaneously predicting an outcome or classifying a unit into one of two or more classes.
The proposed methods model nonlinear relationships in multiview data together with predicting a clinical outcome and are capable of identifying variables or groups of variables that best contribute to the relationships among the views.
arXiv Detail & Related papers (2023-04-10T16:14:42Z) - Align and Attend: Multimodal Summarization with Dual Contrastive Losses [57.83012574678091]
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Existing methods fail to leverage the temporal correspondence between different modalities and ignore the intrinsic correlation between different samples.
We introduce Align and Attend Multimodal Summarization (A2Summ), a unified multimodal transformer-based model which can effectively align and attend the multimodal input.
arXiv Detail & Related papers (2023-03-13T17:01:42Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Data thinning for convolution-closed distributions [2.299914829977005]
We propose data thinning, an approach for splitting an observation into two or more independent parts that sum to the original observation.
We show that data thinning can be used to validate the results of unsupervised learning approaches.
arXiv Detail & Related papers (2023-01-18T02:47:41Z) - Unsupervised Manifold Alignment with Joint Multidimensional Scaling [4.683612295430957]
We introduce Joint Multidimensional Scaling, which maps datasets from two different domains to a common low-dimensional Euclidean space.
Our approach integrates Multidimensional Scaling (MDS) and Wasserstein Procrustes analysis into a joint optimization problem.
We demonstrate the effectiveness of our approach in several applications, including joint visualization of two datasets, unsupervised heterogeneous domain adaptation, graph matching, and protein structure alignment.
arXiv Detail & Related papers (2022-07-06T21:02:42Z) - AVIDA: Alternating method for Visualizing and Integrating Data [1.6637373649145604]
AVIDA is a framework for simultaneously performing data alignment and dimension reduction.
We show that AVIDA correctly aligns high-dimensional datasets without common features.
In general applications, other methods can be used for the alignment and dimension reduction modules.
arXiv Detail & Related papers (2022-05-31T22:36:10Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.