Double-matched matrix decomposition for multi-view data
- URL: http://arxiv.org/abs/2105.03396v1
- Date: Fri, 7 May 2021 17:09:57 GMT
- Title: Double-matched matrix decomposition for multi-view data
- Authors: Dongbang Yuan and Irina Gaynanova
- Abstract summary: We consider the problem of extracting joint and individual signals from multi-view data, that is data collected from different sources on matched samples.
Our proposed double-matched matrix decomposition allows to simultaneously extract joint and individual signals across subjects.
We apply our method to data from the English Premier League soccer matches, and find joint and individual multi-view signals that align with domain specific knowledge.
- Score: 0.6091702876917281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of extracting joint and individual signals from
multi-view data, that is data collected from different sources on matched
samples. While existing methods for multi-view data decomposition explore
single matching of data by samples, we focus on double-matched multi-view data
(matched by both samples and source features). Our motivating example is the
miRNA data collected from both primary tumor and normal tissues of the same
subjects; the measurements from two tissues are thus matched both by subjects
and by miRNAs. Our proposed double-matched matrix decomposition allows to
simultaneously extract joint and individual signals across subjects, as well as
joint and individual signals across miRNAs. Our estimation approach takes
advantage of double-matching by formulating a new type of optimization problem
with explicit row space and column space constraints, for which we develop an
efficient iterative algorithm. Numerical studies indicate that taking advantage
of double-matching leads to superior signal estimation performance compared to
existing multi-view data decomposition based on single-matching. We apply our
method to miRNA data as well as data from the English Premier League soccer
matches, and find joint and individual multi-view signals that align with
domain specific knowledge.
Related papers
- A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback.
First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF.
Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z) - Empirical Bayes Linked Matrix Decomposition [0.0]
We propose an empirical variational Bayesian approach to this problem.
We describe an associated iterative imputation approach that is novel for the single-matrix context.
We show that the method performs very well under different scenarios with respect to recovering underlying low-rank signal.
arXiv Detail & Related papers (2024-08-01T02:13:11Z) - Propensity Score Alignment of Unpaired Multimodal Data [3.8373578956681555]
Multimodal representation learning techniques typically rely on paired samples to learn common representations.
This paper presents an approach to address the challenge of aligning unpaired samples across disparate modalities in multimodal representation learning.
arXiv Detail & Related papers (2024-04-02T02:36:21Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Align and Attend: Multimodal Summarization with Dual Contrastive Losses [57.83012574678091]
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.
Existing methods fail to leverage the temporal correspondence between different modalities and ignore the intrinsic correlation between different samples.
We introduce Align and Attend Multimodal Summarization (A2Summ), a unified multimodal transformer-based model which can effectively align and attend the multimodal input.
arXiv Detail & Related papers (2023-03-13T17:01:42Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Data thinning for convolution-closed distributions [2.299914829977005]
We propose data thinning, an approach for splitting an observation into two or more independent parts that sum to the original observation.
We show that data thinning can be used to validate the results of unsupervised learning approaches.
arXiv Detail & Related papers (2023-01-18T02:47:41Z) - Unsupervised Manifold Alignment with Joint Multidimensional Scaling [4.683612295430957]
We introduce Joint Multidimensional Scaling, which maps datasets from two different domains to a common low-dimensional Euclidean space.
Our approach integrates Multidimensional Scaling (MDS) and Wasserstein Procrustes analysis into a joint optimization problem.
We demonstrate the effectiveness of our approach in several applications, including joint visualization of two datasets, unsupervised heterogeneous domain adaptation, graph matching, and protein structure alignment.
arXiv Detail & Related papers (2022-07-06T21:02:42Z) - AVIDA: Alternating method for Visualizing and Integrating Data [1.6637373649145604]
AVIDA is a framework for simultaneously performing data alignment and dimension reduction.
We show that AVIDA correctly aligns high-dimensional datasets without common features.
In general applications, other methods can be used for the alignment and dimension reduction modules.
arXiv Detail & Related papers (2022-05-31T22:36:10Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.