Single-Cell Multimodal Prediction via Transformers
- URL: http://arxiv.org/abs/2303.00233v3
- Date: Fri, 13 Oct 2023 15:32:57 GMT
- Title: Single-Cell Multimodal Prediction via Transformers
- Authors: Wenzhuo Tang, Hongzhi Wen, Renming Liu, Jiayuan Ding, Wei Jin, Yuying
Xie, Hui Liu, Jiliang Tang
- Abstract summary: We propose scMoFormer to model the complex interactions among different modalities.
scMoFormer won a Kaggle silver medal with the rank of 24/1221 (Top 2%) without ensemble in a NeurIPS 2022 competition.
- Score: 36.525050229323845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent development of multimodal single-cell technology has made the
possibility of acquiring multiple omics data from individual cells, thereby
enabling a deeper understanding of cellular states and dynamics. Nevertheless,
the proliferation of multimodal single-cell data also introduces tremendous
challenges in modeling the complex interactions among different modalities. The
recently advanced methods focus on constructing static interaction graphs and
applying graph neural networks (GNNs) to learn from multimodal data. However,
such static graphs can be suboptimal as they do not take advantage of the
downstream task information; meanwhile GNNs also have some inherent limitations
when deeply stacking GNN layers. To tackle these issues, in this work, we
investigate how to leverage transformers for multimodal single-cell data in an
end-to-end manner while exploiting downstream task information. In particular,
we propose a scMoFormer framework which can readily incorporate external domain
knowledge and model the interactions within each modality and cross modalities.
Extensive experiments demonstrate that scMoFormer achieves superior performance
on various benchmark datasets. Remarkably, scMoFormer won a Kaggle silver medal
with the rank of 24/1221 (Top 2%) without ensemble in a NeurIPS 2022
competition. Our implementation is publicly available at Github.
Related papers
- U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation [63.31007867379312]
We introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantics.
We employ feature fusion at multiple scales to ensure the effective extraction and integration of both global and local features.
Experimental results demonstrate that our approach achieves superior performance across multiple datasets.
arXiv Detail & Related papers (2024-05-24T08:58:48Z) - NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild.
NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities.
We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - Deformable Mixer Transformer with Gating for Multi-Task Learning of
Dense Prediction [126.34551436845133]
CNNs and Transformers have their own advantages and both have been widely used for dense prediction in multi-task learning (MTL)
We present a novel MTL model by combining both merits of deformable CNN and query-based Transformer with shared gating for multi-task learning of dense prediction.
arXiv Detail & Related papers (2023-08-10T17:37:49Z) - Graph Neural Networks for Multimodal Single-Cell Data Integration [32.8390339109358]
We present a general Graph Neural Network framework $textitscMoGNN$ to tackle three tasks.
textitscMoGNN$ demonstrates superior results in all three tasks compared with the state-of-the-art and conventional approaches.
arXiv Detail & Related papers (2022-03-03T17:59:02Z) - Progressive Multi-stage Interactive Training in Mobile Network for
Fine-grained Recognition [8.727216421226814]
We propose a Progressive Multi-Stage Interactive training method with a Recursive Mosaic Generator (RMG-PMSI)
First, we propose a Recursive Mosaic Generator (RMG) that generates images with different granularities in different phases.
Then, the features of different stages pass through a Multi-Stage Interaction (MSI) module, which strengthens and complements the corresponding features of different stages.
Experiments on three prestigious fine-grained benchmarks show that RMG-PMSI can significantly improve the performance with good robustness and transferability.
arXiv Detail & Related papers (2021-12-08T10:50:03Z) - Graph Capsule Aggregation for Unaligned Multimodal Sequences [16.679793708015534]
We introduce Graph Capsule Aggregation (GraphCAGE) to model unaligned multimodal sequences with graph-based neural model and Capsule Network.
By converting sequence data into graph, the previously mentioned problems of RNN are avoided.
In addition, the aggregation capability of Capsule Network and the graph-based structure enable our model to be interpretable and better solve the problem of long-range dependency.
arXiv Detail & Related papers (2021-08-17T10:04:23Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Analyzing Unaligned Multimodal Sequence via Graph Convolution and Graph
Pooling Fusion [28.077474663199062]
We propose a novel model, termed Multimodal Graph, to investigate the effectiveness of graph neural networks (GNN) on modeling multimodal sequential data.
Our graph-based model reaches state-of-the-art performance on two benchmark datasets.
arXiv Detail & Related papers (2020-11-27T06:12:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.