Depression Diagnosis and Analysis via Multimodal Multi-order Factor
Fusion
- URL: http://arxiv.org/abs/2301.00254v1
- Date: Sat, 31 Dec 2022 17:13:06 GMT
- Title: Depression Diagnosis and Analysis via Multimodal Multi-order Factor
Fusion
- Authors: Chengbo Yuan, Qianhui Xu and Yong Luo
- Abstract summary: Depression is a leading cause of death worldwide, and the diagnosis of depression is nontrivial.
We propose a multimodal multi-order factor fusion (MMFF) method for automatic diagnosis of depression.
- Score: 4.991507302519828
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Depression is a leading cause of death worldwide, and the diagnosis of
depression is nontrivial. Multimodal learning is a popular solution for
automatic diagnosis of depression, and the existing works suffer two main
drawbacks: 1) the high-order interactions between different modalities can not
be well exploited; and 2) interpretability of the models are weak. To remedy
these drawbacks, we propose a multimodal multi-order factor fusion (MMFF)
method. Our method can well exploit the high-order interactions between
different modalities by extracting and assembling modality factors under the
guide of a shared latent proxy. We conduct extensive experiments on two recent
and popular datasets, E-DAIC-WOZ and CMDC, and the results show that our method
achieve significantly better performance compared with other existing
approaches. Besides, by analyzing the process of factor assembly, our model can
intuitively show the contribution of each factor. This helps us understand the
fusion mechanism.
Related papers
- A Depression Detection Method Based on Multi-Modal Feature Fusion Using Cross-Attention [3.4872769952628926]
Depression affects approximately 3.8% of the global population.
Over 75% of individuals in low- and middle-income countries remain untreated.
This paper introduces a novel method for detecting depression based on multi-modal feature fusion utilizing cross-attention.
arXiv Detail & Related papers (2024-07-02T13:13:35Z) - Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening.
It provides a measure of confidence for each modality and elegantly integrates the multi-modality information.
Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - CANAMRF: An Attention-Based Model for Multimodal Depression Detection [7.266707571724883]
We present a Cross-modal Attention Network with Adaptive Multi-modal Recurrent Fusion (CANAMRF) for multimodal depression detection.
CANAMRF is constructed by a multimodal feature extractor, an Adaptive Multimodal Recurrent Fusion module, and a Hybrid Attention Module.
arXiv Detail & Related papers (2024-01-04T12:08:16Z) - Joint Self-Supervised and Supervised Contrastive Learning for Multimodal
MRI Data: Towards Predicting Abnormal Neurodevelopment [5.771221868064265]
We present a novel joint self-supervised and supervised contrastive learning method to learn the robust latent feature representation from multimodal MRI data.
Our method has the capability to facilitate computer-aided diagnosis within clinical practice, harnessing the power of multimodal data.
arXiv Detail & Related papers (2023-12-22T21:05:51Z) - Provable Dynamic Fusion for Low-Quality Multimodal Data [94.39538027450948]
Dynamic multimodal fusion emerges as a promising learning paradigm.
Despite its widespread use, theoretical justifications in this field are still notably lacking.
This paper provides theoretical understandings to answer this question under a most popular multimodal fusion framework from the generalization perspective.
A novel multimodal fusion framework termed Quality-aware Multimodal Fusion (QMF) is proposed, which can improve the performance in terms of classification accuracy and model robustness.
arXiv Detail & Related papers (2023-06-03T08:32:35Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Reliable Multimodality Eye Disease Screening via Mixture of Student's t
Distributions [49.4545260500952]
We introduce a novel multimodality evidential fusion pipeline for eye disease screening, EyeMoSt.
Our model estimates both local uncertainty for unimodality and global uncertainty for the fusion modality to produce reliable classification results.
Our experimental findings on both public and in-house datasets show that our model is more reliable than current methods.
arXiv Detail & Related papers (2023-03-17T06:18:16Z) - Multi-modal Depression Estimation based on Sub-attentional Fusion [29.74171323437029]
Failure to diagnose depression leads to over 280 million people suffering from this psychological disorder worldwide.
We tackle the task of automatically identifying depression from multi-modal data.
We introduce a sub-attention mechanism for linking heterogeneous information.
arXiv Detail & Related papers (2022-07-13T13:19:32Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.