Related papers: 3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

URL: http://arxiv.org/abs/2407.09020v2
Date: Mon, 15 Jul 2024 03:53:12 GMT
Title: 3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection
Authors: Rina Carines Cabral, Siwen Luo, Josiah Poon, Soyeon Caren Han,
Abstract summary: We introduce a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification. Unlike conventional approaches that often rely on simple concatenation to integrate diverse features, our model addresses the challenge of appropriately representing inputs of varying natures.
Score: 9.469887408109251
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The significance of mental health classification is paramount in contemporary society, where digital platforms serve as crucial sources for monitoring individuals' well-being. However, existing social media mental health datasets primarily consist of text-only samples, potentially limiting the efficacy of models trained on such data. Recognising that humans utilise cross-modal information to comprehend complex situations or issues, we present a novel approach to address the limitations of current methodologies. In this work, we introduce a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification, leveraging insights from cross-modal human understanding. Unlike conventional approaches that often rely on simple concatenation to integrate diverse features, our model addresses the challenge of appropriately representing inputs of varying natures (e.g., texts and sounds). To mitigate the computational complexity associated with integrating all features into a single model, we employ a multimodal and multi-teacher architecture. By distributing the learning process across multiple teachers, each specialising in a particular feature extraction aspect, we enhance the overall mental health classification performance. Through experimental validation, we demonstrate the efficacy of our model in achieving improved performance. All relevant codes will be made available upon publication.

Related papers

Quantifying Cross-Modality Memorization in Vision-Language Models [86.82366725590508]
We study the unique characteristics of cross-modality memorization and conduct a systematic study centered on vision-language models.<n>Our results reveal that facts learned in one modality transfer to the other, but a significant gap exists between recalling information in the source and target modalities.
arXiv Detail & Related papers (2025-06-05T16:10:47Z)
Promoting cross-modal representations to improve multimodal foundation models for physiological signals [3.630706646160043]
We use a masked autoencoding objective to pretrain a multimodal model. We show that the model learns representations that can be linearly probed for a diverse set of downstream tasks. We argue that explicit methods for inducing cross-modality may enhance multimodal pretraining strategies.
arXiv Detail & Related papers (2024-10-21T18:47:36Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development. We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
HEMM: Holistic Evaluation of Multimodal Foundation Models [91.60364024897653]
Multimodal foundation models can holistically process text alongside images, video, audio, and other sensory modalities. It is challenging to characterize and study progress in multimodal foundation models, given the range of possible modeling decisions, tasks, and domains.
arXiv Detail & Related papers (2024-07-03T18:00:48Z)
Advancing Multimodal Data Fusion in Pain Recognition: A Strategy Leveraging Statistical Correlation and Human-Centered Perspectives [0.3749861135832073]
This research presents a novel multimodal data fusion methodology for pain behavior recognition. We introduce two key innovations: 1) integrating data-driven statistical relevance weights into the fusion strategy, and 2) incorporating human-centric movement characteristics into multimodal representation learning. Our findings have significant implications for promoting patient-centered healthcare interventions and supporting explainable clinical decision-making.
arXiv Detail & Related papers (2024-03-30T11:13:18Z)
Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation [12.094890186803958]
We present a novel Modality Aware and Shift Mixer that integrates intra-modality and inter-modality dependencies of multi-modal images for effective and robust brain tumor segmentation. Specifically, we introduce a Modality-Aware module according to neuroimaging studies for modeling the specific modality pair relationships at low levels, and a Modality-Shift module with specific mosaic patterns is developed to explore the complex relationships across modalities at high levels via the self-attention.
arXiv Detail & Related papers (2024-03-04T14:21:51Z)
Few-Shot Learning for Mental Disorder Detection: A Continuous Multi-Prompt Engineering Approach with Medical Knowledge Injection [10.913054876743729]
This study harnesses state-of-the-art AI technology for detecting mental disorders through user-generated textual content. We propose a novel method to address these challenges by leveraging large language models and continuous multi-prompt engineering.
arXiv Detail & Related papers (2024-01-16T13:54:43Z)
Learning Unseen Modality Interaction [54.23533023883659]
Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences. We pose the problem of unseen modality interaction and introduce a first solution. It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved.
arXiv Detail & Related papers (2023-06-22T10:53:10Z)
A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires [0.2580765958706853]
We propose a novel approach that captures the semantic meanings directly from the text and compares them to symptom-related descriptions. Our detailed analysis shows that the proposed model is effective at leveraging domain knowledge, transferable to other mental disorders, and providing interpretable detection results.
arXiv Detail & Related papers (2023-06-05T15:23:55Z)
Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks. We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z)
Self-supervised multimodal neuroimaging yields predictive representations for a spectrum of Alzheimer's phenotypes [27.331511924585023]
This work presents a novel multi-scale coordinated framework for learning multiple representations from multimodal neuroimaging data. We propose a general taxonomy of informative inductive biases to capture unique and joint information in multimodal self-supervised fusion. We show that self-supervised models reveal disorder-relevant brain regions and multimodal links without access to the labels during pre-training.
arXiv Detail & Related papers (2022-09-07T01:37:19Z)
Multimodal foundation models are better simulators of the human brain [65.10501322822881]
We present a newly-designed multimodal foundation model pre-trained on 15 million image-text pairs. We find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.
arXiv Detail & Related papers (2022-08-17T12:36:26Z)
DIME: Fine-grained Interpretations of Multimodal Models via Disentangled Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models. Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.