Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition
- URL: http://arxiv.org/abs/2406.16968v2
- Date: Wed, 26 Jun 2024 01:54:51 GMT
- Title: Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition
- Authors: Kai Shao, Rui Wang, Yixue Hao, Long Hu, Min Chen, Hans Arno Jacobsen,
- Abstract summary: Depression based on physiological signals such as functional near-infrared spectroscopy (NIRS) and electroencephalogram (EEG) has made considerable progress.
In this paper, we introduce a multimodal physiological signals representation learning framework using architecture via multiscale contrasting for depression recognition (MRLM)
To enhance the learning of semantic representation associated with stimulation tasks, a semantic contrast module is proposed.
- Score: 18.65975882665568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depression recognition based on physiological signals such as functional near-infrared spectroscopy (fNIRS) and electroencephalogram (EEG) has made considerable progress. However, most existing studies ignore the complementarity and semantic consistency of multimodal physiological signals under the same stimulation task in complex spatio-temporal patterns. In this paper, we introduce a multimodal physiological signals representation learning framework using Siamese architecture via multiscale contrasting for depression recognition (MRLMC). First, fNIRS and EEG are transformed into different but correlated data based on a time-domain data augmentation strategy. Then, we design a spatio-temporal contrasting module to learn the representation of fNIRS and EEG through weight-sharing multiscale spatio-temporal convolution. Furthermore, to enhance the learning of semantic representation associated with stimulation tasks, a semantic consistency contrast module is proposed, aiming to maximize the semantic similarity of fNIRS and EEG. Extensive experiments on publicly available and self-collected multimodal physiological signals datasets indicate that MRLMC outperforms the state-of-the-art models. Moreover, our proposed framework is capable of transferring to multimodal time series downstream tasks.
Related papers
- MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals.
By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z) - Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral
Physiological Signals [7.293063257956068]
We propose a hypercomplex multimodal network equipped with a novel fusion module comprising parameterized hypercomplex multiplications.
We perform classification of valence and arousal from electroencephalogram (EEG) and peripheral physiological signals, employing the publicly available database MAHNOB-HCI.
arXiv Detail & Related papers (2023-10-11T16:45:44Z) - TACOformer:Token-channel compounded Cross Attention for Multimodal
Emotion Recognition [0.951828574518325]
We propose a comprehensive perspective of multimodal fusion that integrates channel-level and token-level cross-modal interactions.
Specifically, we introduce a unified cross attention module called Token-chAnnel COmpound (TACO) Cross Attention.
We also propose a 2D position encoding method to preserve information about the spatial distribution of EEG signal channels.
arXiv Detail & Related papers (2023-06-23T16:28:12Z) - fMRI from EEG is only Deep Learning away: the use of interpretable DL to
unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data.
We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z) - A Self-attention Guided Multi-scale Gradient GAN for Diversified X-ray
Image Synthesis [0.6308539010172307]
Generative Adversarial Networks (GANs) are utilized to address the data limitation problem via the generation of synthetic images.
Training challenges such as mode collapse, non-convergence, and instability degrade a GAN's performance in synthesizing diversified and high-quality images.
This work proposes an attention-guided multi-scale gradient GAN architecture to model the relationship between long-range dependencies of biomedical image features.
arXiv Detail & Related papers (2022-10-09T13:17:17Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z) - Augmenting interictal mapping with neurovascular coupling biomarkers by
structured factorization of epileptic EEG and fMRI data [3.2268407474728957]
We develop a novel structured matrix-tensor factorization for EEG-fMRI analysis.
We show that the extracted source signatures provide a sensitive localization of the ictal onset zone.
We also show that complementary localizing information can be derived from the spatial variation of the hemodynamic response.
arXiv Detail & Related papers (2020-04-29T13:27:45Z) - Multi-Scale Neural network for EEG Representation Learning in BCI [2.105172041656126]
We propose a novel deep multi-scale neural network that discovers feature representations in multiple frequency/time ranges.
By representing EEG signals withspectral-temporal information, the proposed method can be utilized for diverse paradigms.
arXiv Detail & Related papers (2020-03-02T04:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.