GSDNet: Revisiting Incomplete Multimodal-Diffusion from Graph Spectrum Perspective for Conversation Emotion Recognition
- URL: http://arxiv.org/abs/2506.12325v1
- Date: Sat, 14 Jun 2025 03:24:19 GMT
- Title: GSDNet: Revisiting Incomplete Multimodal-Diffusion from Graph Spectrum Perspective for Conversation Emotion Recognition
- Authors: Yuntao Shou, Jun Yao, Tao Meng, Wei Ai, Cen Chen, Keqin Li,
- Abstract summary: Multimodal emotion recognition in conversations aims to infer the speaker's emotional state by analyzing utterance information from multiple sources.<n>The modality missing problem severely limits the performance of MERC in practical scenarios.<n>We propose a novel Graph Spectral Diffusion Network (GSDNet) which maps Gaussian noise to the graph spectral space of missing modalities and recovers the missing data according to its original distribution.
- Score: 26.41302797345201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal emotion recognition in conversations (MERC) aims to infer the speaker's emotional state by analyzing utterance information from multiple sources (i.e., video, audio, and text). Compared with unimodality, a more robust utterance representation can be obtained by fusing complementary semantic information from different modalities. However, the modality missing problem severely limits the performance of MERC in practical scenarios. Recent work has achieved impressive performance on modality completion using graph neural networks and diffusion models, respectively. This inspires us to combine these two dimensions through the graph diffusion model to obtain more powerful modal recovery capabilities. Unfortunately, existing graph diffusion models may destroy the connectivity and local structure of the graph by directly adding Gaussian noise to the adjacency matrix, resulting in the generated graph data being unable to retain the semantic and topological information of the original graph. To this end, we propose a novel Graph Spectral Diffusion Network (GSDNet), which maps Gaussian noise to the graph spectral space of missing modalities and recovers the missing data according to its original distribution. Compared with previous graph diffusion methods, GSDNet only affects the eigenvalues of the adjacency matrix instead of destroying the adjacency matrix directly, which can maintain the global topological information and important spectral features during the diffusion process. Extensive experiments have demonstrated that GSDNet achieves state-of-the-art emotion recognition performance in various modality loss scenarios.
Related papers
- Graffe: Graph Representation Learning via Diffusion Probabilistic Models [25.28957372847043]
We introduce Graffe, a self-supervised diffusion model proposed for graph representation learning.<n>It features a graph encoder that distills a source graph into a compact representation, which serves as the condition to guide the denoising process of the diffusion decoder.
arXiv Detail & Related papers (2025-05-08T05:38:19Z) - MAPN: Enhancing Heterogeneous Sparse Graph Representation by Mamba-based Asynchronous Aggregation [4.114908634432608]
This paper introduces the Mamba-based Asynchronous Propagation Network (MAPN), which enhances the representation of heterogeneous sparse graphs.<n> MAPN consists of two primary components: node sequence generation and semantic information aggregation.<n>Extensive experiments across diverse datasets demonstrate the effectiveness of MAPN in graph embeddings for various downstream tasks.
arXiv Detail & Related papers (2025-02-23T06:02:31Z) - DiffGraph: Heterogeneous Graph Diffusion Model [16.65576765238224]
Graph Neural Networks (GNNs) have revolutionized graph-structured data modeling, yet traditional GNNs struggle with complex heterogeneous structures prevalent in real-world scenarios.<n>We present the Heterogeneous Graph Diffusion Model (DiffGraph), a pioneering framework that introduces an innovative cross-view denoising strategy.<n>At its core, DiffGraph features a sophisticated latent heterogeneous graph diffusion mechanism, implementing a novel forward and backward diffusion process for superior noise management.
arXiv Detail & Related papers (2025-01-04T15:30:48Z) - Advancing Graph Generation through Beta Diffusion [49.49740940068255]
Graph Beta Diffusion (GBD) is a generative model specifically designed to handle the diverse nature of graph data.
We propose a modulation technique that enhances the realism of generated graphs by stabilizing critical graph topology.
arXiv Detail & Related papers (2024-06-13T17:42:57Z) - Continuous Product Graph Neural Networks [5.703629317205571]
Multidomain data defined on multiple graphs holds significant potential in practical applications in computer science.
We introduce Continuous Product Graph Neural Networks (CITRUS) that emerge as a natural solution to the TPDEG.
We evaluate CITRUS on well-known traffic andtemporal weather forecasting datasets, demonstrating superior performance over existing approaches.
arXiv Detail & Related papers (2024-05-29T08:36:09Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - Leveraging Graph Diffusion Models for Network Refinement Tasks [72.54590628084178]
We propose a novel graph generative framework, SGDM, based on subgraph diffusion.
Our framework not only improves the scalability and fidelity of graph diffusion models, but also leverages the reverse process to perform novel, conditional generation tasks.
arXiv Detail & Related papers (2023-11-29T18:02:29Z) - Supercharging Graph Transformers with Advective Diffusion [28.40109111316014]
This paper proposes AdvDIFFormer, a physics-inspired graph Transformer model designed to address this challenge.<n>We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts.<n> Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions.
arXiv Detail & Related papers (2023-10-10T08:40:47Z) - Dynamic Causal Explanation Based Diffusion-Variational Graph Neural
Network for Spatio-temporal Forecasting [60.03169701753824]
We propose a novel Dynamic Diffusion-al Graph Neural Network (DVGNN) fortemporal forecasting.
The proposed DVGNN model outperforms state-of-the-art approaches and achieves outstanding Root Mean Squared Error result.
arXiv Detail & Related papers (2023-05-16T11:38:19Z) - Text Enriched Sparse Hyperbolic Graph Convolutional Networks [21.83127488157701]
Graph Neural Networks (GNNs) and their hyperbolic variants provide a promising approach to encode such networks in a low-dimensional latent space.
We propose Text Enriched Sparse Hyperbolic Graph Convolution Network (TESH-GCN) to capture the graph's metapath structures using semantic signals.
Our model outperforms the current state-of-the-art approaches by a large margin on the task of link prediction.
arXiv Detail & Related papers (2022-07-06T00:23:35Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.