Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models
- URL: http://arxiv.org/abs/2312.17679v2
- Date: Wed, 11 Sep 2024 21:39:09 GMT
- Title: Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models
- Authors: Kay Liu, Hengrui Zhang, Ziqing Hu, Fangxin Wang, Philip S. Yu,
- Abstract summary: We introduce GODM, a novel data augmentation for mitigating class imbalance in supervised graph outlier detection with latent Diffusion Models.
Our proposed method consists of three key components: (1) Variantioanl maps the heterogeneous information inherent within the graph data into a unified latent space, (2) Graph Generator synthesizes graph data that are statistically similar to real outliers from latent space, and (3) Latent Diffusion Model learns the latent space distribution of real organic data by iterative denoising.
- Score: 39.33024157496401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph outlier detection is a prominent task of research and application in the realm of graph neural networks. It identifies the outlier nodes that exhibit deviation from the majority in the graph. One of the fundamental challenges confronting supervised graph outlier detection algorithms is the prevalent issue of class imbalance, where the scarcity of outlier instances compared to normal instances often results in suboptimal performance. Conventional methods mitigate the imbalance by reweighting instances in the estimation of the loss function, assigning higher weights to outliers and lower weights to inliers. Nonetheless, these strategies are prone to overfitting and underfitting, respectively. Recently, generative models, especially diffusion models, have demonstrated their efficacy in synthesizing high-fidelity images. Despite their extraordinary generation quality, their potential in data augmentation for supervised graph outlier detection remains largely underexplored. To bridge this gap, we introduce GODM, a novel data augmentation for mitigating class imbalance in supervised Graph Outlier detection with latent Diffusion Models. Specifically, our proposed method consists of three key components: (1) Variantioanl Encoder maps the heterogeneous information inherent within the graph data into a unified latent space. (2) Graph Generator synthesizes graph data that are statistically similar to real outliers from latent space, and (3) Latent Diffusion Model learns the latent space distribution of real organic data by iterative denoising. Extensive experiments conducted on multiple datasets substantiate the effectiveness and efficiency of GODM. The case study further demonstrated the generation quality of our synthetic data. To foster accessibility and reproducibility, we encapsulate GODM into a plug-and-play package and release it at the Python Package Index (PyPI).
Related papers
- Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization [9.116601683256317]
In this work, we propose a novel framework, called Invariant Graph Learning based on Information bottleneck theory (InfoIGL)
Specifically, InfoIGL introduces a redundancy filter to compress task-irrelevant information related to environmental factors.
Experiments on both synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance under OOD generalization.
arXiv Detail & Related papers (2024-08-03T07:38:04Z) - Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning [1.3756846638796]
We propose an imbalanced GLAD method via counterfactual augmentation and feature learning.
We apply the model to brain disease datasets, which can prove the capability of our work.
arXiv Detail & Related papers (2024-07-13T13:40:06Z) - ADA-GAD: Anomaly-Denoised Autoencoders for Graph Anomaly Detection [84.0718034981805]
We introduce a novel framework called Anomaly-Denoised Autoencoders for Graph Anomaly Detection (ADA-GAD)
In the first stage, we design a learning-free anomaly-denoised augmentation method to generate graphs with reduced anomaly levels.
In the next stage, the decoders are retrained for detection on the original graph.
arXiv Detail & Related papers (2023-12-22T09:02:01Z) - Graph Out-of-Distribution Generalization with Controllable Data
Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z) - Resisting Graph Adversarial Attack via Cooperative Homophilous
Augmentation [60.50994154879244]
Recent studies show that Graph Neural Networks are vulnerable and easily fooled by small perturbations.
In this work, we focus on the emerging but critical attack, namely, Graph Injection Attack.
We propose a general defense framework CHAGNN against GIA through cooperative homophilous augmentation of graph data and model.
arXiv Detail & Related papers (2022-11-15T11:44:31Z) - DAGAD: Data Augmentation for Graph Anomaly Detection [57.92471847260541]
This paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs.
A series of experiments on three datasets prove that DAGAD outperforms ten state-of-the-art baseline detectors concerning various mostly-used metrics.
arXiv Detail & Related papers (2022-10-18T11:28:21Z) - OOD-GNN: Out-of-Distribution Generalized Graph Neural Network [73.67049248445277]
Graph neural networks (GNNs) have achieved impressive performance when testing and training graph data come from identical distribution.
Existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data.
We propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs.
arXiv Detail & Related papers (2021-12-07T16:29:10Z) - Issues with Propagation Based Models for Graph-Level Outlier Detection [16.980621769406916]
Graph-Level Outlier Detection ( GLOD) is the task of identifying unusual graphs within a graph database.
This paper identifies and delves into a fundamental and intriguing issue with applying propagation based models to GLOD.
We find that ROC-AUC performance of the models change significantly depending on which class is down-sampled.
arXiv Detail & Related papers (2020-12-23T19:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.