Benchmarking Node Outlier Detection on Graphs
- URL: http://arxiv.org/abs/2206.10071v1
- Date: Tue, 21 Jun 2022 01:46:38 GMT
- Title: Benchmarking Node Outlier Detection on Graphs
- Authors: Kay Liu, Yingtong Dou, Yue Zhao, Xueying Ding, Xiyang Hu, Ruitong
Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li,
George H. Chen, Zhihao Jia, Philip S. Yu
- Abstract summary: Graph outlier detection is an emerging but crucial machine learning task with numerous applications.
We present the first comprehensive unsupervised node outlier detection benchmark for graphs called UNOD.
- Score: 90.29966986023403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph outlier detection is an emerging but crucial machine learning task with
numerous applications. Despite the proliferation of algorithms developed in
recent years, the lack of a standard and unified setting for performance
evaluation limits their advancement and usage in real-world applications. To
tap the gap, we present, (to our best knowledge) the first comprehensive
unsupervised node outlier detection benchmark for graphs called UNOD, with the
following highlights: (1) evaluating fourteen methods with backbone spanning
from classical matrix factorization to the latest graph neural networks; (2)
benchmarking the method performance with different types of injected outliers
and organic outliers on real-world datasets; (3) comparing the efficiency and
scalability of the algorithms by runtime and GPU memory usage on synthetic
graphs at different scales. Based on the analyses of extensive experimental
results, we discuss the pros and cons of current UNOD methods, and point out
multiple crucial and promising future research directions.
Related papers
- Faster Inference Time for GNNs using coarsening [1.323700980948722]
coarsening-based methods are used to reduce the graph into a smaller one, resulting in faster computation.
No previous research has tackled the cost during the inference.
This paper presents a novel approach to improve the scalability of GNNs through subgraph-based techniques.
arXiv Detail & Related papers (2024-10-19T06:27:24Z) - Rethinking the Effectiveness of Graph Classification Datasets in Benchmarks for Assessing GNNs [7.407592553310068]
We propose an empirical protocol based on a fair benchmarking framework to investigate the performance discrepancy between simple methods and GNNs.
We also propose a novel metric to quantify the dataset effectiveness by considering both dataset complexity and model performance.
Our findings shed light on the current understanding of benchmark datasets, and our new platform could fuel the future evolution of graph classification benchmarks.
arXiv Detail & Related papers (2024-07-06T08:33:23Z) - Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark [73.58840254552656]
Unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection have received significant attention in recent years.
We present a Unified Benchmark for unsupervised Graph-level OOD and anomaly Detection (our method)
Our benchmark encompasses 35 datasets spanning four practical anomaly and OOD detection scenarios.
We conduct multi-dimensional analyses to explore the effectiveness, generalizability, robustness, and efficiency of existing methods.
arXiv Detail & Related papers (2024-06-21T04:07:43Z) - Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message
Passing and Hyperbolic Neural Networks [9.708651460086916]
In this paper, we revisit datasets and approaches for unsupervised node-level graph anomaly detection tasks.
Firstly, we introduce outlier injection methods that create more diverse and graph-based anomalies in graph datasets.
Secondly, we compare methods employing message passing against those without, uncovering the unexpected decline in performance.
arXiv Detail & Related papers (2024-03-06T19:42:34Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Bootstrapped Representation Learning on Graphs [37.62546075583656]
Current state-of-the-art self-supervised learning methods for graph neural networks (GNNs) are based on contrastive learning.
Inspired by BYOL, we present Bootstrapped Graph Latents, BGRL, a self-supervised graph representation method.
BGRL outperforms or matches the previous unsupervised state-of-the-art results on several established benchmark datasets.
arXiv Detail & Related papers (2021-02-12T13:36:39Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.