LEGO-Learn: Label-Efficient Graph Open-Set Learning
- URL: http://arxiv.org/abs/2410.16386v1
- Date: Mon, 21 Oct 2024 18:01:11 GMT
- Title: LEGO-Learn: Label-Efficient Graph Open-Set Learning
- Authors: Haoyan Xu, Kay Liu, Zhengtao Yao, Philip S. Yu, Kaize Ding, Yue Zhao,
- Abstract summary: Graph open-set learning (GOL) and out-of-distribution (OOD) detection aim to address this challenge by training models that can accurately classify known, in-distribution (ID) classes.
It is critical for high-stakes, real-world applications where models frequently encounter unexpected data, including finance, security, and healthcare.
We propose LEGO-Learn, a novel framework that tackles open-set node classification on graphs within a given label budget by selecting the most informative ID nodes.
- Score: 46.62885412695813
- License:
- Abstract: How can we train graph-based models to recognize unseen classes while keeping labeling costs low? Graph open-set learning (GOL) and out-of-distribution (OOD) detection aim to address this challenge by training models that can accurately classify known, in-distribution (ID) classes while identifying and handling previously unseen classes during inference. It is critical for high-stakes, real-world applications where models frequently encounter unexpected data, including finance, security, and healthcare. However, current GOL methods assume access to many labeled ID samples, which is unrealistic for large-scale graphs due to high annotation costs. In this paper, we propose LEGO-Learn (Label-Efficient Graph Open-set Learning), a novel framework that tackles open-set node classification on graphs within a given label budget by selecting the most informative ID nodes. LEGO-Learn employs a GNN-based filter to identify and exclude potential OOD nodes and then select highly informative ID nodes for labeling using the K-Medoids algorithm. To prevent the filter from discarding valuable ID examples, we introduce a classifier that differentiates between the C known ID classes and an additional class representing OOD nodes (hence, a C+1 classifier). This classifier uses a weighted cross-entropy loss to balance the removal of OOD nodes while retaining informative ID nodes. Experimental results on four real-world datasets demonstrate that LEGO-Learn significantly outperforms leading methods, with up to a 6.62% improvement in ID classification accuracy and a 7.49% increase in AUROC for OOD detection.
Related papers
- GOODAT: Towards Test-time Graph Out-of-Distribution Detection [103.40396427724667]
Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains.
Recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN.
This paper introduces a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture.
arXiv Detail & Related papers (2024-01-10T08:37:39Z) - ERASE: Error-Resilient Representation Learning on Graphs for Label Noise
Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance.
ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience.
Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z) - Learning on Graphs with Out-of-Distribution Nodes [33.141867473074264]
Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs.
This work defines the problem of graph learning with out-of-distribution nodes.
We propose Out-of-Distribution Graph Attention Network (OODGAT), a novel GNN model which explicitly models the interaction between different kinds of nodes.
arXiv Detail & Related papers (2023-08-13T08:10:23Z) - Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection [134.05510658882278]
Cyclic-Bootstrap Labeling (CBL) is a novel weakly supervised object detection pipeline.
Uses a weighted exponential moving average strategy to take advantage of various refinement modules.
A novel class-specific ranking distillation algorithm is proposed to leverage the output of weighted ensembled teacher network.
arXiv Detail & Related papers (2023-08-11T07:57:17Z) - Dissimilar Nodes Improve Graph Active Learning [27.78519071553204]
We introduce 3 dissimilarity-based information scores for active learning: feature dissimilarity score (FDS), structure dissimilarity score (SDS), and embedding dissimilarity score (EDS)
Our newly proposed scores boost the classification accuracy by 2.1% on average and are capable of generalizing to different Graph Neural Network architectures.
arXiv Detail & Related papers (2022-12-05T01:00:37Z) - Noise-robust Graph Learning by Estimating and Leveraging Pairwise
Interactions [123.07967420310796]
This paper bridges the gap by proposing a pairwise framework for noisy node classification on graphs.
PI-GNN relies on the PI as a primary learning proxy in addition to the pointwise learning from the noisy node class labels.
Our proposed framework PI-GNN contributes two novel components: (1) a confidence-aware PI estimation model that adaptively estimates the PI labels, and (2) a decoupled training approach that leverages the estimated PI labels.
arXiv Detail & Related papers (2021-06-14T14:23:08Z) - Active Learning for Node Classification: The Additional Learning Ability
from Unlabelled Nodes [33.97571297149204]
Given a limited labelling budget, active learning aims to improve performance by carefully choosing which nodes to label.
Our empirical study shows that existing active learning methods for node classification are considerably outperformed by a simple method.
We propose a novel latent space clustering-based active learning method for node classification (LSCALE)
arXiv Detail & Related papers (2020-12-13T13:59:48Z) - When Contrastive Learning Meets Active Learning: A Novel Graph Active
Learning Paradigm with Self-Supervision [19.938379604834743]
This paper studies active learning (AL) on graphs, whose purpose is to discover the most informative nodes to maximize the performance of graph neural networks (GNNs)
Motivated by the success of contrastive learning (CL), we propose a novel paradigm that seamlessly integrates graph AL with CL.
Comprehensive, confounding-free experiments on five public datasets demonstrate the superiority of our method over state-of-the-arts.
arXiv Detail & Related papers (2020-10-30T06:20:07Z) - DEAL: Deep Evidential Active Learning for Image Classification [0.0]
Active Learning (AL) is one approach to mitigate the problem of limited labeled data.
Recent AL methods for CNNs propose different solutions for the selection of instances to be labeled.
We propose a novel AL algorithm that efficiently learns from unlabeled data by capturing high prediction uncertainty.
arXiv Detail & Related papers (2020-07-22T11:14:23Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.