Active Learning for Node Classification: The Additional Learning Ability
from Unlabelled Nodes
- URL: http://arxiv.org/abs/2012.07065v1
- Date: Sun, 13 Dec 2020 13:59:48 GMT
- Title: Active Learning for Node Classification: The Additional Learning Ability
from Unlabelled Nodes
- Authors: Juncheng Liu, Yiwei Wang, Bryan Hooi, Renchi Yang, Xiaokui Xiao
- Abstract summary: Given a limited labelling budget, active learning aims to improve performance by carefully choosing which nodes to label.
Our empirical study shows that existing active learning methods for node classification are considerably outperformed by a simple method.
We propose a novel latent space clustering-based active learning method for node classification (LSCALE)
- Score: 33.97571297149204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Node classification on graph data is an important task on many practical
domains. However, it requires labels for training, which can be difficult or
expensive to obtain in practice. Given a limited labelling budget, active
learning aims to improve performance by carefully choosing which nodes to
label. Our empirical study shows that existing active learning methods for node
classification are considerably outperformed by a simple method which randomly
selects nodes to label and trains a linear classifier with labelled nodes and
unsupervised learning features. This indicates that existing methods do not
fully utilize the information present in unlabelled nodes as they only use
unlabelled nodes for label acquisition. In this paper, we utilize the
information in unlabelled nodes by using unsupervised learning features. We
propose a novel latent space clustering-based active learning method for node
classification (LSCALE). Specifically, to select nodes for labelling, our
method uses the K-Medoids clustering algorithm on a feature space based on the
dynamic combination of both unsupervised features and supervised features. In
addition, we design an incremental clustering module to avoid redundancy
between nodes selected at different steps. We conduct extensive experiments on
three public citation datasets and two co-authorship datasets, where our
proposed method LSCALE consistently and significantly outperforms the
state-of-the-art approaches by a large margin.
Related papers
- Inconsistency-Based Data-Centric Active Open-Set Annotation [6.652785290214744]
NEAT is a data-centric active learning method that actively annotates open-set data.
NEAT achieves significantly better performance than state-of-the-art active learning methods for active open-set annotation.
arXiv Detail & Related papers (2024-01-10T04:18:02Z) - Contrastive Meta-Learning for Few-shot Node Classification [54.36506013228169]
Few-shot node classification aims to predict labels for nodes on graphs with only limited labeled nodes as references.
We create a novel contrastive meta-learning framework on graphs, named COSMIC, with two key designs.
arXiv Detail & Related papers (2023-06-27T02:22:45Z) - Dissimilar Nodes Improve Graph Active Learning [27.78519071553204]
We introduce 3 dissimilarity-based information scores for active learning: feature dissimilarity score (FDS), structure dissimilarity score (SDS), and embedding dissimilarity score (EDS)
Our newly proposed scores boost the classification accuracy by 2.1% on average and are capable of generalizing to different Graph Neural Network architectures.
arXiv Detail & Related papers (2022-12-05T01:00:37Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Label-Enhanced Graph Neural Network for Semi-supervised Node
Classification [32.64730237473914]
We present a label-enhanced learning framework for Graph Neural Networks (GNNs)
It first models each label as a virtual center for intra-class nodes and then jointly learns the representations of both nodes and labels.
Our approach could not only smooth the representations of nodes belonging to the same class, but also explicitly encode the label semantics into the learning process of GNNs.
arXiv Detail & Related papers (2022-05-31T09:48:47Z) - Information Gain Propagation: a new way to Graph Active Learning with
Soft Labels [26.20597165750861]
Graph Neural Networks (GNNs) have achieved great success in various tasks, but their performance highly relies on a large number of labeled nodes.
We propose GNN-based Active Learning (AL) methods to improve the labeling efficiency by selecting the most valuable nodes to label.
Our method significantly outperforms the state-of-the-art GNN-based AL methods in terms of both accuracy and labeling cost.
arXiv Detail & Related papers (2022-03-02T13:28:25Z) - Dual-Refinement: Joint Label and Feature Refinement for Unsupervised
Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data.
We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase.
Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z) - When Contrastive Learning Meets Active Learning: A Novel Graph Active
Learning Paradigm with Self-Supervision [19.938379604834743]
This paper studies active learning (AL) on graphs, whose purpose is to discover the most informative nodes to maximize the performance of graph neural networks (GNNs)
Motivated by the success of contrastive learning (CL), we propose a novel paradigm that seamlessly integrates graph AL with CL.
Comprehensive, confounding-free experiments on five public datasets demonstrate the superiority of our method over state-of-the-arts.
arXiv Detail & Related papers (2020-10-30T06:20:07Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z) - Graph Inference Learning for Semi-supervised Classification [50.55765399527556]
We propose a Graph Inference Learning framework to boost the performance of semi-supervised node classification.
For learning the inference process, we introduce meta-optimization on structure relations from training nodes to validation nodes.
Comprehensive evaluations on four benchmark datasets demonstrate the superiority of our proposed GIL when compared against state-of-the-art methods.
arXiv Detail & Related papers (2020-01-17T02:52:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.