DiffusAL: Coupling Active Learning with Graph Diffusion for
Label-Efficient Node Classification
- URL: http://arxiv.org/abs/2308.00146v1
- Date: Mon, 31 Jul 2023 20:30:13 GMT
- Title: DiffusAL: Coupling Active Learning with Graph Diffusion for
Label-Efficient Node Classification
- Authors: Sandra Gilhuber, Julian Busch, Daniel Rotthues, Christian M. M. Frey
and Thomas Seidl
- Abstract summary: We introduce a novel active graph learning approach called DiffusAL, showing significant robustness in diverse settings.
Most of our calculations for acquisition and training can be pre-processed, making DiffusAL more efficient compared to approaches combining diverse selection criteria.
Our experiments on various benchmark datasets show that, unlike previous methods, our approach significantly outperforms random selection in 100% of all datasets and labeling budgets tested.
- Score: 1.0602247913671219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Node classification is one of the core tasks on attributed graphs, but
successful graph learning solutions require sufficiently labeled data. To keep
annotation costs low, active graph learning focuses on selecting the most
qualitative subset of nodes that maximizes label efficiency. However, deciding
which heuristic is best suited for an unlabeled graph to increase label
efficiency is a persistent challenge. Existing solutions either neglect
aligning the learned model and the sampling method or focus only on limited
selection aspects. They are thus sometimes worse or only equally good as random
sampling. In this work, we introduce a novel active graph learning approach
called DiffusAL, showing significant robustness in diverse settings. Toward
better transferability between different graph structures, we combine three
independent scoring functions to identify the most informative node samples for
labeling in a parameter-free way: i) Model Uncertainty, ii) Diversity
Component, and iii) Node Importance computed via graph diffusion heuristics.
Most of our calculations for acquisition and training can be pre-processed,
making DiffusAL more efficient compared to approaches combining diverse
selection criteria and similarly fast as simpler heuristics. Our experiments on
various benchmark datasets show that, unlike previous methods, our approach
significantly outperforms random selection in 100% of all datasets and labeling
budgets tested.
Related papers
- Mitigating Label Noise on Graph via Topological Sample Selection [72.86862597508077]
We propose a $textitTopological Sample Selection$ (TSS) method that boosts the informative sample selection process in a graph by utilising topological information.
We theoretically prove that our procedure minimizes an upper bound of the expected risk under target clean distribution, and experimentally show the superiority of our method compared with state-of-the-art baselines.
arXiv Detail & Related papers (2024-03-04T11:24:51Z) - A Simple and Scalable Graph Neural Network for Large Directed Graphs [11.792826520370774]
We investigate various combinations of node representations and edge direction awareness within an input graph.
In response, we propose a simple yet holistic classification method A2DUG.
We demonstrate that A2DUG stably performs well on various datasets and improves the accuracy up to 11.29 compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-06-14T06:24:58Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - SMARTQUERY: An Active Learning Framework for Graph Neural Networks
through Hybrid Uncertainty Reduction [25.77052028238513]
We propose a framework to learn a graph neural network with very few labeled nodes using a hybrid uncertainty reduction function.
We demonstrate the competitive performance of our method against state-of-the-arts on very few labeled data.
arXiv Detail & Related papers (2022-12-02T20:49:38Z) - Enhancing Graph Contrastive Learning with Node Similarity [4.60032347615771]
Graph contrastive learning (GCL) is a representative framework for self-supervised learning.
GCL learns node representations by contrasting semantically similar nodes (positive samples) and dissimilar nodes (negative samples) with anchor nodes.
We propose an enhanced objective that contains all positive samples and no false-negative samples.
arXiv Detail & Related papers (2022-08-13T22:49:20Z) - Information Gain Propagation: a new way to Graph Active Learning with
Soft Labels [26.20597165750861]
Graph Neural Networks (GNNs) have achieved great success in various tasks, but their performance highly relies on a large number of labeled nodes.
We propose GNN-based Active Learning (AL) methods to improve the labeling efficiency by selecting the most valuable nodes to label.
Our method significantly outperforms the state-of-the-art GNN-based AL methods in terms of both accuracy and labeling cost.
arXiv Detail & Related papers (2022-03-02T13:28:25Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - How to distribute data across tasks for meta-learning? [59.608652082495624]
We show that the optimal number of data points per task depends on the budget, but it converges to a unique constant value for large budgets.
Our results suggest a simple and efficient procedure for data collection.
arXiv Detail & Related papers (2021-03-15T15:38:47Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Active Learning on Attributed Graphs via Graph Cognizant Logistic
Regression and Preemptive Query Generation [37.742218733235084]
We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs.
Our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN) for the prediction phase and maximizes the expected error reduction in the query phase.
We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-09T18:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.