Extending Graph Condensation to Multi-Label Datasets: A Benchmark Study
- URL: http://arxiv.org/abs/2412.17961v1
- Date: Mon, 23 Dec 2024 20:18:53 GMT
- Title: Extending Graph Condensation to Multi-Label Datasets: A Benchmark Study
- Authors: Liangliang Zhang, Haoran Bao, Yao Ma,
- Abstract summary: We extend graph condensation approaches to accommodate multi-label datasets.
We prove the effectiveness of our method through experiments on eight real-world multi-label graph datasets.
This benchmark for multi-label graph condensation offers substantial benefits for diverse real-world applications.
- Score: 6.714933246552199
- License:
- Abstract: As graph data grows increasingly complicate, training graph neural networks (GNNs) on large-scale datasets presents significant challenges, including computational resource constraints, data redundancy, and transmission inefficiencies. While existing graph condensation techniques have shown promise in addressing these issues, they are predominantly designed for single-label datasets, where each node is associated with a single class label. However, many real-world applications, such as social network analysis and bioinformatics, involve multi-label graph datasets, where one node can have various related labels. To deal with this problem, we extends traditional graph condensation approaches to accommodate multi-label datasets by introducing modifications to synthetic dataset initialization and condensing optimization. Through experiments on eight real-world multi-label graph datasets, we prove the effectiveness of our method. In experiment, the GCond framework, combined with K-Center initialization and binary cross-entropy loss (BCELoss), achieves best performance in general. This benchmark for multi-label graph condensation not only enhances the scalability and efficiency of GNNs for multi-label graph data, but also offering substantial benefits for diverse real-world applications.
Related papers
- Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Predictive Query-based Pipeline for Graph Data [0.0]
Graph embedding techniques simplify the analysis and processing of large-scale graphs.
Several approaches, such as GraphSAGE, Node2Vec, and FastRP, offer efficient methods for generating graph embeddings.
By storing embeddings as node properties, it is possible to compare different embedding techniques and evaluate their effectiveness.
arXiv Detail & Related papers (2024-12-13T08:03:57Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Multi-label Node Classification On Graph-Structured Data [7.892731722253387]
Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs.
A more general and realistic scenario in which each node could have multiple labels has so far received little attention.
We collect and release three real-world biological datasets and develop a multi-label graph generator.
arXiv Detail & Related papers (2023-04-20T15:34:20Z) - EGG-GAE: scalable graph neural networks for tabular data imputation [8.775728170359024]
We propose a novel EdGe Generation Graph AutoEncoder (EGG-GAE) for missing data imputation.
EGG-GAE works on randomly sampled mini-batches of the input data, and it automatically infers the best connectivity across the mini-batch for each architecture layer.
arXiv Detail & Related papers (2022-10-19T10:26:17Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Learning Semantic Segmentation from Multiple Datasets with Label Shifts [101.24334184653355]
This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces.
Specifically, we propose two losses that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains.
arXiv Detail & Related papers (2022-02-28T18:55:19Z) - Meta Propagation Networks for Graph Few-shot Semi-supervised Learning [39.96930762034581]
We propose a novel network architecture equipped with a novel meta-learning algorithm to solve this problem.
In essence, our framework Meta-PN infers high-quality pseudo labels on unlabeled nodes via a meta-learned label propagation strategy.
Our approach offers easy and substantial performance gains compared to existing techniques on various benchmark datasets.
arXiv Detail & Related papers (2021-12-18T00:11:56Z) - Graph Contrastive Learning Automated [94.41860307845812]
Graph contrastive learning (GraphCL) has emerged with promising representation learning performance.
The effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset.
This paper proposes a unified bi-level optimization framework to automatically, adaptively and dynamically select data augmentations when performing GraphCL on specific graph data.
arXiv Detail & Related papers (2021-06-10T16:35:27Z) - Semi-supervised Superpixel-based Multi-Feature Graph Learning for
Hyperspectral Image Data [0.0]
We present a novel framework for the classification of Hyperspectral Image (HSI) data in light of a very limited amount of labelled data.
We propose a multi-stage edge-efficient semi-supervised graph learning framework for HSI data.
arXiv Detail & Related papers (2021-04-27T15:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.