Data-centric Graph Learning: A Survey
- URL: http://arxiv.org/abs/2310.04987v3
- Date: Thu, 21 Nov 2024 15:44:38 GMT
- Title: Data-centric Graph Learning: A Survey
- Authors: Yuxin Guo, Deyu Bo, Cheng Yang, Zhiyuan Lu, Zhongjian Zhang, Jixi Liu, Yufei Peng, Chuan Shi,
- Abstract summary: We propose a novel taxonomy based on the stages in the graph learning pipeline.
We analyze some potential problems embedded in graph data and discuss how to solve them in a data-centric manner.
- Score: 37.849198493911736
- License:
- Abstract: The history of artificial intelligence (AI) has witnessed the significant impact of high-quality data on various deep learning models, such as ImageNet for AlexNet and ResNet. Recently, instead of designing more complex neural architectures as model-centric approaches, the attention of AI community has shifted to data-centric ones, which focuses on better processing data to strengthen the ability of neural models. Graph learning, which operates on ubiquitous topological data, also plays an important role in the era of deep learning. In this survey, we comprehensively review graph learning approaches from the data-centric perspective, and aim to answer three crucial questions: (1) when to modify graph data, (2) what part of the graph data needs modification to unlock the potential of various graph models, and (3) how to safeguard graph models from problematic data influence. Accordingly, we propose a novel taxonomy based on the stages in the graph learning pipeline, and highlight the processing methods for different data structures in the graph data, i.e., topology, feature and label. Furthermore, we analyze some potential problems embedded in graph data and discuss how to solve them in a data-centric manner. Finally, we provide some promising future directions for data-centric graph learning.
Related papers
- Self-Supervised Graph Neural Networks for Enhanced Feature Extraction in Heterogeneous Information Networks [16.12856816023414]
This paper explores the applications and challenges of graph neural networks (GNNs) in processing complex graph data brought about by the rapid development of the Internet.
By introducing a self-supervisory mechanism, it is expected to improve the adaptability of existing models to the diversity and complexity of graph data.
arXiv Detail & Related papers (2024-10-23T07:14:37Z) - When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding
and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies.
This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities.
The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z) - Towards Data-centric Graph Machine Learning: Review and Outlook [120.64417630324378]
We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle.
A thorough taxonomy of each stage is presented to answer three critical graph-centric questions.
We pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.
arXiv Detail & Related papers (2023-09-20T00:40:13Z) - A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and
Future Directions [64.84521350148513]
Graphs represent interconnected structures prevalent in a myriad of real-world scenarios.
Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data.
However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce.
This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes.
arXiv Detail & Related papers (2023-08-26T09:11:44Z) - Curriculum Graph Machine Learning: A Survey [51.89783017927647]
curriculum graph machine learning (Graph CL) integrates the strength of graph machine learning and curriculum learning.
This paper comprehensively overview approaches on Graph CL and present a detailed survey of recent advances in this direction.
arXiv Detail & Related papers (2023-02-06T16:59:25Z) - State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more.
Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures.
Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z) - Data Augmentation for Deep Graph Learning: A Survey [66.04015540536027]
We first propose a taxonomy for graph data augmentation and then provide a structured review by categorizing the related work based on the augmented information modalities.
Focusing on the two challenging problems in DGL (i.e., optimal graph learning and low-resource graph learning), we also discuss and review the existing learning paradigms which are based on graph data augmentation.
arXiv Detail & Related papers (2022-02-16T18:30:33Z) - Machine Learning on Graphs: A Model and Comprehensive Taxonomy [22.73365477040205]
We bridge the gap between graph neural networks, network embedding and graph regularization models.
Specifically, we propose a Graph Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs.
arXiv Detail & Related papers (2020-05-07T18:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.