Towards Data-centric Graph Machine Learning: Review and Outlook
- URL: http://arxiv.org/abs/2309.10979v1
- Date: Wed, 20 Sep 2023 00:40:13 GMT
- Title: Towards Data-centric Graph Machine Learning: Review and Outlook
- Authors: Xin Zheng, Yixin Liu, Zhifeng Bao, Meng Fang, Xia Hu, Alan Wee-Chung
Liew, Shirui Pan
- Abstract summary: We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle.
A thorough taxonomy of each stage is presented to answer three critical graph-centric questions.
We pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.
- Score: 120.64417630324378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-centric AI, with its primary focus on the collection, management, and
utilization of data to drive AI models and applications, has attracted
increasing attention in recent years. In this article, we conduct an in-depth
and comprehensive review, offering a forward-looking outlook on the current
efforts in data-centric AI pertaining to graph data-the fundamental data
structure for representing and capturing intricate dependencies among massive
and diverse real-life entities. We introduce a systematic framework,
Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of
the graph data lifecycle, including graph data collection, exploration,
improvement, exploitation, and maintenance. A thorough taxonomy of each stage
is presented to answer three critical graph-centric questions: (1) how to
enhance graph data availability and quality; (2) how to learn from graph data
with limited-availability and low-quality; (3) how to build graph MLOps systems
from the graph data-centric view. Lastly, we pinpoint the future prospects of
the DC-GML domain, providing insights to navigate its advancements and
applications.
Related papers
- AutoG: Towards automatic graph construction from tabular data [60.877867570524884]
We introduce a set of datasets to formalize and evaluate graph construction methods.
We propose an LLM-based solution, AutoG, which automatically generates high-quality graph schemas without human intervention.
arXiv Detail & Related papers (2025-01-25T17:31:56Z) - Towards Data-centric Machine Learning on Directed Graphs: a Survey [23.498557237805414]
We introduce a novel taxonomy for existing studies of directed graph learning.
We re-examine these methods from the data-centric perspective, with an emphasis on understanding and improving data representation.
We identify key opportunities and challenges within the field, offering insights that can guide future research and development in directed graph learning.
arXiv Detail & Related papers (2024-11-28T06:09:12Z) - A Survey of Data-Efficient Graph Learning [16.053913182723143]
We introduce a novel concept of Data-Efficient Graph Learning (DEGL) as a research frontier.
We systematically review recent advances on several key aspects, including self-supervised graph learning, semi-supervised graph learning, and few-shot graph learning.
arXiv Detail & Related papers (2024-02-01T09:28:48Z) - When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding
and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies.
This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities.
The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z) - Data-centric Graph Learning: A Survey [37.849198493911736]
We propose a novel taxonomy based on the stages in the graph learning pipeline.
We analyze some potential problems embedded in graph data and discuss how to solve them in a data-centric manner.
arXiv Detail & Related papers (2023-10-08T03:17:22Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Curriculum Graph Machine Learning: A Survey [51.89783017927647]
curriculum graph machine learning (Graph CL) integrates the strength of graph machine learning and curriculum learning.
This paper comprehensively overview approaches on Graph CL and present a detailed survey of recent advances in this direction.
arXiv Detail & Related papers (2023-02-06T16:59:25Z) - Data Augmentation for Deep Graph Learning: A Survey [66.04015540536027]
We first propose a taxonomy for graph data augmentation and then provide a structured review by categorizing the related work based on the augmented information modalities.
Focusing on the two challenging problems in DGL (i.e., optimal graph learning and low-resource graph learning), we also discuss and review the existing learning paradigms which are based on graph data augmentation.
arXiv Detail & Related papers (2022-02-16T18:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.