Revisiting Link Prediction: A Data Perspective
- URL: http://arxiv.org/abs/2310.00793v2
- Date: Tue, 6 Feb 2024 06:00:53 GMT
- Title: Revisiting Link Prediction: A Data Perspective
- Authors: Haitao Mao, Juanhui Li, Harry Shomer, Bingheng Li, Wenqi Fan, Yao Ma,
Tong Zhao, Neil Shah, Jiliang Tang
- Abstract summary: Link prediction, a fundamental task on graphs, has proven indispensable in various applications, e.g., friend recommendation, protein analysis, and drug interaction prediction.
Evidence in existing literature underscores the absence of a universally best algorithm suitable for all datasets.
We recognize three fundamental factors critical to link prediction: local structural proximity, global structural proximity, and feature proximity.
- Score: 61.52668130971441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Link prediction, a fundamental task on graphs, has proven indispensable in
various applications, e.g., friend recommendation, protein analysis, and drug
interaction prediction. However, since datasets span a multitude of domains,
they could have distinct underlying mechanisms of link formation. Evidence in
existing literature underscores the absence of a universally best algorithm
suitable for all datasets. In this paper, we endeavor to explore principles of
link prediction across diverse datasets from a data-centric perspective. We
recognize three fundamental factors critical to link prediction: local
structural proximity, global structural proximity, and feature proximity. We
then unearth relationships among those factors where (i) global structural
proximity only shows effectiveness when local structural proximity is
deficient. (ii) The incompatibility can be found between feature and structural
proximity. Such incompatibility leads to GNNs for Link Prediction (GNN4LP)
consistently underperforming on edges where the feature proximity factor
dominates. Inspired by these new insights from a data perspective, we offer
practical instruction for GNN4LP model design and guidelines for selecting
appropriate benchmark datasets for more comprehensive evaluations.
Related papers
- Towards Better Graph-based Cross-document Relation Extraction via Non-bridge Entity Enhancement and Prediction Debiasing [30.204313638661255]
Cross-document Relation Extraction aims to predict the relation between target entities located in different documents.
We propose a novel graph-based cross-document RE model with non-bridge entity enhancement and prediction debiasing.
arXiv Detail & Related papers (2024-06-24T11:08:28Z) - Trust your Good Friends: Source-free Domain Adaptation by Reciprocal
Neighborhood Clustering [50.46892302138662]
We address the source-free domain adaptation problem, where the source pretrained model is adapted to the target domain in the absence of source data.
Our method is based on the observation that target data, which might not align with the source domain classifier, still forms clear clusters.
We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood.
arXiv Detail & Related papers (2023-09-01T15:31:18Z) - Variational Disentangled Graph Auto-Encoders for Link Prediction [10.390861526194662]
This paper proposes a novel framework with two variants, the disentangled graph auto-encoder (DGAE) and the variational disentangled graph auto-encoder (VDGAE)
The proposed framework infers the latent factors that cause edges in the graph and disentangles the representation into multiple channels corresponding to unique latent factors.
arXiv Detail & Related papers (2023-06-20T06:25:05Z) - BSAL: A Framework of Bi-component Structure and Attribute Learning for
Link Prediction [33.488229191263564]
We propose a bicomponent structural and attribute learning framework (BSAL) that is designed to adaptively leverage information from topology and feature spaces.
BSAL constructs a semantic topology via the node attributes and then gets the embeddings regarding the semantic view.
It provides a flexible and easy-to-implement solution to adaptively incorporate the information carried by the node attributes.
arXiv Detail & Related papers (2022-04-18T03:12:13Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [77.14319095965058]
We formulate the OOD problem for node-level prediction on graphs.
We develop a new domain-invariant learning approach, named Explore-to-Extrapolate Risk Minimization.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - Inter-domain Multi-relational Link Prediction [19.094154079752123]
When related graphs coexist, it is of great benefit to build a larger graph via integrating the smaller ones.
The integration requires predicting hidden relational connections between entities belonged to different graphs.
We propose a new approach to tackle the inter-domain link prediction problem by softly aligning the entity distributions between different domains.
arXiv Detail & Related papers (2021-06-11T05:10:31Z) - Link Prediction on N-ary Relational Facts: A Graph-based Approach [18.01071110085996]
Link prediction on knowledge graphs (KGs) is a key research topic.
This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task.
arXiv Detail & Related papers (2021-05-18T12:40:35Z) - Link Prediction on N-ary Relational Data Based on Relatedness Evaluation [61.61555159755858]
We propose a method called NaLP to conduct link prediction on n-ary relational data.
We represent each n-ary relational fact as a set of its role and role-value pairs.
Experimental results validate the effectiveness and merits of the proposed methods.
arXiv Detail & Related papers (2021-04-21T09:06:54Z) - Should Graph Convolution Trust Neighbors? A Simple Causal Inference
Method [114.48708191371524]
Graph Convolutional Network (GCN) is an emerging technique for information retrieval (IR) applications.
This work focuses on the local structure discrepancy of testing nodes, which has received little scrutiny.
We analyze the working mechanism of GCN with causal graph, estimating the causal effect of a node's local structure for the prediction.
arXiv Detail & Related papers (2020-10-22T15:21:47Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.