AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs
- URL: http://arxiv.org/abs/2509.22017v1
- Date: Fri, 26 Sep 2025 07:51:40 GMT
- Title: AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs
- Authors: Hugh Xuechen Liu, Kıvanç Tatar,
- Abstract summary: AEGIS is an edge-only augmentation framework that resamples existing training edges.<n>On text-rich GDP graph, semantic KNN achieves the largest AUC improvement and Brier score reduction.
- Score: 0.8594140167290097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bipartite knowledge graphs in niche domains are typically data-poor and edge-sparse, which hinders link prediction. We introduce AEGIS (Authentic Edge Growth In Sparsity), an edge-only augmentation framework that resamples existing training edges -either uniformly simple or with inverse-degree bias degree-aware -thereby preserving the original node set and sidestepping fabricated endpoints. To probe authenticity across regimes, we consider naturally sparse graphs (game design pattern's game-pattern network) and induce sparsity in denser benchmarks (Amazon, MovieLens) via high-rate bond percolation. We evaluate augmentations on two complementary metrics: AUC-ROC (higher is better) and the Brier score (lower is better), using two-tailed paired t-tests against sparse baselines. On Amazon and MovieLens, copy-based AEGIS variants match the baseline while the semantic KNN augmentation is the only method that restores AUC and calibration; random and synthetic edges remain detrimental. On the text-rich GDP graph, semantic KNN achieves the largest AUC improvement and Brier score reduction, and simple also lowers the Brier score relative to the sparse control. These findings position authenticity-constrained resampling as a data-efficient strategy for sparse bipartite link prediction, with semantic augmentation providing an additional boost when informative node descriptions are available.
Related papers
- How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks? [0.34546020643989767]
Graph neural networks (GNNs) have become the standard tool for encoding data and their complex relationships into continuous representations.<n>Their use in sensitive applications has raised concerns about the potential leakage of training data.<n>Research on privacy leakage in GNNs has largely been shaped by findings from non-graph domains.
arXiv Detail & Related papers (2026-01-23T19:08:36Z) - Semi-supervised Instruction Tuning for Large Language Models on Text-Attributed Graphs [62.544129365882014]
We propose a novel Semi-supervised Instruction Tuning pipeline for Graph Learning, named SIT-Graph.<n> SIT-Graph is model-agnostic and can be seamlessly integrated into any graph instruction tuning method that utilizes LLMs as the predictor.<n>Extensive experiments demonstrate that when incorporated into state-of-the-art graph instruction tuning methods, SIT-Graph significantly enhances their performance on text-attributed graph benchmarks.
arXiv Detail & Related papers (2026-01-19T08:10:53Z) - Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization [90.74916553208153]
We propose PrunE, the first pruning-based graph OOD method that eliminates spurious edges to improve OOD generalizability.<n>PrunE employs two regularization terms to prune spurious edges: 1) graph size constraint to exclude uninformative spurious edges, and 2) $epsilon$-probability alignment to further suppress the occurrence of spurious edges.
arXiv Detail & Related papers (2025-06-06T10:34:48Z) - Edge Contrastive Learning: An Augmentation-Free Graph Contrastive Learning Model [18.02317423788033]
Graph contrastive learning (GCL) aims to learn representations from unlabeled graph data in a self-supervised manner.<n>One of the primary obstacles of edge-based GCL is the heavy burden.<n>We propose AugmentationFree Edge Contrastive Learning (AFECL) to achieve edgeedge contrast.
arXiv Detail & Related papers (2024-12-15T06:16:01Z) - Two Heads Are Better Than One: Boosting Graph Sparse Training via
Semantic and Topological Awareness [80.87683145376305]
Graph Neural Networks (GNNs) excel in various graph learning tasks but face computational challenges when applied to large-scale graphs.
We propose Graph Sparse Training ( GST), which dynamically manipulates sparsity at the data level.
GST produces a sparse graph with maximum topological integrity and no performance degradation.
arXiv Detail & Related papers (2024-02-02T09:10:35Z) - Efficient Link Prediction via GNN Layers Induced by Negative Sampling [86.87385758192566]
Graph neural networks (GNNs) for link prediction can loosely be divided into two broad categories.<n>We propose a novel GNN architecture whereby the emphforward pass explicitly depends on emphboth positive (as is typical) and negative (unique to our approach) edges.<n>This is achieved by recasting the embeddings themselves as minimizers of a forward-pass-specific energy function that favors separation of positive and negative samples.
arXiv Detail & Related papers (2023-10-14T07:02:54Z) - Article Classification with Graph Neural Networks and Multigraphs [0.12499537119440243]
We propose a method to enhance the performance of article classification by enriching simple Graph Neural Network (GNN) pipelines with multi-graph representations.
fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark OGBN-arXiv dataset and the PubMed diabetes dataset.
Results demonstrate that multi-graphs consistently improve the performance of a variety of GNN models compared to the default graphs.
arXiv Detail & Related papers (2023-09-20T14:18:04Z) - Training Robust Graph Neural Networks with Topology Adaptive Edge
Dropping [116.26579152942162]
Graph neural networks (GNNs) are processing architectures that exploit graph structural information to model representations from network data.
Despite their success, GNNs suffer from sub-optimal generalization performance given limited training data.
This paper proposes Topology Adaptive Edge Dropping to improve generalization performance and learn robust GNN models.
arXiv Detail & Related papers (2021-06-05T13:20:36Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Self-Constructing Graph Convolutional Networks for Semantic Labeling [23.623276007011373]
We propose a novel architecture called the Self-Constructing Graph (SCG), which makes use of learnable latent variables to generate embeddings.
SCG can automatically obtain optimized non-local context graphs from complex-shaped objects in aerial imagery.
We demonstrate the effectiveness and flexibility of the proposed SCG on the publicly available ISPRS Vaihingen dataset.
arXiv Detail & Related papers (2020-03-15T21:55:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.