Related papers: AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs

AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs

URL: http://arxiv.org/abs/2509.22017v1
Date: Fri, 26 Sep 2025 07:51:40 GMT
Title: AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs
Authors: Hugh Xuechen Liu, Kıvanç Tatar,
Abstract summary: AEGIS is an edge-only augmentation framework that resamples existing training edges.<n>On text-rich GDP graph, semantic KNN achieves the largest AUC improvement and Brier score reduction.
Score: 0.8594140167290097
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bipartite knowledge graphs in niche domains are typically data-poor and edge-sparse, which hinders link prediction. We introduce AEGIS (Authentic Edge Growth In Sparsity), an edge-only augmentation framework that resamples existing training edges -either uniformly simple or with inverse-degree bias degree-aware -thereby preserving the original node set and sidestepping fabricated endpoints. To probe authenticity across regimes, we consider naturally sparse graphs (game design pattern's game-pattern network) and induce sparsity in denser benchmarks (Amazon, MovieLens) via high-rate bond percolation. We evaluate augmentations on two complementary metrics: AUC-ROC (higher is better) and the Brier score (lower is better), using two-tailed paired t-tests against sparse baselines. On Amazon and MovieLens, copy-based AEGIS variants match the baseline while the semantic KNN augmentation is the only method that restores AUC and calibration; random and synthetic edges remain detrimental. On the text-rich GDP graph, semantic KNN achieves the largest AUC improvement and Brier score reduction, and simple also lowers the Brier score relative to the sparse control. These findings position authenticity-constrained resampling as a data-efficient strategy for sparse bipartite link prediction, with semantic augmentation providing an additional boost when informative node descriptions are available.

Related papers

How does Graph Structure Modulate Membership-Inference Risk for Graph Neural Networks? [0.34546020643989767]
Graph neural networks (GNNs) have become the standard tool for encoding data and their complex relationships into continuous representations.<n>Their use in sensitive applications has raised concerns about the potential leakage of training data.<n>Research on privacy leakage in GNNs has largely been shaped by findings from non-graph domains.
arXiv Detail & Related papers (2026-01-23T19:08:36Z)
Semi-supervised Instruction Tuning for Large Language Models on Text-Attributed Graphs [62.544129365882014]
We propose a novel Semi-supervised Instruction Tuning pipeline for Graph Learning, named SIT-Graph.<n> SIT-Graph is model-agnostic and can be seamlessly integrated into any graph instruction tuning method that utilizes LLMs as the predictor.<n>Extensive experiments demonstrate that when incorporated into state-of-the-art graph instruction tuning methods, SIT-Graph significantly enhances their performance on text-attributed graph benchmarks.
arXiv Detail & Related papers (2026-01-19T08:10:53Z)
Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization [90.74916553208153]
We propose PrunE, the first pruning-based graph OOD method that eliminates spurious edges to improve OOD generalizability.<n>PrunE employs two regularization terms to prune spurious edges: 1) graph size constraint to exclude uninformative spurious edges, and 2) $epsilon$-probability alignment to further suppress the occurrence of spurious edges.
arXiv Detail & Related papers (2025-06-06T10:34:48Z)
Edge Contrastive Learning: An Augmentation-Free Graph Contrastive Learning Model [18.02317423788033]
Graph contrastive learning (GCL) aims to learn representations from unlabeled graph data in a self-supervised manner.<n>One of the primary obstacles of edge-based GCL is the heavy burden.<n>We propose AugmentationFree Edge Contrastive Learning (AFECL) to achieve edgeedge contrast.
arXiv Detail & Related papers (2024-12-15T06:16:01Z)
Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness [80.87683145376305]
Graph Neural Networks (GNNs) excel in various graph learning tasks but face computational challenges when applied to large-scale graphs. We propose Graph Sparse Training ( GST), which dynamically manipulates sparsity at the data level. GST produces a sparse graph with maximum topological integrity and no performance degradation.
arXiv Detail & Related papers (2024-02-02T09:10:35Z)
Efficient Link Prediction via GNN Layers Induced by Negative Sampling [86.87385758192566]
Graph neural networks (GNNs) for link prediction can loosely be divided into two broad categories.<n>We propose a novel GNN architecture whereby the emphforward pass explicitly depends on emphboth positive (as is typical) and negative (unique to our approach) edges.<n>This is achieved by recasting the embeddings themselves as minimizers of a forward-pass-specific energy function that favors separation of positive and negative samples.
arXiv Detail & Related papers (2023-10-14T07:02:54Z)
Article Classification with Graph Neural Networks and Multigraphs [0.12499537119440243]
We propose a method to enhance the performance of article classification by enriching simple Graph Neural Network (GNN) pipelines with multi-graph representations. fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark OGBN-arXiv dataset and the PubMed diabetes dataset. Results demonstrate that multi-graphs consistently improve the performance of a variety of GNN models compared to the default graphs.
arXiv Detail & Related papers (2023-09-20T14:18:04Z)
Training Robust Graph Neural Networks with Topology Adaptive Edge Dropping [116.26579152942162]
Graph neural networks (GNNs) are processing architectures that exploit graph structural information to model representations from network data. Despite their success, GNNs suffer from sub-optimal generalization performance given limited training data. This paper proposes Topology Adaptive Edge Dropping to improve generalization performance and learn robust GNN models.
arXiv Detail & Related papers (2021-06-05T13:20:36Z)
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)
Self-Constructing Graph Convolutional Networks for Semantic Labeling [23.623276007011373]
We propose a novel architecture called the Self-Constructing Graph (SCG), which makes use of learnable latent variables to generate embeddings. SCG can automatically obtain optimized non-local context graphs from complex-shaped objects in aerial imagery. We demonstrate the effectiveness and flexibility of the proposed SCG on the publicly available ISPRS Vaihingen dataset.
arXiv Detail & Related papers (2020-03-15T21:55:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.