Graph-Based Feature Augmentation for Predictive Tasks on Relational Datasets
- URL: http://arxiv.org/abs/2508.20986v1
- Date: Thu, 28 Aug 2025 16:44:04 GMT
- Title: Graph-Based Feature Augmentation for Predictive Tasks on Relational Datasets
- Authors: Lianpeng Qiao, Ziqi Cao, Kaiyu Feng, Ye Yuan, Guoren Wang,
- Abstract summary: We propose an end-to-end automated feature augmentation framework, ReCoGNN.<n>It enhances initial datasets using features extracted from multiple relational tables to support predictive tasks.<n>ReCoGNN consistently outperforms existing methods on both classification and regression tasks.
- Score: 28.942521393654985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data has become a foundational asset driving innovation across domains such as finance, healthcare, and e-commerce. In these areas, predictive modeling over relational tables is commonly employed, with increasing emphasis on reducing manual effort through automated machine learning (AutoML) techniques. This raises an interesting question: can feature augmentation itself be automated and identify and utilize task-related relational signals? To address this challenge, we propose an end-to-end automated feature augmentation framework, ReCoGNN, which enhances initial datasets using features extracted from multiple relational tables to support predictive tasks. ReCoGNN first captures semantic dependencies within each table by modeling intra-table attribute relationships, enabling it to partition tables into structured, semantically coherent segments. It then constructs a heterogeneous weighted graph that represents inter-row relationships across all segments. Finally, ReCoGNN leverages message-passing graph neural networks to propagate information through the graph, guiding feature selection and augmenting the original dataset. Extensive experiments conducted on ten real-life and synthetic datasets demonstrate that ReCoGNN consistently outperforms existing methods on both classification and regression tasks.
Related papers
- A Pre-training Framework for Relational Data with Information-theoretic Principles [57.93973948947743]
We introduce Task Vector Estimation (TVE), a novel pre-training framework that constructs supervisory signals via set-based aggregation over relational graphs.<n>TVE consistently outperforms traditional pre-training baselines.<n>Our findings advocate for pre-training objectives that encode task heterogeneity and temporal structure as design principles for predictive modeling on relational databases.
arXiv Detail & Related papers (2025-07-14T00:17:21Z) - Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures [50.46688111973999]
Graph machine learning has led to a significant increase in the capabilities of models that learn on arbitrary graph-structured data.<n>We present a new blueprint that enables end-to-end representation of'relational entity graphs' without traditional engineering feature.<n>We discuss key challenges including large-scale multi-table integration and the complexities of modeling temporal dynamics and heterogeneous data.
arXiv Detail & Related papers (2025-06-19T23:51:38Z) - From Features to Structure: Task-Aware Graph Construction for Relational and Tabular Learning with GNNs [6.0757501646966965]
We introduce auGraph, a unified framework for task-aware graph augmentation.<n> auGraph enhances base graph structures by selectively promoting attributes into nodes.<n>It preserves the original data schema while injecting task-relevant structural signal.
arXiv Detail & Related papers (2025-06-02T20:42:53Z) - Relation-Aware Graph Foundation Model [21.86954503656643]
A graph foundation model (GFMs) has emerged as a promising direction in graph learning.<n>Unlike language models that rely on explicit token representations, graphs lack a well-defined unit for generalization.<n>We propose REEF, a novel framework that leverages relation tokens as the basic units for GFMs.
arXiv Detail & Related papers (2025-05-17T14:34:41Z) - Boosting Relational Deep Learning with Pretrained Tabular Models [18.34233986830027]
Graph Neural Networks (GNNs) offer a compelling alternative inherently by modeling these relationships.<n>Our framework achieves up to $33%$ performance improvement and a $526times$ inference speedup compared to GNNs.
arXiv Detail & Related papers (2025-04-07T11:19:04Z) - A Transfer Framework for Enhancing Temporal Graph Learning in Data-Scarce Settings [30.97142882931946]
Dynamic interactions between entities are prevalent in domains like social platforms, financial systems, healthcare, and e-commerce.<n>TGNNs have achieved strong results for such predictive tasks but typically require extensive training data, which is often limited in real-world scenarios.<n>We introduce a novel transfer approach that disentangles node representations from their associated features through a structured bipartite encoding mechanism.
arXiv Detail & Related papers (2025-03-02T11:10:29Z) - RelGNN: Composite Message Passing for Relational Deep Learning [56.48834369525997]
We introduce RelGNN, a novel GNN framework specifically designed to leverage the unique structural characteristics of the graphs built from relational databases.<n>RelGNN is evaluated on 30 diverse real-world tasks from Relbench (Fey et al., 2024), and achieves state-of-the-art performance on the vast majority tasks, with improvements of up to 25%.
arXiv Detail & Related papers (2025-02-10T18:58:40Z) - DGNN: Decoupled Graph Neural Networks with Structural Consistency
between Attribute and Graph Embedding Representations [62.04558318166396]
Graph neural networks (GNNs) demonstrate a robust capability for representation learning on graphs with complex structures.
A novel GNNs framework, dubbed Decoupled Graph Neural Networks (DGNN), is introduced to obtain a more comprehensive embedding representation of nodes.
Experimental results conducted on several graph benchmark datasets verify DGNN's superiority in node classification task.
arXiv Detail & Related papers (2024-01-28T06:43:13Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Dynamic Relation Discovery and Utilization in Multi-Entity Time Series
Forecasting [92.32415130188046]
In many real-world scenarios, there could exist crucial yet implicit relation between entities.
We propose an attentional multi-graph neural network with automatic graph learning (A2GNN) in this work.
arXiv Detail & Related papers (2022-02-18T11:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.