Database Views as Explanations for Relational Deep Learning
- URL: http://arxiv.org/abs/2509.09482v1
- Date: Thu, 11 Sep 2025 14:11:48 GMT
- Title: Database Views as Explanations for Relational Deep Learning
- Authors: Agapi Rissaki, Ilias Fountalis, Wolfgang Gatterbauer, Benny Kimelfeld,
- Abstract summary: We present a novel framework for explaining machine-learning models over relational databases.<n>We develop algorithms that avoid the exhaustive search over the space of all databases.<n>Our approach is evaluated through an extensive empirical study on the RelBench collection.
- Score: 7.126902744514975
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, there has been significant progress in the development of deep learning models over relational databases, including architectures based on heterogeneous graph neural networks (hetero-GNNs) and heterogeneous graph transformers. In effect, such architectures state how the database records and links (e.g., foreign-key references) translate into a large, complex numerical expression, involving numerous learnable parameters. This complexity makes it hard to explain, in human-understandable terms, how a model uses the available data to arrive at a given prediction. We present a novel framework for explaining machine-learning models over relational databases, where explanations are view definitions that highlight focused parts of the database that mostly contribute to the model's prediction. We establish such global abductive explanations by adapting the classic notion of determinacy by Nash, Segoufin, and Vianu (2010). In addition to tuning the tradeoff between determinacy and conciseness, the framework allows controlling the level of granularity by adopting different fragments of view definitions, such as ones highlighting whole columns, foreign keys between tables, relevant groups of tuples, and so on. We investigate the realization of the framework in the case of hetero-GNNs. We develop heuristic algorithms that avoid the exhaustive search over the space of all databases. We propose techniques that are model-agnostic, and others that are tailored to hetero-GNNs via the notion of learnable masking. Our approach is evaluated through an extensive empirical study on the RelBench collection, covering a variety of domains and different record-level tasks. The results demonstrate the usefulness of the proposed explanations, as well as the efficiency of their generation.
Related papers
- Synthesize, Retrieve, and Propagate: A Unified Predictive Modeling Framework for Relational Databases [34.57267286892218]
We propose SRP, a unified predictive modeling framework that synthesizes features using the unary dependency.<n>SRP is designed to fully capture both the unary and the composite dependencies within a relational database.
arXiv Detail & Related papers (2025-08-10T07:59:41Z) - Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures [50.46688111973999]
Graph machine learning has led to a significant increase in the capabilities of models that learn on arbitrary graph-structured data.<n>We present a new blueprint that enables end-to-end representation of'relational entity graphs' without traditional engineering feature.<n>We discuss key challenges including large-scale multi-table integration and the complexities of modeling temporal dynamics and heterogeneous data.
arXiv Detail & Related papers (2025-06-19T23:51:38Z) - Representation Learning for Tabular Data: A Comprehensive Survey [23.606506938919605]
Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications.<n>Deep Neural Networks (DNNs) have recently demonstrated promising results through their capability of representation learning.<n>We organize existing methods into three main categories according to their generalization capabilities.
arXiv Detail & Related papers (2025-04-17T17:58:23Z) - RelGNN: Composite Message Passing for Relational Deep Learning [56.48834369525997]
We introduce RelGNN, a novel GNN framework specifically designed to leverage the unique structural characteristics of the graphs built from relational databases.<n>RelGNN is evaluated on 30 diverse real-world tasks from Relbench (Fey et al., 2024), and achieves state-of-the-art performance on the vast majority tasks, with improvements of up to 25%.
arXiv Detail & Related papers (2025-02-10T18:58:40Z) - A Self-Explainable Heterogeneous GNN for Relational Deep Learning [13.201879122862703]
We present a self-explainable heterogeneous graph neural network (GNN) for relational data.<n>Our approach effectively identifies informative meta-paths that faithfully capture the model's reasoning mechanisms.<n>It significantly outperforms existing methods in both synthetic and real-world scenario.
arXiv Detail & Related papers (2024-11-30T16:02:14Z) - BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains.
BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution.
Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z) - G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning [8.02547453169677]
We propose a novel Graph-based Structure-Aware Prompt Learning Model for commonsense reasoning, named G-SAP.
In particular, an evidence graph is constructed by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and Cambridge Dictionary.
The results reveal a significant advancement over the existing models, especially, with 6.12% improvement over the SoTA LM+GNNs model on the OpenbookQA dataset.
arXiv Detail & Related papers (2024-05-09T08:28:12Z) - Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs [53.03085605769093]
We propose to learn Federated Neural Graph DataBase (FedNGDB), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data.<n>FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities, and improving the overall quality of graph data.
arXiv Detail & Related papers (2024-02-22T14:57:44Z) - Relational Deep Learning: Graph Representation Learning on Relational
Databases [69.7008152388055]
We introduce an end-to-end representation approach to learn on data laid out across multiple tables.
Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all data input.
arXiv Detail & Related papers (2023-12-07T18:51:41Z) - Neural Graph Reasoning: Complex Logical Query Answering Meets Graph
Databases [63.96793270418793]
Complex logical query answering (CLQA) is a recently emerged task of graph machine learning.
We introduce the concept of Neural Graph Database (NGDBs)
NGDB consists of a Neural Graph Storage and a Neural Graph Engine.
arXiv Detail & Related papers (2023-03-26T04:03:37Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.