A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning
- URL: http://arxiv.org/abs/2602.17092v1
- Date: Thu, 19 Feb 2026 05:31:03 GMT
- Title: A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning
- Authors: Aadi Joshi, Kavya Bhand,
- Abstract summary: We introduce locality radius, a measure of the minimum structural neighborhood required to determine a prediction in relational schemas.<n>We conduct a controlled empirical study across foreign key prediction, join cost estimation, blast radius regression, cascade impact classification, and additional graph-derived schema tasks.<n>Results reveal a consistent bias-radius alignment effect.
- Score: 0.3058685580689604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foreign key discovery and related schema-level prediction tasks are often modeled using graph neural networks (GNNs), implicitly assuming that relational inductive bias improves performance. However, it remains unclear when multi-hop structural reasoning is actually necessary. In this work, we introduce locality radius, a formal measure of the minimum structural neighborhood required to determine a prediction in relational schemas. We hypothesize that model performance depends critically on alignment between task locality radius and architectural aggregation depth. We conduct a controlled empirical study across foreign key prediction, join cost estimation, blast radius regression, cascade impact classification, and additional graph-derived schema tasks. Our evaluation includes multi-seed experiments, capacity-matched comparisons, statistical significance testing, scaling analysis, and synthetic radius-controlled benchmarks. Results reveal a consistent bias-radius alignment effect.
Related papers
- Understanding Generalization from Embedding Dimension and Distributional Convergence [13.491874401333021]
We study generalization from a representation-centric perspective and analyze how the geometry of learned embeddings controls predictive performance for a fixed trained model.<n>We show that population risk can be bounded by two factors: (i) the intrinsic dimension of the embedding distribution, which determines the convergence rate of empirical embedding distribution to the population distribution in Wasserstein distance, and (ii) the sensitivity of the downstream mapping from embeddings to predictions, characterized by Lipschitz constants.
arXiv Detail & Related papers (2026-01-30T09:32:04Z) - On Discprecncies between Perturbation Evaluations of Graph Neural
Network Attributions [49.8110352174327]
We assess attribution methods from a perspective not previously explored in the graph domain: retraining.
The core idea is to retrain the network on important (or not important) relationships as identified by the attributions.
We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets.
arXiv Detail & Related papers (2024-01-01T02:03:35Z) - Pitfalls of Climate Network Construction: A Statistical Perspective [13.623860700196625]
We simulate time-dependent isotropic random fields on the sphere and apply common network construction techniques.
We find several ways in which the uncertainty stemming from the estimation procedure has major impact on network characteristics.
arXiv Detail & Related papers (2022-11-05T11:59:55Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Prequential MDL for Causal Structure Learning with Neural Networks [9.669269791955012]
We show that the prequential minimum description length principle can be used to derive a practical scoring function for Bayesian networks.
We obtain plausible and parsimonious graph structures without relying on sparsity inducing priors or other regularizers which must be tuned.
We discuss how the the prequential score relates to recent work that infers causal structure from the speed of adaptation when the observations come from a source undergoing distributional shift.
arXiv Detail & Related papers (2021-07-02T22:35:21Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.