Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning
- URL: http://arxiv.org/abs/2506.17826v1
- Date: Sat, 21 Jun 2025 21:38:43 GMT
- Title: Actionable Interpretability via Causal Hypergraphs: Unravelling Batch Size Effects in Deep Learning
- Authors: Zhongtian Sun, Anoushka Harit, Pietro Lio,
- Abstract summary: We introduce a hypergraph-based causal framework, HGCNet, to uncover how batch size influences generalisation via gradient noise, minima sharpness, and model complexity.<n>Using do-calculus, we quantify direct and mediated effects of batch size interventions, providing interpretable, causally grounded insights into optimisation.
- Score: 6.583734409076539
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While the impact of batch size on generalisation is well studied in vision tasks, its causal mechanisms remain underexplored in graph and text domains. We introduce a hypergraph-based causal framework, HGCNet, that leverages deep structural causal models (DSCMs) to uncover how batch size influences generalisation via gradient noise, minima sharpness, and model complexity. Unlike prior approaches based on static pairwise dependencies, HGCNet employs hypergraphs to capture higher-order interactions across training dynamics. Using do-calculus, we quantify direct and mediated effects of batch size interventions, providing interpretable, causally grounded insights into optimisation. Experiments on citation networks, biomedical text, and e-commerce reviews show that HGCNet outperforms strong baselines including GCN, GAT, PI-GNN, BERT, and RoBERTa. Our analysis reveals that smaller batch sizes causally enhance generalisation through increased stochasticity and flatter minima, offering actionable interpretability to guide training strategies in deep learning. This work positions interpretability as a driver of principled architectural and optimisation choices beyond post hoc analysis.
Related papers
- A Recipe for Causal Graph Regression: Confounding Effects Revisited [10.615260306723536]
causal graph learning (CGL) has risen to be a promising approach for improving the generalizability of graph neural networks under out-of-distribution (OOD) scenarios.<n>We focus on tackling causal regression (CGR), a more challenging setting in graph learning.<n>We reflect on the predictive power of confounders in graph-level regression, and generalize classification-specific causal intervention techniques to regression through a lens of contrastive learning.
arXiv Detail & Related papers (2025-07-01T05:46:29Z) - Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs [15.036480111358369]
Large Language Models (LLMs) have achieved impressive capabilities in language understanding and generation.<n>They continue to underperform on knowledge-intensive reasoning tasks due to limited access to structured context and multi-hop information.<n>We introduce Inference-Scaled GraphRAG, a novel framework that enhances LLM-based graph reasoning by applying inference-time compute scaling.
arXiv Detail & Related papers (2025-06-24T19:31:03Z) - Geometry-Aware Edge Pooling for Graph Neural Networks [20.080879481223924]
Graph Neural Networks (GNNs) have shown significant success for graph-based tasks.<n>Motivated by the prevalence of large datasets in real-world applications, pooling layers are crucial components of GNNs.<n>We propose novel graph pooling layers for structure aware pooling via edge collapses.
arXiv Detail & Related papers (2025-06-13T12:01:46Z) - Beyond Message Passing: Neural Graph Pattern Machine [50.78679002846741]
We introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.<n>GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies.
arXiv Detail & Related papers (2025-01-30T20:37:47Z) - Generalization Performance of Hypergraph Neural Networks [21.483543928698676]
We develop margin-based generalization bounds for four representative classes of hypergraph neural networks.<n>Our results reveal the manner in which hypergraph structure and spectral norms of the learned weights can affect the generalization bounds.<n>Our empirical study examines the relationship between the practical performance and theoretical bounds of the models over synthetic and real-world datasets.
arXiv Detail & Related papers (2025-01-22T00:20:26Z) - LAMP: Learnable Meta-Path Guided Adversarial Contrastive Learning for Heterogeneous Graphs [22.322402072526927]
Heterogeneous Graph Contrastive Learning (HGCL) usually requires pre-defined meta-paths.
textsfLAMP integrates various meta-path sub-graphs into a unified and stable structure.
textsfLAMP significantly outperforms existing state-of-the-art unsupervised models in terms of accuracy and robustness.
arXiv Detail & Related papers (2024-09-10T08:27:39Z) - Amplify Graph Learning for Recommendation via Sparsity Completion [16.32861024767423]
Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems.
Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges.
We propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC)
arXiv Detail & Related papers (2024-06-27T08:26:20Z) - Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs.
We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z) - Graph-level Protein Representation Learning by Structure Knowledge
Refinement [50.775264276189695]
This paper focuses on learning representation on the whole graph level in an unsupervised manner.
We propose a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative.
arXiv Detail & Related papers (2024-01-05T09:05:33Z) - On the Expressiveness and Generalization of Hypergraph Neural Networks [77.65788763444877]
This extended abstract describes a framework for analyzing the expressiveness, learning, and (structural) generalization of hypergraph neural networks (HyperGNNs)
Specifically, we focus on how HyperGNNs can learn from finite datasets and generalize structurally to graph reasoning problems of arbitrary input sizes.
arXiv Detail & Related papers (2023-03-09T18:42:18Z) - Position-aware Structure Learning for Graph Topology-imbalance by
Relieving Under-reaching and Over-squashing [67.83086131278904]
Topology-imbalance is a graph-specific imbalance problem caused by the uneven topology positions of labeled nodes.
We propose a novel position-aware graph structure learning framework named PASTEL.
Our key insight is to enhance the connectivity of nodes within the same class for more supervision information.
arXiv Detail & Related papers (2022-08-17T14:04:21Z) - GraphCoCo: Graph Complementary Contrastive Learning [65.89743197355722]
Graph Contrastive Learning (GCL) has shown promising performance in graph representation learning (GRL) without the supervision of manual annotations.
This paper proposes an effective graph complementary contrastive learning approach named GraphCoCo to tackle the above issue.
arXiv Detail & Related papers (2022-03-24T02:58:36Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.