Related papers: Gradient Regularization-based Neural Granger Causality

Gradient Regularization-based Neural Granger Causality

URL: http://arxiv.org/abs/2507.11178v1
Date: Tue, 15 Jul 2025 10:35:29 GMT
Title: Gradient Regularization-based Neural Granger Causality
Authors: Meiliang Liu, Huiwen Dong, Xiaoxiao Yang, Yunfang Xu, Zijin Li, Zhengye Si, Xinyue Yang, Zhiwen Zhao,
Abstract summary: We propose Gradient Regularization-based Neural Granger Causality (GRNGC)<n> GRNGC requires only one time series prediction model and applies $L_1$ regularization to the gradient between model's input and output to infer Granger causality.<n> Numerical simulations on DREAM, Lorenz-96, fMRI, and CausalTime show that GRNGC outperforms existing baselines and significantly reduces computational overhead.
Score: 1.7365653221505928
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the advancement of deep learning technologies, various neural network-based Granger causality models have been proposed. Although these models have demonstrated notable improvements, several limitations remain. Most existing approaches adopt the component-wise architecture, necessitating the construction of a separate model for each time series, which results in substantial computational costs. In addition, imposing the sparsity-inducing penalty on the first-layer weights of the neural network to extract causal relationships weakens the model's ability to capture complex interactions. To address these limitations, we propose Gradient Regularization-based Neural Granger Causality (GRNGC), which requires only one time series prediction model and applies $L_{1}$ regularization to the gradient between model's input and output to infer Granger causality. Moreover, GRNGC is not tied to a specific time series forecasting model and can be implemented with diverse architectures such as KAN, MLP, and LSTM, offering enhanced flexibility. Numerical simulations on DREAM, Lorenz-96, fMRI BOLD, and CausalTime show that GRNGC outperforms existing baselines and significantly reduces computational overhead. Meanwhile, experiments on real-world DNA, Yeast, HeLa, and bladder urothelial carcinoma datasets further validate the model's effectiveness in reconstructing gene regulatory networks.

Related papers

The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models [54.51795784459866]
We propose a theoretical framework of performance scaling for multi-model collaboration.<n>We show that multi-model systems follow a power-law scaling with respect to the total parameter count.<n> ensembles of heterogeneous model families achieve better performance scaling than those formed within a single model family.
arXiv Detail & Related papers (2025-12-29T09:55:12Z)
Jacobian Regularizer-based Neural Granger Causality [45.902407376192656]
We propose a Jacobian Regularizer-based Neural Granger Causality (JRNGC) approach. Our method eliminates the sparsity constraints of weights by leveraging an input-output Jacobian matrix regularizer. Our proposed approach achieves competitive performance with the state-of-the-art methods for learning summary Granger causality and full-time Granger causality.
arXiv Detail & Related papers (2024-05-14T17:13:50Z)
CGNSDE: Conditional Gaussian Neural Stochastic Differential Equation for Modeling Complex Systems and Data Assimilation [1.4322470793889193]
A new knowledge-based and machine learning hybrid modeling approach, called conditional neural differential equation (CGNSDE), is developed. In contrast to the standard neural network predictive models, the CGNSDE is designed to effectively tackle both forward prediction tasks and inverse state estimation problems.
arXiv Detail & Related papers (2024-04-10T05:32:03Z)
Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale [54.15522908057831]
We propose an adapted version of the computationally-Mixer for STTD forecast at scale. Our results surprisingly show that this simple-yeteffective solution can rival SOTA baselines when tested on several traffic benchmarks. Our findings contribute to the exploration of simple-yet-effective models for real-world STTD forecasting.
arXiv Detail & Related papers (2023-07-04T05:19:19Z)
Learning CO$_2$ plume migration in faulted reservoirs with Graph Neural Networks [0.3914676152740142]
We develop a graph-based neural model for capturing the impact of faults on CO$$ plume migration. We demonstrate that our approach can accurately predict the temporal evolution of gas saturation and pore pressure in a synthetic reservoir with faults. This work highlights the potential of GNN-based methods to accurately and rapidly model subsurface flow with complex faults and fractures.
arXiv Detail & Related papers (2023-06-16T06:47:47Z)
Gibbs-Duhem-Informed Neural Networks for Binary Activity Coefficient Prediction [45.84205238554709]
We propose Gibbs-Duhem-informed neural networks for the prediction of binary activity coefficients at varying compositions. We include the Gibbs-Duhem equation explicitly in the loss function for training neural networks.
arXiv Detail & Related papers (2023-05-31T07:36:45Z)
A Momentum-Incorporated Non-Negative Latent Factorization of Tensors Model for Dynamic Network Representation [0.0]
A large-scale dynamic network (LDN) is a source of data in many big data-related applications. A Latent factorization of tensors (LFT) model efficiently extracts this time pattern. LFT models based on gradient descent (SGD) solvers are often limited by training schemes and have poor tail convergence. This paper proposes a novel nonlinear LFT model (MNNL) based on momentum-ind SGD to make training unconstrained and compatible with general training schemes.
arXiv Detail & Related papers (2023-05-04T12:30:53Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
MGNNI: Multiscale Graph Neural Networks with Implicit Layers [53.75421430520501]
implicit graph neural networks (GNNs) have been proposed to capture long-range dependencies in underlying graphs. We introduce and justify two weaknesses of implicit GNNs: the constrained expressiveness due to their limited effective range for capturing long-range dependencies, and their lack of ability to capture multiscale information on graphs at multiple resolutions. We propose a multiscale graph neural network with implicit layers (MGNNI) which is able to model multiscale structures on graphs and has an expanded effective range for capturing long-range dependencies.
arXiv Detail & Related papers (2022-10-15T18:18:55Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Deep Recurrent Modelling of Granger Causality with Latent Confounding [0.0]
We propose a deep learning-based approach to model non-linear Granger causality by directly accounting for latent confounders. We demonstrate the model performance on non-linear time series for which the latent confounder influences the cause and effect with different time lags.
arXiv Detail & Related papers (2022-02-23T03:26:22Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
Neural Closure Models for Dynamical Systems [35.000303827255024]
We develop a novel methodology to learn non-Markovian closure parameterizations for low-fidelity models. New "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs) We show that using non-Markovian over Markovian closures improves long-term accuracy and requires smaller networks.
arXiv Detail & Related papers (2020-12-27T05:55:33Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.