Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning
- URL: http://arxiv.org/abs/2501.17077v2
- Date: Mon, 02 Jun 2025 10:38:54 GMT
- Title: Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning
- Authors: Anna Soligo, Pietro Ferraro, David Boyle,
- Abstract summary: We show how encouraging sparsity and locality in network weights leads to the emergence of functional modules in RL policy networks.<n>Applying these methods to 2D and 3D MiniGrid environments reveals the consistent emergence of distinct navigational modules for different axes.
- Score: 1.597617022056624
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interpretability is crucial for ensuring RL systems align with human values. However, it remains challenging to achieve in complex decision making domains. Existing methods frequently attempt interpretability at the level of fundamental model units, such as neurons or decision nodes: an approach which scales poorly to large models. Here, we instead propose an approach to interpretability at the level of functional modularity. We show how encouraging sparsity and locality in network weights leads to the emergence of functional modules in RL policy networks. To detect these modules, we develop an extended Louvain algorithm which uses a novel `correlation alignment' metric to overcome the limitations of standard network analysis techniques when applied to neural network architectures. Applying these methods to 2D and 3D MiniGrid environments reveals the consistent emergence of distinct navigational modules for different axes, and we further demonstrate how these functions can be validated through direct interventions on network weights prior to inference.
Related papers
- Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction [55.914891182214475]
We introduce neural network reprogrammability as a unifying framework for model adaptation.<n>We present a taxonomy that categorizes such information manipulation approaches across four key dimensions.<n>We also analyze remaining technical challenges and ethical considerations.
arXiv Detail & Related papers (2025-06-05T05:42:27Z) - Certified Neural Approximations of Nonlinear Dynamics [52.79163248326912]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z) - SymDQN: Symbolic Knowledge and Reasoning in Neural Network-based Reinforcement Learning [0.0]
We introduce SymDQN, a novel modular approach that augments the existing DuelDQN architecture.
The modules guide action policy learning and allow reinforcement learning agents to display behaviour consistent with reasoning about the environment.
We show that our architecture significantly improves learning, both in terms of performance and the precision of the agent.
arXiv Detail & Related papers (2025-04-03T14:51:11Z) - Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity [51.40558987254471]
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations.
This paper addresses the question of reinforcement learning under $textitgeneral$ latent dynamics from a statistical and algorithmic perspective.
arXiv Detail & Related papers (2024-10-23T14:22:49Z) - Spatial embedding promotes a specific form of modularity with low entropy and heterogeneous spectral dynamics [0.0]
Spatially embedded recurrent neural networks provide a promising avenue to study how modelled constraints shape the combined structural and functional organisation of networks over learning.
We show that it is possible to study these restrictions through entropic measures of the neural weights and eigenspectrum, across both rate and spiking neural networks.
This work deepens our understanding of constrained learning in neural networks, across coding schemes and tasks, where solutions to simultaneous structural and functional objectives must be accomplished in tandem.
arXiv Detail & Related papers (2024-09-26T10:00:05Z) - GFN: A graph feedforward network for resolution-invariant reduced operator learning in multifidelity applications [0.0]
This work presents a novel resolution-invariant model order reduction strategy for multifidelity applications.
We base our architecture on a novel neural network layer developed in this work, the graph feedforward network.
We exploit the method's capability of training and testing on different mesh sizes in an autoencoder-based reduction strategy for parametrised partial differential equations.
arXiv Detail & Related papers (2024-06-05T18:31:37Z) - Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions [2.50194939587674]
We propose a new Model-based RL framework to enable efficient policy learning with unknown dynamics.
We introduce and explore a novel method for adding safety constraints for model-based RL during training and policy learning.
arXiv Detail & Related papers (2024-05-25T11:21:12Z) - Self-Supervised Interpretable End-to-End Learning via Latent Functional Modularity [2.163881720692685]
MoNet is a novel functionally modular network for self-supervised and interpretable end-to-end learning.
In real-world indoor environments, MoNet demonstrates effective visual autonomous navigation, outperforming baseline models by 7% to 28%.
arXiv Detail & Related papers (2024-02-21T15:17:20Z) - Foundations of Reinforcement Learning and Interactive Decision Making [81.76863968810423]
We present a unifying framework for addressing the exploration-exploitation dilemma using frequentist and Bayesian approaches.
Special attention is paid to function approximation and flexible model classes such as neural networks.
arXiv Detail & Related papers (2023-12-27T21:58:45Z) - A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z) - Harmonizing Feature Attributions Across Deep Learning Architectures:
Enhancing Interpretability and Consistency [2.2237337682863125]
This study examines the generalization of feature attributions across various deep learning architectures.
We aim to develop a more coherent and optimistic understanding of feature attributions.
Our findings highlight the potential for harmonized feature attribution methods to improve interpretability and foster trust in machine learning applications.
arXiv Detail & Related papers (2023-07-05T09:46:41Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Interpreting Neural Policies with Disentangled Tree Representations [58.769048492254555]
We study interpretability of compact neural policies through the lens of disentangled representation.
We leverage decision trees to obtain factors of variation for disentanglement in robot learning.
We introduce interpretability metrics that measure disentanglement of learned neural dynamics.
arXiv Detail & Related papers (2022-10-13T01:10:41Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Towards Understanding the Link Between Modularity and Performance in Neural Networks for Reinforcement Learning [2.038038953957366]
We find that the amount of network modularity for optimal performance is likely entangled in complex relationships between many other features of the network and problem environment.
We used a classic neuroevolutionary algorithm which enables rich, automatic optimisation and exploration of neural network architectures.
arXiv Detail & Related papers (2022-05-13T05:18:18Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting.
We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Are Neural Nets Modular? Inspecting Functional Modularity Through
Differentiable Weight Masks [10.0444013205203]
Understanding if and how NNs are modular could provide insights into how to improve them.
Current inspection methods, however, fail to link modules to their functionality.
arXiv Detail & Related papers (2020-10-05T15:04:11Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.