MIND-Stack: Modular, Interpretable, End-to-End Differentiability for Autonomous Navigation
- URL: http://arxiv.org/abs/2505.21734v1
- Date: Tue, 27 May 2025 20:23:21 GMT
- Title: MIND-Stack: Modular, Interpretable, End-to-End Differentiability for Autonomous Navigation
- Authors: Felix Jahncke, Johannes Betz,
- Abstract summary: We present MIND-Stack, a modular software stack consisting of a localization network and a Stanley Controller.<n>Our approach enables the upstream localization module to reduce the downstream control error, extending its role beyond state estimation.<n>We conduct experiments that demonstrate the ability of the localization module to reduce the downstream control loss through its end-to-end differentiability.
- Score: 2.07180164747172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing robust, efficient navigation algorithms is challenging. Rule-based methods offer interpretability and modularity but struggle with learning from large datasets, while end-to-end neural networks excel in learning but lack transparency and modularity. In this paper, we present MIND-Stack, a modular software stack consisting of a localization network and a Stanley Controller with intermediate human interpretable state representations and end-to-end differentiability. Our approach enables the upstream localization module to reduce the downstream control error, extending its role beyond state estimation. Unlike existing research on differentiable algorithms that either lack modules of the autonomous stack to span from sensor input to actuator output or real-world implementation, MIND-Stack offers both capabilities. We conduct experiments that demonstrate the ability of the localization module to reduce the downstream control loss through its end-to-end differentiability while offering better performance than state-of-the-art algorithms. We showcase sim-to-real capabilities by deploying the algorithm on a real-world embedded autonomous platform with limited computation power and demonstrate simultaneous training of both the localization and controller towards one goal. While MIND-Stack shows good results, we discuss the incorporation of additional modules from the autonomous navigation pipeline in the future, promising even greater stability and performance in the next iterations of the framework.
Related papers
- LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation [94.84458417662404]
LangTraj is a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios.<n>By conditioning on natural language inputs, LangTraj provides flexible and intuitive control over interactive behaviors.<n>LangTraj demonstrates strong performance in realism, language controllability, and language-conditioned safety-critical simulation.
arXiv Detail & Related papers (2025-04-15T17:14:06Z) - Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning [1.597617022056624]
We show how encouraging sparsity and locality in network weights leads to the emergence of functional modules in RL policy networks.<n>Applying these methods to 2D and 3D MiniGrid environments reveals the consistent emergence of distinct navigational modules for different axes.
arXiv Detail & Related papers (2025-01-28T17:02:16Z) - Self-Supervised Interpretable End-to-End Learning via Latent Functional Modularity [2.163881720692685]
MoNet is a novel functionally modular network for self-supervised and interpretable end-to-end learning.
In real-world indoor environments, MoNet demonstrates effective visual autonomous navigation, outperforming baseline models by 7% to 28%.
arXiv Detail & Related papers (2024-02-21T15:17:20Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - Drive Anywhere: Generalizable End-to-end Autonomous Driving with
Multi-modal Foundation Models [114.69732301904419]
We present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text.
Our approach demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations.
arXiv Detail & Related papers (2023-10-26T17:56:35Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - DiffStack: A Differentiable and Modular Control Stack for Autonomous
Vehicles [75.43355868143209]
We present DiffStack, a differentiable and modular stack for prediction, planning, and control.
Our results on the nuScenes dataset indicate that end-to-end training with DiffStack yields substantial improvements in open-loop and closed-loop planning metrics.
arXiv Detail & Related papers (2022-12-13T09:05:21Z) - A unified software/hardware scalable architecture for brain-inspired
computing based on self-organizing neural models [6.072718806755325]
We develop an original brain-inspired neural model associating Self-Organizing Maps (SOM) and Hebbian learning in the Reentrant SOM (ReSOM) model.
This work also demonstrates the distributed and scalable nature of the model through both simulation results and hardware execution on a dedicated FPGA-based platform.
arXiv Detail & Related papers (2022-01-06T22:02:19Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.