Related papers: SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms

SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms

URL: http://arxiv.org/abs/2409.14515v1
Date: Sun, 22 Sep 2024 16:19:47 GMT
Title: SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms
Authors: Niraj Pudasaini, Muhammad Abdullah Hanif, Muhammad Shafique,
Abstract summary: This paper presents a framework that applies Structured Pruning and Quantization (SPAQ) to the architecture of one of the state-ofthe-art DL-SLAM algorithms, DROID-SLAM. Our SPAQ-DROIDSLAM model, optimized version of DROID-SLAM model, achieves an 18.9% reduction in FLOPs and a 79.8% reduction in overall model size.
Score: 5.383130566626935
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimizing Deep Learning-based Simultaneous Localization and Mapping (DL-SLAM) algorithms is essential for efficient implementation on resource-constrained embedded platforms, enabling real-time on-board computation in autonomous mobile robots. This paper presents SPAQ-DL-SLAM, a framework that strategically applies Structured Pruning and Quantization (SPAQ) to the architecture of one of the state-ofthe-art DL-SLAM algorithms, DROID-SLAM, for resource and energy-efficiency. Specifically, we perform structured pruning with fine-tuning based on layer-wise sensitivity analysis followed by 8-bit post-training static quantization (PTQ) on the deep learning modules within DROID-SLAM. Our SPAQ-DROIDSLAM model, optimized version of DROID-SLAM model using our SPAQ-DL-SLAM framework with 20% structured pruning and 8-bit PTQ, achieves an 18.9% reduction in FLOPs and a 79.8% reduction in overall model size compared to the DROID-SLAM model. Our evaluations on the TUM-RGBD benchmark shows that SPAQ-DROID-SLAM model surpasses the DROID-SLAM model by an average of 10.5% on absolute trajectory error (ATE) metric. Additionally, our results on the ETH3D SLAM training benchmark demonstrate enhanced generalization capabilities of the SPAQ-DROID-SLAM model, seen by a higher Area Under the Curve (AUC) score and success in 2 additional data sequences compared to the DROIDSLAM model. Despite these improvements, the model exhibits performance variance on the distinct Vicon Room sequences from the EuRoC dataset, which are captured at high angular velocities. This varying performance at some distinct scenarios suggests that designing DL-SLAM algorithms taking operating environments and tasks in consideration can achieve optimal performance and resource efficiency for deployment in resource-constrained embedded platforms.

Related papers

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs [51.21041884010009]
Ring-lite is a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL)<n>Our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks.
arXiv Detail & Related papers (2025-06-17T17:12:34Z)
Dynamic Acoustic Model Architecture Optimization in Training for ASR [51.21112094223223]
DMAO is an architecture optimization framework that employs a grow-and-drop strategy to automatically reallocate parameters during training.<n>We evaluate DMAO through experiments with CTC onSpeech, TED-LIUM-v2 and Switchboard datasets.
arXiv Detail & Related papers (2025-06-16T07:47:34Z)
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution [19.22142805041799]
Convolutional neural networks (CNNs) have been widely used in efficient image super-resolution. We propose Distillation-Supervised Convolutional Low-Rank Adaptation (DSCLoRA), which improves model performance without increasing architectural complexity or inference costs.
arXiv Detail & Related papers (2025-04-15T15:12:57Z)
Streaming Looking Ahead with Token-level Self-reward [50.699168440048716]
We propose a policy model with token-level self-reward modeling (TRM) capability to eliminate the need for external models and extra communication. In addition, we propose a streaming-looking-ahead (SLA) algorithm to further boost search efficiency with better parallelization. If we combine SLA with reinforcement fine-tuning techniques such as DPO, SLA achieves an overall win rate of 89.4%.
arXiv Detail & Related papers (2025-02-24T22:35:53Z)
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs [46.443316184807145]
We introduce Dynamic Layer Operations (DLO), a novel approach for vertically scaling transformer-based Large Language Models (LLMs) Unlike traditional Mixture-of-Experts (MoE) methods that focus on extending the model width, our approach targets model depth, addressing the redundancy observed across layer representations for various input samples. Experimental results demonstrate that DLO not only outperforms the original unscaled models but also achieves comparable results to densely expanded models with significantly improved efficiency.
arXiv Detail & Related papers (2024-07-03T18:34:08Z)
Can AI be enabled to dynamical downscaling? A Latent Diffusion Model to mimic km-scale COSMO5.0\_CLM9 simulations [0.0]
Downscaling techniques are one of the most prominent applications of Deep Learning (DL) in Earth System Modeling. In this study, we apply a Latent Diffusion Model (LDM) to downscale ERA5 data over Italy up to a resolution of 2 km. Our goal is to demonstrate that recent advancements in generative modeling enable DL to deliver results comparable to those of numerical dynamical models.
arXiv Detail & Related papers (2024-06-19T15:20:28Z)
DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking and Loop-Closing [13.50980509878613]
Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning based SLAM systems. Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks. To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection.
arXiv Detail & Related papers (2024-01-17T12:08:30Z)
A-SDM: Accelerating Stable Diffusion through Redundancy Removal and Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network. We then prune the redundancy blocks of the model and maintain the network performance. Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z)
A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments. This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z)
Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture [68.13678918660872]
We design a more capable parameter-sharing architecture based on matrix product operator (MPO) MPO decomposition can reorganize and factorize the information of a parameter matrix into two parts. Our architecture shares the central tensor across all layers for reducing the model size.
arXiv Detail & Related papers (2023-03-27T02:34:09Z)
Using Detection, Tracking and Prediction in Visual SLAM to Achieve Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU. We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z)
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent. We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z)
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime [57.5143536744084]
High performance of deep learning models comes at the expense of high computational, storage and power requirements. We introduce Deeplite Neutrino for production-ready optimization of the models and Deeplite for deployment of ultra-low bit quantized models on Arm-based platforms.
arXiv Detail & Related papers (2022-07-18T15:05:17Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
A Hybrid Learner for Simultaneous Localization and Mapping [2.1041384320978267]
Simultaneous localization and mapping (SLAM) is used to predict the dynamic motion path of a moving platform. This work introduces a hybrid learning model that explores beyond feature fusion. It carries out weight enhancement of the front end feature extractor of the SLAM via mutation of different deep networks' top layers. The trajectory predictions from independently trained models are amalgamated to refine the location detail.
arXiv Detail & Related papers (2021-01-04T18:41:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.