From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking
- URL: http://arxiv.org/abs/2406.16850v1
- Date: Mon, 24 Jun 2024 17:57:05 GMT
- Title: From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking
- Authors: Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang,
- Abstract summary: Embodied agents require robust navigation systems to operate in unstructured environments.
We propose a novel, customizable pipeline for noisy data synthesis.
Our analysis uncovers the susceptibilities of both neural (NeRF) and non-neural SLAM models to disturbances.
- Score: 32.52171076424419
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Embodied agents require robust navigation systems to operate in unstructured environments, making the robustness of Simultaneous Localization and Mapping (SLAM) models critical to embodied agent autonomy. While real-world datasets are invaluable, simulation-based benchmarks offer a scalable approach for robustness evaluations. However, the creation of a challenging and controllable noisy world with diverse perturbations remains under-explored. To this end, we propose a novel, customizable pipeline for noisy data synthesis, aimed at assessing the resilience of multi-modal SLAM models against various perturbations. The pipeline comprises a comprehensive taxonomy of sensor and motion perturbations for embodied multi-modal (specifically RGB-D) sensing, categorized by their sources and propagation order, allowing for procedural composition. We also provide a toolbox for synthesizing these perturbations, enabling the transformation of clean environments into challenging noisy simulations. Utilizing the pipeline, we instantiate the large-scale Noisy-Replica benchmark, which includes diverse perturbation types, to evaluate the risk tolerance of existing advanced RGB-D SLAM models. Our extensive analysis uncovers the susceptibilities of both neural (NeRF and Gaussian Splatting -based) and non-neural SLAM models to disturbances, despite their demonstrated accuracy in standard benchmarks. Our code is publicly available at https://github.com/Xiaohao-Xu/SLAM-under-Perturbation.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Addressing Misspecification in Simulation-based Inference through Data-driven Calibration [43.811367860375825]
Recent work has demonstrated that model misspecification can harm simulation-based inference's reliability.
This work introduces robust posterior estimation (ROPE), a framework that overcomes model misspecification with a small real-world calibration set of ground truth parameter measurements.
arXiv Detail & Related papers (2024-05-14T16:04:39Z) - Koopman-Based Surrogate Modelling of Turbulent Rayleigh-BĂ©nard Convection [4.248022697109535]
We use a Koopman-inspired architecture called the Linear Recurrent Autoencoder Network (LRAN) for learning reduced-order dynamics in convection flows.
A traditional fluid dynamics method, the Kernel Dynamic Mode Decomposition (KDMD) is used to compare the LRAN.
We obtained more accurate predictions with the LRAN than with KDMD in the most turbulent setting.
arXiv Detail & Related papers (2024-05-10T12:15:02Z) - Customizable Perturbation Synthesis for Robust SLAM Benchmarking [33.74471840597803]
We propose a novel, customizable pipeline for noisy data synthesis.
This pipeline incorporates customizable hardware setups, software components, and perturbed environments.
We instantiate the Robust-SLAM benchmark, which includes diverse perturbation types, to evaluate the risk tolerance of existing advanced SLAM models.
arXiv Detail & Related papers (2024-02-12T23:49:40Z) - DeNoising-MOT: Towards Multiple Object Tracking with Severe Occlusions [52.63323657077447]
We propose DNMOT, an end-to-end trainable DeNoising Transformer for multiple object tracking.
Specifically, we augment the trajectory with noises during training and make our model learn the denoising process in an encoder-decoder architecture.
We conduct extensive experiments on the MOT17, MOT20, and DanceTrack datasets, and the experimental results show that our method outperforms previous state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2023-09-09T04:40:01Z) - Realistic Noise Synthesis with Diffusion Models [68.48859665320828]
Deep image denoising models often rely on large amount of training data for the high quality performance.
We propose a novel method that synthesizes realistic noise using diffusion models, namely Realistic Noise Synthesize Diffusor (RNSD)
RNSD can incorporate guided multiscale content, such as more realistic noise with spatial correlations can be generated at multiple frequencies.
arXiv Detail & Related papers (2023-05-23T12:56:01Z) - HaDR: Applying Domain Randomization for Generating Synthetic Multimodal
Dataset for Hand Instance Segmentation in Cluttered Industrial Environments [0.0]
This study uses domain randomization to generate a synthetic RGB-D dataset for training multimodal instance segmentation models.
We show that our approach enables the models to outperform corresponding models trained on existing state-of-the-art datasets.
arXiv Detail & Related papers (2023-04-12T13:02:08Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.