Related papers: TivNe-SLAM: Dynamic Mapping and Tracking via Time-Varying Neural Radiance Fields

TivNe-SLAM: Dynamic Mapping and Tracking via Time-Varying Neural Radiance Fields

URL: http://arxiv.org/abs/2310.18917v4
Date: Mon, 18 Mar 2024 03:37:31 GMT
Title: TivNe-SLAM: Dynamic Mapping and Tracking via Time-Varying Neural Radiance Fields
Authors: Chengyao Duan, Zhiliu Yang,
Abstract summary: We propose a time-varying representation to track and reconstruct dynamic scenes. For the tracking process, all input images are uniformly sampled, then progressively trained in a self-supervised paradigm. Experiments validate our method when compared to existing state-of-the-art NeRF-based methods.
Score: 0.1227734309612871
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous attempts to integrate Neural Radiance Fields (NeRF) into the Simultaneous Localization and Mapping (SLAM) framework either rely on the assumption of static scenes or require the ground truth camera poses, which impedes their application in real-world scenarios. In this paper, we propose a time-varying representation to track and reconstruct the dynamic scenes. Firstly, two processes, tracking process and mapping process, are simultaneously maintained in our framework. For the tracking process, all input images are uniformly sampled, then progressively trained in a self-supervised paradigm. For the mapping process, we leverage motion masks to distinguish dynamic objects from static background, and sample more pixels from dynamic areas. Secondly, the parameter optimization for both processes consists of two stages: the first stage associates time with 3D positions to convert the deformation field to the canonical field. And the second stage associates time with the embeddings of canonical field to obtain colors and Signed Distance Function (SDF). Lastly, we propose a novel keyframe selection strategy based on the overlapping rate. We evaluate our approach on two synthetic datasets and one real-world dataset. And the experiments validate that our method achieves competitive results in both tracking and mapping when compared to existing state-of-the-art NeRF-based methods.

Related papers

LocalDyGS: Multi-view Global Dynamic Scene Modeling via Adaptive Local Implicit Feature Decoupling [33.71658540929536]
LocalDyGS is a novel method to model dynamic videos from multi-view inputs for arbitrary viewpoints.<n>Our method is competitive on various fine-scale datasets compared to state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2025-07-03T06:50:33Z)
DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion [53.70278210626701]
We propose a data-driven multi-view reasoning approach that directly infers 3D scene geometry and camera poses from multi-view images.<n>Our framework, DiffusionSfM, parameterizes scene geometry and cameras as pixel-wise ray origins and endpoints in a global frame.<n>We empirically validate DiffusionSfM on both synthetic and real datasets, demonstrating that it outperforms classical and learning-based approaches.
arXiv Detail & Related papers (2025-05-08T17:59:47Z)
Learning Appearance and Motion Cues for Panoptic Tracking [13.062016289815057]
Panoptic tracking enables pixel-level scene of videos by integrating instance tracking in panoptic segmentation. We propose a novel approach for panoptic tracking that simultaneously captures information and instance-specific appearance and motion features.
arXiv Detail & Related papers (2025-03-12T09:32:29Z)
NVFi: Neural Velocity Fields for 3D Physics Learning from Dynamic Videos [8.559809421797784]
We propose to simultaneously learn the geometry, appearance, and physical velocity of 3D scenes only from video frames. We conduct extensive experiments on multiple datasets, demonstrating the superior performance of our method over all baselines.
arXiv Detail & Related papers (2023-12-11T14:07:31Z)
DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation. Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details. Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z)
EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes. It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping. Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z)
DynaMoN: Motion-Aware Fast and Robust Camera Localization for Dynamic Neural Radiance Fields [71.94156412354054]
We propose Dynamic Motion-Aware Fast and Robust Camera Localization for Dynamic Neural Radiance Fields (DynaMoN) DynaMoN handles dynamic content for initial camera pose estimation and statics-focused ray sampling for fast and accurate novel-view synthesis. We extensively evaluate our approach on two real-world dynamic datasets, the TUM RGB-D dataset and the BONN RGB-D Dynamic dataset.
arXiv Detail & Related papers (2023-09-16T08:46:59Z)
3D Multi-Object Tracking with Differentiable Pose Estimation [0.0]
We propose a novel approach for joint 3D multi-object tracking and reconstruction from RGB-D sequences in indoor environments. We leverage those correspondences to inform a graph neural network to solve for the optimal, temporally-consistent 7-DoF pose trajectories of all objects. Our method improves the accumulated MOTA score for all test sequences by 24.8% over existing state-of-the-art methods.
arXiv Detail & Related papers (2022-06-28T06:46:32Z)
Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis [63.25919018001152]
We propose a fast deformable radiance field method to handle dynamic scenes. Our method achieves comparable performance to D-NeRF using only 20 minutes for training.
arXiv Detail & Related papers (2022-06-15T17:49:08Z)
TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches. We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z)
CDN-MEDAL: Two-stage Density and Difference Approximation Framework for Motion Analysis [3.337126420148156]
We propose a novel, two-stage method of change detection with two convolutional neural networks. Our two-stage framework contains approximately 3.5K parameters in total but still maintains rapid convergence to intricate motion patterns.
arXiv Detail & Related papers (2021-06-07T16:39:42Z)
Deep Learning based Virtual Point Tracking for Real-Time Target-less Dynamic Displacement Measurement in Railway Applications [0.0]
We propose virtual point tracking for real-time target-less dynamic displacement measurement, incorporating deep learning techniques and domain knowledge. We demonstrate our approach for a railway application, where the lateral displacement of the wheel on the rail is measured during operation.
arXiv Detail & Related papers (2021-01-17T16:19:47Z)
DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects. To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error. Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.