Related papers: Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM

Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM

URL: http://arxiv.org/abs/2010.07646v1
Date: Thu, 15 Oct 2020 10:31:12 GMT
Title: Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM
Authors: Berta Bescos, Cesar Cadena, Jose Neira
Abstract summary: We present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera. We introduce an end-to-end deep learning framework to turn images of an urban environment into realistic static frames suitable for localization and mapping.
Score: 6.693607456009373
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera. The general objective is to improve vision-based localization and mapping tasks in dynamic environments, where the presence (or absence) of different dynamic objects in different moments makes these tasks less robust. We introduce an end-to-end deep learning framework to turn images of an urban environment that include dynamic content, such as vehicles or pedestrians, into realistic static frames suitable for localization and mapping. This objective faces two main challenges: detecting the dynamic objects, and inpainting the static occluded back-ground. The first challenge is addressed by the use of a convolutional network that learns a multi-class semantic segmentation of the image. The second challenge is approached with a generative adversarial model that, taking as input the original dynamic image and the computed dynamic/static binary mask, is capable of generating the final static image. This framework makes use of two new losses, one based on image steganalysis techniques, useful to improve the inpainting quality, and another one based on ORB features, designed to enhance feature matching between real and hallucinated image regions. To validate our approach, we perform an extensive evaluation on different tasks that are affected by dynamic entities, i.e., visual odometry, place recognition and multi-view stereo, with the hallucinated images. Code has been made available on https://github.com/bertabescos/EmptyCities_SLAM.

Related papers

Vision-based 3D Semantic Scene Completion via Capture Dynamic Representations [37.61183525419993]
We propose CDScene: Vision-based Robust Semantic Scene Completion via Capturing Dynamic Representations. We leverage a multimodal large-scale model to extract 2D explicit semantics and align them into 3D space. We exploit the characteristics of monocular and stereo depth to decouple scene information into dynamic and static features.
arXiv Detail & Related papers (2025-03-08T13:49:43Z)
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation [52.337472185022136]
We consider the task of Image-to-Video (I2V) generation, which involves transforming static images into realistic video sequences based on a textual description. We propose a two-stage compositional framework that decomposes I2V generation into: (i) An explicit intermediate representation generation stage, followed by (ii) A video generation stage that is conditioned on this representation. We evaluate our method on challenging benchmarks with multi-object and high-motion scenarios and empirically demonstrate that the proposed method achieves state-of-the-art consistency.
arXiv Detail & Related papers (2025-01-06T14:49:26Z)
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning. voxelization infers per-object occupancy probabilities at individual spatial locations. Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z)
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation [126.12940972028012]
We present HVC, a framework for self-supervised video object segmentation. HVC extracts pseudo-dynamic signals from static images, enabling an efficient and scalable VOS model. We propose a hybrid visual correspondence loss to learn joint static and dynamic consistency representations.
arXiv Detail & Related papers (2024-04-21T02:21:30Z)
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios. We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out. Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z)
DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views. We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z)
D$^2$NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video [23.905013304668426]
Given a monocular video, segmenting and decoupling dynamic objects while recovering the static environment is a widely studied problem in machine intelligence. We introduce Decoupled Dynamic Neural Radiance Field (D$2$NeRF), a self-supervised approach that takes a monocular video and learns a 3D scene representation.
arXiv Detail & Related papers (2022-05-31T14:41:24Z)
Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels. Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions. We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z)
Multi-modal Visual Place Recognition in Dynamics-Invariant Perception Space [23.43468556831308]
This letter explores the use of multi-modal fusion of semantic and visual modalities to improve place recognition in dynamic environments. We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation. We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors. In parallel, the static image is encoded using the popular Bag-of-words model.
arXiv Detail & Related papers (2021-05-17T13:14:52Z)
DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects. To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error. Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z)
Removing Dynamic Objects for Static Scene Reconstruction using Light Fields [2.286041284499166]
Dynamic environments pose challenges to visual simultaneous localization and mapping (SLAM) algorithms. Light Fields capture a bundle of light rays emerging from a single point in space, allowing us to see through dynamic objects by refocusing past them. We present a method to synthesize a refocused image of the static background in the presence of dynamic objects.
arXiv Detail & Related papers (2020-03-24T19:05:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.