Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM
- URL: http://arxiv.org/abs/2010.07646v1
- Date: Thu, 15 Oct 2020 10:31:12 GMT
- Title: Empty Cities: a Dynamic-Object-Invariant Space for Visual SLAM
- Authors: Berta Bescos, Cesar Cadena, Jose Neira
- Abstract summary: We present a data-driven approach to obtain the static image of a scene, eliminating dynamic objects that might have been present at the time of traversing the scene with a camera.
We introduce an end-to-end deep learning framework to turn images of an urban environment into realistic static frames suitable for localization and mapping.
- Score: 6.693607456009373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we present a data-driven approach to obtain the static image of
a scene, eliminating dynamic objects that might have been present at the time
of traversing the scene with a camera. The general objective is to improve
vision-based localization and mapping tasks in dynamic environments, where the
presence (or absence) of different dynamic objects in different moments makes
these tasks less robust. We introduce an end-to-end deep learning framework to
turn images of an urban environment that include dynamic content, such as
vehicles or pedestrians, into realistic static frames suitable for localization
and mapping. This objective faces two main challenges: detecting the dynamic
objects, and inpainting the static occluded back-ground. The first challenge is
addressed by the use of a convolutional network that learns a multi-class
semantic segmentation of the image. The second challenge is approached with a
generative adversarial model that, taking as input the original dynamic image
and the computed dynamic/static binary mask, is capable of generating the final
static image. This framework makes use of two new losses, one based on image
steganalysis techniques, useful to improve the inpainting quality, and another
one based on ORB features, designed to enhance feature matching between real
and hallucinated image regions. To validate our approach, we perform an
extensive evaluation on different tasks that are affected by dynamic entities,
i.e., visual odometry, place recognition and multi-view stereo, with the
hallucinated images. Code has been made available on
https://github.com/bertabescos/EmptyCities_SLAM.
Related papers
- Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation [126.12940972028012]
We present HVC, a framework for self-supervised video object segmentation.
HVC extracts pseudo-dynamic signals from static images, enabling an efficient and scalable VOS model.
We propose a hybrid visual correspondence loss to learn joint static and dynamic consistency representations.
arXiv Detail & Related papers (2024-04-21T02:21:30Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - D$^2$NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from
a Monocular Video [23.905013304668426]
Given a monocular video, segmenting and decoupling dynamic objects while recovering the static environment is a widely studied problem in machine intelligence.
We introduce Decoupled Dynamic Neural Radiance Field (D$2$NeRF), a self-supervised approach that takes a monocular video and learns a 3D scene representation.
arXiv Detail & Related papers (2022-05-31T14:41:24Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - Multi-modal Visual Place Recognition in Dynamics-Invariant Perception
Space [23.43468556831308]
This letter explores the use of multi-modal fusion of semantic and visual modalities to improve place recognition in dynamic environments.
We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation.
We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors.
In parallel, the static image is encoded using the popular Bag-of-words model.
arXiv Detail & Related papers (2021-05-17T13:14:52Z) - DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects.
To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error.
Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z) - Removing Dynamic Objects for Static Scene Reconstruction using Light
Fields [2.286041284499166]
Dynamic environments pose challenges to visual simultaneous localization and mapping (SLAM) algorithms.
Light Fields capture a bundle of light rays emerging from a single point in space, allowing us to see through dynamic objects by refocusing past them.
We present a method to synthesize a refocused image of the static background in the presence of dynamic objects.
arXiv Detail & Related papers (2020-03-24T19:05:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.