Street-View Image Generation from a Bird's-Eye View Layout
- URL: http://arxiv.org/abs/2301.04634v4
- Date: Tue, 13 Feb 2024 06:21:11 GMT
- Title: Street-View Image Generation from a Bird's-Eye View Layout
- Authors: Alexander Swerdlow, Runsheng Xu, Bolei Zhou
- Abstract summary: Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
- Score: 95.36869800896335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bird's-Eye View (BEV) Perception has received increasing attention in recent
years as it provides a concise and unified spatial representation across views
and benefits a diverse set of downstream driving applications. At the same
time, data-driven simulation for autonomous driving has been a focal point of
recent research but with few approaches that are both fully data-driven and
controllable. Instead of using perception data from real-life scenarios, an
ideal model for simulation would generate realistic street-view images that
align with a given HD map and traffic layout, a task that is critical for
visualizing complex traffic scenarios and developing robust perception models
for autonomous driving. In this paper, we propose BEVGen, a conditional
generative model that synthesizes a set of realistic and spatially consistent
surrounding images that match the BEV layout of a traffic scenario. BEVGen
incorporates a novel cross-view transformation with spatial attention design
which learns the relationship between cameras and map views to ensure their
consistency. We evaluate the proposed model on the challenging NuScenes and
Argoverse 2 datasets. After training, BEVGen can accurately render road and
lane lines, as well as generate traffic scenes with diverse different weather
conditions and times of day.
Related papers
- Learning autonomous driving from aerial imagery [67.06858775696453]
Photogrammetric simulators allow the synthesis of novel views through the transformation of pre-generated assets into novel views.
We use a Neural Radiance Field (NeRF) as an intermediate representation to synthesize novel views from the point of view of a ground vehicle.
arXiv Detail & Related papers (2024-10-18T05:09:07Z) - From Bird's-Eye to Street View: Crafting Diverse and Condition-Aligned Images with Latent Diffusion Model [16.716345249091408]
We explore Bird's-Eye View generation, converting a BEV map into its corresponding multi-view street images.
Our approach comprises two main components: the Neural View Transformation and the Street Image Generation.
arXiv Detail & Related papers (2024-09-02T07:47:16Z) - Camera Perspective Transformation to Bird's Eye View via Spatial Transformer Model for Road Intersection Monitoring [0.09208007322096533]
Road intersection monitoring and control research often utilize bird's eye view (BEV) simulators.
In real traffic settings, achieving a BEV akin to that in a simulator requires the deployment of drones or specific sensor mounting.
We introduce a novel deep-learning model that converts a single camera's perspective of a road intersection into a BEV.
arXiv Detail & Related papers (2024-08-10T15:01:19Z) - Urban Scene Diffusion through Semantic Occupancy Map [49.20779809250597]
UrbanDiffusion is a 3D diffusion model conditioned on a Bird's-Eye View (BEV) map.
Our model learns the data distribution of scene-level structures within a latent space.
After training on real-world driving datasets, our model can generate a wide range of diverse urban scenes.
arXiv Detail & Related papers (2024-03-18T11:54:35Z) - Synthesizing Traffic Datasets using Graph Neural Networks [2.444217495283211]
This paper introduces a novel methodology for bridging this sim-real' gap by creating photorealistic images from 2D traffic simulations and recorded junction footage.
We propose a novel image generation approach, integrating a Conditional Generative Adversarial Network with a Graph Neural Network (GNN) to facilitate the creation of realistic urban traffic images.
arXiv Detail & Related papers (2023-12-08T13:24:19Z) - Deep Perspective Transformation Based Vehicle Localization on Bird's Eye
View [0.49747156441456597]
Traditional approaches rely on installing multiple sensors to simulate the environment.
We propose an alternative solution by generating a top-down representation of the scene.
We present an architecture that transforms perspective view RGB images into bird's-eye-view maps with segmented surrounding vehicles.
arXiv Detail & Related papers (2023-11-12T10:16:42Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Structured Bird's-Eye-View Traffic Scene Understanding from Onboard
Images [128.881857704338]
We study the problem of extracting a directed graph representing the local road network in BEV coordinates, from a single onboard camera image.
We show that the method can be extended to detect dynamic objects on the BEV plane.
We validate our approach against powerful baselines and show that our network achieves superior performance.
arXiv Detail & Related papers (2021-10-05T12:40:33Z) - SceneGen: Learning to Generate Realistic Traffic Scenes [92.98412203941912]
We present SceneGen, a neural autoregressive model of traffic scenes that eschews the need for rules and distributions.
We demonstrate SceneGen's ability to faithfully model distributions of real traffic scenes.
arXiv Detail & Related papers (2021-01-16T22:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.