SIMBAR: Single Image-Based Scene Relighting For Effective Data
Augmentation For Automated Driving Vision Tasks
- URL: http://arxiv.org/abs/2204.00644v1
- Date: Fri, 1 Apr 2022 18:11:43 GMT
- Title: SIMBAR: Single Image-Based Scene Relighting For Effective Data
Augmentation For Automated Driving Vision Tasks
- Authors: Xianling Zhang, Nathan Tseng, Ameerah Syed, Rohan Bhasin, Nikita
Jaipuria
- Abstract summary: This paper presents a novel image-based relighting pipeline, SIMBAR, that can work with a single image as input.
To the best of our knowledge, there is no prior work on scene relighting leveraging explicit geometric representations from a single image.
To further validate and effectively quantify the benefit of leveraging SIMBAR for data augmentation for automated driving vision tasks, object detection and tracking experiments are conducted.
- Score: 2.974889834426778
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-world autonomous driving datasets comprise of images aggregated from
different drives on the road. The ability to relight captured scenes to unseen
lighting conditions, in a controllable manner, presents an opportunity to
augment datasets with a richer variety of lighting conditions, similar to what
would be encountered in the real-world. This paper presents a novel image-based
relighting pipeline, SIMBAR, that can work with a single image as input. To the
best of our knowledge, there is no prior work on scene relighting leveraging
explicit geometric representations from a single image. We present qualitative
comparisons with prior multi-view scene relighting baselines. To further
validate and effectively quantify the benefit of leveraging SIMBAR for data
augmentation for automated driving vision tasks, object detection and tracking
experiments are conducted with a state-of-the-art method, a Multiple Object
Tracking Accuracy (MOTA) of 93.3% is achieved with CenterTrack on
SIMBAR-augmented KITTI - an impressive 9.0% relative improvement over the
baseline MOTA of 85.6% with CenterTrack on original KITTI, both models trained
from scratch and tested on Virtual KITTI. For more details and SIMBAR relit
datasets, please visit our project website (https://simbarv1.github.io/).
Related papers
- LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond [37.47964043913622]
We introduce a new dataset LoLI-Street (Low-Light Images of Streets) with 33k paired low-light and well-exposed images from street scenes in developed cities.
LoLI-Street dataset also features 1,000 real low-light test images for testing LLIE models under real-life conditions.
arXiv Detail & Related papers (2024-10-13T13:11:56Z) - BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement [56.97766265018334]
This paper introduces a low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels.
Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets.
arXiv Detail & Related papers (2024-07-03T22:41:49Z) - Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks [47.07188762367792]
We present ARSim, a framework designed to enhance real multi-view image data with 3D synthetic objects of interest.
We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it.
The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles.
arXiv Detail & Related papers (2024-03-22T17:49:11Z) - Car-Studio: Learning Car Radiance Fields from Single-View and Endless
In-the-wild Images [16.075690774805622]
In this letter, we propose a pipeline for learning unconstrained images and building a dataset from processed images.
To meet the requirements of the simulator, we design a radiation field of the vehicle, a crucial part of the urban scene foreground.
Using the datasets built from in-the-wild images, our method gradually presents a controllable appearance editing function.
arXiv Detail & Related papers (2023-07-26T07:44:34Z) - 3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z) - Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z) - VISTA 2.0: An Open, Data-driven Simulator for Multimodal Sensing and
Policy Learning for Autonomous Vehicles [131.2240621036954]
We present VISTA, an open source, data-driven simulator that integrates multiple types of sensors for autonomous vehicles.
Using high fidelity, real-world datasets, VISTA represents and simulates RGB cameras, 3D LiDAR, and event-based cameras.
We demonstrate the ability to train and test perception-to-control policies across each of the sensor types and showcase the power of this approach via deployment on a full scale autonomous vehicle.
arXiv Detail & Related papers (2021-11-23T18:58:10Z) - Stereo Matching by Self-supervision of Multiscopic Vision [65.38359887232025]
We propose a new self-supervised framework for stereo matching utilizing multiple images captured at aligned camera positions.
A cross photometric loss, an uncertainty-aware mutual-supervision loss, and a new smoothness loss are introduced to optimize the network.
Our model obtains better disparity maps than previous unsupervised methods on the KITTI dataset.
arXiv Detail & Related papers (2021-04-09T02:58:59Z) - VIDIT: Virtual Image Dataset for Illumination Transfer [18.001635516017902]
We present a novel dataset, the Virtual Image dataset for Illumination Transfer (VIDIT)
VIDIT contains 300 virtual scenes used for training, where every scene is captured 40 times in total: from 8 equally-spaced azimuthal angles, each lit with 5 different illuminants.
arXiv Detail & Related papers (2020-05-11T21:58:03Z) - SimAug: Learning Robust Representations from Simulation for Trajectory
Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data.
We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.