Partially fake it till you make it: mixing real and fake thermal images
for improved object detection
- URL: http://arxiv.org/abs/2106.13603v1
- Date: Fri, 25 Jun 2021 12:56:09 GMT
- Title: Partially fake it till you make it: mixing real and fake thermal images
for improved object detection
- Authors: Francesco Bongini, Lorenzo Berlincioni, Marco Bertini, Alberto Del
Bimbo
- Abstract summary: We show the performance of the proposed system in the context of object detection in thermal videos.
Our single-modality detector achieves state-of-the-art results on the FLIR ADAS dataset.
- Score: 29.13557322147509
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this paper we propose a novel data augmentation approach for visual
content domains that have scarce training datasets, compositing synthetic 3D
objects within real scenes. We show the performance of the proposed system in
the context of object detection in thermal videos, a domain where 1) training
datasets are very limited compared to visible spectrum datasets and 2) creating
full realistic synthetic scenes is extremely cumbersome and expensive due to
the difficulty in modeling the thermal properties of the materials of the
scene. We compare different augmentation strategies, including state of the art
approaches obtained through RL techniques, the injection of simulated data and
the employment of a generative model, and study how to best combine our
proposed augmentation with these other techniques.Experimental results
demonstrate the effectiveness of our approach, and our single-modality detector
achieves state-of-the-art results on the FLIR ADAS dataset.
Related papers
- Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting [3.5351922399745166]
This research introduces a novel method that employs 3D Gaussian Splatting to generate synthetic surgical datasets.
We developed a data recording system capable of acquiring images alongside tool and camera poses in a surgical scene.
Using this pose data, we synthetically replicate the scene, thereby enabling direct comparisons of the synthetic image quality.
arXiv Detail & Related papers (2024-07-20T11:20:07Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture.
We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation.
Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z) - Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection [59.33188668341604]
3D object detection serves as the fundamental task of autonomous driving perception.
It is costly to obtain high-quality annotations for point cloud data.
We propose a hardness-aware scene synthesis (HASS) method to generate adaptive synthetic scenes.
arXiv Detail & Related papers (2024-05-27T17:59:23Z) - Training Deep Learning Models with Hybrid Datasets for Robust Automatic Target Detection on real SAR images [0.13194391758295113]
We propose a Deep Learning approach to train ATD models with synthetic target signatures produced with the MOCEM simulator.
We train ATD models specifically tailored to bridge the domain gap between synthetic and real data.
Our approach can reach up to 90% of Average Precision on real data while exclusively using synthetic targets for training.
arXiv Detail & Related papers (2024-05-15T09:26:24Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks [47.07188762367792]
We present ARSim, a framework designed to enhance real multi-view image data with 3D synthetic objects of interest.
We construct a simplified virtual scene using real data and strategically place 3D synthetic assets within it.
The resulting augmented multi-view consistent dataset is used to train a multi-camera perception network for autonomous vehicles.
arXiv Detail & Related papers (2024-03-22T17:49:11Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - WinSyn: A High Resolution Testbed for Synthetic Data [41.11481327112564]
We present WinSyn, a unique dataset and testbed for creating high-quality synthetic data with procedural modeling techniques.
The dataset contains high-resolution photographs of windows, selected from locations around the world, with 89,318 individual window crops showcasing diverse geometric and material characteristics.
We evaluate a procedural model by training semantic segmentation networks on both synthetic and real images and then comparing their performances on a shared test set of real images.
arXiv Detail & Related papers (2023-10-09T20:18:10Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.