CrowdSim2: an Open Synthetic Benchmark for Object Detectors
- URL: http://arxiv.org/abs/2304.05090v1
- Date: Tue, 11 Apr 2023 09:35:57 GMT
- Title: CrowdSim2: an Open Synthetic Benchmark for Object Detectors
- Authors: Pawe{\l} Foszner, Agnieszka Szcz\k{e}sna, Luca Ciampi, Nicola Messina,
Adam Cygan, Bartosz Bizo\'n, Micha{\l} Cogiel, Dominik Golba, El\.zbieta
Macioszek, Micha{\l} Staniszewski
- Abstract summary: This paper presents and publicly releases CrowdSim2, a new synthetic collection of images suitable for people and vehicle detection.
It consists of thousands of images gathered from various synthetic scenarios resembling the real world, where we varied some factors of interest.
We exploited this new benchmark as a testing ground for some state-of-the-art detectors, showing that our simulated scenarios can be a valuable tool for measuring their performances in a controlled environment.
- Score: 0.7223361655030193
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data scarcity has become one of the main obstacles to developing supervised
models based on Artificial Intelligence in Computer Vision. Indeed, Deep
Learning-based models systematically struggle when applied in new scenarios
never seen during training and may not be adequately tested in non-ordinary yet
crucial real-world situations. This paper presents and publicly releases
CrowdSim2, a new synthetic collection of images suitable for people and vehicle
detection gathered from a simulator based on the Unity graphical engine. It
consists of thousands of images gathered from various synthetic scenarios
resembling the real world, where we varied some factors of interest, such as
the weather conditions and the number of objects in the scenes. The labels are
automatically collected and consist of bounding boxes that precisely localize
objects belonging to the two object classes, leaving out humans from the
annotation pipeline. We exploited this new benchmark as a testing ground for
some state-of-the-art detectors, showing that our simulated scenarios can be a
valuable tool for measuring their performances in a controlled environment.
Related papers
- Exploring Generative AI for Sim2Real in Driving Data Synthesis [6.769182994217369]
Driving simulators offer a solution by automatically generating various driving scenarios with corresponding annotations, but the simulation-to-reality (Sim2Real) domain gap remains a challenge.
This paper applied three different generative AI methods to leverage semantic label maps from a driving simulator as a bridge for the creation of realistic datasets.
Experiments show that although GAN-based methods are adept at generating high-quality images when provided with manually annotated labels, ControlNet produces synthetic datasets with fewer artefacts and more structural fidelity when using simulator-generated labels.
arXiv Detail & Related papers (2024-04-14T01:23:19Z) - Reconstructing Objects in-the-wild for Realistic Sensor Simulation [41.55571880832957]
We present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data.
We model the object appearance with a robust physics-inspired reflectance representation effective for in-the-wild data.
Our experiments show that NeuSim has strong view synthesis performance on challenging scenarios with sparse training views.
arXiv Detail & Related papers (2023-11-09T18:58:22Z) - Towards 3D Scene Understanding by Referring Synthetic Models [65.74211112607315]
Methods typically alleviate on-extensive annotations on real scene scans.
We explore how synthetic models rely on real scene categories of synthetic features to a unified feature space.
Experiments show that our method achieves the average mAP of 46.08% on the ScanNet S3DIS dataset and 55.49% by learning datasets.
arXiv Detail & Related papers (2022-03-20T13:06:15Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Predicting Stable Configurations for Semantic Placement of Novel Objects [37.18437299513799]
Our goal is to enable robots to repose previously unseen objects according to learned semantic relationships in novel environments.
We build our models and training from the ground up to be tightly integrated with our proposed planning algorithm for semantic placement of unknown objects.
Our approach enables motion planning for semantic rearrangement of unknown objects in scenes with varying geometry from only RGB-D sensing.
arXiv Detail & Related papers (2021-08-26T23:05:05Z) - Small Object Detection for Near Real-Time Egocentric Perception in a
Manual Assembly Scenario [0.0]
We describe a near real-time small object detection pipeline for egocentric perception in a manual assembly scenario.
First, the context is recognized, then the small object of interest is detected.
We evaluate our pipeline on the augmented reality device Microsoft Hololens 2.
arXiv Detail & Related papers (2021-06-11T13:59:44Z) - DriveGAN: Towards a Controllable High-Quality Neural Simulation [147.6822288981004]
We introduce a novel high-quality neural simulator referred to as DriveGAN.
DriveGAN achieves controllability by disentangling different components without supervision.
We train DriveGAN on multiple datasets, including 160 hours of real-world driving data.
arXiv Detail & Related papers (2021-04-30T15:30:05Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z) - Benchmarking Unsupervised Object Representations for Video Sequences [111.81492107649889]
We compare the perceptual abilities of four object-centric approaches: ViMON, OP3, TBA and SCALOR.
Our results suggest that the architectures with unconstrained latent representations learn more powerful representations in terms of object detection, segmentation and tracking.
Our benchmark may provide fruitful guidance towards learning more robust object-centric video representations.
arXiv Detail & Related papers (2020-06-12T09:37:24Z) - SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and
Benchmark [11.101588888002045]
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles.
We analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations.
arXiv Detail & Related papers (2020-01-10T14:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.