Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic
Segmentation Challenge Baseline
- URL: http://arxiv.org/abs/2206.00527v1
- Date: Wed, 1 Jun 2022 14:38:33 GMT
- Title: Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic
Segmentation Challenge Baseline
- Authors: Jasmin Breitenstein and Tim Fingscheidt
- Abstract summary: We consider the task of amodal semantic segmentation and propose a generic way to generate datasets to train amodal semantic segmentation methods.
We use this approach to generate an amodal Cityscapes dataset, showing its applicability for amodal semantic segmentation in automotive environment perception.
- Score: 38.8592627329447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Amodal perception terms the ability of humans to imagine the entire shapes of
occluded objects. This gives humans an advantage to keep track of everything
that is going on, especially in crowded situations. Typical perception
functions, however, lack amodal perception abilities and are therefore at a
disadvantage in situations with occlusions. Complex urban driving scenarios
often experience many different types of occlusions and, therefore, amodal
perception for automated vehicles is an important task to investigate. In this
paper, we consider the task of amodal semantic segmentation and propose a
generic way to generate datasets to train amodal semantic segmentation methods.
We use this approach to generate an amodal Cityscapes dataset. Moreover, we
propose and evaluate a method as baseline on Amodal Cityscapes, showing its
applicability for amodal semantic segmentation in automotive environment
perception. We provide the means to re-generate this dataset on github.
Related papers
- Amodal Ground Truth and Completion in the Wild [84.54972153436466]
We use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images.
This pipeline is used to construct an amodal completion evaluation benchmark, MP3D-Amodal, consisting of a variety of object categories and labels.
arXiv Detail & Related papers (2023-12-28T18:59:41Z) - TAO-Amodal: A Benchmark for Tracking Any Object Amodally [41.5396827282691]
We introduce TAO-Amodal, featuring 833 diverse categories in thousands of video sequences.
Our dataset includes textitamodal and modal bounding boxes for visible and partially or fully occluded objects, including those that are partially out of the camera frame.
arXiv Detail & Related papers (2023-12-19T18:58:40Z) - AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous
Driving [10.928470926399566]
We introduce Amodal SynthDrive, a synthetic multi-task multi-modal amodal perception dataset.
The dataset provides multi-view camera images, 3D bounding boxes, LiDAR data, and odometry for 150 driving sequences.
Amodal SynthDrive supports multiple amodal scene understanding tasks including the introduced amodal depth estimation.
arXiv Detail & Related papers (2023-09-12T19:46:15Z) - Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal
Rearrangement [49.888011242939385]
We propose a system for rearranging objects in a scene to achieve a desired object-scene placing relationship.
The pipeline generalizes to novel geometries, poses, and layouts of both scenes and objects.
arXiv Detail & Related papers (2023-07-10T17:56:06Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - AutoLay: Benchmarking amodal layout estimation for autonomous driving [18.152206533685412]
AutoLay is a dataset and benchmark for amodal layout estimation from monocular images.
In addition to fine-grained attributes such as lanes, sidewalks, and vehicles, we also provide semantically annotated 3D point clouds.
arXiv Detail & Related papers (2021-08-20T08:21:11Z) - Amodal Segmentation through Out-of-Task and Out-of-Distribution
Generalization with a Bayesian Model [19.235173141731885]
Amodal completion is a visual task that humans perform easily but which is difficult for computer vision algorithms.
We formulate amodal segmentation as an out-of-task and out-of-distribution generalization problem.
Our algorithm outperforms alternative methods that use the same supervision by a large margin.
arXiv Detail & Related papers (2020-10-25T18:01:26Z) - Hidden Footprints: Learning Contextual Walkability from 3D Human Trails [70.01257397390361]
Current datasets only tell you where people are, not where they could be.
We first augment the set of valid, labeled walkable regions by propagating person observations between images, utilizing 3D information to create what we call hidden footprints.
We devise a training strategy designed for such sparse labels, combining a class-balanced classification loss with a contextual adversarial loss.
arXiv Detail & Related papers (2020-08-19T23:19:08Z) - Future Urban Scenes Generation Through Vehicles Synthesis [90.1731992199415]
We propose a deep learning pipeline to predict the visual future appearance of an urban scene.
We follow a two stages approach, where interpretable information is included in the loop and each actor is modelled independently.
We show the superiority of this approach over traditional end-to-end scene-generation methods on CityFlow.
arXiv Detail & Related papers (2020-07-01T08:40:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.