StandardSim: A Synthetic Dataset For Retail Environments
- URL: http://arxiv.org/abs/2202.02418v1
- Date: Fri, 4 Feb 2022 22:28:35 GMT
- Title: StandardSim: A Synthetic Dataset For Retail Environments
- Authors: Cristina Mata, Nick Locascio, Mohammed Azeem Sheikh, Kenny Kihara and
Dan Fischetti
- Abstract summary: We present a large-scale synthetic dataset featuring annotations for semantic segmentation, instance segmentation, depth estimation, and object detection.
Our dataset provides multiple views per scene, enabling multi-view representation learning.
We benchmark widely-used models for segmentation and depth estimation on our dataset, show that our test set constitutes a difficult benchmark compared to current smaller-scale datasets.
- Score: 0.07874708385247352
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Autonomous checkout systems rely on visual and sensory inputs to carry out
fine-grained scene understanding in retail environments. Retail environments
present unique challenges compared to typical indoor scenes owing to the vast
number of densely packed, unique yet similar objects. The problem becomes even
more difficult when only RGB input is available, especially for data-hungry
tasks such as instance segmentation. To address the lack of datasets for
retail, we present StandardSim, a large-scale photorealistic synthetic dataset
featuring annotations for semantic segmentation, instance segmentation, depth
estimation, and object detection. Our dataset provides multiple views per
scene, enabling multi-view representation learning. Further, we introduce a
novel task central to autonomous checkout called change detection, requiring
pixel-level classification of takes, puts and shifts in objects over time. We
benchmark widely-used models for segmentation and depth estimation on our
dataset, show that our test set constitutes a difficult benchmark compared to
current smaller-scale datasets and that our training set provides models with
crucial information for autonomous checkout tasks.
Related papers
- Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark [18.636210870172675]
Few-shot semantic segmentation can encourage deep learning models to learn from few labelled examples for novel classes not seen during the training.
The generalized few-shot segmentation setting has an additional challenge which encourages models not only to adapt to the novel classes but also to maintain strong performance on the training base classes.
We release the dataset augmenting OpenEarthMap with additional classes labelled for the generalized few-shot evaluation setting.
arXiv Detail & Related papers (2024-09-17T14:20:47Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models [4.157013247909771]
We propose to leverage the recent advancements in state-of-the-art models for bottom-up segmentation (SAM), object detection (Detic), and semantic segmentation (MaskFormer)
We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments.
We demonstrate the effectiveness of the proposed approach on the Active Vision dataset and the ADE20K dataset.
arXiv Detail & Related papers (2023-11-17T21:58:26Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and
Benchmark [11.101588888002045]
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles.
We analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations.
arXiv Detail & Related papers (2020-01-10T14:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.