A Neuromorphic Dataset for Object Segmentation in Indoor Cluttered
Environment
- URL: http://arxiv.org/abs/2302.06301v1
- Date: Mon, 13 Feb 2023 12:02:51 GMT
- Title: A Neuromorphic Dataset for Object Segmentation in Indoor Cluttered
Environment
- Authors: Xiaoqian Huang, Kachole Sanket, Abdulla Ayyad, Fariborz Baghaei
Naeini, Dimitrios Makris, Yahya Zweir
- Abstract summary: This paper proposes a new Event-based ESD dataset for object segmentation in an indoor environment.
Our proposed dataset comprises 145 sequences with 14,166 RGB frames that are manually annotated with instance masks.
Overall 21.88 million and 20.80 million events from two event-based cameras in a stereo-graphic configuration are collected.
- Score: 3.6047642906482142
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Taking advantage of an event-based camera, the issues of motion blur, low
dynamic range and low time sampling of standard cameras can all be addressed.
However, there is a lack of event-based datasets dedicated to the benchmarking
of segmentation algorithms, especially those that provide depth information
which is critical for segmentation in occluded scenes. This paper proposes a
new Event-based Segmentation Dataset (ESD), a high-quality 3D spatial and
temporal dataset for object segmentation in an indoor cluttered environment.
Our proposed dataset ESD comprises 145 sequences with 14,166 RGB frames that
are manually annotated with instance masks. Overall 21.88 million and 20.80
million events from two event-based cameras in a stereo-graphic configuration
are collected, respectively. To the best of our knowledge, this densely
annotated and 3D spatial-temporal event-based segmentation benchmark of
tabletop objects is the first of its kind. By releasing ESD, we expect to
provide the community with a challenging segmentation benchmark with high
quality.
Related papers
- Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras [45.063747874243276]
We present EV-WSSS: a novel weakly supervised approach for event-based semantic segmentation.
The proposed framework performs asymmetric dual-student learning between 1) the original forward event data and 2) the longer reversed event data.
We show that the proposed method achieves substantial segmentation results even without relying on pixel-level dense ground truths.
arXiv Detail & Related papers (2024-07-15T20:00:50Z) - DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition [51.96660522869841]
DailyDVS-200 is a benchmark dataset tailored for the event-based action recognition community.
It covers 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences.
DailyDVS-200 is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions.
arXiv Detail & Related papers (2024-07-06T15:25:10Z) - PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments [50.79058028754952]
PACE (Pose s in Cluttered Environments) is a large-scale benchmark for pose estimation methods in cluttered scenarios.
The benchmark consists of 55K frames with 258K annotations across 300 videos, covering 238 objects from 43 categories.
PACE-Sim contains 100K photo-realistic simulated frames with 2.4M annotations across 931 objects.
arXiv Detail & Related papers (2023-12-23T01:38:41Z) - Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor
Environments [67.34330257205525]
In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner.
We present a method that uses annotated objects to learn the objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
arXiv Detail & Related papers (2022-12-22T17:59:48Z) - EVIMO2: An Event Camera Dataset for Motion Segmentation, Optical Flow,
Structure from Motion, and Visual Inertial Odometry in Indoor Scenes with
Monocular or Stereo Algorithms [10.058432912712396]
dataset consists of 41 minutes of data from three 640$times$480 event cameras, one 2080$times$1552 classical color camera.
The dataset's 173 sequences are arranged into three categories.
Some sequences were recorded in low-light conditions where conventional cameras fail.
arXiv Detail & Related papers (2022-05-06T20:09:18Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Learning to Detect Objects with a 1 Megapixel Event Camera [14.949946376335305]
Event cameras encode visual information with high temporal precision, low data-rate, and high-dynamic range.
Due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions.
arXiv Detail & Related papers (2020-09-28T16:03:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.