DIOR: Dataset for Indoor-Outdoor Reidentification -- Long Range 3D/2D
Skeleton Gait Collection Pipeline, Semi-Automated Gait Keypoint Labeling and
Baseline Evaluation Methods
- URL: http://arxiv.org/abs/2309.12429v1
- Date: Thu, 21 Sep 2023 18:51:00 GMT
- Title: DIOR: Dataset for Indoor-Outdoor Reidentification -- Long Range 3D/2D
Skeleton Gait Collection Pipeline, Semi-Automated Gait Keypoint Labeling and
Baseline Evaluation Methods
- Authors: Yuyang Chen, Praveen Raj Masilamani, Bhavin Jawade, Srirangaraj
Setlur, Karthik Dantu
- Abstract summary: This paper introduces DIOR -- a framework for data collection, semi-automated annotation, and provides a dataset with 14 subjects and 1.649 million RGB frames with 3D/2D skeleton gait labels.
We successfully achieve precise skeleton labeling on far-away subjects, even when their height is limited to a mere 20-25 pixels within an RGB frame.
- Score: 8.265408202637857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent times, there is an increased interest in the identification and
re-identification of people at long distances, such as from rooftop cameras,
UAV cameras, street cams, and others. Such recognition needs to go beyond face
and use whole-body markers such as gait. However, datasets to train and test
such recognition algorithms are not widely prevalent, and fewer are labeled.
This paper introduces DIOR -- a framework for data collection, semi-automated
annotation, and also provides a dataset with 14 subjects and 1.649 million RGB
frames with 3D/2D skeleton gait labels, including 200 thousands frames from a
long range camera. Our approach leverages advanced 3D computer vision
techniques to attain pixel-level accuracy in indoor settings with motion
capture systems. Additionally, for outdoor long-range settings, we remove the
dependency on motion capture systems and adopt a low-cost, hybrid 3D computer
vision and learning pipeline with only 4 low-cost RGB cameras, successfully
achieving precise skeleton labeling on far-away subjects, even when their
height is limited to a mere 20-25 pixels within an RGB frame. On publication,
we will make our pipeline open for others to use.
Related papers
- Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for
Longer-Range Object Tracking Applications [3.9776693020673677]
Digital twin is a problem of augmenting real objects with their digital counterparts.
A critical component in a good digital-twin system is real-time, accurate 3D object tracking.
In this work, we create a novel RGB-D dataset, called Digital Twin Tracking dataset (DTTD)
arXiv Detail & Related papers (2023-02-12T20:06:07Z) - LidarGait: Benchmarking 3D Gait Recognition with Point Clouds [18.22238384814974]
This work explores precise 3D gait features from point clouds and proposes a simple yet efficient 3D gait recognition framework, termed LidarGait.
Our proposed approach projects sparse point clouds into depth maps to learn the representations with 3D geometry information.
Due to the lack of point cloud datasets, we built the first large-scale LiDAR-based gait recognition dataset, SUSTech1K.
arXiv Detail & Related papers (2022-11-19T06:23:08Z) - Synthehicle: Multi-Vehicle Multi-Camera Tracking in Virtual Cities [4.4855664250147465]
We present a massive synthetic dataset for multiple vehicle tracking and segmentation in multiple overlapping and non-overlapping camera views.
The dataset consists of 17 hours of labeled video material, recorded from 340 cameras in 64 diverse day, rain, dawn, and night scenes.
arXiv Detail & Related papers (2022-08-30T11:36:07Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in
Autonomous Driving [74.74519047735916]
3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors.
Data collected for other use cases (such as virtual reality, gaming, and animation) may not be usable for AV applications.
We propose one of the first approaches to alleviate this problem in the AV setting.
arXiv Detail & Related papers (2021-12-22T18:57:16Z) - Pedestrian Detection in 3D Point Clouds using Deep Neural Networks [2.6763498831034034]
We propose a PointNet++ architecture to detect pedestrians in dense 3D point clouds.
The aim is to explore the potential contribution of geometric information alone in pedestrian detection systems.
We also present a semi-automatic labeling system that transfers pedestrian and non-pedestrian labels from RGB images onto the 3D domain.
arXiv Detail & Related papers (2021-05-03T20:12:11Z) - Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS)
Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters.
In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z) - JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset [34.609125601292]
We present JRMOT, a novel 3D MOT system that integrates information from RGB images and 3D point clouds to achieve real-time tracking performance.
As part of our work, we release the JRDB dataset, a novel large scale 2D+3D dataset and benchmark.
The presented 3D MOT system demonstrates state-of-the-art performance against competing methods on the popular 2D tracking KITTI benchmark.
arXiv Detail & Related papers (2020-02-19T19:21:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.