TBD Pedestrian Data Collection: Towards Rich, Portable, and Large-Scale
Natural Pedestrian Data
- URL: http://arxiv.org/abs/2309.17187v2
- Date: Sun, 3 Mar 2024 20:54:36 GMT
- Title: TBD Pedestrian Data Collection: Towards Rich, Portable, and Large-Scale
Natural Pedestrian Data
- Authors: Allan Wang, Daisuke Sato, Yasser Corzo, Sonya Simkin, Abhijat Biswas,
Aaron Steinfeld
- Abstract summary: Social navigation and pedestrian behavior research has shifted towards machine learning-based methods.
For this, large-scale datasets that contain rich information are needed.
We describe a portable data collection system, coupled with a semi-autonomous labeling pipeline.
- Score: 5.582962886199554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Social navigation and pedestrian behavior research has shifted towards
machine learning-based methods and converged on the topic of modeling
inter-pedestrian interactions and pedestrian-robot interactions. For this,
large-scale datasets that contain rich information are needed. We describe a
portable data collection system, coupled with a semi-autonomous labeling
pipeline. As part of the pipeline, we designed a label correction web app that
facilitates human verification of automated pedestrian tracking outcomes. Our
system enables large-scale data collection in diverse environments and fast
trajectory label production. Compared with existing pedestrian data collection
methods, our system contains three components: a combination of top-down and
ego-centric views, natural human behavior in the presence of a socially
appropriate "robot", and human-verified labels grounded in the metric space. To
the best of our knowledge, no prior data collection system has a combination of
all three components. We further introduce our ever-expanding dataset from the
ongoing data collection effort -- the TBD Pedestrian Dataset and show that our
collected data is larger in scale, contains richer information when compared to
prior datasets with human-verified labels, and supports new research
opportunities.
Related papers
- trajdata: A Unified Interface to Multiple Human Trajectory Datasets [32.93180256927027]
We present trajdata, a unified interface to multiple human trajectory datasets.
Trajdata provides a simple, uniform, and efficient representation and API for trajectory and map data.
arXiv Detail & Related papers (2023-07-26T02:45:59Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Deep Learning and Handheld Augmented Reality Based System for Optimal
Data Collection in Fault Diagnostics Domain [0.0]
This paper presents a novel human-machine interaction framework to perform fault diagnostics with minimal data.
Minimizing the required data will increase the practicability of data-driven models in diagnosing faults.
The proposed framework has provided above 100% precision and recall on a novel dataset with only one instance of each fault condition.
arXiv Detail & Related papers (2022-06-15T19:15:26Z) - Towards Rich, Portable, and Large-Scale Pedestrian Data Collection [6.250018240133604]
We propose a data collection system that is portable, which facilitates accessible large-scale data collection in diverse environments.
We introduce the first batch of dataset from the ongoing data collection effort -- the TBD pedestrian dataset.
Compared with existing pedestrian datasets, our dataset contains three components: human verified labels grounded in the metric space, a combination of top-down and perspective views, and naturalistic human behavior in the presence of a socially appropriate "robot"
arXiv Detail & Related papers (2022-03-03T19:28:10Z) - Addressing Data Scarcity in Multimodal User State Recognition by
Combining Semi-Supervised and Supervised Learning [1.1688030627514532]
We present a multimodal machine learning approach for detecting dis-/agreement and confusion states in a human-robot interaction environment.
We achieve an average F1-score of 81.1% for dis-/agreement detection with a small amount of labeled data and a large unlabeled data set.
arXiv Detail & Related papers (2022-02-08T10:41:41Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - JRDB-Act: A Large-scale Multi-modal Dataset for Spatio-temporal Action,
Social Group and Activity Detection [54.696819174421584]
We introduce JRDB-Act, a multi-modal dataset that reflects a real distribution of human daily life actions in a university campus environment.
JRDB-Act has been densely annotated with atomic actions, comprises over 2.8M action labels.
JRDB-Act comes with social group identification annotations conducive to the task of grouping individuals based on their interactions in the scene.
arXiv Detail & Related papers (2021-06-16T14:43:46Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.