OpenPack: A Large-scale Dataset for Recognizing Packaging Works in IoT-enabled Logistic Environments
- URL: http://arxiv.org/abs/2212.11152v2
- Date: Sat, 20 Apr 2024 15:15:13 GMT
- Title: OpenPack: A Large-scale Dataset for Recognizing Packaging Works in IoT-enabled Logistic Environments
- Authors: Naoya Yoshimura, Jaime Morales, Takuya Maekawa, Takahiro Hara,
- Abstract summary: We introduce a new large-scale dataset for packaging work recognition called OpenPack.
OpenPack contains 53.8 hours of multimodal sensor data, including acceleration data, keypoints, depth images, and readings from IoT-enabled devices.
We apply state-of-the-art human activity recognition techniques to the dataset and provide future directions of complex work activity recognition studies.
- Score: 6.2454830041363145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unlike human daily activities, existing publicly available sensor datasets for work activity recognition in industrial domains are limited by difficulties in collecting realistic data as close collaboration with industrial sites is required. This also limits research on and development of methods for industrial applications. To address these challenges and contribute to research on machine recognition of work activities in industrial domains, in this study, we introduce a new large-scale dataset for packaging work recognition called OpenPack. OpenPack contains 53.8 hours of multimodal sensor data, including acceleration data, keypoints, depth images, and readings from IoT-enabled devices (e.g., handheld barcode scanners), collected from 16 distinct subjects with different levels of packaging work experience. We apply state-of-the-art human activity recognition techniques to the dataset and provide future directions of complex work activity recognition studies in the pervasive computing community based on the results. We believe that OpenPack will contribute to the sensor-based action/activity recognition community by providing challenging tasks. The OpenPack dataset is available at https://open-pack.github.io.
Related papers
- CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments [8.177157078744571]
This paper presents a pioneering and comprehensive real-world multi-robot collaborative perception dataset.
It features raw sensor inputs, pose estimation, and optional high-level perception annotation.
We believe this work will unlock the potential research of high-level scene understanding through multi-modal collaborative perception in multi-robot settings.
arXiv Detail & Related papers (2024-05-23T15:59:48Z) - IPAD: Industrial Process Anomaly Detection Dataset [71.39058003212614]
Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames.
We propose a new dataset, IPAD, specifically designed for VAD in industrial scenarios.
This dataset covers 16 different industrial devices and contains over 6 hours of both synthetic and real-world video footage.
arXiv Detail & Related papers (2024-04-23T13:38:01Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - DOO-RE: A dataset of ambient sensors in a meeting room for activity
recognition [2.2939897247190886]
We build a dataset collected from a meeting room equipped with ambient sensors.
The dataset, DOO-RE, includes data streams from various ambient sensor types such as Sound and Projector.
To our best knowledge, DOO-RE is the first dataset to support the recognition of both single and group activities in a real meeting room with reliable annotations.
arXiv Detail & Related papers (2024-01-17T04:21:04Z) - Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data.
We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z) - Cross-Domain HAR: Few Shot Transfer Learning for Human Activity
Recognition [0.2944538605197902]
We present an approach for economic use of publicly available labeled HAR datasets for effective transfer learning.
We introduce a novel transfer learning framework, Cross-Domain HAR, which follows the teacher-student self-training paradigm.
We demonstrate the effectiveness of our approach for practically relevant few shot activity recognition scenarios.
arXiv Detail & Related papers (2023-10-22T19:13:25Z) - Towards Packaging Unit Detection for Automated Palletizing Tasks [5.235268087662475]
We propose an approach to this challenging problem that is fully trained on synthetically generated data.
The proposed approach is able to handle sparse and low quality sensor data.
We conduct an extensive evaluation on real-world data with a wide range of different retail products.
arXiv Detail & Related papers (2023-08-11T15:37:38Z) - RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
One-Shot [56.130215236125224]
A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.
Recent research in one-shot imitation learning has shown promise in transferring trained policies to new tasks based on demonstrations.
This paper aims to unlock the potential for an agent to generalize to hundreds of real-world skills with multi-modal perception.
arXiv Detail & Related papers (2023-07-02T15:33:31Z) - Single-Modal Entropy based Active Learning for Visual Question Answering [75.1682163844354]
We address Active Learning in the multi-modal setting of Visual Question Answering (VQA)
In light of the multi-modal inputs, image and question, we propose a novel method for effective sample acquisition.
Our novel idea is simple to implement, cost-efficient, and readily adaptable to other multi-modal tasks.
arXiv Detail & Related papers (2021-10-21T05:38:45Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - IMUTube: Automatic Extraction of Virtual on-body Accelerometry from
Video for Human Activity Recognition [12.91206329972949]
We introduce IMUTube, an automated processing pipeline to convert videos of human activity into virtual streams of IMU data.
These virtual IMU streams represent accelerometry at a wide variety of locations on the human body.
We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets.
arXiv Detail & Related papers (2020-05-29T21:50:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.