Challenges of the Creation of a Dataset for Vision Based Human Hand
Action Recognition in Industrial Assembly
- URL: http://arxiv.org/abs/2303.03716v1
- Date: Tue, 7 Mar 2023 07:57:12 GMT
- Title: Challenges of the Creation of a Dataset for Vision Based Human Hand
Action Recognition in Industrial Assembly
- Authors: Fabian Sturm, Elke Hergenroether, Julian Reinhardt, Petar Smilevski
Vojnovikj, Melanie Siegel
- Abstract summary: This dataset consists of 12 classes with 459,180 images in the basic version and 2,295,900 images after spatial augmentation.
It has an above-average duration and meets the technical and legal requirements for industrial assembly lines.
The recorded ground truth assembly classes were selected after extensive observation of real-world use cases.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents the Industrial Hand Action Dataset V1, an industrial
assembly dataset consisting of 12 classes with 459,180 images in the basic
version and 2,295,900 images after spatial augmentation. Compared to other
freely available datasets tested, it has an above-average duration and, in
addition, meets the technical and legal requirements for industrial assembly
lines. Furthermore, the dataset contains occlusions, hand-object interaction,
and various fine-grained human hand actions for industrial assembly tasks that
were not found in combination in examined datasets. The recorded ground truth
assembly classes were selected after extensive observation of real-world use
cases. A Gated Transformer Network, a state-of-the-art model from the
transformer domain was adapted, and proved with a test accuracy of 86.25%
before hyperparameter tuning by 18,269,959 trainable parameters, that it is
possible to train sequential deep learning models with this dataset.
Related papers
- The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments [2.30069810310356]
Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven solutions.
New techniques require algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly.
This dataset provides 1200 challenging examples to build applications toward hand and glove segmentation in industrial human collaboration scenarios.
arXiv Detail & Related papers (2024-07-19T19:56:53Z) - DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition [51.96660522869841]
DailyDVS-200 is a benchmark dataset tailored for the event-based action recognition community.
It covers 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences.
DailyDVS-200 is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions.
arXiv Detail & Related papers (2024-07-06T15:25:10Z) - Supervised Anomaly Detection for Complex Industrial Images [4.890533180388991]
We present a novel real-world industrial dataset comprising 5000 images, including 2000 instances of challenging real defects.
We also introduce (2)-based Anomaly Detector (SegAD)
SegAD uses anomaly maps as well as segmentation maps to compute local statistics.
Our SegAD state-of-the-art performance on both VAD and the VisA dataset (+0.4% AUROC)
arXiv Detail & Related papers (2024-05-08T10:47:28Z) - IPAD: Industrial Process Anomaly Detection Dataset [71.39058003212614]
Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames.
We propose a new dataset, IPAD, specifically designed for VAD in industrial scenarios.
This dataset covers 16 different industrial devices and contains over 6 hours of both synthetic and real-world video footage.
arXiv Detail & Related papers (2024-04-23T13:38:01Z) - Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset [6.481744951262474]
We introduce a synthetic dataset that may serve as a preliminary testbed for the Sim-to-Real challenge.
It contains 17 objects of six industrial use cases, including isolated and assembled parts.
All the sample images come with and without random backgrounds and post-processing for evaluating the importance of domain randomization.
arXiv Detail & Related papers (2024-04-12T19:04:59Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Investigation of the Impact of Synthetic Training Data in the Industrial
Application of Terminal Strip Object Detection [4.327763441385371]
In this paper, we investigate the sim-to-real generalization performance of standard object detectors on the complex industrial application of terminal strip object detection.
We manually annotated 300 real images of terminal strips for the evaluation. The results show the cruciality of the objects of interest to have the same scale in either domain.
arXiv Detail & Related papers (2024-03-06T18:33:27Z) - Analog and Multi-modal Manufacturing Datasets Acquired on the Future
Factories Platform [0.0]
Two industry-grade datasets are presented in this paper.
They were collected at the Future Factories Lab at the University of South Carolina on December 11th and 12th of 2023.
arXiv Detail & Related papers (2024-01-28T02:26:58Z) - Synthetic Data for Object Classification in Industrial Applications [53.180678723280145]
In object classification, capturing a large number of images per object and in different conditions is not always possible.
This work explores the creation of artificial images using a game engine to cope with limited data in the training dataset.
arXiv Detail & Related papers (2022-12-09T11:43:04Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.