Related papers: The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments

The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments

URL: http://arxiv.org/abs/2407.14649v1
Date: Fri, 19 Jul 2024 19:56:53 GMT
Title: The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments
Authors: Shivansh Sharma, Mathew Huang, Sanat Nair, Alan Wen, Christina Petlowany, Juston Moore, Selma Wanna, Mitch Pryor,
Abstract summary: Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven solutions. New techniques require algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. This dataset provides 1200 challenging examples to build applications toward hand and glove segmentation in industrial human collaboration scenarios.
Score: 2.30069810310356
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven manufacturing solutions. Developing these techniques necessitates algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. Although substantial efforts have curated datasets for hand segmentation, most focus on residential or commercial domains. Existing datasets targeting industrial settings predominantly rely on synthetic data, which we demonstrate does not effectively transfer to real-world operations. Moreover, these datasets lack uncertainty estimations critical for safe collaboration. Addressing these gaps, we present HAGS: Hand and Glove Segmentation Dataset. This dataset provides 1200 challenging examples to build applications toward hand and glove segmentation in industrial human-robot collaboration scenarios as well as assess out-of-distribution images, constructed via green screen augmentations, to determine ML-classifier robustness. We study state-of-the-art, real-time segmentation models to evaluate existing methods. Our dataset and baselines are publicly available: https://dataverse.tdl.org/dataset.xhtml?persistentId=doi:10.18738/T8/85R7KQ and https://github.com/UTNuclearRoboticsPublic/assembly_glovebox_dataset.

Related papers

Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map [50.21082069320818]
We propose a novel diffusion-based pipeline for generating high-fidelity industrial datasets with minimal supervision.<n>Our approach conditions the diffusion model on enriched bounding box representations to produce precise segmentation masks.<n>Results demonstrate that diffusion-based synthesis can bridge the gap between artificial and real-world industrial data.
arXiv Detail & Related papers (2025-05-06T15:21:36Z)
Testing Human-Hand Segmentation on In-Distribution and Out-of-Distribution Data in Human-Robot Interactions Using a Deep Ensemble Model [40.815678328617686]
We present a novel approach by evaluating the performance of pre-trained deep learning models under both ID data and more challenging OOD scenarios. We incorporated unique and rare conditions such as finger-crossing gestures and motion blur from fast-moving hands. Results revealed that models trained on industrial datasets outperformed those trained on non-industrial datasets.
arXiv Detail & Related papers (2025-01-13T21:52:46Z)
Generate to Discriminate: Expert Routing for Continual Learning [59.71853576559306]
Generate to Discriminate (G2D) is a continual learning method that leverages synthetic data to train a domain-discriminator. We observe that G2D outperforms competitive domain-incremental learning methods on tasks in both vision and language modalities.
arXiv Detail & Related papers (2024-12-22T13:16:28Z)
Language Supervised Human Action Recognition with Salient Fusion: Construction Worker Action Recognition as a Use Case [8.26451988845854]
We introduce a novel approach to Human Action Recognition (HAR) based on skeleton and visual cues. We employ learnable prompts for the language model conditioned on the skeleton modality to optimize feature representation. We introduce a new dataset tailored for real-world robotic applications in construction sites, featuring visual, skeleton, and depth data modalities.
arXiv Detail & Related papers (2024-10-02T19:10:23Z)
Efficient Data Collection for Robotic Manipulation via Compositional Generalization [70.76782930312746]
We show that policies can compose environmental factors from their data to succeed when encountering unseen factor combinations. We propose better in-domain data collection strategies that exploit composition. We provide videos at http://iliad.stanford.edu/robot-data-comp/.
arXiv Detail & Related papers (2024-03-08T07:15:38Z)
Synthetic Data Generation for Bridging Sim2Real Gap in a Production Environment [0.0]
Domain knowledge is vital in bridging the simulation to reality gap in computer vision applications. This paper focuses on synthetic data generation procedures for parts and assemblies used in a production environment.
arXiv Detail & Related papers (2023-11-18T11:15:08Z)
Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model. We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks. Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z)
Industrial Application of 6D Pose Estimation for Robotic Manipulation in Automotive Internal Logistics [0.4915744683251149]
A large proportion of parts handling tasks in the automotive industry's internal logistics are not automated but still performed by humans. A key component to competitively automate these processes is a 6D pose estimation that can handle a large number of different parts. We build a representative 6D pose estimation pipeline with state-of-the-art components from economically scalable real to synthetic data generation.
arXiv Detail & Related papers (2023-09-25T16:23:49Z)
Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario [14.188006024550257]
EgoISM-HOI is a new multimodal dataset composed of synthetic EHOI images in an industrial environment with rich annotations of hands and objects. Our study shows that exploiting synthetic data to pre-train the proposed method significantly improves performance when tested on real-world data. To support research in this field, we publicly release the datasets, source code, and pre-trained models at https://iplab.dmi.unict.it/egoism-hoi.
arXiv Detail & Related papers (2023-06-21T09:56:55Z)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations. We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2) The dataset is composed of both real and synthetic videos from seven gesture classes. We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z)
COVERED, CollabOratiVE Robot Environment Dataset for 3D Semantic segmentation [39.64058995273062]
This work develops a new dataset specifically designed for this use case, named "COVERED" We provide a benchmark of current state-of-the-art (SOTA) algorithm performance on the dataset and demonstrate a real-time semantic segmentation of a collaborative robot workspace using a multi-LiDAR system. Our perception pipeline achieves 20Hz throughput with a prediction point accuracy of $>$96% and $>$92% mean intersection over union (mIOU) while maintaining an 8Hz throughput.
arXiv Detail & Related papers (2023-02-24T14:24:58Z)
Towards Multi-User Activity Recognition through Facilitated Training Data and Deep Learning for Human-Robot Collaboration Applications [2.3274633659223545]
This study proposes an alternative way of gathering data regarding multi-user activity, by collecting data related to single users and merging them in post-processing. It is possible to make use of data collected in this way for pair HRC settings and get similar performances compared to using training data regarding groups of users recorded under the same settings.
arXiv Detail & Related papers (2023-02-11T19:27:07Z)
Video-based Pose-Estimation Data as Source for Transfer Learning in Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments. Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data. This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z)
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision [3.5694949627557846]
We release a human-centric synthetic data generator PeopleSansPeople. It contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels.
arXiv Detail & Related papers (2021-12-17T02:33:31Z)
Deployment and Evaluation of a Flexible Human-Robot Collaboration Model Based on AND/OR Graphs in a Manufacturing Environment [2.3848738964230023]
A major bottleneck to effectively deploy collaborative robots to manufacturing industries is developing task planning algorithms. A pick-and-place palletization task, which requires the collaboration between humans and robots, is investigated. The results of this study demonstrate how human-robot collaboration models like the one we propose can leverage the flexibility and the comfort of operators in the workplace.
arXiv Detail & Related papers (2020-07-13T22:05:34Z)
Human Trajectory Forecasting in Crowds: A Deep Learning Perspective [89.4600982169]
We present an in-depth analysis of existing deep learning-based methods for modelling social interactions. We propose two knowledge-based data-driven methods to effectively capture these social interactions. We develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting.
arXiv Detail & Related papers (2020-07-07T17:19:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.