How Object Information Improves Skeleton-based Human Action Recognition
in Assembly Tasks
- URL: http://arxiv.org/abs/2306.05844v1
- Date: Fri, 9 Jun 2023 12:18:14 GMT
- Title: How Object Information Improves Skeleton-based Human Action Recognition
in Assembly Tasks
- Authors: Dustin Aganian, Mona K\"ohler, Sebastian Baake, Markus Eisenbach, and
Horst-Michael Gross
- Abstract summary: We present a novel approach of integrating object information into skeleton-based action recognition.
We enhance two state-of-the-art methods by treating object centers as further skeleton joints.
Our research sheds light on the benefits of combining skeleton joints with object information for human action recognition in assembly tasks.
- Score: 12.349172146831506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the use of collaborative robots (cobots) in industrial manufacturing
continues to grow, human action recognition for effective human-robot
collaboration becomes increasingly important. This ability is crucial for
cobots to act autonomously and assist in assembly tasks. Recently,
skeleton-based approaches are often used as they tend to generalize better to
different people and environments. However, when processing skeletons alone,
information about the objects a human interacts with is lost. Therefore, we
present a novel approach of integrating object information into skeleton-based
action recognition. We enhance two state-of-the-art methods by treating object
centers as further skeleton joints. Our experiments on the assembly dataset
IKEA ASM show that our approach improves the performance of these
state-of-the-art methods to a large extent when combining skeleton joints with
objects predicted by a state-of-the-art instance segmentation model. Our
research sheds light on the benefits of combining skeleton joints with object
information for human action recognition in assembly tasks. We analyze the
effect of the object detector on the combination for action classification and
discuss the important factors that must be taken into account.
Related papers
- SkeleTR: Towrads Skeleton-based Action Recognition in the Wild [86.03082891242698]
SkeleTR is a new framework for skeleton-based action recognition.
It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions.
It then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.
arXiv Detail & Related papers (2023-09-20T16:22:33Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Compositional Learning in Transformer-Based Human-Object Interaction
Detection [6.630793383852106]
Long-tailed distribution of labeled instances is a primary challenge in HOI detection.
Inspired by the nature of HOI triplets, some existing approaches adopt the idea of compositional learning.
We creatively propose a transformer-based framework for compositional HOI learning.
arXiv Detail & Related papers (2023-08-11T06:41:20Z) - InterTracker: Discovering and Tracking General Objects Interacting with
Hands in the Wild [40.489171608114574]
Existing methods rely on frame-based detectors to locate interacting objects.
We propose to leverage hand-object interaction to track interactive objects.
Our proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2023-08-06T09:09:17Z) - Fusing Hand and Body Skeletons for Human Action Recognition in Assembly [13.24875937437949]
We propose a method in which less detailed body skeletons are combined with highly detailed hand skeletons.
This paper demonstrates the effectiveness of our proposed approach in enhancing action recognition in assembly scenarios.
arXiv Detail & Related papers (2023-07-18T13:18:52Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Human-Robot Collaboration and Machine Learning: A Systematic Review of
Recent Research [69.48907856390834]
Human-robot collaboration (HRC) is the approach that explores the interaction between a human and a robot.
This paper proposes a thorough literature review of the use of machine learning techniques in the context of HRC.
arXiv Detail & Related papers (2021-10-14T15:14:33Z) - Simultaneous Learning from Human Pose and Object Cues for Real-Time
Activity Recognition [11.290467061493189]
We propose a novel approach to real-time human activity recognition, through simultaneously learning from observations of both human poses and objects involved in the human activity.
Our method outperforms previous methods and obtains real-time performance for human activity recognition with a processing speed of 104 Hz.
arXiv Detail & Related papers (2020-03-26T22:04:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.