Related papers: Fine-grained Human Activity Recognition Using Virtual On-body Acceleration Data

Fine-grained Human Activity Recognition Using Virtual On-body Acceleration Data

URL: http://arxiv.org/abs/2211.01342v1
Date: Wed, 2 Nov 2022 17:51:56 GMT
Title: Fine-grained Human Activity Recognition Using Virtual On-body Acceleration Data
Authors: Zikang Leng, Yash Jain, Hyeokhyen Kwon, Thomas Pl\"otz
Abstract summary: IMUTube was originally designed to cover activities based on substantial body (part) movements. This paper introduces a measure to quantitatively assess the subtlety of human movements that are underlying activities of interest. We then perform a "stress-test" on IMUTube and explore for which activities with underlying subtle movements a cross-modality transfer approach works, and for which not.
Score: 1.64882269584049
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous work has demonstrated that virtual accelerometry data, extracted from videos using cross-modality transfer approaches like IMUTube, is beneficial for training complex and effective human activity recognition (HAR) models. Systems like IMUTube were originally designed to cover activities that are based on substantial body (part) movements. Yet, life is complex, and a range of activities of daily living is based on only rather subtle movements, which bears the question to what extent systems like IMUTube are of value also for fine-grained HAR, i.e., When does IMUTube break? In this work we first introduce a measure to quantitatively assess the subtlety of human movements that are underlying activities of interest--the motion subtlety index (MSI)--which captures local pixel movements and pose changes in the vicinity of target virtual sensor locations, and correlate it to the eventual activity recognition accuracy. We then perform a "stress-test" on IMUTube and explore for which activities with underlying subtle movements a cross-modality transfer approach works, and for which not. As such, the work presented in this paper allows us to map out the landscape for IMUTube applications in practical scenarios.

Related papers

ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning [59.64325421657381]
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks.<n>We introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data.<n>Results show substantial gains in task success, training efficiency, and robustness over strong baselines.
arXiv Detail & Related papers (2025-10-06T17:47:02Z)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction [52.00935005918032]
Natural Human-Robot Interaction (N-HRI) requires robots to recognize human actions at varying distances and states, regardless of whether the robot itself is in motion or stationary.<n>Existing benchmarks for N-HRI fail to address the unique complexities in N-HRI due to limited data, modalities, task categories, and diversity of subjects and environments.<n>We introduce (Action from Robotic View) a large-scale dataset for perception-centric robotic views prevalent in mobile service robots.
arXiv Detail & Related papers (2025-07-30T09:48:34Z)
CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning [47.195002937893115]
CoMo aims to learn more informative continuous motion representations from diverse, internet-scale videos.<n>We introduce two new metrics for more robustly and affordably evaluating motion and guiding motion learning methods.<n>CoMo exhibits strong zero-shot generalization, enabling it to generate continuous pseudo actions for previously unseen video domains.
arXiv Detail & Related papers (2025-05-22T17:58:27Z)
Watch Less, Feel More: Sim-to-Real RL for Generalizable Articulated Object Manipulation via Motion Adaptation and Impedance Control [7.986465090160508]
We present a novel RL-based pipeline equipped with variable impedance control and motion adaptation. Our pipeline focuses on smooth and dexterous motion during zero-shot sim-to-real transfer. To the best of our knowledge, our policy is the first to report 84% success rate in the real world.
arXiv Detail & Related papers (2025-02-20T11:18:35Z)
Motion Generation Review: Exploring Deep Learning for Lifelike Animation with Manifold [4.853986914715961]
Human motion generation involves creating natural sequences of human body poses, widely used in gaming, virtual reality, and human-computer interaction. Previous work has focused on motion generation based on signals like movement, music, text, or scene background. Mandela learning offers a solution by reducing data dimensionality and capturing subspaces of effective motion.
arXiv Detail & Related papers (2024-12-12T08:27:15Z)
Cross-Domain HAR: Few Shot Transfer Learning for Human Activity Recognition [0.2944538605197902]
We present an approach for economic use of publicly available labeled HAR datasets for effective transfer learning. We introduce a novel transfer learning framework, Cross-Domain HAR, which follows the teacher-student self-training paradigm. We demonstrate the effectiveness of our approach for practically relevant few shot activity recognition scenarios.
arXiv Detail & Related papers (2023-10-22T19:13:25Z)
Universal Humanoid Motion Representations for Physics-Based Control [71.46142106079292]
We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control. We first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset. We then create our motion representation by distilling skills directly from the imitator.
arXiv Detail & Related papers (2023-10-06T20:48:43Z)
milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing [11.541217629396373]
mmWave radars have gained popularity due to their privacy-friendly features. We propose milliFlow, a novel deep learning approach to estimate scene flow as complementary motion information for mmWave point cloud.
arXiv Detail & Related papers (2023-06-29T15:06:21Z)
Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving. Most segmentation methods leverage motion cues obtained from optical flow maps. We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z)
Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos. Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras. We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z)
Transformer Inertial Poser: Attention-based Real-time Human Motion Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time. Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z)
Contrastive Predictive Coding for Human Activity Recognition [5.766384728949437]
We introduce the Contrastive Predictive Coding framework to human activity recognition, which captures the long-term temporal structure of sensor data streams. CPC-based pre-training is self-supervised, and the resulting learned representations can be integrated into standard activity chains. It leads to significantly improved recognition performance when only small amounts of labeled training data are available.
arXiv Detail & Related papers (2020-12-09T21:44:36Z)
Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks. RL does not work directly in the real-world, which is known as the sim-to-real transfer problem. We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z)
IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition [12.91206329972949]
We introduce IMUTube, an automated processing pipeline to convert videos of human activity into virtual streams of IMU data. These virtual IMU streams represent accelerometry at a wide variety of locations on the human body. We show how the virtually-generated IMU data improves the performance of a variety of models on known HAR datasets.
arXiv Detail & Related papers (2020-05-29T21:50:38Z)
ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.