Progressive Cross-modal Knowledge Distillation for Human Action
Recognition
- URL: http://arxiv.org/abs/2208.08090v1
- Date: Wed, 17 Aug 2022 06:06:03 GMT
- Title: Progressive Cross-modal Knowledge Distillation for Human Action
Recognition
- Authors: Jianyuan Ni, Anne H.H. Ngu, Yan Yan
- Abstract summary: We propose a novel Progressive Skeleton-to-sensor Knowledge Distillation (PSKD) model for solving the wearable sensor-based HAR problem.
Specifically, we construct multiple teacher models using data from both teacher (human skeleton sequence) and student (time-series accelerometer data) modalities.
- Score: 10.269019492921306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wearable sensor-based Human Action Recognition (HAR) has achieved remarkable
success recently. However, the accuracy performance of wearable sensor-based
HAR is still far behind the ones from the visual modalities-based system (i.e.,
RGB video, skeleton, and depth). Diverse input modalities can provide
complementary cues and thus improve the accuracy performance of HAR, but how to
take advantage of multi-modal data on wearable sensor-based HAR has rarely been
explored. Currently, wearable devices, i.e., smartwatches, can only capture
limited kinds of non-visual modality data. This hinders the multi-modal HAR
association as it is unable to simultaneously use both visual and non-visual
modality data. Another major challenge lies in how to efficiently utilize
multimodal data on wearable devices with their limited computation resources.
In this work, we propose a novel Progressive Skeleton-to-sensor Knowledge
Distillation (PSKD) model which utilizes only time-series data, i.e.,
accelerometer data, from a smartwatch for solving the wearable sensor-based HAR
problem. Specifically, we construct multiple teacher models using data from
both teacher (human skeleton sequence) and student (time-series accelerometer
data) modalities. In addition, we propose an effective progressive learning
scheme to eliminate the performance gap between teacher and student models. We
also designed a novel loss function called Adaptive-Confidence Semantic (ACS),
to allow the student model to adaptively select either one of the teacher
models or the ground-truth label it needs to mimic. To demonstrate the
effectiveness of our proposed PSKD method, we conduct extensive experiments on
Berkeley-MHAD, UTD-MHAD, and MMAct datasets. The results confirm that the
proposed PSKD method has competitive performance compared to the previous mono
sensor-based HAR methods.
Related papers
- Multi Teacher Privileged Knowledge Distillation for Multimodal Expression Recognition [58.41784639847413]
Human emotion is a complex phenomenon conveyed and perceived through facial expressions, vocal tones, body language, and physiological signals.
In this paper, a multi-teacher PKD (MT-PKDOT) method with self-distillation is introduced to align diverse teacher representations before distilling them to the student.
Results indicate that our proposed method can outperform SOTA PKD methods.
arXiv Detail & Related papers (2024-08-16T22:11:01Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Efficient Adaptive Human-Object Interaction Detection with
Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM)
ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm.
Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z) - Human Activity Recognition Using Self-Supervised Representations of
Wearable Data [0.0]
Development of accurate algorithms for human activity recognition (HAR) is hindered by the lack of large real-world labeled datasets.
Here we develop a 6-class HAR model with strong performance when evaluated on real-world datasets not seen during training.
arXiv Detail & Related papers (2023-04-26T07:33:54Z) - Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity
Recognition [6.0306313759213275]
We propose a multi-modal framework that learns to effectively combine features from RGB Video and IMU sensors.
Our model is trained in two-stage, where in the first stage, each input encoder learns to effectively extract features.
We show significant improvements of 22% and 11% compared to video only, and 20% and 12% on MMAct datasets.
arXiv Detail & Related papers (2022-11-08T15:48:44Z) - DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and
Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.
We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis.
We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z) - Exploring Inconsistent Knowledge Distillation for Object Detection with
Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model.
We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions.
Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z) - Deep Transfer Learning with Graph Neural Network for Sensor-Based Human
Activity Recognition [12.51766929898714]
We devised a graph-inspired deep learning approach toward the sensor-based HAR tasks.
We present a multi-layer residual structure involved graph convolutional neural network (ResGCNN) toward the sensor-based HAR tasks.
Experimental results on the PAMAP2 and mHealth data sets demonstrate that our ResGCNN is effective at capturing the characteristics of actions.
arXiv Detail & Related papers (2022-03-14T07:57:32Z) - Cross-modal Knowledge Distillation for Vision-to-Sensor Action
Recognition [12.682984063354748]
This study introduces an end-to-end Vision-to-Sensor Knowledge Distillation (VSKD) framework.
In this VSKD framework, only time-series data, i.e., accelerometer data, is needed from wearable devices during the testing phase.
This framework will not only reduce the computational demands on edge devices, but also produce a learning model that closely matches the performance of the computational expensive multi-modal approach.
arXiv Detail & Related papers (2021-10-08T15:06:38Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - A Deep Learning Method for Complex Human Activity Recognition Using
Virtual Wearable Sensors [22.923108537119685]
Sensor-based human activity recognition (HAR) is now a research hotspot in multiple application areas.
We propose a novel method based on deep learning for complex HAR in the real-scene.
The proposed method can surprisingly converge in a few iterations and achieve an accuracy of 91.15% on a real IMU dataset.
arXiv Detail & Related papers (2020-03-04T03:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.