SelfHAR: Improving Human Activity Recognition through Self-training with
Unlabeled Data
- URL: http://arxiv.org/abs/2102.06073v1
- Date: Thu, 11 Feb 2021 15:40:35 GMT
- Title: SelfHAR: Improving Human Activity Recognition through Self-training with
Unlabeled Data
- Authors: Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Soren Brage,
Nick Wareham and Cecilia Mascolo
- Abstract summary: SelfHAR is a semi-supervised model that learns to leverage unlabeled datasets to complement small labeled datasets.
Our approach combines teacher-student self-training, which distills the knowledge of unlabeled and labeled datasets.
SelfHAR is data-efficient, reaching similar performance using up to 10 times less labeled data compared to supervised approaches.
- Score: 9.270269467155547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning and deep learning have shown great promise in mobile sensing
applications, including Human Activity Recognition. However, the performance of
such models in real-world settings largely depends on the availability of large
datasets that captures diverse behaviors. Recently, studies in computer vision
and natural language processing have shown that leveraging massive amounts of
unlabeled data enables performance on par with state-of-the-art supervised
models.
In this work, we present SelfHAR, a semi-supervised model that effectively
learns to leverage unlabeled mobile sensing datasets to complement small
labeled datasets. Our approach combines teacher-student self-training, which
distills the knowledge of unlabeled and labeled datasets while allowing for
data augmentation, and multi-task self-supervision, which learns robust
signal-level representations by predicting distorted versions of the input.
We evaluated SelfHAR on various HAR datasets and showed state-of-the-art
performance over supervised and previous semi-supervised approaches, with up to
12% increase in F1 score using the same number of model parameters at
inference. Furthermore, SelfHAR is data-efficient, reaching similar performance
using up to 10 times less labeled data compared to supervised approaches. Our
work not only achieves state-of-the-art performance in a diverse set of HAR
datasets, but also sheds light on how pre-training tasks may affect downstream
performance.
Related papers
- Less is More: High-value Data Selection for Visual Instruction Tuning [127.38740043393527]
We propose a high-value data selection approach TIVE, to eliminate redundancy within the visual instruction data and reduce the training cost.
Our approach using only about 15% data can achieve comparable average performance to the full-data fine-tuned model across eight benchmarks.
arXiv Detail & Related papers (2024-03-14T16:47:25Z) - Combining Public Human Activity Recognition Datasets to Mitigate Labeled
Data Scarcity [1.274578243851308]
We propose a novel strategy to combine publicly available datasets with the goal of learning a generalized HAR model.
Our experimental evaluation, which includes experimenting with different state-of-the-art neural network architectures, shows that combining public datasets can significantly reduce the number of labeled samples.
arXiv Detail & Related papers (2023-06-23T18:51:22Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Self-supervised Activity Representation Learning with Incremental Data:
An Empirical Study [7.782045150068569]
This research examines the impact of using a self-supervised representation learning model for time series classification tasks.
We analyzed the effect of varying the size, distribution, and source of the unlabeled data on the final classification performance across four public datasets.
arXiv Detail & Related papers (2023-05-01T01:39:55Z) - Human Activity Recognition Using Self-Supervised Representations of
Wearable Data [0.0]
Development of accurate algorithms for human activity recognition (HAR) is hindered by the lack of large real-world labeled datasets.
Here we develop a 6-class HAR model with strong performance when evaluated on real-world datasets not seen during training.
arXiv Detail & Related papers (2023-04-26T07:33:54Z) - SelfAct: Personalized Activity Recognition based on Self-Supervised and
Active Learning [0.688204255655161]
SelfAct is a novel framework for Human Activity Recognition (HAR) on wearable and mobile devices.
It combines self-supervised and active learning to mitigate problems such as intra- and inter-variability of activity execution.
Our experiments on two publicly available HAR datasets demonstrate that SelfAct achieves results close to or even better than the ones of fully supervised approaches.
arXiv Detail & Related papers (2023-04-19T09:39:11Z) - Reinforcement Learning from Passive Data via Latent Intentions [86.4969514480008]
We show that passive data can still be used to learn features that accelerate downstream RL.
Our approach learns from passive data by modeling intentions.
Our experiments demonstrate the ability to learn from many forms of passive data, including cross-embodiment video data and YouTube videos.
arXiv Detail & Related papers (2023-04-10T17:59:05Z) - Striving for data-model efficiency: Identifying data externalities on
group performance [75.17591306911015]
Building trustworthy, effective, and responsible machine learning systems hinges on understanding how differences in training data and modeling decisions interact to impact predictive performance.
We focus on a particular type of data-model inefficiency, in which adding training data from some sources can actually lower performance evaluated on key sub-groups of the population.
Our results indicate that data-efficiency is a key component of both accurate and trustworthy machine learning.
arXiv Detail & Related papers (2022-11-11T16:48:27Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.