WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity
Recognition
- URL: http://arxiv.org/abs/2304.05088v3
- Date: Tue, 21 Nov 2023 16:35:26 GMT
- Title: WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity
Recognition
- Authors: Marius Bock, Hilde Kuehne, Kristof Van Laerhoven, Michael Moeller
- Abstract summary: WEAR is an outdoor sports dataset for both vision- and inertial-based human activity recognition (HAR)
The dataset comprises data from 18 participants performing a total of 18 different workout activities with untrimmed inertial (acceleration) and camera (egocentric video) data recorded at 10 different outside locations.
- Score: 25.113458430281632
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Though research has shown the complementarity of camera- and inertial-based
data, datasets which offer both egocentric video and inertial-based sensor data
remain scarce. In this paper, we introduce WEAR, an outdoor sports dataset for
both vision- and inertial-based human activity recognition (HAR). The dataset
comprises data from 18 participants performing a total of 18 different workout
activities with untrimmed inertial (acceleration) and camera (egocentric video)
data recorded at 10 different outside locations. Unlike previous egocentric
datasets, WEAR provides a challenging prediction scenario marked by purposely
introduced activity variations as well as an overall small information overlap
across modalities. Benchmark results obtained using each modality separately
show that each modality interestingly offers complementary strengths and
weaknesses in their prediction performance. Further, in light of the recent
success of temporal action localization models following the architecture
design of the ActionFormer, we demonstrate their versatility by applying them
in a plain fashion using vision, inertial and combined (vision + inertial)
features as input. Results demonstrate both the applicability of vision-based
temporal action localization models for inertial data and fusing both
modalities by means of simple concatenation, with the combined approach (vision
+ inertial features) being able to produce the highest mean average precision
and close-to-best F1-score. The dataset and code to reproduce experiments is
publicly available via: https://mariusbock.github.io/wear/
Related papers
- DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes [52.09869569068291]
We introduce DISCOVER, a method to discover fine-grained human sub-activities from unlabeled sensor data without relying on pre-segmentation.
We demonstrate its effectiveness through a re-annotation exercise on widely used HAR datasets.
arXiv Detail & Related papers (2025-02-11T20:02:24Z) - Advancing Location-Invariant and Device-Agnostic Motion Activity
Recognition on Wearable Devices [6.557453686071467]
We conduct a comprehensive evaluation of the generalizability of motion models across sensor locations.
Our analysis highlights this challenge and identifies key on-body locations for building location-invariant models.
We present deployable on-device motion models reaching 91.41% frame-level F1-score from a single model irrespective of sensor placements.
arXiv Detail & Related papers (2024-02-06T05:10:00Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - VALERIE22 -- A photorealistic, richly metadata annotated dataset of
urban environments [5.439020425819001]
The VALERIE tool pipeline is a synthetic data generator developed to contribute to the understanding of domain-specific factors.
The VALERIE22 dataset was generated with the VALERIE procedural tools pipeline providing a photorealistic sensor simulation.
The dataset provides a uniquely rich set of metadata, allowing extraction of specific scene and semantic features.
arXiv Detail & Related papers (2023-08-18T15:44:45Z) - Learning Fine-grained View-Invariant Representations from Unpaired
Ego-Exo Videos via Temporal Alignment [71.16699226211504]
We propose to learn fine-grained action features that are invariant to the viewpoints by aligning egocentric and exocentric videos in time.
To this end, we propose AE2, a self-supervised embedding approach with two key designs.
For evaluation, we establish a benchmark for fine-grained video understanding in the ego-exo context.
arXiv Detail & Related papers (2023-06-08T19:54:08Z) - Do I Have Your Attention: A Large Scale Engagement Prediction Dataset
and Baselines [9.896915478880635]
The degree of concentration, enthusiasm, optimism, and passion displayed by individual(s) while interacting with a machine is referred to as user engagement'
To create engagement prediction systems that can work in real-world conditions, it is quintessential to learn from rich, diverse datasets.
Large scale multi-faceted engagement in the wild dataset EngageNet is proposed.
arXiv Detail & Related papers (2023-02-01T13:25:54Z) - Towards Continual Egocentric Activity Recognition: A Multi-modal
Egocentric Activity Dataset for Continual Learning [21.68009790164824]
We present a multi-modal egocentric activity dataset for continual learning named UESTC-MMEA-CL.
It contains synchronized data of videos, accelerometers, and gyroscopes, for 32 types of daily activities, performed by 10 participants.
Results of egocentric activity recognition are reported when using separately, and jointly, three modalities: RGB, acceleration, and gyroscope.
arXiv Detail & Related papers (2023-01-26T04:32:00Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - Multi-Environment Pretraining Enables Transfer to Action Limited
Datasets [129.24823721649028]
In reinforcement learning, available data of decision making is often not annotated with actions.
We propose combining large but sparsely-annotated datasets from a emphtarget environment of interest with fully-annotated datasets from various other emphsource environments.
We show that utilizing even one additional environment dataset of sequential labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences.
arXiv Detail & Related papers (2022-11-23T22:48:22Z) - HighlightMe: Detecting Highlights from Human-Centric Videos [52.84233165201391]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos.
We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions.
We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.