Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition
- URL: http://arxiv.org/abs/2504.07916v1
- Date: Thu, 10 Apr 2025 17:30:07 GMT
- Title: Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition
- Authors: Wen Ge, Guanyi Mou, Emmanuel O. Agu, Kyumin Lee,
- Abstract summary: We propose SEAL, which leverage LMs to encode CA-HAR activity labels to capture semantic relationships.<n>Our research opens up new possibilities for integrating more advanced LMs into CA-HAR tasks.
- Score: 2.8132886759540146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prior work has primarily formulated CA-HAR as a multi-label classification problem, where model inputs are time-series sensor data and target labels are binary encodings representing whether a given activity or context occurs. These CA-HAR methods either predicted each label independently or manually imposed relationships using graphs. However, both strategies often neglect an essential aspect: activity labels have rich semantic relationships. For instance, walking, jogging, and running activities share similar movement patterns but differ in pace and intensity, indicating that they are semantically related. Consequently, prior CA-HAR methods often struggled to accurately capture these inherent and nuanced relationships, particularly on datasets with noisy labels typically used for CA-HAR or situations where the ideal sensor type is unavailable (e.g., recognizing speech without audio sensors). To address this limitation, we propose SEAL, which leverage LMs to encode CA-HAR activity labels to capture semantic relationships. LMs generate vector embeddings that preserve rich semantic information from natural language. Our SEAL approach encodes input-time series sensor data from smart devices and their associated activity and context labels (text) as vector embeddings. During training, SEAL aligns the sensor data representations with their corresponding activity/context label embeddings in a shared embedding space. At inference time, SEAL performs a similarity search, returning the CA-HAR label with the embedding representation closest to the input data. Although LMs have been widely explored in other domains, surprisingly, their potential in CA-HAR has been underexplored, making our approach a novel contribution to the field. Our research opens up new possibilities for integrating more advanced LMs into CA-HAR tasks.
Related papers
- DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes [52.09869569068291]
We introduce DISCOVER, a method to discover fine-grained human sub-activities from unlabeled sensor data without relying on pre-segmentation.<n>We demonstrate its effectiveness through a re-annotation exercise on widely used HAR datasets.
arXiv Detail & Related papers (2025-02-11T20:02:24Z) - Language-centered Human Activity Recognition [8.925867647929088]
Human Activity Recognition (HAR) using Inertial Measurement Unit (IMU) sensors is critical for applications in healthcare, safety, and industrial production.<n> variations in activity patterns, device types, and sensor placements create distribution gaps across datasets.<n>We propose LanHAR, a novel system that generates semantic interpretations of sensor readings and activity labels for cross-dataset HAR.
arXiv Detail & Related papers (2024-09-12T22:57:29Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Unleashing the Power of Shared Label Structures for Human Activity
Recognition [36.66107380956779]
We propose SHARE, a framework that takes into account shared structures of label names for different activities.
To exploit the shared structures, SHARE comprises an encoder for extracting features from input sensory time series and a decoder for generating label names as a token sequence.
We also propose three label augmentation techniques to help the model more effectively capture semantic structures across activities.
arXiv Detail & Related papers (2023-01-01T22:50:08Z) - Using Language Model to Bootstrap Human Activity Recognition Ambient
Sensors Based in Smart Homes [2.336163487623381]
We propose two Natural Language Processing embedding methods to enhance LSTM-based structures in activity-sequences classification tasks.
Results indicate that this approach provides useful information, such as a sensor organization map.
Our tests show that the embeddings can be pretrained on different datasets than the target one, enabling transfer learning.
arXiv Detail & Related papers (2021-11-23T21:21:14Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z) - Sequential Weakly Labeled Multi-Activity Localization and Recognition on
Wearable Sensors using Recurrent Attention Networks [13.64024154785943]
We propose a recurrent attention network (RAN) to handle sequential weakly labeled multi-activity recognition and location tasks.
Our RAN model can simultaneously infer multi-activity types from the coarse-grained sequential weak labels.
It will greatly reduce the burden of manual labeling.
arXiv Detail & Related papers (2020-04-13T04:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.