Unleashing the Power of Shared Label Structures for Human Activity
Recognition
- URL: http://arxiv.org/abs/2301.03462v2
- Date: Fri, 20 Oct 2023 00:27:00 GMT
- Title: Unleashing the Power of Shared Label Structures for Human Activity
Recognition
- Authors: Xiyuan Zhang, Ranak Roy Chowdhury, Jiayun Zhang, Dezhi Hong, Rajesh K.
Gupta, Jingbo Shang
- Abstract summary: We propose SHARE, a framework that takes into account shared structures of label names for different activities.
To exploit the shared structures, SHARE comprises an encoder for extracting features from input sensory time series and a decoder for generating label names as a token sequence.
We also propose three label augmentation techniques to help the model more effectively capture semantic structures across activities.
- Score: 36.66107380956779
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current human activity recognition (HAR) techniques regard activity labels as
integer class IDs without explicitly modeling the semantics of class labels. We
observe that different activity names often have shared structures. For
example, "open door" and "open fridge" both have "open" as the action; "kicking
soccer ball" and "playing tennis ball" both have "ball" as the object. Such
shared structures in label names can be translated to the similarity in sensory
data and modeling common structures would help uncover knowledge across
different activities, especially for activities with limited samples. In this
paper, we propose SHARE, a HAR framework that takes into account shared
structures of label names for different activities. To exploit the shared
structures, SHARE comprises an encoder for extracting features from input
sensory time series and a decoder for generating label names as a token
sequence. We also propose three label augmentation techniques to help the model
more effectively capture semantic structures across activities, including a
basic token-level augmentation, and two enhanced embedding-level and
sequence-level augmentations utilizing the capabilities of pre-trained models.
SHARE outperforms state-of-the-art HAR models in extensive experiments on seven
HAR benchmark datasets. We also evaluate in few-shot learning and label
imbalance settings and observe even more significant performance gap.
Related papers
- Semantically Encoding Activity Labels for Context-Aware Human Activity Recognition [2.8132886759540146]
We propose SEAL, which leverage LMs to encode CA-HAR activity labels to capture semantic relationships.
Our research opens up new possibilities for integrating more advanced LMs into CA-HAR tasks.
arXiv Detail & Related papers (2025-04-10T17:30:07Z) - CoA: Chain-of-Action for Generative Semantic Labels [5.016605351534376]
Chain-of-Action (CoA) method generates labels aligned with contextually relevant features of an image.
CoA is designed based on the observation that enriched and valuable contextual information improves generative performance during inference.
arXiv Detail & Related papers (2024-11-26T13:09:14Z) - Multi-grained Label Refinement Network with Dependency Structures for
Joint Intent Detection and Slot Filling [13.963083174197164]
intent and semantic components of a utterance are dependent on the syntactic elements of a sentence.
In this paper, we investigate a multi-grained label refinement network, which utilizes dependency structures and label semantic embeddings.
Considering to enhance syntactic representations, we introduce the dependency structures of sentences into our model by graph attention layer.
arXiv Detail & Related papers (2022-09-09T07:27:38Z) - Disentangled Action Recognition with Knowledge Bases [77.77482846456478]
We aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns.
Previous work utilizes verb-noun compositional action nodes in the knowledge graph, making it inefficient to scale.
We propose our approach: Disentangled Action Recognition with Knowledge-bases (DARK), which leverages the inherent compositionality of actions.
arXiv Detail & Related papers (2022-07-04T20:19:13Z) - Label Semantics for Few Shot Named Entity Recognition [68.01364012546402]
We study the problem of few shot learning for named entity recognition.
We leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors.
Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder.
arXiv Detail & Related papers (2022-03-16T23:21:05Z) - BABEL: Bodies, Action and Behavior with English Labels [53.83774092560076]
We present BABEL, a large dataset with language labels describing the actions being performed in mocap sequences.
There are over 28k sequence labels, and 63k frame labels in BABEL, which belong to over 250 unique action categories.
We demonstrate the value of BABEL as a benchmark, and evaluate the performance of models on 3D action recognition.
arXiv Detail & Related papers (2021-06-17T17:51:14Z) - Generate, Annotate, and Learn: Generative Models Advance Self-Training
and Knowledge Distillation [58.64720318755764]
Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often hinges on the availability of task-specific unlabeled data.
Knowledge distillation (KD) has enabled compressing deep networks and ensembles, achieving the best results when distilling knowledge on fresh task-specific unlabeled examples.
We present a general framework called "generate, annotate, and learn (GAL)" that uses unconditional generative models to synthesize in-domain unlabeled data.
arXiv Detail & Related papers (2021-06-11T05:01:24Z) - FineGym: A Hierarchical Video Dataset for Fine-grained Action
Understanding [118.32912239230272]
FineGym is a new action recognition dataset built on top of gymnastic videos.
It provides temporal annotations at both action and sub-action levels with a three-level semantic hierarchy.
This new level of granularity presents significant challenges for action recognition.
arXiv Detail & Related papers (2020-04-14T17:55:21Z) - ActiLabel: A Combinatorial Transfer Learning Framework for Activity
Recognition [14.605223647792862]
ActiLabel is a framework that learns structural similarities among events in an arbitrary domain and those of a different domain.
Experiments based on three public datasets demonstrate the superiority of ActiLabel over state-of-the-art transfer learning and deep learning methods.
arXiv Detail & Related papers (2020-03-16T19:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.