Multi-Label Activity Recognition using Activity-specific Features and
Activity Correlations
- URL: http://arxiv.org/abs/2009.07420v2
- Date: Thu, 4 Mar 2021 22:37:16 GMT
- Title: Multi-Label Activity Recognition using Activity-specific Features and
Activity Correlations
- Authors: Yanyi Zhang, Xinyu Li, Ivan Marsic
- Abstract summary: We introduce an approach to multi-label activity recognition that extracts independent feature descriptors for each activity and learns activity correlations.
Our method outperformed state-of-the-art approaches on four multi-label activity recognition datasets.
- Score: 15.356959177480965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-label activity recognition is designed for recognizing multiple
activities that are performed simultaneously or sequentially in each video.
Most recent activity recognition networks focus on single-activities, that
assume only one activity in each video. These networks extract shared features
for all the activities, which are not designed for multi-label activities. We
introduce an approach to multi-label activity recognition that extracts
independent feature descriptors for each activity and learns activity
correlations. This structure can be trained end-to-end and plugged into any
existing network structures for video classification. Our method outperformed
state-of-the-art approaches on four multi-label activity recognition datasets.
To better understand the activity-specific features that the system generated,
we visualized these activity-specific features in the Charades dataset.
Related papers
- Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting [87.11995635760108]
Key to action counting is accurately locating each video's repetitive actions.
We propose a dual-branch network, i.e., SkimFocusNet, working in a two-step manner.
arXiv Detail & Related papers (2024-06-13T05:15:52Z) - Query by Activity Video in the Wild [52.42177539947216]
In current query-by-activity-video literature, a common assumption is that all activities have sufficient labelled examples when learning an embedding.
We propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval.
arXiv Detail & Related papers (2023-11-23T10:26:36Z) - Automatic Interaction and Activity Recognition from Videos of Human
Manual Demonstrations with Application to Anomaly Detection [0.0]
This paper exploits Scene Graphs to extract key interaction features from image sequences while simultaneously motion patterns and context.
The method introduces event-based automatic video segmentation and clustering, which allow for the grouping of similar events and detect if a monitored activity is executed correctly.
arXiv Detail & Related papers (2023-04-19T16:15:23Z) - A Multi-Task Deep Learning Approach for Sensor-based Human Activity
Recognition and Segmentation [4.987833356397567]
We propose a new deep neural network to solve the two tasks simultaneously.
The proposed network adopts selective convolution and features multiscale windows to segment activities of long or short time durations.
Our proposed method outperforms the state-of-the-art methods both for activity recognition and segmentation.
arXiv Detail & Related papers (2023-03-20T13:34:28Z) - Learning To Recognize Procedural Activities with Distant Supervision [96.58436002052466]
We consider the problem of classifying fine-grained, multi-step activities from long videos spanning up to several minutes.
Our method uses a language model to match noisy, automatically-transcribed speech from the video to step descriptions in the knowledge base.
arXiv Detail & Related papers (2022-01-26T15:06:28Z) - Learning Asynchronous and Sparse Human-Object Interaction in Videos [56.73059840294019]
Asynchronous-Sparse Interaction Graph Networks (ASSIGN) is able to automatically detect the structure of interaction events associated with entities in a video scene.
ASSIGN is tested on human-object interaction recognition and shows superior performance in segmenting and labeling of human sub-activities and object affordances from raw videos.
arXiv Detail & Related papers (2021-03-03T23:43:55Z) - Adversarial Background-Aware Loss for Weakly-supervised Temporal
Activity Localization [40.517438760096056]
Temporally localizing activities within untrimmed videos has been extensively studied in recent years.
Despite recent advances, existing methods for weakly-supervised temporal activity localization struggle to recognize when an activity is not occurring.
arXiv Detail & Related papers (2020-07-13T19:33:24Z) - Sequential Weakly Labeled Multi-Activity Localization and Recognition on
Wearable Sensors using Recurrent Attention Networks [13.64024154785943]
We propose a recurrent attention network (RAN) to handle sequential weakly labeled multi-activity recognition and location tasks.
Our RAN model can simultaneously infer multi-activity types from the coarse-grained sequential weak labels.
It will greatly reduce the burden of manual labeling.
arXiv Detail & Related papers (2020-04-13T04:57:09Z) - Revisiting Few-shot Activity Detection with Class Similarity Control [107.79338380065286]
We present a framework for few-shot temporal activity detection based on proposal regression.
Our model is end-to-end trainable, takes into account the frame rate differences between few-shot activities and untrimmed test videos, and can benefit from additional few-shot examples.
arXiv Detail & Related papers (2020-03-31T22:02:38Z) - ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected.
We design an end-to-end deep network based on R-C3D as the architecture for this solution.
Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.