HabitAction: A Video Dataset for Human Habitual Behavior Recognition
- URL: http://arxiv.org/abs/2408.13463v1
- Date: Sat, 24 Aug 2024 04:40:31 GMT
- Title: HabitAction: A Video Dataset for Human Habitual Behavior Recognition
- Authors: Hongwu Li, Zhenliang Zhang, Wei Wang,
- Abstract summary: Human habitual behaviors (HHBs) hold significant importance for analyzing a person's personality, habits, and psychological changes.
In this work, we build a novel video dataset to demonstrate various HHBs.
The dataset contains 30 categories of habitual behaviors including more than 300,000 frames and 6,899 action instances.
- Score: 3.7478789114676108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human Action Recognition (HAR) is a very crucial task in computer vision. It helps to carry out a series of downstream tasks, like understanding human behaviors. Due to the complexity of human behaviors, many highly valuable behaviors are not yet encompassed within the available datasets for HAR, e.g., human habitual behaviors (HHBs). HHBs hold significant importance for analyzing a person's personality, habits, and psychological changes. To solve these problems, in this work, we build a novel video dataset to demonstrate various HHBs. These behaviors in the proposed dataset are able to reflect internal mental states and specific emotions of the characters, e.g., crossing arms suggests to shield oneself from perceived threats. The dataset contains 30 categories of habitual behaviors including more than 300,000 frames and 6,899 action instances. Since these behaviors usually appear at small local parts of human action videos, it is difficult for existing action recognition methods to handle these local features. Therefore, we also propose a two-stream model using both human skeletons and RGB appearances. Experimental results demonstrate that our proposed method has much better performance in action recognition than the existing methods on the proposed dataset.
Related papers
- Modeling User Preferences via Brain-Computer Interfacing [54.3727087164445]
We use Brain-Computer Interfacing technology to infer users' preferences, their attentional correlates towards visual content, and their associations with affective experience.
We link these to relevant applications, such as information retrieval, personalized steering of generative models, and crowdsourcing population estimates of affective experiences.
arXiv Detail & Related papers (2024-05-15T20:41:46Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - PoseAction: Action Recognition for Patients in the Ward using Deep
Learning Approaches [0.0]
We propose using computer vision (CV) and deep learning (DL) methods for detecting subjects and recognizing their actions.
We utilize OpenPose as an accurate subject detector for recognizing the positions of human subjects in the video stream.
We employ AlphAction's Asynchronous Interaction Aggregation (AIA) network to predict the actions of detected subjects.
arXiv Detail & Related papers (2023-10-05T03:33:35Z) - Human-centric Scene Understanding for 3D Large-scale Scenarios [52.12727427303162]
We present a large-scale multi-modal dataset for human-centric scene understanding, dubbed HuCenLife.
Our HuCenLife can benefit many 3D perception tasks, such as segmentation, detection, action recognition, etc.
arXiv Detail & Related papers (2023-07-26T08:40:46Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - MECCANO: A Multimodal Egocentric Dataset for Humans Behavior
Understanding in the Industrial-like Domain [23.598727613908853]
We present MECCANO, a dataset of egocentric videos to study humans behavior understanding in industrial-like settings.
The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset.
The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view.
arXiv Detail & Related papers (2022-09-19T00:52:42Z) - Bodily Behaviors in Social Interaction: Novel Annotations and
State-of-the-Art Evaluation [0.0]
We present BBSI, the first set of annotations of complex Bodily Behaviors embedded in continuous Social Interactions.
Based on previous work in psychology, we manually annotated 26 hours of spontaneous human behavior.
We adapt the Pyramid Dilated Attention Network (PDAN), a state-of-the-art approach for human action detection.
arXiv Detail & Related papers (2022-07-26T11:24:00Z) - Video Action Detection: Analysing Limitations and Challenges [70.01260415234127]
We analyze existing datasets on video action detection and discuss their limitations.
We perform a biasness study which analyzes a key property differentiating videos from static images: the temporal aspect.
Such extreme experiments show existence of biases which have managed to creep into existing methods inspite of careful modeling.
arXiv Detail & Related papers (2022-04-17T00:42:14Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - A Grid-based Representation for Human Action Recognition [12.043574473965318]
Human action recognition (HAR) in videos is a fundamental research topic in computer vision.
We propose a novel method for action recognition that encodes efficiently the most discriminative appearance information of an action.
Our method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions.
arXiv Detail & Related papers (2020-10-17T18:25:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.