A Grid-based Representation for Human Action Recognition
- URL: http://arxiv.org/abs/2010.08841v2
- Date: Thu, 29 Oct 2020 14:39:08 GMT
- Title: A Grid-based Representation for Human Action Recognition
- Authors: Soufiane Lamghari, Guillaume-Alexandre Bilodeau, Nicolas Saunier
- Abstract summary: Human action recognition (HAR) in videos is a fundamental research topic in computer vision.
We propose a novel method for action recognition that encodes efficiently the most discriminative appearance information of an action.
Our method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions.
- Score: 12.043574473965318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human action recognition (HAR) in videos is a fundamental research topic in
computer vision. It consists mainly in understanding actions performed by
humans based on a sequence of visual observations. In recent years, HAR have
witnessed significant progress, especially with the emergence of deep learning
models. However, most of existing approaches for action recognition rely on
information that is not always relevant for this task, and are limited in the
way they fuse the temporal information. In this paper, we propose a novel
method for human action recognition that encodes efficiently the most
discriminative appearance information of an action with explicit attention on
representative pose features, into a new compact grid representation. Our GRAR
(Grid-based Representation for Action Recognition) method is tested on several
benchmark datasets demonstrating that our model can accurately recognize human
actions, despite intra-class appearance variations and occlusion challenges.
Related papers
- Are Visual-Language Models Effective in Action Recognition? A Comparative Study [22.97135293252601]
This paper provides a large-scale study and insight on state-of-the-art vision foundation models.
It compares their transfer ability onto zero-shot and frame-wise action recognition tasks.
Experiments are conducted on recent fine-grained, human-centric action recognition datasets.
arXiv Detail & Related papers (2024-10-22T16:28:21Z) - A Comprehensive Review of Few-shot Action Recognition [64.47305887411275]
Few-shot action recognition aims to address the high cost and impracticality of manually labeling complex and variable video data.
It requires accurately classifying human actions in videos using only a few labeled examples per class.
arXiv Detail & Related papers (2024-07-20T03:53:32Z) - Self-Explainable Affordance Learning with Embodied Caption [63.88435741872204]
We introduce Self-Explainable Affordance learning (SEA) with embodied caption.
SEA enables robots to articulate their intentions and bridge the gap between explainable vision-language caption and visual affordance learning.
We propose a novel model to effectively combine affordance grounding with self-explanation in a simple but efficient manner.
arXiv Detail & Related papers (2024-04-08T15:22:38Z) - Video-based Human Action Recognition using Deep Learning: A Review [4.976815699476327]
Human action recognition is an important application domain in computer vision.
Deep learning has been given particular attention by the computer vision community.
This paper presents an overview of the current state-of-the-art in action recognition using video analysis with deep learning techniques.
arXiv Detail & Related papers (2022-08-07T17:12:12Z) - ActAR: Actor-Driven Pose Embeddings for Video Action Recognition [12.043574473965318]
Human action recognition (HAR) in videos is one of the core tasks of video understanding.
We propose a new method that simultaneously learns to recognize efficiently human actions in the infrared spectrum.
arXiv Detail & Related papers (2022-04-19T05:12:24Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.