A Training Method For VideoPose3D With Ideology of Action Recognition
- URL: http://arxiv.org/abs/2206.06430v1
- Date: Mon, 13 Jun 2022 19:25:27 GMT
- Title: A Training Method For VideoPose3D With Ideology of Action Recognition
- Authors: Hao Bai
- Abstract summary: This research shows a faster and more flexible training method for VideoPose3D based on action recognition.
It can handle both action-oriented and common pose-estimation problems.
- Score: 0.9949781365631559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Action recognition and pose estimation from videos are closely related to
understand human motions, but more literature focuses on how to solve pose
estimation tasks alone from action recognition. This research shows a faster
and more flexible training method for VideoPose3D which is based on action
recognition. This model is fed with the same type of action as the type that
will be estimated, and different types of actions can be trained separately.
Evidence has shown that, for common pose-estimation tasks, this model requires
a relatively small amount of data to carry out similar results with the
original research, and for action-oriented tasks, it outperforms the original
research by 4.5% with a limited receptive field size and training epoch on
Velocity Error of MPJPE. This model can handle both action-oriented and common
pose-estimation problems.
Related papers
- DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action
Recognition [0.5161531917413706]
One-shot action recognition allows the recognition of human-performed actions with only a single training example.
This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour.
We propose a novel image-based skeleton representation that performs well in a metric learning setting.
arXiv Detail & Related papers (2020-12-26T22:31:11Z) - Learning View-Disentangled Human Pose Representation by Contrastive
Cross-View Mutual Information Maximization [33.36330493757669]
We introduce a novel representation learning method to disentangle pose-dependent as well as view-dependent factors from 2D human poses.
The method trains a network using cross-view mutual information (CV-MIM) which maximizes mutual information of the same pose performed from different viewpoints.
CV-MIM outperforms other competing methods by a large margin in the single-shot cross-view setting.
arXiv Detail & Related papers (2020-12-02T18:55:35Z) - Action similarity judgment based on kinematic primitives [48.99831733355487]
We investigate to which extent a computational model based on kinematics can determine action similarity.
The chosen model has its roots in developmental robotics and performs action classification based on learned kinematic primitives.
The results show that both the model and human performance are highly accurate in an action similarity task based on kinematic-level features.
arXiv Detail & Related papers (2020-08-30T13:58:47Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Delving into 3D Action Anticipation from Streaming Videos [99.0155538452263]
Action anticipation aims to recognize the action with a partial observation.
We introduce several complementary evaluation metrics and present a basic model based on frame-wise action classification.
We also explore multi-task learning strategies by incorporating auxiliary information from two aspects: the full action representation and the class-agnostic action label.
arXiv Detail & Related papers (2019-06-15T10:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.