Shifting Perspective to See Difference: A Novel Multi-View Method for
Skeleton based Action Recognition
- URL: http://arxiv.org/abs/2209.02986v1
- Date: Wed, 7 Sep 2022 08:20:37 GMT
- Title: Shifting Perspective to See Difference: A Novel Multi-View Method for
Skeleton based Action Recognition
- Authors: Ruijie Hou, Yanran Li, Ningyu Zhang, Yulin Zhou, Xiaosong Yang, Zhao
Wang
- Abstract summary: Skeleton-based human action recognition is a longstanding challenge due to its complex dynamics.
We propose a conceptually simple yet effective Multi-view strategy that recognizes actions from a collection of dynamic view features.
Our module can work seamlessly with the existing action classification model.
- Score: 22.004971546763162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Skeleton-based human action recognition is a longstanding challenge due to
its complex dynamics. Some fine-grain details of the dynamics play a vital role
in classification. The existing work largely focuses on designing incremental
neural networks with more complicated adjacent matrices to capture the details
of joints relationships. However, they still have difficulties distinguishing
actions that have broadly similar motion patterns but belong to different
categories. Interestingly, we found that the subtle differences in motion
patterns can be significantly amplified and become easy for audience to
distinct through specified view directions, where this property haven't been
fully explored before. Drastically different from previous work, we boost the
performance by proposing a conceptually simple yet effective Multi-view
strategy that recognizes actions from a collection of dynamic view features.
Specifically, we design a novel Skeleton-Anchor Proposal (SAP) module which
contains a Multi-head structure to learn a set of views. For feature learning
of different views, we introduce a novel Angle Representation to transform the
actions under different views and feed the transformations into the baseline
model. Our module can work seamlessly with the existing action classification
model. Incorporated with baseline models, our SAP module exhibits clear
performance gains on many challenging benchmarks. Moreover, comprehensive
experiments show that our model consistently beats down the state-of-the-art
and remains effective and robust especially when dealing with corrupted data.
Related code will be available on https://github.com/ideal-idea/SAP .
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - UniFS: Universal Few-shot Instance Perception with Point Representations [36.943019984075065]
We propose UniFS, a universal few-shot instance perception model that unifies a wide range of instance perception tasks.
Our approach makes minimal assumptions about the tasks, yet it achieves competitive results compared to highly specialized and well optimized specialist models.
arXiv Detail & Related papers (2024-04-30T09:47:44Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Multi-Modal Few-Shot Temporal Action Detection [157.96194484236483]
Few-shot (FS) and zero-shot (ZS) learning are two different approaches for scaling temporal action detection to new classes.
We introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD.
arXiv Detail & Related papers (2022-11-27T18:13:05Z) - Modular Networks Prevent Catastrophic Interference in Model-Based
Multi-Task Reinforcement Learning [0.8883733362171032]
We study whether model-based multi-task reinforcement learning benefits from shared dynamics models in a similar way model-free methods do from shared policy networks.
Using a single dynamics model, we see clear evidence of task confusion and reduced performance.
As a remedy, enforcing an internal structure for the learned dynamics model by training isolated sub-networks for each task notably improves performance.
arXiv Detail & Related papers (2021-11-15T12:31:31Z) - Visual Concept Reasoning Networks [93.99840807973546]
A split-transform-merge strategy has been broadly used as an architectural constraint in convolutional neural networks for visual recognition tasks.
We propose to exploit this strategy and combine it with our Visual Concept Reasoning Networks (VCRNet) to enable reasoning between high-level visual concepts.
Our proposed model, VCRNet, consistently improves the performance by increasing the number of parameters by less than 1%.
arXiv Detail & Related papers (2020-08-26T20:02:40Z) - Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and
Novel-View Synthesis [39.53519330457627]
We propose a novel task of joint few-shot recognition and novel-view synthesis.
We aim to simultaneously learn an object classifier and generate images of that type of object from new viewpoints.
We focus on the interaction and cooperation between a generative model and a discriminative model.
arXiv Detail & Related papers (2020-08-16T19:40:56Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Dynamic Feature Integration for Simultaneous Detection of Salient
Object, Edge and Skeleton [108.01007935498104]
In this paper, we solve three low-level pixel-wise vision problems, including salient object segmentation, edge detection, and skeleton extraction.
We first show some similarities shared by these tasks and then demonstrate how they can be leveraged for developing a unified framework.
arXiv Detail & Related papers (2020-04-18T11:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.