ConvGRU in Fine-grained Pitching Action Recognition for Action Outcome
Prediction
- URL: http://arxiv.org/abs/2008.07819v1
- Date: Tue, 18 Aug 2020 09:27:17 GMT
- Title: ConvGRU in Fine-grained Pitching Action Recognition for Action Outcome
Prediction
- Authors: Tianqi Ma, Lin Zhang, Xiumin Diao, Ou Ma
- Abstract summary: Fine-grained action recognition is significant in many fields such as human-robot interaction, intelligent traffic management, sports training, health caring.
In this paper, we explore the performance of convolutional gate recurrent unit (ConvGRU) method on a fine-grained action recognition tasks.
Based on sequences of RGB images of human actions, the proposed approach achieved the performance of 79.17% accuracy.
- Score: 4.073910992747716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prediction of the action outcome is a new challenge for a robot
collaboratively working with humans. With the impressive progress in video
action recognition in recent years, fine-grained action recognition from video
data turns into a new concern. Fine-grained action recognition detects subtle
differences of actions in more specific granularity and is significant in many
fields such as human-robot interaction, intelligent traffic management, sports
training, health caring. Considering that the different outcomes are closely
connected to the subtle differences in actions, fine-grained action recognition
is a practical method for action outcome prediction. In this paper, we explore
the performance of convolutional gate recurrent unit (ConvGRU) method on a
fine-grained action recognition tasks: predicting outcomes of ball-pitching.
Based on sequences of RGB images of human actions, the proposed approach
achieved the performance of 79.17% accuracy, which exceeds the current
state-of-the-art result. We also compared different network implementations and
showed the influence of different image sampling methods, different fusion
methods and pre-training, etc. Finally, we discussed the advantages and
limitations of ConvGRU in such action outcome prediction and fine-grained
action recognition tasks.
Related papers
- FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition [57.17966905865054]
Real-life applications of action recognition often require a fine-grained understanding of subtle movements.
Existing semi-supervised action recognition has mainly focused on coarse-grained action recognition.
We propose an Alignability-Verification-based Metric learning technique to effectively discriminate between fine-grained action pairs.
arXiv Detail & Related papers (2024-09-02T20:08:06Z) - Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations.
We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy [12.257725479880458]
Action recognition has become one of the popular research topics in computer vision.
We propose a multi-view attention consistency method that computes the similarity between two attentions from two different views of the action videos.
Our approach applies the idea of Neural Radiance Field to implicitly render the features from novel views when training on single-view datasets.
arXiv Detail & Related papers (2024-05-02T14:43:21Z) - Unsupervised Learning of Effective Actions in Robotics [0.9374652839580183]
Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions.
We propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes"
We evaluate our method on a simulated stair-climbing reinforcement learning task.
arXiv Detail & Related papers (2024-04-03T13:28:52Z) - DOAD: Decoupled One Stage Action Detection Network [77.14883592642782]
Localizing people and recognizing their actions from videos is a challenging task towards high-level video understanding.
Existing methods are mostly two-stage based, with one stage for person bounding box generation and the other stage for action recognition.
We present a decoupled one-stage network dubbed DOAD, to improve the efficiency for-temporal action detection.
arXiv Detail & Related papers (2023-04-01T08:06:43Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - Object and Relation Centric Representations for Push Effect Prediction [18.990827725752496]
Pushing is an essential non-prehensile manipulation skill used for tasks ranging from pre-grasp manipulation to scene rearrangement.
We propose a graph neural network based framework for effect prediction and parameter estimation of pushing actions.
Our framework is validated both in real and simulated environments containing different shaped multi-part objects connected via different types of joints and objects with different masses.
arXiv Detail & Related papers (2021-02-03T15:09:12Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z) - Attention-Oriented Action Recognition for Real-Time Human-Robot
Interaction [11.285529781751984]
We propose an attention-oriented multi-level network framework to meet the need for real-time interaction.
Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution.
The other compact CNN receives the extracted skeleton sequence as input for action recognition.
arXiv Detail & Related papers (2020-07-02T12:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.