Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid
Network
- URL: http://arxiv.org/abs/2003.04145v1
- Date: Mon, 9 Mar 2020 13:47:36 GMT
- Title: Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid
Network
- Authors: Jialin Gao, Zhixiang Shi, Jiani Li, Guanshuo Wang, Yufeng Yuan,
Shiming Ge, and Xi Zhou
- Abstract summary: We propose a Relation-aware pyramid Network (RapNet) to generate highly accurate temporal action proposals.
In RapNet, a novel relation-aware module is introduced to exploit bi-directional long-range relations between local features for context distilling.
- Score: 29.7640925776191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate temporal action proposals play an important role in detecting
actions from untrimmed videos. The existing approaches have difficulties in
capturing global contextual information and simultaneously localizing actions
with different durations. To this end, we propose a Relation-aware pyramid
Network (RapNet) to generate highly accurate temporal action proposals. In
RapNet, a novel relation-aware module is introduced to exploit bi-directional
long-range relations between local features for context distilling. This
embedded module enhances the RapNet in terms of its multi-granularity temporal
proposal generation ability, given predefined anchor boxes. We further
introduce a two-stage adjustment scheme to refine the proposal boundaries and
measure their confidence in containing an action with snippet-level actionness.
Extensive experiments on the challenging ActivityNet and THUMOS14 benchmarks
demonstrate our RapNet generates superior accurate proposals over the existing
state-of-the-art methods.
Related papers
- Faster Learning of Temporal Action Proposal via Sparse Multilevel
Boundary Generator [9.038216757761955]
Temporal action localization in videos presents significant challenges in the field of computer vision.
We propose a novel framework, Sparse Multilevel Boundary Generator (SMBG), which enhances the boundary-sensitive method with boundary classification and action completeness regression.
Our method is evaluated on two popular benchmarks, ActivityNet-1.3 and THUMOS14, and is shown to achieve state-of-the-art performance, with a better inference speed (2.47xBSN++, 2.12xDBG)
arXiv Detail & Related papers (2023-03-06T14:26:56Z) - Temporal Action Proposal Generation with Background Constraint [25.783837570359267]
Temporal action proposal generation (TAPG) is a challenging task that aims to locate action instances in untrimmed videos with temporal boundaries.
To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.
In this paper, we innovatively propose a general auxiliary Background Constraint idea to further suppress low-quality proposals.
arXiv Detail & Related papers (2021-12-15T09:20:49Z) - Augmented Transformer with Adaptive Graph for Temporal Action Proposal
Generation [79.98992138865042]
We present an augmented transformer with adaptive graph network (ATAG) to exploit both long-range and local temporal contexts for TAPG.
Specifically, we enhance the vanilla transformer by equipping a snippet actionness loss and a front block, dubbed augmented transformer.
An adaptive graph convolutional network (GCN) is proposed to build local temporal context by mining the position information and difference between adjacent features.
arXiv Detail & Related papers (2021-03-30T02:01:03Z) - Temporal Context Aggregation Network for Temporal Action Proposal
Refinement [93.03730692520999]
Temporal action proposal generation is a challenging yet important task in the video understanding field.
Current methods still suffer from inaccurate temporal boundaries and inferior confidence used for retrieval.
We propose TCANet to generate high-quality action proposals through "local and global" temporal context aggregation.
arXiv Detail & Related papers (2021-03-24T12:34:49Z) - Two-Stream Consensus Network for Weakly-Supervised Temporal Action
Localization [94.37084866660238]
We present a Two-Stream Consensus Network (TSCN) to simultaneously address these challenges.
The proposed TSCN features an iterative refinement training method, where a frame-level pseudo ground truth is iteratively updated.
We propose a new attention normalization loss to encourage the predicted attention to act like a binary selection, and promote the precise localization of action instance boundaries.
arXiv Detail & Related papers (2020-10-22T10:53:32Z) - BSN++: Complementary Boundary Regressor with Scale-Balanced Relation
Modeling for Temporal Action Proposal Generation [85.13713217986738]
We present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation.
Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
arXiv Detail & Related papers (2020-09-15T07:08:59Z) - Revisiting Anchor Mechanisms for Temporal Action Localization [126.96340233561418]
This paper proposes a novel anchor-free action localization module that assists action localization by temporal points.
By combining the proposed anchor-free module with a conventional anchor-based module, we propose a novel action localization framework, called A2Net.
The cooperation between anchor-free and anchor-based modules achieves superior performance to the state-of-the-art on THUMOS14.
arXiv Detail & Related papers (2020-08-22T13:39:29Z) - Complementary Boundary Generator with Scale-Invariant Relation Modeling
for Temporal Action Localization: Submission to ActivityNet Challenge 2020 [66.4527310659592]
This report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1.
We decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity.
Our proposed scheme achieves the state-of-the-art performance on the temporal action localization task with textbf42.26 average mAP on the challenge testing set.
arXiv Detail & Related papers (2020-07-20T04:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.