BSN++: Complementary Boundary Regressor with Scale-Balanced Relation
Modeling for Temporal Action Proposal Generation
- URL: http://arxiv.org/abs/2009.07641v5
- Date: Mon, 1 Mar 2021 08:01:49 GMT
- Title: BSN++: Complementary Boundary Regressor with Scale-Balanced Relation
Modeling for Temporal Action Proposal Generation
- Authors: Haisheng Su, Weihao Gan, Wei Wu, Yu Qiao, Junjie Yan
- Abstract summary: We present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation.
Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
- Score: 85.13713217986738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating human action proposals in untrimmed videos is an important yet
challenging task with wide applications. Current methods often suffer from the
noisy boundary locations and the inferior quality of confidence scores used for
proposal retrieving. In this paper, we present BSN++, a new framework which
exploits complementary boundary regressor and relation modeling for temporal
proposal generation. First, we propose a novel boundary regressor based on the
complementary characteristics of both starting and ending boundary classifiers.
Specifically, we utilize the U-shaped architecture with nested skip connections
to capture rich contexts and introduce bi-directional boundary matching
mechanism to improve boundary precision. Second, to account for the
proposal-proposal relations ignored in previous methods, we devise a proposal
relation block to which includes two self-attention modules from the aspects of
position and channel. Furthermore, we find that there inevitably exists data
imbalanced problems in the positive/negative proposals and temporal durations,
which harm the model performance on tail distributions. To relieve this issue,
we introduce the scale-balanced re-sampling strategy. Extensive experiments are
conducted on two popular benchmarks: ActivityNet-1.3 and THUMOS14, which
demonstrate that BSN++ achieves the state-of-the-art performance. Not
surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet
challenge leaderboard on temporal action localization task.
Related papers
- Faster Learning of Temporal Action Proposal via Sparse Multilevel
Boundary Generator [9.038216757761955]
Temporal action localization in videos presents significant challenges in the field of computer vision.
We propose a novel framework, Sparse Multilevel Boundary Generator (SMBG), which enhances the boundary-sensitive method with boundary classification and action completeness regression.
Our method is evaluated on two popular benchmarks, ActivityNet-1.3 and THUMOS14, and is shown to achieve state-of-the-art performance, with a better inference speed (2.47xBSN++, 2.12xDBG)
arXiv Detail & Related papers (2023-03-06T14:26:56Z) - Semi-Supervised Temporal Action Detection with Proposal-Free Masking [134.26292288193298]
We propose a novel Semi-supervised Temporal action detection model based on PropOsal-free Temporal mask (SPOT)
SPOT outperforms state-of-the-art alternatives, often by a large margin.
arXiv Detail & Related papers (2022-07-14T16:58:47Z) - Temporal Action Proposal Generation with Background Constraint [25.783837570359267]
Temporal action proposal generation (TAPG) is a challenging task that aims to locate action instances in untrimmed videos with temporal boundaries.
To evaluate the confidence of proposals, the existing works typically predict action score of proposals that are supervised by the temporal Intersection-over-Union (tIoU) between proposal and the ground-truth.
In this paper, we innovatively propose a general auxiliary Background Constraint idea to further suppress low-quality proposals.
arXiv Detail & Related papers (2021-12-15T09:20:49Z) - Adaptive Proposal Generation Network for Temporal Sentence Localization
in Videos [58.83440885457272]
We address the problem of temporal sentence localization in videos (TSLV)
Traditional methods follow a top-down framework which localizes the target segment with pre-defined segment proposals.
We propose an Adaptive Proposal Generation Network (APGN) to maintain the segment-level interaction while speeding up the efficiency.
arXiv Detail & Related papers (2021-09-14T02:02:36Z) - Temporal Context Aggregation Network for Temporal Action Proposal
Refinement [93.03730692520999]
Temporal action proposal generation is a challenging yet important task in the video understanding field.
Current methods still suffer from inaccurate temporal boundaries and inferior confidence used for retrieval.
We propose TCANet to generate high-quality action proposals through "local and global" temporal context aggregation.
arXiv Detail & Related papers (2021-03-24T12:34:49Z) - Learning Salient Boundary Feature for Anchor-free Temporal Action
Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding.
We propose the first purely anchor-free temporal localization method.
Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z) - Boundary Content Graph Neural Network for Temporal Action Proposal
Generation [16.42008388422392]
Temporal action proposal generation plays an important role in video action understanding.
We propose a novel Boundary Content Graph Neural Network (BC-GNN) to model the insightful relations between the boundary and action content of temporal proposals.
BC-GNN outperforms previous state-of-the-art methods in both temporal action proposal and temporal action detection tasks.
arXiv Detail & Related papers (2020-08-04T09:35:11Z) - Complementary Boundary Generator with Scale-Invariant Relation Modeling
for Temporal Action Localization: Submission to ActivityNet Challenge 2020 [66.4527310659592]
This report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1.
We decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity.
Our proposed scheme achieves the state-of-the-art performance on the temporal action localization task with textbf42.26 average mAP on the challenge testing set.
arXiv Detail & Related papers (2020-07-20T04:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.