Boundary Discretization and Reliable Classification Network for Temporal Action Detection
- URL: http://arxiv.org/abs/2310.06403v4
- Date: Fri, 7 Jun 2024 10:19:33 GMT
- Title: Boundary Discretization and Reliable Classification Network for Temporal Action Detection
- Authors: Zhenying Fang, Jun Yu, Richang Hong,
- Abstract summary: Temporal action detection aims to recognize the action category and determine each action instance's starting and ending time in untrimmed videos.
Mixed methods have achieved remarkable performance by seamlessly merging anchor-based and anchor-free approaches.
We propose a novel Boundary Discretization and Reliable Classification Network (BDRC-Net) that addresses the issues above by introducing boundary discretization and reliable classification modules.
- Score: 39.17204328036531
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal action detection aims to recognize the action category and determine each action instance's starting and ending time in untrimmed videos. The mixed methods have achieved remarkable performance by seamlessly merging anchor-based and anchor-free approaches. Nonetheless, there are still two crucial issues within the mixed framework: (1) Brute-force merging and handcrafted anchor design hinder the substantial potential and practicality of the mixed methods. (2) Within-category predictions show a significant abundance of false positives. In this paper, we propose a novel Boundary Discretization and Reliable Classification Network (BDRC-Net) that addresses the issues above by introducing boundary discretization and reliable classification modules. Specifically, the boundary discretization module (BDM) elegantly merges anchor-based and anchor-free approaches in the form of boundary discretization, eliminating the need for the traditional handcrafted anchor design. Furthermore, the reliable classification module (RCM) predicts reliable global action categories to reduce false positives. Extensive experiments conducted on different benchmarks demonstrate that our proposed method achieves competitive detection performance. The code will be released at https://github.com/zhenyingfang/BDRC-Net.
Related papers
- Few-shot Open Relation Extraction with Gaussian Prototype and Adaptive Margin [15.118656235473921]
Few-shot relation extraction with none-of-the-above (FsRE with NOTA) aims at predicting labels in few-shot scenarios with unknown classes.
We propose a novel framework based on Gaussian prototype and adaptive margin named GPAM for FsRE with NOTA.
arXiv Detail & Related papers (2024-10-27T03:16:09Z) - Boundary-Aware Proposal Generation Method for Temporal Action
Localization [23.79359799496947]
TAL aims to find the categories and temporal boundaries of actions in an untrimmed video.
Most TAL methods rely heavily on action recognition models that are sensitive to action labels rather than temporal boundaries.
We propose a Boundary-Aware Proposal Generation (BAPG) method with contrastive learning.
arXiv Detail & Related papers (2023-09-25T01:41:09Z) - Temporal Action Localization with Enhanced Instant Discriminability [66.76095239972094]
Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.
We propose a one-stage framework named TriDet to resolve imprecise predictions of action boundaries by existing methods.
Experimental results demonstrate the robustness of TriDet and its state-of-the-art performance on multiple TAD datasets.
arXiv Detail & Related papers (2023-09-11T16:17:50Z) - Semi-Supervised Temporal Action Detection with Proposal-Free Masking [134.26292288193298]
We propose a novel Semi-supervised Temporal action detection model based on PropOsal-free Temporal mask (SPOT)
SPOT outperforms state-of-the-art alternatives, often by a large margin.
arXiv Detail & Related papers (2022-07-14T16:58:47Z) - Learning Salient Boundary Feature for Anchor-free Temporal Action
Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding.
We propose the first purely anchor-free temporal localization method.
Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z) - Mixup-CAM: Weakly-supervised Semantic Segmentation via Uncertainty
Regularization [73.03956876752868]
We propose a principled and end-to-end train-able framework to allow the network to pay attention to other parts of the object.
Specifically, we introduce the mixup data augmentation scheme into the classification network and design two uncertainty regularization terms to better interact with the mixup strategy.
arXiv Detail & Related papers (2020-08-03T21:19:08Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.