TriDet: Temporal Action Detection with Relative Boundary Modeling
- URL: http://arxiv.org/abs/2303.07347v2
- Date: Thu, 16 Mar 2023 11:26:39 GMT
- Title: TriDet: Temporal Action Detection with Relative Boundary Modeling
- Authors: Dingfeng Shi, Yujie Zhong, Qiong Cao, Lin Ma, Jia Li, Dacheng Tao
- Abstract summary: Existing methods often suffer from imprecise boundary predictions due to ambiguous action boundaries in videos.
We propose a novel Trident-head to model the action boundary via an estimated relative probability distribution around the boundary.
TriDet achieves state-of-the-art performance on three challenging benchmarks.
- Score: 85.49834276225484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a one-stage framework TriDet for temporal action
detection. Existing methods often suffer from imprecise boundary predictions
due to the ambiguous action boundaries in videos. To alleviate this problem, we
propose a novel Trident-head to model the action boundary via an estimated
relative probability distribution around the boundary. In the feature pyramid
of TriDet, we propose an efficient Scalable-Granularity Perception (SGP) layer
to mitigate the rank loss problem of self-attention that takes place in the
video features and aggregate information across different temporal
granularities. Benefiting from the Trident-head and the SGP-based feature
pyramid, TriDet achieves state-of-the-art performance on three challenging
benchmarks: THUMOS14, HACS and EPIC-KITCHEN 100, with lower computational
costs, compared to previous methods. For example, TriDet hits an average mAP of
$69.3\%$ on THUMOS14, outperforming the previous best by $2.5\%$, but with only
$74.6\%$ of its latency. The code is released to
https://github.com/sssste/TriDet.
Related papers
- Tangential Randomization in Linear Bandits (TRAiL): Guaranteed Inference and Regret Bounds [1.03590082373586]
We propose and analyze TRAiL, a regret-optimal forced exploration algorithm for linear bandits.
TraiL ensures a $Omega(sqrtT)$ growth in the inference quality, measured via the minimum eigenvalue of the design (regressor) matrix.
We characterize an $Omega(sqrtT)$ minimax lower bound for any algorithm on the expected regret.
arXiv Detail & Related papers (2024-11-19T01:08:13Z) - Temporal Action Localization with Enhanced Instant Discriminability [66.76095239972094]
Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.
We propose a one-stage framework named TriDet to resolve imprecise predictions of action boundaries by existing methods.
Experimental results demonstrate the robustness of TriDet and its state-of-the-art performance on multiple TAD datasets.
arXiv Detail & Related papers (2023-09-11T16:17:50Z) - Recurrence without Recurrence: Stable Video Landmark Detection with Deep
Equilibrium Models [96.76758318732308]
We show that the recently proposed Deep Equilibrium Model (DEQ) can be naturally adapted to this form of computation.
Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the WFLW facial landmark dataset.
arXiv Detail & Related papers (2023-04-02T19:08:02Z) - Revisiting Weighted Strategy for Non-stationary Parametric Bandits [82.1942459195896]
This paper revisits the weighted strategy for non-stationary parametric bandits.
We propose a refined analysis framework, which produces a simpler weight-based algorithm.
Our new framework can be used to improve regret bounds of other parametric bandits.
arXiv Detail & Related papers (2023-03-05T15:11:14Z) - Post-Processing Temporal Action Detection [134.26292288193298]
Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence.
This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution.
We introduce a novel model-agnostic post-processing method without model redesign and retraining.
arXiv Detail & Related papers (2022-11-27T19:50:37Z) - A Coarse-to-Fine Instance Segmentation Network with Learning Boundary
Representation [10.967299485260163]
Boundary-based instance segmentation has drawn much attention since of its attractive efficiency.
Existing methods suffer from the difficulty in long-distance regression.
We propose a coarse-to-fine module to address the problem.
arXiv Detail & Related papers (2021-06-18T16:37:28Z) - An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic
Gradient Descent and Thompson Sampling [83.48992319018147]
We consider the contextual bandit problem, where a player sequentially makes decisions based on past observations to maximize the cumulative reward.
A natural way to resolve this problem is to apply online gradient descent (SGD) so that the per-step time and memory complexity can be reduced to constant.
In this work, we show that online SGD can be applied to the generalized linear bandit problem.
The proposed SGD-TS algorithm, which uses a single-step SGD update to exploit past information, achieves $tildeO(sqrtT)$ regret with the total time complexity that
arXiv Detail & Related papers (2020-06-07T01:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.