Spatio-Temporal Action Detection Under Large Motion
- URL: http://arxiv.org/abs/2209.02250v1
- Date: Tue, 6 Sep 2022 06:55:26 GMT
- Title: Spatio-Temporal Action Detection Under Large Motion
- Authors: Gurkirt Singh, Vasileios Choutas, Suman Saha, Fisher Yu, Luc Van Gool
- Abstract summary: We study the performance of cuboid-aware feature aggregation in action detection under large action.
We propose to enhance actor representation under large motion by tracking actors and performing temporal feature aggregation along the respective tracks.
We find that cuboid-aware feature aggregation consistently achieves a large improvement in action detection performance compared to the cuboid-aware baseline.
- Score: 86.3220533375967
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current methods for spatiotemporal action tube detection often extend a
bounding box proposal at a given keyframe into a 3D temporal cuboid and pool
features from nearby frames. However, such pooling fails to accumulate
meaningful spatiotemporal features if the position or shape of the actor shows
large 2D motion and variability through the frames, due to large camera motion,
large actor shape deformation, fast actor action and so on. In this work, we
aim to study the performance of cuboid-aware feature aggregation in action
detection under large action. Further, we propose to enhance actor feature
representation under large motion by tracking actors and performing temporal
feature aggregation along the respective tracks. We define the actor motion
with intersection-over-union (IoU) between the boxes of action tubes/tracks at
various fixed time scales. The action having a large motion would result in
lower IoU over time, and slower actions would maintain higher IoU. We find that
track-aware feature aggregation consistently achieves a large improvement in
action detection performance, especially for actions under large motion
compared to the cuboid-aware baseline. As a result, we also report
state-of-the-art on the large-scale MultiSports dataset.
Related papers
- ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack.
Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns.
We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z) - Instantaneous Perception of Moving Objects in 3D [86.38144604783207]
The perception of 3D motion of surrounding traffic participants is crucial for driving safety.
We propose to leverage local occupancy completion of object point clouds to densify the shape cue, and mitigate the impact of swimming artifacts.
Extensive experiments demonstrate superior performance compared to standard 3D motion estimation approaches.
arXiv Detail & Related papers (2024-05-05T01:07:24Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - Behavior Recognition Based on the Integration of Multigranular Motion
Features [17.052997301790693]
We propose a novel behavior recognition method based on the integration of multigranular (IMG) motion features.
We evaluate our model on several action recognition benchmarks such as HMDB51, Something-Something and UCF101.
arXiv Detail & Related papers (2022-03-07T02:05:26Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - TSI: Temporal Saliency Integration for Video Action Recognition [32.18535820790586]
We propose a Temporal Saliency Integration (TSI) block, which mainly contains a Salient Motion Excitation (SME) module and a Cross-scale Temporal Integration (CTI) module.
SME aims to highlight the motion-sensitive area through local-global motion modeling.
CTI is designed to perform multi-scale temporal modeling through a group of separate 1D convolutions respectively.
arXiv Detail & Related papers (2021-06-02T11:43:49Z) - MAT: Motion-Aware Multi-Object Tracking [9.098793914779161]
In this paper, we propose Motion-Aware Tracker (MAT), focusing more on various motion patterns of different objects.
Experiments on MOT16 and MOT17 challenging benchmarks demonstrate that our MAT approach can achieve the superior performance by a large margin.
arXiv Detail & Related papers (2020-09-10T11:51:33Z) - PAN: Towards Fast Action Recognition via Learning Persistence of
Appearance [60.75488333935592]
Most state-of-the-art methods heavily rely on dense optical flow as motion representation.
In this paper, we shed light on fast action recognition by lifting the reliance on optical flow.
We design a novel motion cue called Persistence of Appearance (PA)
In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
arXiv Detail & Related papers (2020-08-08T07:09:54Z) - Motion-Attentive Transition for Zero-Shot Video Object Segmentation [99.44383412488703]
We present a Motion-Attentive Transition Network (MATNet) for zero-shot object segmentation.
An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder.
In this way, the encoder becomes deeply internative, allowing for closely hierarchical interactions between object motion and appearance.
arXiv Detail & Related papers (2020-03-09T16:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.