Top-1 Solution of Multi-Moments in Time Challenge 2019
- URL: http://arxiv.org/abs/2003.05837v2
- Date: Fri, 13 Mar 2020 11:53:24 GMT
- Title: Top-1 Solution of Multi-Moments in Time Challenge 2019
- Authors: Manyuan Zhang, Hao Shao, Guanglu Song, Yu Liu, Junjie Yan
- Abstract summary: We conduct several experiments with popular Image-Based action recognition methods TRN, TSN, and TSM.
A novel temporal interlacing network is proposed towards fast and accurate recognition.
We ensemble all the above models and achieve 67.22% on the validation set and 60.77% on the test set, which ranks 1st on the final leaderboard.
- Score: 56.15819266653481
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this technical report, we briefly introduce the solutions of our team
'Efficient' for the Multi-Moments in Time challenge in ICCV 2019. We first
conduct several experiments with popular Image-Based action recognition methods
TRN, TSN, and TSM. Then a novel temporal interlacing network is proposed
towards fast and accurate recognition. Besides, the SlowFast network and its
variants are explored. Finally, we ensemble all the above models and achieve
67.22\% on the validation set and 60.77\% on the test set, which ranks 1st on
the final leaderboard. In addition, we release a new code repository for video
understanding which unifies state-of-the-art 2D and 3D methods based on
PyTorch. The solution of the challenge is also included in the repository,
which is available at https://github.com/Sense-X/X-Temporal.
Related papers
- Exploiting Multiple Sequence Lengths in Fast End to End Training for
Image Captioning [52.25026952905702]
We introduce a method called the Expansion mechanism that processes the input unconstrained by the number of elements in the sequence.
By doing so, the model can learn more effectively compared to traditional attention-based approaches.
arXiv Detail & Related papers (2022-08-13T02:50:35Z) - Multi-Modal and Multi-Factor Branching Time Active Inference [2.513785998932353]
Two versions of branching time active inference (BTAI) based on Monte-Carlo tree search have been developed.
However, those two versions of BTAI still suffer from an exponential complexity class w.r.t the number of observed and latent variables being modelled.
In this paper, we resolve this limitation by allowing the modelling of several observations, each of them having its own likelihood mapping.
The inference algorithm then exploits the factorisation of the likelihood and transition mappings to accelerate the computation of the posterior.
arXiv Detail & Related papers (2022-06-24T22:07:21Z) - HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow
Prediction [10.02342218798102]
We introduce our solution to the Occupancy and Flow Prediction challenge in the Open Challenges at CVPR 2022.
We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a hierarchical 3D decoder.
Our method achieves a Flow-Grounded Occupancy AUC of 0.8389 and outperforms all the other teams on the leaderboard.
arXiv Detail & Related papers (2022-06-21T05:25:58Z) - Two-Stream Consensus Network: Submission to HACS Challenge 2021
Weakly-Supervised Learning Track [78.64815984927425]
The goal of weakly-supervised temporal action localization is to temporally locate and classify action of interest in untrimmed videos.
We adopt the two-stream consensus network (TSCN) as the main framework in this challenge.
Our solution ranked 2rd in this challenge, and we hope our method can serve as a baseline for future academic research.
arXiv Detail & Related papers (2021-06-21T03:36:36Z) - Weakly-Supervised Temporal Action Localization Through Local-Global
Background Modeling [30.104982661371164]
We present our 2021 HACS Challenge - Weakly-supervised Learning Track solution that based on BaSNet to address above problem.
Specifically, we first adopt pre-trained CSN, Slowfast, TDN, and ViViT as feature extractors to get feature sequences.
Then our proposed Local-Global Background Modeling Network (LGBM-Net) is trained to localize instances by using only video-level labels.
arXiv Detail & Related papers (2021-06-20T02:58:45Z) - Anchor-Free Person Search [127.88668724345195]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
Most existing works employ two-stage detectors like Faster-RCNN, yielding encouraging accuracy but with high computational overhead.
We present the Feature-Aligned Person Search Network (AlignPS), the first anchor-free framework to efficiently tackle this challenging task.
arXiv Detail & Related papers (2021-03-22T07:04:29Z) - Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z) - Challenge report:VIPriors Action Recognition Challenge [14.080142383692417]
Action recognition has attracted many researchers attention for its full application, but it is still challenging.
In this paper, we study previous methods and propose our method.
We use a fast but effective way to extract motion features from videos by using residual frames as input.
arXiv Detail & Related papers (2020-07-16T08:40:31Z) - DeepMark++: Real-time Clothing Detection at the Edge [55.41644538483948]
We propose a single-stage approach to deliver rapid clothing detection and keypoint estimation.
Our solution is based on a multi-target network CenterNet, and we introduce several powerful post-processing techniques to enhance performance.
Our most accurate model achieves results comparable to state-of-the-art solutions on the DeepFashion2 dataset.
arXiv Detail & Related papers (2020-06-01T04:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.