1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge
2020
- URL: http://arxiv.org/abs/2006.09116v1
- Date: Tue, 16 Jun 2020 12:52:59 GMT
- Title: 1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge
2020
- Authors: Siyu Chen, Junting Pan, Guanglu Song, Manyuan Zhang, Hao Shao, Ziyi
Lin, Jing Shao, Hongsheng Li, Yu Liu
- Abstract summary: This report introduces our winning solution to the action-temporal localization track, AVA-Kinetics, in ActivityNet Challenge 2020.
We describe technical details for the new AVA-Kinetics dataset, together with some experimental results.
Without any bells and whistles, we achieved 39.62 mAP on the test set of AVA-Kinetics, which outperforms other entries by a large margin.
- Score: 43.81722332148899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This technical report introduces our winning solution to the spatio-temporal
action localization track, AVA-Kinetics Crossover, in ActivityNet Challenge
2020. Our entry is mainly based on Actor-Context-Actor Relation Network. We
describe technical details for the new AVA-Kinetics dataset, together with some
experimental results. Without any bells and whistles, we achieved 39.62 mAP on
the test set of AVA-Kinetics, which outperforms other entries by a large
margin. Code will be available at: https://github.com/Siyu-C/ACAR-Net.
Related papers
- 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation [81.50620771207329]
We investigate the effectiveness of static-dominant data and frame sampling on referring video object segmentation (RVOS)
Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge.
arXiv Detail & Related papers (2024-06-11T08:05:26Z) - 1st Place Solution of The Robust Vision Challenge (RVC) 2022 Semantic
Segmentation Track [67.56316745239629]
This report describes the winning solution to the semantic segmentation task of the Robust Vision Challenge on ECCV 2022.
Our method adopts the FAN-B-Hybrid model as the encoder and uses Segformer as the segmentation framework.
The proposed method could serve as a strong baseline for the multi-domain segmentation task and benefit future works.
arXiv Detail & Related papers (2022-10-23T20:52:22Z) - UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at
ActivityNet Challenge 2022 [69.67841335302576]
This report presents a brief description of our winning solution to the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2022.
Our underlying model UniCon+ continues to build on our previous work, the Unified Context Network (UniCon) and Extended UniCon.
We augment the architecture with a simple GRU-based module that allows information of recurring identities to flow across scenes.
arXiv Detail & Related papers (2022-06-22T06:11:07Z) - "Knights": First Place Submission for VIPriors21 Action Recognition
Challenge at ICCV 2021 [39.990872080183884]
This report presents "Knights" to solve the action recognition task on a small subset of Kinetics400ViPriors.
Our approach has 3 main components: state-of-the-art Temporal Contrastive self-supervised pretraining, video transformer models, and optical flow modality.
arXiv Detail & Related papers (2021-10-14T22:47:31Z) - NTIRE 2021 Multi-modal Aerial View Object Classification Challenge [88.89190054948325]
We introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR.
This challenge is composed of two different tracks using EO and SAR imagery.
We discuss the top methods submitted for this competition and evaluate their results on our blind test set.
arXiv Detail & Related papers (2021-07-02T16:55:08Z) - An Empirical Study of Vehicle Re-Identification on the AI City Challenge [19.13038665501964]
The Track2 is a vehicle re-identification (ReID) task with both the real-world data and synthetic data.
We mainly focus on four points, i.e. training data, unsupervised domain-adaptive (UDA) training, post-processing, model ensembling in this challenge.
With aforementioned techniques, our method finally achieves 0.7445 mAP score, yielding the first place in the competition.
arXiv Detail & Related papers (2021-05-20T12:20:52Z) - The AVA-Kinetics Localized Human Actions Video Dataset [124.41706958756049]
This paper describes the AVA-Kinetics localized human actions video dataset.
The dataset is collected by annotating videos from the Kinetics-700 dataset using the AVA annotation protocol.
The dataset contains over 230k clips annotated with the 80 AVA action classes for each of the humans in key-frames.
arXiv Detail & Related papers (2020-05-01T04:17:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.