NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to
the Ego4D Moment Queries Challenge 2023
- URL: http://arxiv.org/abs/2307.02025v1
- Date: Wed, 5 Jul 2023 05:23:49 GMT
- Title: NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to
the Ego4D Moment Queries Challenge 2023
- Authors: Lin Sui, Fangzhou Mu, Yin Li
- Abstract summary: This report describes our submission to the Ego4D Moment Queries Challenge 2023.
Our submission extends ActionFormer, a latest method for temporal action localization.
Our solution is ranked 2nd on the public leaderboard with 26.62% average mAP and 45.69% Recall@1x at tIoU=0.5 on the test set, significantly outperforming the strong baseline from 2023 challenge.
- Score: 8.674624972031387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This report describes our submission to the Ego4D Moment Queries Challenge
2023. Our submission extends ActionFormer, a latest method for temporal action
localization. Our extension combines an improved ground-truth assignment
strategy during training and a refined version of SoftNMS at inference time.
Our solution is ranked 2nd on the public leaderboard with 26.62% average mAP
and 45.69% Recall@1x at tIoU=0.5 on the test set, significantly outperforming
the strong baseline from 2023 challenge. Our code is available at
https://github.com/happyharrycn/actionformer_release.
Related papers
- 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation [81.50620771207329]
We investigate the effectiveness of static-dominant data and frame sampling on referring video object segmentation (RVOS)
Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge.
arXiv Detail & Related papers (2024-06-11T08:05:26Z) - NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results [126.78130602974319]
This paper reviews the NTIRE 2024 challenge on image super-resolution ($times$4)
The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs.
The aim of the challenge is to obtain designs/solutions with the most advanced SR performance.
arXiv Detail & Related papers (2024-04-15T13:45:48Z) - Lightweight Boosting Models for User Response Prediction Using
Adversarial Validation [2.4040470282119983]
The ACM RecSys Challenge 2023, organized by ShareChat, aims to predict the probability of the app being installed.
This paper describes the lightweight solution to this challenge.
arXiv Detail & Related papers (2023-10-05T13:57:05Z) - The 2nd Place Solution for 2023 Waymo Open Sim Agents Challenge [8.821526792549648]
We propose a simple yet effective autoregressive method for simulating multi-agent behaviors.
Our submission named MTR+++ achieves 0.4697 on the Realism Meta metric in 2023 Open Sim Agents Challenge (WOSAC)
Besides, a modified model based on MTR named MTR_E is proposed after the challenge, which has a better score 0.4911 and is ranked the 3rd on the leaderboard of WOSAC as of June 25, 2023.
arXiv Detail & Related papers (2023-06-28T04:33:12Z) - GroundNLQ @ Ego4D Natural Language Queries Challenge 2023 [73.12670280220992]
To accurately ground in a video, an effective egocentric feature extractor and a powerful grounding model are required.
We leverage a two-stage pre-training strategy to train egocentric feature extractors and the grounding model on video narrations.
In addition, we introduce a novel grounding model GroundNLQ, which employs a multi-modal multi-scale grounding module.
arXiv Detail & Related papers (2023-06-27T07:27:52Z) - Action Sensitivity Learning for the Ego4D Episodic Memory Challenge 2023 [41.10032280192564]
This report presents ReLER submission to two tracks in the Ego4D Episodic Memory Benchmark in CVPR 2023.
This solution inherits from our proposed Action Sensitivity Learning framework (ASL) to better capture discrepant information of frames.
arXiv Detail & Related papers (2023-06-15T14:50:17Z) - Where a Strong Backbone Meets Strong Features -- ActionFormer for Ego4D
Moment Queries Challenge [7.718326034763966]
Our submission builds on ActionFormer, the state-of-the-art backbone for temporal action localization, and a trio of strong video features from SlowFast, Omnivore and Ego.
Our solution is ranked 2nd on the public leaderboard with 21.76% average mAP on the test set, which is nearly three times higher than the official baseline.
arXiv Detail & Related papers (2022-11-16T17:43:26Z) - A Simple Transformer-Based Model for Ego4D Natural Language Queries
Challenge [8.674624972031387]
This report describes our submission to the Ego4D Natural Language Queries (NLQ) Challenge.
Our solution inherits the point-based event representation from our prior work on temporal action localization, and develops a Transformer-based model for video grounding.
Without bells and whistles, our submission based on a single model achieves 12.64% Mean R@1 and is ranked 2nd on the public leaderboard.
arXiv Detail & Related papers (2022-11-16T06:33:37Z) - NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results [279.8098140331206]
The NTIRE 2022 challenge was to super-resolve an input image with a magnification factor of $times$4 based on pairs of low and corresponding high resolution images.
The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics.
arXiv Detail & Related papers (2022-05-11T17:58:54Z) - Top-1 Solution of Multi-Moments in Time Challenge 2019 [56.15819266653481]
We conduct several experiments with popular Image-Based action recognition methods TRN, TSN, and TSM.
A novel temporal interlacing network is proposed towards fast and accurate recognition.
We ensemble all the above models and achieve 67.22% on the validation set and 60.77% on the test set, which ranks 1st on the final leaderboard.
arXiv Detail & Related papers (2020-03-12T15:11:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.