FBK-HUPBA Submission to the EPIC-Kitchens Action Recognition 2020
Challenge
- URL: http://arxiv.org/abs/2006.13725v1
- Date: Wed, 24 Jun 2020 13:41:17 GMT
- Title: FBK-HUPBA Submission to the EPIC-Kitchens Action Recognition 2020
Challenge
- Authors: Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz
- Abstract summary: We describe the technical details of our submission to the EPIC-Kitchens Action Recognition 2020 Challenge.
Our submission achieved top Ego-1 action recognition accuracy of 40.0% on S1 setting, and 21% on S2 setting, using only RGB.
- Score: 43.8525418821458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this report we describe the technical details of our submission to the
EPIC-Kitchens Action Recognition 2020 Challenge. To participate in the
challenge we deployed spatio-temporal feature extraction and aggregation models
we have developed recently: Gate-Shift Module (GSM) [1] and EgoACO, an
extension of Long Short-Term Attention (LSTA) [2]. We design an ensemble of GSM
and EgoACO model families with different backbones and pre-training to generate
the prediction scores. Our submission, visible on the public leaderboard with
team name FBK-HUPBA, achieved a top-1 action recognition accuracy of 40.0% on
S1 setting, and 25.71% on S2 setting, using only RGB.
Related papers
- Predicting the Next Action by Modeling the Abstract Goal [18.873728614415946]
We present an action anticipation model that leverages goal information for the purpose of reducing the uncertainty in future predictions.
We derive a novel concept called abstract goal which is conditioned on observed sequences of visual features for action anticipation.
Our method obtains impressive results on the very challenging Epic-Kitchens55 (EK55), EK100, and EGTEA Gaze+ datasets.
arXiv Detail & Related papers (2022-09-12T06:52:42Z) - NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation
Challenge 2022 [13.603712913129506]
We describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge.
Our modelings, the higher-order recurrent space-time transformer and the message-passing neural network with edge learning, are both recurrent-based architectures which observe only 2.5 seconds inference context to form the action anticipation prediction.
By averaging the prediction scores from a set of models compiled with our proposed training pipeline, we achieved strong performance on the test set, which is 19.61% overall mean top-5 recall, recorded as second place on the public leaderboard.
arXiv Detail & Related papers (2022-06-22T06:34:58Z) - End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge
Distillation [86.41437210485932]
We aim at advancing zero-shot HOI detection to detect both seen and unseen HOIs simultaneously.
We propose a novel end-to-end zero-shot HOI Detection framework via vision-language knowledge distillation.
Our method outperforms the previous SOTA by 8.92% on unseen mAP and 10.18% on overall mAP.
arXiv Detail & Related papers (2022-04-01T07:27:19Z) - SAIC_Cambridge-HuPBA-FBK Submission to the EPIC-Kitchens-100 Action
Recognition Challenge 2021 [80.05652375838073]
This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.
Our submission, visible on the public leaderboard, achieved a top-1 action recognition accuracy of 44.82%, using only RGB.
arXiv Detail & Related papers (2021-10-06T16:29:47Z) - TransAction: ICL-SJTU Submission to EPIC-Kitchens Action Anticipation
Challenge 2021 [42.35018041385645]
We developed a hierarchical attention model for action anticipation.
In terms of Mean Top-5 Recall of action, our submission with team name ICL-SJTU achieved 13.39%.
It is noteworthy that our submission ranked 1st in terms of verb class in all three (sub)sets.
arXiv Detail & Related papers (2021-07-28T10:42:47Z) - Two-Stream Consensus Network: Submission to HACS Challenge 2021
Weakly-Supervised Learning Track [78.64815984927425]
The goal of weakly-supervised temporal action localization is to temporally locate and classify action of interest in untrimmed videos.
We adopt the two-stream consensus network (TSCN) as the main framework in this challenge.
Our solution ranked 2rd in this challenge, and we hope our method can serve as a baseline for future academic research.
arXiv Detail & Related papers (2021-06-21T03:36:36Z) - Joint COCO and Mapillary Workshop at ICCV 2019: COCO Instance
Segmentation Challenge Track [87.90450014797287]
MegDetV2 works in a two-pass fashion, first to detect instances then to obtain segmentation.
On the COCO-2019 detection/instance-segmentation test-dev dataset, our system achieves 61.0/53.1 mAP, which surpassed our 2018 winning results by 5.0/4.2 respectively.
arXiv Detail & Related papers (2020-10-06T04:49:37Z) - Rescaling Egocentric Vision [48.57283024015145]
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS.
The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos.
Compared to its previous version, EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments)
arXiv Detail & Related papers (2020-06-23T18:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.