Technical Report for Argoverse Challenges on 4D Occupancy Forecasting
- URL: http://arxiv.org/abs/2311.15660v1
- Date: Mon, 27 Nov 2023 09:40:53 GMT
- Title: Technical Report for Argoverse Challenges on 4D Occupancy Forecasting
- Authors: Pengfei Zheng, Kanokphan Lertniphonphan, Feng Chen, Siwei Chen,
Bingchuan Sun, Jun Xie, Zhepeng Wang
- Abstract summary: Our solution consists of a strong LiDAR-based Bird's Eye View (BEV) encoder with temporal fusion and a two-stage decoder.
The solution was tested on the Argoverse 2 sensor dataset to evaluate the occupancy state 3 seconds in the future.
Our solution achieved 18% lower L1 Error (3.57) than the baseline and got the 1 place on the 4D Occupancy Forecasting task in Argoverse Challenges at CVPR 2023.
- Score: 32.43324720856606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This report presents our Le3DE2E_Occ solution for 4D Occupancy Forecasting in
Argoverse Challenges at CVPR 2023 Workshop on Autonomous Driving (WAD). Our
solution consists of a strong LiDAR-based Bird's Eye View (BEV) encoder with
temporal fusion and a two-stage decoder, which combines a DETR head and a UNet
decoder. The solution was tested on the Argoverse 2 sensor dataset to evaluate
the occupancy state 3 seconds in the future. Our solution achieved 18% lower L1
Error (3.57) than the baseline and got the 1 place on the 4D Occupancy
Forecasting task in Argoverse Challenges at CVPR 2023.
Related papers
- Technical Report for CVPR 2024 WeatherProof Dataset Challenge: Semantic Segmentation on Paired Real Data [9.128113804878959]
This challenge aims at semantic segmentation of images degraded by various degrees of weather from all around the world.
We introduced a pre-trained large-scale vision foundation model: InternImage, and trained it using images with different levels of noise.
As a result, we achieved 2nd place in the challenge with 45.1 mIOU and fewer submissions than the other winners.
arXiv Detail & Related papers (2024-06-09T17:08:07Z) - A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+ [10.069192320623031]
We propose a two-stage deep learning framework for the WeatherProof dataset challenge.
In the challenge, our solution achieved a competitive score of 0.43 on the Mean Intersection over Union (mIoU) metric, securing a respectable rank of 4th.
arXiv Detail & Related papers (2024-06-08T16:22:26Z) - The Third Monocular Depth Estimation Challenge [134.16634233789776]
This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC)
The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings.
The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%.
arXiv Detail & Related papers (2024-04-25T17:59:59Z) - NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results [126.78130602974319]
This paper reviews the NTIRE 2024 challenge on image super-resolution ($times$4)
The challenge involves generating corresponding high-resolution (HR) images, magnified by a factor of four, from low-resolution (LR) inputs.
The aim of the challenge is to obtain designs/solutions with the most advanced SR performance.
arXiv Detail & Related papers (2024-04-15T13:45:48Z) - Technical Report for Argoverse Challenges on Unified Sensor-based
Detection, Tracking, and Forecasting [14.44580354496143]
We propose a unified network that incorporates three tasks, including detection, tracking, and forecasting.
This solution adopts a strong Bird's Eye View (BEV) encoder with spatial and temporal fusion and generates unified representations for multi-tasks.
We achieved 1st place in Detection, Tracking, and Forecasting on the E2E Forecasting track in Argoverse Challenges at CVPR 2023 WAD.
arXiv Detail & Related papers (2023-11-27T08:25:23Z) - FB-OCC: 3D Occupancy Prediction based on Forward-Backward View
Transformation [79.41536932037822]
Proposal builds upon FB-BEV, a cutting-edge camera-based bird's-eye view perception design using forward-backward projection.
Designs and optimization result in a state-of-the-art mIoU score of 54.19% on the nuScenes dataset, ranking the 1st place in the challenge track.
arXiv Detail & Related papers (2023-07-04T05:55:54Z) - AVATAR submission to the Ego4D AV Transcription Challenge [79.21857972093332]
Our pipeline is based on AVATAR, a state of the art encoder-decoder model for AV-ASR that performs early fusion of spectrograms and RGB images.
Our final method achieves a WER of 68.40 on the challenge test set, outperforming the baseline by 43.7%, and winning the challenge.
arXiv Detail & Related papers (2022-11-18T01:03:30Z) - Where a Strong Backbone Meets Strong Features -- ActionFormer for Ego4D
Moment Queries Challenge [7.718326034763966]
Our submission builds on ActionFormer, the state-of-the-art backbone for temporal action localization, and a trio of strong video features from SlowFast, Omnivore and Ego.
Our solution is ranked 2nd on the public leaderboard with 21.76% average mAP on the test set, which is nearly three times higher than the official baseline.
arXiv Detail & Related papers (2022-11-16T17:43:26Z) - Workshop on Autonomous Driving at CVPR 2021: Technical Report for
Streaming Perception Challenge [57.647371468876116]
We introduce our real-time 2D object detection system for the realistic autonomous driving scenario.
Our detector is built on a newly designed YOLO model, called YOLOX.
On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively.
arXiv Detail & Related papers (2021-07-27T06:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.