Related papers: HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

URL: http://arxiv.org/abs/2206.10118v1
Date: Tue, 21 Jun 2022 05:25:58 GMT
Title: HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction
Authors: Yihan Hu, Wenxin Shao, Bo Jiang, Jiajie Chen, Siqi Chai, Zhening Yang, Jingyu Qian, Helong Zhou, Qiang Liu
Abstract summary: We introduce our solution to the Occupancy and Flow Prediction challenge in the Open Challenges at CVPR 2022. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a hierarchical 3D decoder. Our method achieves a Flow-Grounded Occupancy AUC of 0.8389 and outperforms all the other teams on the leaderboard.
Score: 10.02342218798102
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022, which ranks 1st on the leaderboard. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a recursive hierarchical 3D decoder. We use multiple losses including focal loss and modified flow trace loss to efficiently guide the training process. Our method achieves a Flow-Grounded Occupancy AUC of 0.8389 and outperforms all the other teams on the leaderboard.

Related papers

DELTAv2: Accelerating Dense 3D Tracking [79.63990337419514]
We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos.<n>We introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories.<n>The newly added trajectories are using a learnable module, which is trained end-to-end alongside the tracking network.
arXiv Detail & Related papers (2025-08-02T03:15:47Z)
FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation [52.89847760590189]
3D scene understanding is a critical yet challenging task in autonomous driving. Recent methods leverage the range-view representation to improve processing efficiency. We re-design the workflow for range-view-based LiDAR semantic segmentation.
arXiv Detail & Related papers (2025-02-13T12:39:26Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Neural Eulerian Scene Flow Fields [59.57980592109722]
EulerFlow works out-of-the-box without tuning across multiple domains. It exhibits emergent 3D point tracking behavior by solving its estimated ODE over long-time horizons. It outperforms all prior art on the Argoverse 2 2024 Scene Flow Challenge.
arXiv Detail & Related papers (2024-10-02T20:56:45Z)
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera [0.8576354642891824]
Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal temporal, we propose a causal convolutional network. We apply our model on the AIS 2024 event-based eye tracking challenge, reaching a score of 0.9916 p10 accuracy on the Kaggle private testset.
arXiv Detail & Related papers (2024-04-13T00:13:20Z)
Active search and coverage using point-cloud reinforcement learning [50.741409008225766]
This paper presents an end-to-end deep reinforcement learning solution for target search and coverage. We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points. We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
arXiv Detail & Related papers (2023-12-18T18:16:30Z)
Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference [23.8192952068949]
We present a novel Contrastive Self-learning framework for S-temporal data (CSST) Our approach initiates with the construction of a spatial adjacency graph founded on the Points of Interest (POIs) and their respective distances. We adopt a swapped prediction approach to anticipate the representation of the target subgraph from similar instances. Our experiments, conducted on two real-world datasets, demonstrate that the CSST pre-trained on extensive noisy data consistently outperforms models trained from scratch.
arXiv Detail & Related papers (2023-09-06T02:51:24Z)
Long-Short Temporal Co-Teaching for Weakly Supervised Video Anomaly Detection [14.721615285883423]
Weakly supervised anomaly detection (WS-VAD) is a challenging problem that aims to learn VAD models only with video-level annotations. Our proposed method is able to better deal with anomalies with varying durations as well as subtle anomalies.
arXiv Detail & Related papers (2023-03-31T13:28:06Z)
Pyramid Correlation based Deep Hough Voting for Visual Object Tracking [16.080776515556686]
We introduce a voting-based classification-only tracking algorithm named Pyramid Correlation based Deep Hough Voting (short for PCDHV) Specifically we innovatively construct a Pyramid Correlation module to equip the embedded feature with fine-grained local structures and global spatial contexts. The elaborately designed Deep Hough Voting module further take over, integrating long-range dependencies of pixels to perceive corners.
arXiv Detail & Related papers (2021-10-15T10:37:00Z)
Hierarchical Attention Learning of Scene Flow in 3D Point Clouds [28.59260783047209]
This paper studies the problem of scene flow estimation from two consecutive 3D point clouds. A novel hierarchical neural network with double attention is proposed for learning the correlation of point features in adjacent frames. Experiments show that the proposed network outperforms the state-of-the-art performance of 3D scene flow estimation.
arXiv Detail & Related papers (2020-10-12T14:56:08Z)
2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors Challenges: An Efficient Optical Flow Stream Guided Framework [57.847010327319964]
We propose a data-efficient framework that can train the model from scratch on small datasets. Specifically, by introducing a 3D central difference convolution operation, we proposed a novel C3D neural network-based two-stream framework. It is proved that our method can achieve a promising result even without a pre-trained model on large scale datasets.
arXiv Detail & Related papers (2020-08-10T09:50:28Z)
Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale. Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous. We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.