Related papers: SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction

SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction

URL: http://arxiv.org/abs/2602.23894v1
Date: Fri, 27 Feb 2026 10:42:01 GMT
Title: SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction
Authors: Xavier Timoneda, Markus Herb, Fabian Duerr, Daniel Goehring,
Abstract summary: Estimating 3D occupancy and motion at the vehicle's surroundings is essential for autonomous driving.<n>Existing approaches jointly learn geometry and motion but rely on expensive 3D occupancy and flow annotations.<n>We propose a self-supervised method for 3D occupancy flow estimation that eliminates the need for human-produced annotations or external flow supervision.
Score: 2.012425476229879
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Estimating 3D occupancy and motion at the vehicle's surroundings is essential for autonomous driving, enabling situational awareness in dynamic environments. Existing approaches jointly learn geometry and motion but rely on expensive 3D occupancy and flow annotations, velocity labels from bounding boxes, or pretrained optical flow models. We propose a self-supervised method for 3D occupancy flow estimation that eliminates the need for human-produced annotations or external flow supervision. Our method disentangles the scene into separate static and dynamic signed distance fields and learns motion implicitly through temporal aggregation. Additionally, we introduce a strong self-supervised flow cue derived from features' cosine similarities. We demonstrate the efficacy of our 3D occupancy flow method on SemanticKITTI, KITTI-MOT, and nuScenes.

Related papers

An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training [50.71892161377806]
DFIT-OccWorld is an efficient 3D occupancy world model that leverages decoupled dynamic flow and image-assisted training strategy.<n>Our model forecasts future dynamic voxels by warping existing observations using voxel flow, whereas static voxels are easily obtained through pose transformation.
arXiv Detail & Related papers (2024-12-18T12:10:33Z)
Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction [14.866463843514156]
Let Occ Flow is the first self-supervised work for joint 3D occupancy and occupancy flow prediction using only camera inputs. Our approach incorporates a novel attention-based temporal fusion module to capture dynamic object dependencies. Our method extends differentiable rendering to 3D volumetric flow fields.
arXiv Detail & Related papers (2024-07-10T12:20:11Z)
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving [18.88208422580103]
Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. Current state-of-the-art methods require annotated data to train scene flow networks. We propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline.
arXiv Detail & Related papers (2024-07-01T18:22:54Z)
Self-Supervised 3D Scene Flow Estimation and Motion Prediction using Local Rigidity Prior [100.98123802027847]
We investigate self-supervised 3D scene flow estimation and class-agnostic motion prediction on point clouds. We generate pseudo scene flow labels for self-supervised learning through piecewise rigid motion estimation. Our method achieves new state-of-the-art performance in self-supervised scene flow learning.
arXiv Detail & Related papers (2023-10-17T14:06:55Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
DEFLOW: Self-supervised 3D Motion Estimation of Debris Flow [19.240172015210586]
We propose DEFLOW, a model for 3D motion estimation of debris flows. We adopt a novel multi-level sensor fusion architecture and self-supervision to incorporate the inductive biases of the scene. Our model achieves state-of-the-art optical flow and depth estimation on our dataset, and fully automates the motion estimation for debris flows.
arXiv Detail & Related papers (2023-04-05T16:40:14Z)
Weakly Supervised Learning of Rigid 3D Scene Flow [81.37165332656612]
We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. We showcase the effectiveness and generalization capacity of our method on four different autonomous driving datasets.
arXiv Detail & Related papers (2021-02-17T18:58:02Z)
IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment. Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z)
Do not trust the neighbors! Adversarial Metric Learning for Self-Supervised Scene Flow Estimation [0.0]
Scene flow is the task of estimating 3D motion vectors to individual points of a dynamic 3D scene. We propose a 3D scene flow benchmark and a novel self-supervised setup for training flow models. We find that our setup is able to keep motion coherence and preserve local geometries, which many self-supervised baselines fail to grasp.
arXiv Detail & Related papers (2020-11-01T17:41:32Z)
Self-Supervised Learning of Non-Rigid Residual Flow and Ego-Motion [63.18340058854517]
We present an alternative method for end-to-end scene flow learning by joint estimation of non-rigid residual flow and ego-motion flow for dynamic 3D scenes. We extend the supervised framework with self-supervisory signals based on the temporal consistency property of a point cloud sequence.
arXiv Detail & Related papers (2020-09-22T11:39:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.