Self-SuperFlow: Self-supervised Scene Flow Prediction in Stereo
Sequences
- URL: http://arxiv.org/abs/2206.15296v1
- Date: Thu, 30 Jun 2022 13:55:17 GMT
- Title: Self-SuperFlow: Self-supervised Scene Flow Prediction in Stereo
Sequences
- Authors: Katharina Bendig, Ren\'e Schuster, Didier Stricker
- Abstract summary: In this paper, we explore the extension of a self-supervised loss for scene flow prediction.
Regarding the KITTI scene flow benchmark, our method outperforms the corresponding supervised pre-training of the same network.
- Score: 12.650574326251023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep neural networks showed their exceeding capabilities in
addressing many computer vision tasks including scene flow prediction. However,
most of the advances are dependent on the availability of a vast amount of
dense per pixel ground truth annotations, which are very difficult to obtain
for real life scenarios. Therefore, synthetic data is often relied upon for
supervision, resulting in a representation gap between the training and test
data. Even though a great quantity of unlabeled real world data is available,
there is a huge lack in self-supervised methods for scene flow prediction.
Hence, we explore the extension of a self-supervised loss based on the Census
transform and occlusion-aware bidirectional displacements for the problem of
scene flow prediction. Regarding the KITTI scene flow benchmark, our method
outperforms the corresponding supervised pre-training of the same network and
shows improved generalization capabilities while achieving much faster
convergence.
Related papers
- Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - Adapting to Length Shift: FlexiLength Network for Trajectory Prediction [53.637837706712794]
Trajectory prediction plays an important role in various applications, including autonomous driving, robotics, and scene understanding.
Existing approaches mainly focus on developing compact neural networks to increase prediction precision on public datasets, typically employing a standardized input duration.
We introduce a general and effective framework, the FlexiLength Network (FLN), to enhance the robustness of existing trajectory prediction against varying observation periods.
arXiv Detail & Related papers (2024-03-31T17:18:57Z) - DiffSF: Diffusion Models for Scene Flow Estimation [17.512660491303684]
We propose DiffSF that combines transformer-based scene flow estimation with denoising diffusion models.
We show that the diffusion process greatly increases the robustness of predictions compared to prior approaches.
By sampling multiple times with different initial states, the denoising process predicts multiple hypotheses, which enables measuring the output uncertainty.
arXiv Detail & Related papers (2024-03-08T14:06:15Z) - Multi-Body Neural Scene Flow [37.31530794244607]
We show that multi-body rigidity can be achieved without the cumbersome and brittle strategy of constraining the $SE(3)$ parameters of each rigid body.
This is achieved by regularizing the scene flow optimization to encourage isometry in flow predictions for rigid bodies.
We conduct extensive experiments on real-world datasets and demonstrate that our approach outperforms the state-of-the-art in 3D scene flow and long-term point-wise 4D trajectory prediction.
arXiv Detail & Related papers (2023-10-16T11:37:53Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference.
Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference.
GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z) - How many Observations are Enough? Knowledge Distillation for Trajectory
Forecasting [31.57539055861249]
Current state-of-the-art models usually rely on a "history" of past tracked locations to predict a plausible sequence of future locations.
We conceive a novel distillation strategy that allows a knowledge transfer from a teacher network to a student one.
We show that a properly defined teacher supervision allows a student network to perform comparably to state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-09T15:05:39Z) - Neural Scene Flow Prior [30.878829330230797]
Before the deep learning revolution, many perception algorithms were based on runtime optimization in conjunction with a strong prior/regularization penalty.
This paper revisits the scene flow problem that relies predominantly on runtime optimization and strong regularization.
A central innovation here is the inclusion of a neural scene flow prior, which uses the architecture of neural networks as a new type of implicit regularizer.
arXiv Detail & Related papers (2021-11-01T20:44:12Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.