Learning to Estimate Single-View Volumetric Flow Motions without 3D
Supervision
- URL: http://arxiv.org/abs/2302.14470v1
- Date: Tue, 28 Feb 2023 10:26:02 GMT
- Title: Learning to Estimate Single-View Volumetric Flow Motions without 3D
Supervision
- Authors: Erik Franz (1), Barbara Solenthaler (2 and 3), Nils Thuerey (1) ((1)
Technical University of Munich (TUM), (2) ETH Zurich, (3) TUM - Institute for
Advanced Study)
- Abstract summary: We show that it is possible to train the corresponding networks without requiring any 3D ground truth for training.
In the absence of ground truth data we can train our model with observations from real-world capture setups instead of relying on synthetic reconstructions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the challenging problem of jointly inferring the 3D flow and
volumetric densities moving in a fluid from a monocular input video with a deep
neural network. Despite the complexity of this task, we show that it is
possible to train the corresponding networks without requiring any 3D ground
truth for training. In the absence of ground truth data we can train our model
with observations from real-world capture setups instead of relying on
synthetic reconstructions. We make this unsupervised training approach possible
by first generating an initial prototype volume which is then moved and
transported over time without the need for volumetric supervision. Our approach
relies purely on image-based losses, an adversarial discriminator network, and
regularization. Our method can estimate long-term sequences in a stable manner,
while achieving closely matching targets for inputs such as rising smoke
plumes.
Related papers
- ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks.
We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation.
Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - NCP: Neural Correspondence Prior for Effective Unsupervised Shape
Matching [31.61255365182462]
We present Neural Correspondence Prior (NCP), a new paradigm for computing correspondences between 3D shapes.
Our approach is fully unsupervised and can lead to high-quality correspondences even in challenging cases.
We show that NCP is data-efficient, fast, and state-of-the-art results on many tasks.
arXiv Detail & Related papers (2023-01-14T07:22:18Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - OReX: Object Reconstruction from Planar Cross-sections Using Neural
Fields [10.862993171454685]
OReX is a method for 3D shape reconstruction from slices alone, featuring a Neural Field gradients as the prior.
A modest neural network is trained on the input planes to return an inside/outside estimate for a given 3D coordinate, yielding a powerful prior that induces smoothness and self-similarities.
We offer an iterative estimation architecture and a hierarchical input sampling scheme that encourage coarse-to-fine training, allowing the training process to focus on high frequencies at later stages.
arXiv Detail & Related papers (2022-11-23T11:44:35Z) - RAUM-VO: Rotational Adjusted Unsupervised Monocular Visual Odometry [0.0]
We present RAUM-VO, an approach based on a model-free epipolar constraint for frame-to-frame motion estimation.
RAUM-VO shows a considerable accuracy improvement compared to other unsupervised pose networks on the KITTI dataset.
arXiv Detail & Related papers (2022-03-14T15:03:24Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Self-Supervised Monocular Scene Flow Estimation [27.477810324117016]
We propose a novel monocular scene flow method that yields competitive accuracy and real-time performance.
By taking an inverse problem view, we design a single convolutional neural network (CNN) that successfully estimates depth and 3D motion simultaneously.
arXiv Detail & Related papers (2020-04-08T17:55:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.