ADFactory: An Effective Framework for Generalizing Optical Flow with
Nerf
- URL: http://arxiv.org/abs/2311.04246v2
- Date: Tue, 14 Nov 2023 08:52:57 GMT
- Title: ADFactory: An Effective Framework for Generalizing Optical Flow with
Nerf
- Authors: Han Ling
- Abstract summary: We introduce a novel optical flow training framework: automatic data factory (ADF)
ADF only requires RGB images as input to effectively train the optical flow network on the target data domain.
We use advanced Nerf technology to reconstruct scenes from photo groups collected by a monocular camera.
We screen the generated labels from multiple aspects, such as optical flow matching accuracy, radiation field confidence, and depth consistency.
- Score: 0.4532517021515834
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A significant challenge facing current optical flow methods is the difficulty
in generalizing them well to the real world. This is mainly due to the high
cost of hand-crafted datasets, and existing self-supervised methods are limited
by indirect loss and occlusions, resulting in fuzzy outcomes. To address this
challenge, we introduce a novel optical flow training framework: automatic data
factory (ADF). ADF only requires RGB images as input to effectively train the
optical flow network on the target data domain. Specifically, we use advanced
Nerf technology to reconstruct scenes from photo groups collected by a
monocular camera, and then calculate optical flow labels between camera pose
pairs based on the rendering results. To eliminate erroneous labels caused by
defects in the scene reconstructed by Nerf, we screened the generated labels
from multiple aspects, such as optical flow matching accuracy, radiation field
confidence, and depth consistency. The filtered labels can be directly used for
network supervision. Experimentally, the generalization ability of ADF on KITTI
surpasses existing self-supervised optical flow and monocular scene flow
algorithms. In addition, ADF achieves impressive results in real-world
zero-point generalization evaluations and surpasses most supervised methods.
Related papers
- Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors [52.195637608631955]
Non-line-of-sight (NLOS) imaging has attracted increasing attention due to its potential applications.
Existing NLOS reconstruction approaches are constrained by the reliance on empirical physical priors.
We introduce a novel learning-based solution, comprising two key designs: Learnable Path Compensation (LPC) and Adaptive Phasor Field (APF)
arXiv Detail & Related papers (2024-09-21T04:39:45Z) - RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface
Normal Estimation and Manipulation [50.10282876199739]
This paper introduces RFTrans, an RGB-D-based method for surface normal estimation and manipulation of transparent objects.
It integrates the RFNet, which predicts refractive flow, object mask, and boundaries, followed by the F2Net, which estimates surface normal from the refractive flow.
A real-world robot grasping task witnesses an 83% success rate, proving that refractive flow can help enable direct sim-to-real transfer.
arXiv Detail & Related papers (2023-11-21T07:19:47Z) - Improving Lens Flare Removal with General Purpose Pipeline and Multiple
Light Sources Recovery [69.71080926778413]
flare artifacts can affect image visual quality and downstream computer vision tasks.
Current methods do not consider automatic exposure and tone mapping in image signal processing pipeline.
We propose a solution to improve the performance of lens flare removal by revisiting the ISP and design a more reliable light sources recovery strategy.
arXiv Detail & Related papers (2023-08-31T04:58:17Z) - Optical Flow for Autonomous Driving: Applications, Challenges and
Improvements [0.9023847175654602]
We propose and evaluate training strategies to improve a learning-based optical flow algorithm.
While trained with synthetic data, the model demonstrates strong capabilities to generalize to real world fisheye data.
We propose a novel, generic semi-supervised framework that significantly boosts performances of existing methods in low light.
arXiv Detail & Related papers (2023-01-11T12:01:42Z) - GraspNeRF: Multiview-based 6-DoF Grasp Detection for Transparent and
Specular Objects Using Generalizable NeRF [7.47805672405939]
We propose a multiview RGB-based 6-DoF grasp detection network, GraspNeRF, to achieve material-agnostic object grasping in clutter.
Compared to the existing NeRF-based 3-DoF grasp detection methods, our system can perform zero-shot NeRF construction with sparse RGB inputs and reliably detect 6-DoF grasps, both in real-time.
For training data, we generate a large-scale photorealistic domain-randomized synthetic dataset of grasping in cluttered tabletop scenes.
arXiv Detail & Related papers (2022-10-12T20:31:23Z) - LWGNet: Learned Wirtinger Gradients for Fourier Ptychographic Phase
Retrieval [14.588976801396576]
We propose a hybrid model-driven residual network that combines the knowledge of the forward imaging system with a deep data-driven network.
Unlike other conventional unrolling techniques, LWGNet uses fewer stages while performing at par or even better than existing traditional and deep learning techniques.
This improvement in performance for low-bit depth and low-cost sensors has the potential to bring down the cost of FPM imaging setup significantly.
arXiv Detail & Related papers (2022-08-08T17:22:54Z) - Dense Optical Flow from Event Cameras [55.79329250951028]
We propose to incorporate feature correlation and sequential processing into dense optical flow estimation from event cameras.
Our proposed approach computes dense optical flow and reduces the end-point error by 23% on MVSEC.
arXiv Detail & Related papers (2021-08-24T07:39:08Z) - Universal and Flexible Optical Aberration Correction Using Deep-Prior
Based Deconvolution [51.274657266928315]
We propose a PSF aware plug-and-play deep network, which takes the aberrant image and PSF map as input and produces the latent high quality version via incorporating lens-specific deep priors.
Specifically, we pre-train a base model from a set of diverse lenses and then adapt it to a given lens by quickly refining the parameters.
arXiv Detail & Related papers (2021-04-07T12:00:38Z) - FD-GAN: Generative Adversarial Networks with Fusion-discriminator for
Single Image Dehazing [48.65974971543703]
We propose a fully end-to-end Generative Adversarial Networks with Fusion-discriminator (FD-GAN) for image dehazing.
Our model can generator more natural and realistic dehazed images with less color distortion and fewer artifacts.
Experiments have shown that our method reaches state-of-the-art performance on both public synthetic datasets and real-world images.
arXiv Detail & Related papers (2020-01-20T04:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.