Focus on defocus: bridging the synthetic to real domain gap for depth
estimation
- URL: http://arxiv.org/abs/2005.09623v1
- Date: Tue, 19 May 2020 17:52:37 GMT
- Title: Focus on defocus: bridging the synthetic to real domain gap for depth
estimation
- Authors: Maxim Maximov, Kevin Galim and Laura Leal-Taix\'e
- Abstract summary: We tackle the issue of closing the synthetic-real domain gap by using domain invariant defocus blur as direct supervision.
We leverage defocus cues by using a permutation invariant convolutional neural network that encourages the network to learn from the differences between images with a different point of focus.
We are able to train our model completely on synthetic data and directly apply it to a wide range of real-world images.
- Score: 9.023847175654602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven depth estimation methods struggle with the generalization outside
their training scenes due to the immense variability of the real-world scenes.
This problem can be partially addressed by utilising synthetically generated
images, but closing the synthetic-real domain gap is far from trivial. In this
paper, we tackle this issue by using domain invariant defocus blur as direct
supervision. We leverage defocus cues by using a permutation invariant
convolutional neural network that encourages the network to learn from the
differences between images with a different point of focus. Our proposed
network uses the defocus map as an intermediate supervisory signal. We are able
to train our model completely on synthetic data and directly apply it to a wide
range of real-world images. We evaluate our model on synthetic and real
datasets, showing compelling generalization results and state-of-the-art depth
prediction.
Related papers
- Domain Generalization for In-Orbit 6D Pose Estimation [14.624172952608653]
We introduce a novel, end-to-end, neural-based architecture for spacecraft pose estimation networks.
We demonstrate that our method effectively closes the domain gap, achieving state-of-the-art accuracy on the widespread SPEED+ dataset.
arXiv Detail & Related papers (2024-06-17T17:01:20Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Towards Real-World Focus Stacking with Deep Learning [97.34754533628322]
We introduce a new dataset consisting of 94 high-resolution bursts of raw images with focus bracketing.
This dataset is used to train the first deep learning algorithm for focus stacking capable of handling bursts of sufficient length for real-world applications.
arXiv Detail & Related papers (2023-11-29T17:49:33Z) - Aberration-Aware Depth-from-Focus [20.956132508261664]
We investigate the domain gap caused by off-axis aberrations that will affect the decision of the best-focused frame in a focal stack.
We then explore bridging this domain gap through aberration-aware training (AAT)
Our approach involves a lightweight network that models lens aberrations at different positions and focus distances, which is then integrated into the conventional network training pipeline.
arXiv Detail & Related papers (2023-03-08T15:21:33Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Grasp-Oriented Fine-grained Cloth Segmentation without Real Supervision [66.56535902642085]
This paper tackles the problem of fine-grained region detection in deformed clothes using only a depth image.
We define up to 6 semantic regions of varying extent, including edges on the neckline, sleeve cuffs, and hem, plus top and bottom grasping points.
We introduce a U-net based network to segment and label these parts.
We show that training our network solely with synthetic data and the proposed DA yields results competitive with models trained on real data.
arXiv Detail & Related papers (2021-10-06T16:31:20Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real
Domain Shift and Improve Depth Estimation [16.153683223016973]
We develop an attention module that learns to identify and remove difficult out-of-domain regions in real images.
Visualizing the removed regions provides interpretable insights into the synthetic-real domain gap.
arXiv Detail & Related papers (2020-02-27T14:28:56Z) - Focus on Semantic Consistency for Cross-domain Crowd Understanding [34.560447389853614]
Some domain adaptation algorithms try to liberate it by training models with synthetic data.
We found that a mass of estimation errors in the background areas impede the performance of the existing methods.
In this paper, we propose a domain adaptation method to eliminate it.
arXiv Detail & Related papers (2020-02-20T08:51:05Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.