Dense Depth Distillation with Out-of-Distribution Simulated Images
- URL: http://arxiv.org/abs/2208.12464v3
- Date: Fri, 8 Dec 2023 04:05:02 GMT
- Title: Dense Depth Distillation with Out-of-Distribution Simulated Images
- Authors: Junjie Hu and Chenyou Fan and Mete Ozay and Hualie Jiang and Tin Lun
Lam
- Abstract summary: We study data-free knowledge distillation (KD) for monocular depth estimation (MDE)
KD learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain.
We show that our method outperforms the baseline KD by a good margin and even slightly better performance with as few as 1/6 of training images.
- Score: 30.79756881887895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study data-free knowledge distillation (KD) for monocular depth estimation
(MDE), which learns a lightweight model for real-world depth perception tasks
by compressing it from a trained teacher model while lacking training data in
the target domain. Owing to the essential difference between image
classification and dense regression, previous methods of data-free KD are not
applicable to MDE. To strengthen its applicability in real-world tasks, in this
paper, we propose to apply KD with out-of-distribution simulated images. The
major challenges to be resolved are i) lacking prior information about scene
configurations of real-world training data and ii) domain shift between
simulated and real-world images. To cope with these difficulties, we propose a
tailored framework for depth distillation. The framework generates new training
samples for embracing a multitude of possible object arrangements in the target
domain and utilizes a transformation network to efficiently adapt them to the
feature statistics preserved in the teacher model. Through extensive
experiments on various depth estimation models and two different datasets, we
show that our method outperforms the baseline KD by a good margin and even
achieves slightly better performance with as few as 1/6 of training images,
demonstrating a clear superiority.
Related papers
- Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Unsupervised Deep Learning-based Pansharpening with Jointly-Enhanced
Spectral and Spatial Fidelity [4.425982186154401]
We propose a new deep learning-based pansharpening model that fully exploits the potential of this approach.
The proposed model features a novel loss function that jointly promotes the spectral and spatial quality of the pansharpened data.
Experiments on a large variety of test images, performed in challenging scenarios, demonstrate that the proposed method compares favorably with the state of the art.
arXiv Detail & Related papers (2023-07-26T17:25:28Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - DELAD: Deep Landweber-guided deconvolution with Hessian and sparse prior [0.22940141855172028]
We present a model for non-blind image deconvolution that incorporates the classic iterative method into a deep learning application.
We build our network based on the iterative Landweber deconvolution algorithm, which is integrated with trainable convolutional layers to enhance the recovered image structures and details.
arXiv Detail & Related papers (2022-09-30T11:15:03Z) - Single Image Internal Distribution Measurement Using Non-Local
Variational Autoencoder [11.985083962982909]
This paper proposes a novel image-specific solution, namely non-local variational autoencoder (textttNLVAE)
textttNLVAE is introduced as a self-supervised strategy that reconstructs high-resolution images using disentangled information from the non-local neighbourhood.
Experimental results from seven benchmark datasets demonstrate the effectiveness of the textttNLVAE model.
arXiv Detail & Related papers (2022-04-02T18:43:55Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.