Multi-modal Bifurcated Network for Depth Guided Image Relighting
- URL: http://arxiv.org/abs/2105.00690v2
- Date: Wed, 5 May 2021 02:13:15 GMT
- Title: Multi-modal Bifurcated Network for Depth Guided Image Relighting
- Authors: Hao-Hsiang Yang and Wei-Ting Chen and Hao-Lun Luo and Sy-Yen Kuo
- Abstract summary: We propose a deep learning-based method called multi-modal bifurcated network (MBNet) for depth guided image relighting.
This model extracts the image and the depth features by the bifurcated network in the encoder.
Experiments conducted on the VIDIT dataset show that the proposed solution obtains the textbf1$st$ place in terms of SSIM and PMS.
- Score: 13.857410735989301
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image relighting aims to recalibrate the illumination setting in an image. In
this paper, we propose a deep learning-based method called multi-modal
bifurcated network (MBNet) for depth guided image relighting. That is, given an
image and the corresponding depth maps, a new image with the given illuminant
angle and color temperature is generated by our network. This model extracts
the image and the depth features by the bifurcated network in the encoder. To
use the two features effectively, we adopt the dynamic dilated pyramid modules
in the decoder. Moreover, to increase the variety of training data, we propose
a novel data process pipeline to increase the number of the training data.
Experiments conducted on the VIDIT dataset show that the proposed solution
obtains the \textbf{1}$^{st}$ place in terms of SSIM and PMS in the NTIRE 2021
Depth Guide One-to-one Relighting Challenge.
Related papers
- Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - PromptMix: Text-to-image diffusion models enhance the performance of
lightweight networks [83.08625720856445]
Deep learning tasks require annotations that are too time consuming for human operators.
In this paper, we introduce PromptMix, a method for artificially boosting the size of existing datasets.
We show that PromptMix can significantly increase the performance of lightweight networks by up to 26%.
arXiv Detail & Related papers (2023-01-30T14:15:47Z) - Enhancing Low-Light Images in Real World via Cross-Image Disentanglement [58.754943762945864]
We propose a new low-light image enhancement dataset consisting of misaligned training images with real-world corruptions.
Our model achieves state-of-the-art performances on both the newly proposed dataset and other popular low-light datasets.
arXiv Detail & Related papers (2022-01-10T03:12:52Z) - Single Plane-Wave Imaging using Physics-Based Deep Learning [2.1410799064827226]
In plane-wave imaging, multiple unfocused ultrasound waves are transmitted into a medium of interest from different angles.
Deep learning methods have been proposed to improve ultrasound imaging.
We propose a data-to-image architecture that incorporates a wave-physics-based image formation algorithm in-between deep convolutional neural networks.
arXiv Detail & Related papers (2021-09-08T14:06:29Z) - RigNet: Repetitive Image Guided Network for Depth Completion [20.66405067066299]
Recent approaches mainly focus on image guided learning to predict dense results.
blurry image guidance and object structures in depth still impede the performance of image guided frameworks.
We explore a repetitive design in our image guided network to sufficiently and gradually recover depth values.
Our method achieves state-of-the-art result on the NYUv2 dataset and ranks 1st on the KITTI benchmark at the time of submission.
arXiv Detail & Related papers (2021-07-29T08:00:33Z) - S3Net: A Single Stream Structure for Depth Guided Image Relighting [13.201978111555817]
We propose a deep learning-based neural Single Stream Structure network called S3Net for depth guided image relighting.
Experiments performed on challenging benchmark show that the proposed model achieves the 3 rd highest SSIM in the NTIRE 2021 Depth Guided Any-to-any Relighting Challenge.
arXiv Detail & Related papers (2021-05-03T08:33:53Z) - Degrade is Upgrade: Learning Degradation for Low-light Image Enhancement [52.49231695707198]
We investigate the intrinsic degradation and relight the low-light image while refining the details and color in two steps.
Inspired by the color image formulation, we first estimate the degradation from low-light inputs to simulate the distortion of environment illumination color, and then refine the content to recover the loss of diffuse illumination color.
Our proposed method has surpassed the SOTA by 0.95dB in PSNR on LOL1000 dataset and 3.18% in mAP on ExDark dataset.
arXiv Detail & Related papers (2021-03-19T04:00:27Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Learning light field synthesis with Multi-Plane Images: scene encoding
as a recurrent segmentation task [30.058283056074426]
This paper addresses the problem of view synthesis from large baseline light fields by turning a sparse set of input views into a Multi-plane Image (MPI)
Because available datasets are scarce, we propose a lightweight network that does not require extensive training.
Our model does not learn to estimate RGB layers but only encodes the scene geometry within MPI alpha layers, which comes down to a segmentation task.
arXiv Detail & Related papers (2020-02-12T14:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.