Detail-Preserving Transformer for Light Field Image Super-Resolution
- URL: http://arxiv.org/abs/2201.00346v1
- Date: Sun, 2 Jan 2022 12:33:23 GMT
- Title: Detail-Preserving Transformer for Light Field Image Super-Resolution
- Authors: Shunzhou Wang, Tianfei Zhou, Yao Lu, Huijun Di
- Abstract summary: We put forth a novel formulation built upon Transformers, by treating light field super-resolution as a sequence-to-sequence reconstruction task.
We propose a detail-preserving Transformer (termed as DPT), by leveraging gradient maps of light field to guide the sequence learning.
DPT consists of two branches, with each associated with a Transformer for learning from an original or gradient image sequence.
- Score: 15.53525700552796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, numerous algorithms have been developed to tackle the problem of
light field super-resolution (LFSR), i.e., super-resolving low-resolution light
fields to gain high-resolution views. Despite delivering encouraging results,
these approaches are all convolution-based, and are naturally weak in global
relation modeling of sub-aperture images necessarily to characterize the
inherent structure of light fields. In this paper, we put forth a novel
formulation built upon Transformers, by treating LFSR as a sequence-to-sequence
reconstruction task. In particular, our model regards sub-aperture images of
each vertical or horizontal angular view as a sequence, and establishes
long-range geometric dependencies within each sequence via a spatial-angular
locally-enhanced self-attention layer, which maintains the locality of each
sub-aperture image as well. Additionally, to better recover image details, we
propose a detail-preserving Transformer (termed as DPT), by leveraging gradient
maps of light field to guide the sequence learning. DPT consists of two
branches, with each associated with a Transformer for learning from an original
or gradient image sequence. The two branches are finally fused to obtain
comprehensive feature representations for reconstruction. Evaluations are
conducted on a number of light field datasets, including real-world scenes and
synthetic data. The proposed method achieves superior performance comparing
with other state-of-the-art schemes. Our code is publicly available at:
https://github.com/BITszwang/DPT.
Related papers
- TransY-Net:Learning Fully Transformer Networks for Change Detection of
Remote Sensing Images [64.63004710817239]
We propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD.
It improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.
Our proposed method achieves a new state-of-the-art performance on four optical and two SAR image CD benchmarks.
arXiv Detail & Related papers (2023-10-22T07:42:19Z) - Light Field Diffusion for Single-View Novel View Synthesis [32.59286750410843]
Single-view novel view synthesis (NVS) is important but challenging in computer vision.
Recent advancements in NVS have leveraged Denoising Diffusion Probabilistic Models (DDPMs) for their exceptional ability to produce high-fidelity images.
We present Light Field Diffusion (LFD), a novel conditional diffusion-based approach that transcends the conventional reliance on camera pose matrices.
arXiv Detail & Related papers (2023-09-20T03:27:06Z) - Low-Light Image Enhancement with Illumination-Aware Gamma Correction and
Complete Image Modelling Network [69.96295927854042]
Low-light environments usually lead to less informative large-scale dark areas.
We propose to integrate the effectiveness of gamma correction with the strong modelling capacities of deep networks.
Because exponential operation introduces high computational complexity, we propose to use Taylor Series to approximate gamma correction.
arXiv Detail & Related papers (2023-08-16T08:46:51Z) - Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - Physics-Informed Ensemble Representation for Light-Field Image
Super-Resolution [12.156009287223382]
We analyze the coordinate transformation of the light field (LF) imaging process to reveal the geometric relationship in the LF images.
We introduce a new LF subspace of virtual-slit images (VSI) that provide sub-pixel information complementary to sub-aperture images.
To super-resolve image structures from undersampled LF data, we propose a geometry-aware decoder, named EPIXformer.
arXiv Detail & Related papers (2023-05-31T16:27:00Z) - Progressively-connected Light Field Network for Efficient View Synthesis [69.29043048775802]
We present a Progressively-connected Light Field network (ProLiF) for the novel view synthesis of complex forward-facing scenes.
ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
arXiv Detail & Related papers (2022-07-10T13:47:20Z) - Light Field Reconstruction Using Convolutional Network on EPI and
Extended Applications [78.63280020581662]
A novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views.
We demonstrate the high performance and robustness of the proposed framework compared with state-of-the-art algorithms.
arXiv Detail & Related papers (2021-03-24T08:16:32Z) - Light Field Spatial Super-resolution via Deep Combinatorial Geometry
Embedding and Structural Consistency Regularization [99.96632216070718]
Light field (LF) images acquired by hand-held devices usually suffer from low spatial resolution.
The high-dimensional spatiality characteristic and complex geometrical structure of LF images make the problem more challenging than traditional single-image SR.
We propose a novel learning-based LF framework, in which each view of an LF image is first individually super-resolved.
arXiv Detail & Related papers (2020-04-05T14:39:57Z) - Learning light field synthesis with Multi-Plane Images: scene encoding
as a recurrent segmentation task [30.058283056074426]
This paper addresses the problem of view synthesis from large baseline light fields by turning a sparse set of input views into a Multi-plane Image (MPI)
Because available datasets are scarce, we propose a lightweight network that does not require extensive training.
Our model does not learn to estimate RGB layers but only encodes the scene geometry within MPI alpha layers, which comes down to a segmentation task.
arXiv Detail & Related papers (2020-02-12T14:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.