Segmentation Guided Sparse Transformer for Under-Display Camera Image
Restoration
- URL: http://arxiv.org/abs/2403.05906v1
- Date: Sat, 9 Mar 2024 13:11:59 GMT
- Title: Segmentation Guided Sparse Transformer for Under-Display Camera Image
Restoration
- Authors: Jingyun Xue, Tao Wang, Jun Wang, Kaihao Zhang, Wenhan Luo, Wenqi Ren,
Zikun Liu, Hyunhee Park, Xiaochun Cao
- Abstract summary: Under-Display Camera (UDC) is an emerging technology that achieves full-screen display via hiding the camera under the display panel.
In this paper, we observe that when using the Vision Transformer for UDC degraded image restoration, the global attention samples a large amount of redundant information and noise.
We propose a Guided Sparse Transformer method (SGSFormer) for the task of restoring high-quality images from UDC degraded images.
- Score: 91.65248635837145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Under-Display Camera (UDC) is an emerging technology that achieves
full-screen display via hiding the camera under the display panel. However, the
current implementation of UDC causes serious degradation. The incident light
required for camera imaging undergoes attenuation and diffraction when passing
through the display panel, leading to various artifacts in UDC imaging.
Presently, the prevailing UDC image restoration methods predominantly utilize
convolutional neural network architectures, whereas Transformer-based methods
have exhibited superior performance in the majority of image restoration tasks.
This is attributed to the Transformer's capability to sample global features
for the local reconstruction of images, thereby achieving high-quality image
restoration. In this paper, we observe that when using the Vision Transformer
for UDC degraded image restoration, the global attention samples a large amount
of redundant information and noise. Furthermore, compared to the ordinary
Transformer employing dense attention, the Transformer utilizing sparse
attention can alleviate the adverse impact of redundant information and noise.
Building upon this discovery, we propose a Segmentation Guided Sparse
Transformer method (SGSFormer) for the task of restoring high-quality images
from UDC degraded images. Specifically, we utilize sparse self-attention to
filter out redundant information and noise, directing the model's attention to
focus on the features more relevant to the degraded regions in need of
reconstruction. Moreover, we integrate the instance segmentation map as prior
information to guide the sparse self-attention in filtering and focusing on the
correct regions.
Related papers
- IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions [26.09373405194564]
We present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2.
We adopt a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields.
Our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods.
arXiv Detail & Related papers (2024-03-31T10:01:20Z) - Image Reconstruction using Enhanced Vision Transformer [0.08594140167290097]
We propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting.
The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings.
We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability.
arXiv Detail & Related papers (2023-07-11T02:14:18Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - Recursive Generalization Transformer for Image Super-Resolution [108.67898547357127]
We propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images.
We combine the RG-SA with local self-attention to enhance the exploitation of the global context.
Our RGT outperforms recent state-of-the-art methods quantitatively and qualitatively.
arXiv Detail & Related papers (2023-03-11T10:44:44Z) - Modular Degradation Simulation and Restoration for Under-Display Camera [21.048590332029995]
Under-display camera (UDC) provides an elegant solution for full-screen smartphones.
UDC captured images suffer from severe degradation since sensors lie under the display.
We propose a modular network dubbed MPGNet trained using the generative adversarial network (GAN) framework for simulating UDC imaging.
arXiv Detail & Related papers (2022-09-23T07:36:07Z) - UDC-UNet: Under-Display Camera Image Restoration via U-Shape Dynamic
Network [13.406025621307132]
Under-Display Camera (UDC) has been widely exploited to help smartphones realize full screen display.
As the screen could inevitably affect the light propagation process, the images captured by the UDC system usually contain flare, haze, blur, and noise.
In this paper, we propose a new deep model, namely UDC-UNet, to address the UDC image restoration problem with the known Point Spread Function (PSF) in HDR scenes.
arXiv Detail & Related papers (2022-09-05T07:41:44Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion [37.993611194758195]
We propose a Patch PyramidTransformer(PPT) to address the issues of extracting semantic information from an image.
The experimental results demonstrate its superior performance against the state-of-the-art fusion approaches.
arXiv Detail & Related papers (2021-07-29T13:57:45Z) - Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic
Skip Connection Network [80.67717076541956]
Under-Display Camera (UDC) systems provide a true bezel-less and notch-free viewing experience on smartphones.
In a typical UDC system, the pixel array attenuates and diffracts the incident light on the camera, resulting in significant image quality degradation.
In this work, we aim to analyze and tackle the aforementioned degradation problems.
arXiv Detail & Related papers (2021-04-19T18:41:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.