Learning Parallax Transformer Network for Stereo Image JPEG Artifacts
Removal
- URL: http://arxiv.org/abs/2207.07335v1
- Date: Fri, 15 Jul 2022 08:21:53 GMT
- Title: Learning Parallax Transformer Network for Stereo Image JPEG Artifacts
Removal
- Authors: Xuhao Jiang, Weimin Tan, Ri Cheng, Shili Zhou and Bo Yan
- Abstract summary: Under stereo settings, the performance of image JPEG artifacts removal can be further improved by exploiting the additional information provided by a second view.
We propose a novel parallax transformer network (PTNet) to integrate the information from stereo image pairs for stereo image JPEG artifacts removal.
- Score: 17.289890973937318
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Under stereo settings, the performance of image JPEG artifacts removal can be
further improved by exploiting the additional information provided by a second
view. However, incorporating this information for stereo image JPEG artifacts
removal is a huge challenge, since the existing compression artifacts make
pixel-level view alignment difficult. In this paper, we propose a novel
parallax transformer network (PTNet) to integrate the information from stereo
image pairs for stereo image JPEG artifacts removal. Specifically, a
well-designed symmetric bi-directional parallax transformer module is proposed
to match features with similar textures between different views instead of
pixel-level view alignment. Due to the issues of occlusions and boundaries, a
confidence-based cross-view fusion module is proposed to achieve better feature
fusion for both views, where the cross-view features are weighted with
confidence maps. Especially, we adopt a coarse-to-fine design for the
cross-view interaction, leading to better performance. Comprehensive
experimental results demonstrate that our PTNet can effectively remove
compression artifacts and achieves superior performance than other testing
state-of-the-art methods.
Related papers
- OAPT: Offset-Aware Partition Transformer for Double JPEG Artifacts Removal [11.880153842710776]
We propose Offset-Aware Partition Transformer for double JPEG artifacts removal, termed as OAPT.
We conduct an analysis of double JPEG compression that results in up to four patterns within each 8x8 block.
Our OAPT consists of two components: compression offset predictor and image reconstructor.
arXiv Detail & Related papers (2024-08-21T09:47:54Z) - SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising [11.776198596143931]
We propose a one-stage transformer-based architecture, named SGDFormer, for cross-spectral Stereo image Guided Denoising.
Our transformer block contains a noise-robust cross-attention (NRCA) module and a spatially variant feature fusion (SVFF) module.
Thanks to the above design, our SGDFormer can restore artifact-free images with fine structures, and achieves state-of-the-art performance on various datasets.
arXiv Detail & Related papers (2024-03-30T12:55:19Z) - Content-aware Masked Image Modeling Transformer for Stereo Image Compression [15.819672238043786]
We propose a stereo image compression framework, named CAMSIC.
CAMSIC transforms each image to latent representation and employs a powerful decoder-free Transformer entropy model.
Experiments show that our framework achieves state-of-the-art rate-distortion performance on two stereo image datasets.
arXiv Detail & Related papers (2024-03-13T13:12:57Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - Cross-View Hierarchy Network for Stereo Image Super-Resolution [14.574538513341277]
Stereo image super-resolution aims to improve the quality of high-resolution stereo image pairs by exploiting complementary information across views.
We propose a novel method, named Cross-View-Hierarchy Network for Stereo Image Super-Resolution (CVHSSR)
CVHSSR achieves the best stereo image super-resolution performance than other state-of-the-art methods while using fewer parameters.
arXiv Detail & Related papers (2023-04-13T03:11:30Z) - Multi-Projection Fusion and Refinement Network for Salient Object
Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image.
MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs.
Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z) - MAT: Mask-Aware Transformer for Large Hole Image Inpainting [79.67039090195527]
We present a novel model for large hole inpainting, which unifies the merits of transformers and convolutions.
Experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets.
arXiv Detail & Related papers (2022-03-29T06:36:17Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z) - PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion [37.993611194758195]
We propose a Patch PyramidTransformer(PPT) to address the issues of extracting semantic information from an image.
The experimental results demonstrate its superior performance against the state-of-the-art fusion approaches.
arXiv Detail & Related papers (2021-07-29T13:57:45Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.