Thinking inside the Convolution for Image Inpainting: Reconstructing Texture via Structure under Global and Local Side
- URL: http://arxiv.org/abs/2602.03013v1
- Date: Tue, 03 Feb 2026 02:30:07 GMT
- Title: Thinking inside the Convolution for Image Inpainting: Reconstructing Texture via Structure under Global and Local Side
- Authors: Haipeng Liu, Yang Wang, Biao Qian, Yong Rui, Meng Wang,
- Abstract summary: Image inpainting has earned substantial progress, owing to the encoder-and-decoder pipeline.<n>Recent studies intuitively identify the high-frequency structure and low-frequency texture to be extracted by CNNs from the encoder.<n>In this paper, we systematically answer whether and how the structure and texture feature map can mutually help to alleviate the information loss during the convolutional downsampling.
- Score: 19.6934948102931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image inpainting has earned substantial progress, owing to the encoder-and-decoder pipeline, which is benefited from the Convolutional Neural Networks (CNNs) with convolutional downsampling to inpaint the masked regions semantically from the known regions within the encoder, coupled with an upsampling process from the decoder for final inpainting output. Recent studies intuitively identify the high-frequency structure and low-frequency texture to be extracted by CNNs from the encoder, and subsequently for a desirable upsampling recovery. However, the existing arts inevitably overlook the information loss for both structure and texture feature maps during the convolutional downsampling process, hence suffer from a non-ideal upsampling output. In this paper, we systematically answer whether and how the structure and texture feature map can mutually help to alleviate the information loss during the convolutional downsampling. Given the structure and texture feature maps, we adopt the statistical normalization and denormalization strategy for the reconstruction guidance during the convolutional downsampling process. The extensive experimental results validate its advantages to the state-of-the-arts over the images from low-to-high resolutions including 256*256 and 512*512, especially holds by substituting all the encoders by ours. Our code is available at https://github.com/htyjers/ConvInpaint-TSGL
Related papers
- Neural Graphics Texture Compression Supporting Random Access [34.974631096947284]
We introduce a novel approach to texture set compression that integrates traditional GPU texture representation and NIC techniques.
We propose an asymmetric auto-encoder framework that employs a convolutional encoder to capture detailed information in a bottleneck-latent space.
Experimental results demonstrate that this approach provides much better results than conventional texture compression.
arXiv Detail & Related papers (2024-05-06T19:44:13Z) - Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering [47.78392889256976]
Paint-it is a text-driven high-fidelity texture map synthesis method for 3D rendering.
Paint-it synthesizes texture maps from a text description by synthesis-through-optimization, exploiting the Score-Distillation Sampling (SDS)
We show that DC-PBR inherently schedules the optimization curriculum according to texture frequency and naturally filters out the noisy signals from SDS.
arXiv Detail & Related papers (2023-12-18T17:17:08Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Super-Resolution of License Plate Images Using Attention Modules and
Sub-Pixel Convolution Layers [3.8831062015253055]
We introduce a Single-Image Super-Resolution (SISR) approach to enhance the detection of structural and textural features in surveillance images.
Our approach incorporates sub-pixel convolution layers and a loss function that uses an Optical Character Recognition (OCR) model for feature extraction.
Our results show that our approach for reconstructing these low-resolution synthesized images outperforms existing ones in both quantitative and qualitative measures.
arXiv Detail & Related papers (2023-05-27T00:17:19Z) - CoordFill: Efficient High-Resolution Image Inpainting via Parameterized
Coordinate Querying [52.91778151771145]
In this paper, we try to break the limitations for the first time thanks to the recent development of continuous implicit representation.
Experiments show that the proposed method achieves real-time performance on the 2048$times$2048 images using a single GTX 2080 Ti GPU.
arXiv Detail & Related papers (2023-03-15T11:13:51Z) - RADAM: Texture Recognition through Randomized Aggregated Encoding of
Deep Activation Maps [12.35895883721336]
We propose a new method named textbfRandom encoding of textbfAggregated textbfDeep textbfActivation textbfMaps (RADAM)
RADAM extracts rich texture representations without ever changing the backbone.
Our results suggest that pre-trained backbones may not require additional fine-tuning for texture recognition if their learned representations are better encoded.
arXiv Detail & Related papers (2023-03-08T13:09:03Z) - The Devil Is in the Details: Window-based Attention for Image
Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs)
In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block.
The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z) - SDWNet: A Straight Dilated Network with Wavelet Transformation for Image
Deblurring [23.86692375792203]
Image deblurring is a computer vision problem that aims to recover a sharp image from a blurred image.
Our model uses dilated convolution to enable the obtainment of the large receptive field with high spatial resolution.
We propose a novel module using the wavelet transform, which effectively helps the network to recover clear high-frequency texture details.
arXiv Detail & Related papers (2021-10-12T07:58:10Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Over-and-Under Complete Convolutional RNN for MRI Reconstruction [57.95363471940937]
Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture.
We propose an Over-and-Under Complete Convolu?tional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN)
The proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.
arXiv Detail & Related papers (2021-06-16T15:56:34Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.