Related papers: CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction

CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction

URL: http://arxiv.org/abs/2011.12836v2
Date: Wed, 31 Mar 2021 11:47:51 GMT
Title: CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction
Authors: Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel
Abstract summary: We propose to teach such patch-borrowing behavior to an attention-free generator by joint training of an auxiliary contextual reconstruction task. The auxiliary branch can be seen as a learnable loss function, where query-reference feature similarity and reference-based reconstructor are jointly optimized with the inpainting generator. Experimental results demonstrate that the proposed inpainting model compares favourably against the state-of-the-art in terms of quantitative and visual performance.
Score: 143.7271816543372
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent deep generative inpainting methods use attention layers to allow the generator to explicitly borrow feature patches from the known region to complete a missing region. Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results. Also, it computes pair-wise similarity across the entire feature map during inference bringing a significant computational overhead. To address this issue, we propose to teach such patch-borrowing behavior to an attention-free generator by joint training of an auxiliary contextual reconstruction task, which encourages the generated output to be plausible even when reconstructed by surrounding regions. The auxiliary branch can be seen as a learnable loss function, i.e. named as contextual reconstruction (CR) loss, where query-reference feature similarity and reference-based reconstructor are jointly optimized with the inpainting generator. The auxiliary branch (i.e. CR loss) is required only during training, and only the inpainting generator is required during the inference. Experimental results demonstrate that the proposed inpainting model compares favourably against the state-of-the-art in terms of quantitative and visual performance.

Related papers

Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z)
DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes [81.56206845824572]
Novel-view synthesis (NVS) approaches play a critical role in vast scene reconstruction. Few-shot methods often struggle with poor reconstruction quality in vast environments. This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes.
arXiv Detail & Related papers (2024-11-19T07:51:44Z)
Context Enhancement with Reconstruction as Sequence for Unified Unsupervised Anomaly Detection [68.74469657656822]
Unsupervised anomaly detection (AD) aims to train robust detection models using only normal samples. Recent research focuses on a unified unsupervised AD setting in which only one model is trained for all classes. We introduce a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction.
arXiv Detail & Related papers (2024-09-10T07:37:58Z)
Multi-feature Reconstruction Network using Crossed-mask Restoration for Unsupervised Industrial Anomaly Detection [4.742650815342744]
Unsupervised anomaly detection is of great significance for quality inspection in industrial manufacturing. We propose a multi-feature reconstruction network, MFRNet, using crossed-mask restoration in this paper. Our method is highly competitive with or significantly outperforms other state-of-the-arts on four public available datasets and one self-made dataset.
arXiv Detail & Related papers (2024-04-20T05:13:56Z)
UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
We propose UGMAE, a unified framework for graph masked autoencoders. We first develop an adaptive feature mask generator to account for the unique significance of nodes. We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information.
arXiv Detail & Related papers (2024-02-12T19:39:26Z)
Panoramic Image Inpainting With Gated Convolution And Contextual Reconstruction Loss [19.659176149635417]
We propose a panoramic image inpainting framework that consists of a Face Generator, a Cube Generator, a side branch, and two discriminators. The proposed method is compared with state-of-the-art (SOTA) methods on SUN360 Street View dataset in terms of PSNR and SSIM.
arXiv Detail & Related papers (2024-02-05T11:58:08Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Decomposing and Coupling Saliency Map for Lesion Segmentation in Ultrasound Images [10.423431415758655]
Complex scenario of ultrasound image, in which adjacent tissues share similar intensity with and even contain richer texture patterns, brings a unique challenge for accurate lesion segmentation. This work presents a decomposition-coupling network, called DC-Net, to deal with this challenge in a (foreground-background) saliency map disentanglement-fusion manner. The proposed method is evaluated on two ultrasound lesion segmentation tasks, which demonstrates the remarkable performance improvement over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-08-02T05:02:30Z)
BGaitR-Net: Occluded Gait Sequence reconstructionwith temporally constrained model for gait recognition [1.151614782416873]
We develop novel deep learning-based algorithms to identify occluded frames in an input sequence. We then reconstruct these frames by exploiting next-temporal information present in the gait sequence. Our LSTM-based model reconstructs occlusion and generates frames that are temporally consistent with the periodic pattern of a gait cycle.
arXiv Detail & Related papers (2021-10-18T18:28:18Z)
Pixel-wise Dense Detector for Image Inpainting [34.721991959357425]
Recent GAN-based image inpainting approaches adopt an average strategy to discriminate the generated image and output a scalar. We propose a novel detection-based generative framework for image inpainting, which adopts the min-max strategy in an adversarial process. Experiments on multiple public datasets show the superior performance of the proposed framework.
arXiv Detail & Related papers (2020-11-04T13:45:27Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.