Related papers: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement

PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement

URL: http://arxiv.org/abs/2512.03247v1
Date: Tue, 02 Dec 2025 21:35:57 GMT
Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
Authors: Haitian Zheng, Yuan Yao, Yongsheng Yu, Yuqian Zhou, Jiebo Luo, Zhe Lin,
Abstract summary: PixPerfect is a pixel-level refinement framework that delivers seamless, high-fidelity local edits across diverse LDM architectures and tasks.<n>Experiments on inpainting, object removal, and insertion benchmarks demonstrate that PixPerfect substantially enhances perceptual fidelity and downstream editing performance.
Score: 52.21370023312275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Latent Diffusion Models (LDMs) have markedly advanced the quality of image inpainting and local editing. However, the inherent latent compression often introduces pixel-level inconsistencies, such as chromatic shifts, texture mismatches, and visible seams along editing boundaries. Existing remedies, including background-conditioned latent decoding and pixel-space harmonization, usually fail to fully eliminate these artifacts in practice and do not generalize well across different latent representations or tasks. We introduce PixPerfect, a pixel-level refinement framework that delivers seamless, high-fidelity local edits across diverse LDM architectures and tasks. PixPerfect leverages (i) a differentiable discriminative pixel space that amplifies and suppresses subtle color and texture discrepancies, (ii) a comprehensive artifact simulation pipeline that exposes the refiner to realistic local editing artifacts during training, and (iii) a direct pixel-space refinement scheme that ensures broad applicability across diverse latent representations and tasks. Extensive experiments on inpainting, object removal, and insertion benchmarks demonstrate that PixPerfect substantially enhances perceptual fidelity and downstream editing performance, establishing a new standard for robust and high-fidelity localized image editing.

Related papers

Edge-Aware Image Manipulation via Diffusion Models with a Novel Structure-Preservation Loss [32.26030534230571]
We propose a novel Structure Preservation Loss (SPL) to quantify structural differences between input and edited images.<n>Our training-free approach integrates SPL directly into the diffusion model's generative process to ensure structural fidelity.<n> Experiments confirm SPL enhances structural fidelity, delivering state-of-the-art performance in latent-diffusion-based image editing.
arXiv Detail & Related papers (2026-01-23T11:06:51Z)
Local-Global Context-Aware and Structure-Preserving Image Super-Resolution [23.87231269881077]
Pretrained text-to-image models, such as Stable Diffusion, have exhibited strong capabilities in synthesizing realistic image content.<n>We propose a contextually precise image super-resolution framework that effectively maintains both local and global pixel relationships.
arXiv Detail & Related papers (2025-10-11T07:17:31Z)
DiffTex: Differentiable Texturing for Architectural Proxy Models [63.370581207280004]
We propose an automated method for generating realistic texture maps for architectural proxy models at the texel level from unordered photographs.<n>Our approach establishes correspondences between texels on a UV map and pixels in the input images, with each texel's color computed as a weighted blend of associated pixel values.
arXiv Detail & Related papers (2025-09-27T14:39:53Z)
IntrinsicEdit: Precise generative image manipulation in intrinsic space [53.404235331886255]
We introduce a versatile, generative workflow that operates in an intrinsic-image latent space.<n>We address key challenges of identity preservation and intrinsic-channel entanglement.<n>We enable precise, efficient editing with automatic resolution of global illumination effects.
arXiv Detail & Related papers (2025-05-13T18:24:15Z)
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models [73.34674816016211]
Edify Image is a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image supports a wide range of applications, including text-to-image synthesis, 4K upsampling, ControlNets, 360 HDR panorama generation, and finetuning for image customization.
arXiv Detail & Related papers (2024-11-11T16:58:31Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image Editing [9.333499287832202]
PFB-Diff is a Progressive Feature Blending method for Diffusion-based image editing.<n>PFB-Diff seamlessly integrates text-guided generated content into the target image through multi-level feature blending.<n>Our method demonstrates its superior performance in terms of editing accuracy and image quality without the need for fine-tuning or training.
arXiv Detail & Related papers (2023-06-28T11:10:20Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.