Related papers: DreamPainter: Image Background Inpainting for E-commerce Scenarios

DreamPainter: Image Background Inpainting for E-commerce Scenarios

URL: http://arxiv.org/abs/2508.02155v1
Date: Mon, 04 Aug 2025 07:54:37 GMT
Title: DreamPainter: Image Background Inpainting for E-commerce Scenarios
Authors: Sijie Zhao, Jing Cheng, Yaoyao Wu, Hao Xu, Shaohui Jiao,
Abstract summary: We introduce DreamPainter, a novel framework that incorporates text prompts for control and reference image information as an additional control signal.<n>Our approach significantly outperforms state-of-the-art methods, maintaining high product consistency while effectively integrating both text prompt and reference image information.
Score: 9.12444106077783
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although diffusion-based image genenation has been widely explored and applied, background generation tasks in e-commerce scenarios still face significant challenges. The first challenge is to ensure that the generated products are consistent with the given product inputs while maintaining a reasonable spatial arrangement, harmonious shadows, and reflections between foreground products and backgrounds. Existing inpainting methods fail to address this due to the lack of domain-specific data. The second challenge involves the limitation of relying solely on text prompts for image control, as effective integrating visual information to achieve precise control in inpainting tasks remains underexplored. To address these challenges, we introduce DreamEcom-400K, a high-quality e-commerce dataset containing accurate product instance masks, background reference images, text prompts, and aesthetically pleasing product images. Based on this dataset, we propose DreamPainter, a novel framework that not only utilizes text prompts for control but also flexibly incorporates reference image information as an additional control signal. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, maintaining high product consistency while effectively integrating both text prompt and reference image information.

Related papers

MagicEraser: Erasing Any Objects via Semantics-Aware Control [40.683569840182926]
We introduce MagicEraser, a diffusion model-based framework tailored for the object erasure task. MagicEraser achieves fine and effective control of content generation while mitigating undesired artifacts.
arXiv Detail & Related papers (2024-10-14T07:03:14Z)
E-Commerce Inpainting with Mask Guidance in Controlnet for Reducing Overcompletion [13.67619785783182]
This paper systematically analyzes and addresses a core pain point in diffusion model generation: overcompletion. Our method has achieved promising results in practical applications and we hope it can serve as an inspiring technical report in this field.
arXiv Detail & Related papers (2024-09-15T10:10:13Z)
Improving Text-guided Object Inpainting with Semantic Pre-inpainting [95.17396565347936]
We decompose the typical single-stage object inpainting into two cascaded processes: semantic pre-inpainting and high-fieldity object generation. To achieve this, we cascade a Transformer-based semantic inpainter and an object inpainting diffusion model, leading to a novel CAscaded Transformer-Diffusion framework.
arXiv Detail & Related papers (2024-09-12T17:55:37Z)
Visual Text Generation in the Wild [67.37458807253064]
We propose a visual text generator (termed SceneVTG) which can produce high-quality text images in the wild. The proposed SceneVTG significantly outperforms traditional rendering-based methods and recent diffusion-based methods in terms of fidelity and reasonability. The generated images provide superior utility for tasks involving text detection and text recognition.
arXiv Detail & Related papers (2024-07-19T09:08:20Z)
RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting [63.567363455092234]
RefFusion is a novel 3D inpainting method based on a multi-scale personalization of an image inpainting diffusion model to the given reference view. Our framework achieves state-of-the-art results for object removal while maintaining high controllability.
arXiv Detail & Related papers (2024-04-16T17:50:02Z)
Locate, Assign, Refine: Taming Customized Promptable Image Inpainting [22.163855501668206]
We introduce the multimodal promptable image inpainting project: a new task model, and data for taming customized image inpainting.<n>We propose LAR-Gen, a novel approach for image inpainting that enables seamless inpainting of specific region in images corresponding to the mask prompt.<n>Our LAR-Gen adopts a coarse-to-fine manner to ensure the context consistency of source image, subject identity consistency, local semantic consistency to the text description, and smoothness consistency.
arXiv Detail & Related papers (2024-03-28T16:07:55Z)
SPIRE: Semantic Prompt-Driven Image Restoration [66.26165625929747]
We develop SPIRE, a Semantic and restoration Prompt-driven Image Restoration framework. Our approach is the first framework that supports fine-level instruction through language-based quantitative specification of the restoration strength. Our experiments demonstrate the superior restoration performance of SPIRE compared to the state of the arts.
arXiv Detail & Related papers (2023-12-18T17:02:30Z)
DreamInpainter: Text-Guided Subject-Driven Image Inpainting with Diffusion Models [37.133727797607676]
This study introduces Text-Guided Subject-Driven Image Inpainting. We compute dense subject features to ensure accurate subject replication. We employ a discriminative token selection module to eliminate redundant subject details.
arXiv Detail & Related papers (2023-12-05T22:23:19Z)
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts [60.54912319612113]
Diffusion-based generative models have significantly advanced text-to-image generation but encounter challenges when processing lengthy and intricate text prompts. We present a novel approach leveraging Large Language Models (LLMs) to extract critical components from text prompts. Our evaluation on complex prompts featuring multiple objects demonstrates a substantial improvement in recall compared to baseline diffusion models.
arXiv Detail & Related papers (2023-10-16T17:57:37Z)
Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents. We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z)
PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding [11.985768957782641]
We propose a method to provide good images by incorporating perceptual understanding in the discriminator module. We show that the perceptual information included in the initial image is improved while modeling image distribution at multiple stages. More importantly, the proposed method can be integrated into the pipeline of other state-of-the-art text-based-image-generation models.
arXiv Detail & Related papers (2020-07-02T09:23:08Z)
Very Long Natural Scenery Image Prediction by Outpainting [96.8509015981031]
Outpainting receives less attention due to two challenges in it. First challenge is how to keep the spatial and content consistency between generated images and original input. Second challenge is how to maintain high quality in generated results.
arXiv Detail & Related papers (2019-12-29T16:29:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.