Related papers: MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference

MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference

URL: http://arxiv.org/abs/2409.05250v1
Date: Mon, 9 Sep 2024 00:01:48 GMT
Title: MRStyle: A Unified Framework for Color Style Transfer with Multi-Modality Reference
Authors: Jiancheng Huang, Yu Gao, Zequn Jie, Yujie Zhong, Xintong Han, Lin Ma,
Abstract summary: We introduce MRStyle, a framework that enables color style transfer using multi-modality reference, including image and text. For text reference, we align the text feature of stable diffusion priors with the style feature of our IRStyle to perform text-guided color style transfer (TRStyle) Our TRStyle method is highly efficient in both training and inference, producing notable open-set text-guided transfer results.
Score: 32.64957647390327
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce MRStyle, a comprehensive framework that enables color style transfer using multi-modality reference, including image and text. To achieve a unified style feature space for both modalities, we first develop a neural network called IRStyle, which generates stylized 3D lookup tables for image reference. This is accomplished by integrating an interaction dual-mapping network with a combined supervised learning pipeline, resulting in three key benefits: elimination of visual artifacts, efficient handling of high-resolution images with low memory usage, and maintenance of style consistency even in situations with significant color style variations. For text reference, we align the text feature of stable diffusion priors with the style feature of our IRStyle to perform text-guided color style transfer (TRStyle). Our TRStyle method is highly efficient in both training and inference, producing notable open-set text-guided transfer results. Extensive experiments in both image and text settings demonstrate that our proposed method outperforms the state-of-the-art in both qualitative and quantitative evaluations.

Related papers

Improved 3D Scene Stylization via Text-Guided Generative Image Editing with Region-Based Control [47.14550252881733]
We introduce techniques that enhance the quality of 3D stylization while maintaining view consistency and providing optional region-controlled style transfer.<n>Our method achieves stylization by re-training an initial 3D representation using stylized multi-view 2D images of the source views.<n>We propose Multi-Region Importance-Weighted Sliced Wasserstein Distance Loss, allowing styles to be applied to distinct image regions using segmentation masks from off-the-shelf models.
arXiv Detail & Related papers (2025-09-04T15:01:01Z)
ReStyle3D: Scene-Level Appearance Transfer with Semantic Correspondences [33.06053818091165]
ReStyle3D is a framework for scene-level appearance transfer from a single style image to a real-world scene represented by multiple views. It combines explicit semantic correspondences with multi-view consistency to achieve precise and coherent stylization. Our code, pretrained models, and dataset will be publicly released to support new applications in interior design, virtual staging, and 3D-consistent stylization.
arXiv Detail & Related papers (2025-02-14T18:54:21Z)
FAGStyle: Feature Augmentation on Geodesic Surface for Zero-shot Text-guided Diffusion Image Style Transfer [2.3293561091456283]
The goal of image style transfer is to render an image guided by a style reference while maintaining the original content. We introduce FAGStyle, a zero-shot text-guided diffusion image style transfer method. Our approach enhances inter-patch information interaction by incorporating the Sliding Window Crop technique.
arXiv Detail & Related papers (2024-08-20T04:20:11Z)
ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z)
StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer [9.010012117838725]
StyleMamba is an efficient image style transfer framework that translates text prompts into corresponding visual styles. Existing text-guided stylization requires hundreds of training iterations and takes a lot of computing resources.
arXiv Detail & Related papers (2024-05-08T12:57:53Z)
Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images. By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z)
A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework. We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z)
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results. We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z)
Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning. Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z)
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation [52.83401421019309]
TediGAN is a framework for multi-modal image generation and manipulation with textual descriptions. StyleGAN inversion module maps real images to the latent space of a well-trained StyleGAN. visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space. instance-level optimization is for identity preservation in manipulation.
arXiv Detail & Related papers (2020-12-06T16:20:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.