Related papers: ColorGPT: Leveraging Large Language Models for Multimodal Color Recommendation

ColorGPT: Leveraging Large Language Models for Multimodal Color Recommendation

URL: http://arxiv.org/abs/2508.08987v1
Date: Tue, 12 Aug 2025 14:56:11 GMT
Title: ColorGPT: Leveraging Large Language Models for Multimodal Color Recommendation
Authors: Ding Xia, Naoto Inoue, Qianru Qiu, Kotaro Kikuchi,
Abstract summary: We explore the use of pretrained Large Language Models (LLMs) and their commonsense reasoning capabilities for color recommendation.<n>Our approach primarily targeted color palette completion by recommending colors based on a set of given colors and accompanying context.<n>Our method can be extended to full palette generation, producing an entire color palette corresponding to a provided textual description.
Score: 4.714111142188893
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Colors play a crucial role in the design of vector graphic documents by enhancing visual appeal, facilitating communication, improving usability, and ensuring accessibility. In this context, color recommendation involves suggesting appropriate colors to complete or refine a design when one or more colors are missing or require alteration. Traditional methods often struggled with these challenges due to the complex nature of color design and the limited data availability. In this study, we explored the use of pretrained Large Language Models (LLMs) and their commonsense reasoning capabilities for color recommendation, raising the question: Can pretrained LLMs serve as superior designers for color recommendation tasks? To investigate this, we developed a robust, rigorously validated pipeline, ColorGPT, that was built by systematically testing multiple color representations and applying effective prompt engineering techniques. Our approach primarily targeted color palette completion by recommending colors based on a set of given colors and accompanying context. Moreover, our method can be extended to full palette generation, producing an entire color palette corresponding to a provided textual description. Experimental results demonstrated that our LLM-based pipeline outperformed existing methods in terms of color suggestion accuracy and the distribution of colors in the color palette completion task. For the full palette generation task, our approach also yielded improvements in color diversity and similarity compared to current techniques.

Related papers

Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer [58.94607850223466]
We present Color3D, a highly adaptable framework for colorizing both static and dynamic 3D scenes from monochromatic inputs.<n>Our approach is able to preserve color diversity and steerability while ensuring cross-view and cross-time consistency.
arXiv Detail & Related papers (2025-10-11T10:21:19Z)
Exploring Palette based Color Guidance in Diffusion Models [5.80330969550483]
We propose a novel approach to enhance color scheme control by integrating color palettes as a separate guidance mechanism alongside prompt instructions.<n>Our results demonstrate that incorporating palette guidance significantly improves the model's ability to generate images with desired color schemes.
arXiv Detail & Related papers (2025-08-12T09:02:10Z)
MangaNinja: Line Art Colorization with Precise Reference Following [84.2001766692797]
MangaNinjia specializes in the task of reference-guided line art colorization.<n>We incorporate two thoughtful designs to ensure precise character detail transcription.<n>A patch shuffling module to facilitate correspondence learning between the reference color image and the target line art, and a point-driven control scheme to enable fine-grained color matching.
arXiv Detail & Related papers (2025-01-14T18:59:55Z)
L-C4: Language-Based Video Colorization for Creative and Consistent Color [59.069498113050436]
We present Language-based video colorization for Creative and Consistent Colors (L-C4) Our model is built upon a pre-trained cross-modality generative model. We propose temporally deformable attention to prevent flickering or color shifts, and cross-clip fusion to maintain long-term color consistency.
arXiv Detail & Related papers (2024-10-07T12:16:21Z)
What Color Scheme is More Effective in Assisting Readers to Locate Information in a Color-Coded Article? [9.50572374662018]
Large Language Models (LLMs) has streamlined document coding, enabling simple automatic text labeling with various schemes. This has the potential to make color-coding more accessible and benefit more users. We conducted a user study assessing various color schemes' effectiveness in LLM-coded text documents. Results showed non-analogous and yellow-inclusive color schemes improved performance, with the latter also being more preferred by participants.
arXiv Detail & Related papers (2024-08-12T21:04:16Z)
Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model. We present an effective way to encode user strokes to enable precise local color manipulation. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z)
Multimodal Color Recommendation in Vector Graphic Documents [14.287758028119788]
We propose a multimodal masked color model that integrates both color and textual contexts to provide text-aware color recommendation for graphic documents. Our proposed model comprises self-attention networks to capture the relationships between colors in multiple palettes, and cross-attention networks that incorporate both color and CLIP-based text representations.
arXiv Detail & Related papers (2023-08-08T08:17:39Z)
L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors [62.80068955192816]
We propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios.
arXiv Detail & Related papers (2023-05-24T14:57:42Z)
PalGAN: Image Colorization with Palette Generative Adversarial Networks [51.59276436217957]
We propose a new GAN-based colorization approach PalGAN, integrated with palette estimation and chromatic attention. PalGAN outperforms state-of-the-arts in quantitative evaluation and visual comparison, delivering notable diverse, contrastive, and edge-preserving appearances.
arXiv Detail & Related papers (2022-10-20T12:28:31Z)
Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation [12.71266194474117]
We extract multiple color palettes from each visual element in a graphic document, and then combine them into a color sequence. We train the model and build a color recommendation system on a large-scale dataset of vector graphic documents.
arXiv Detail & Related papers (2022-09-22T07:06:17Z)
Towards Vivid and Diverse Image Colorization with Generative Color Prior [17.087464490162073]
Recent deep-learning-based methods could automatically colorize images at a low cost. We aim at recovering vivid colors by leveraging the rich and diverse color priors encapsulated in a pretrained Generative Adversarial Networks (GAN) Thanks to the powerful generative color prior and delicate designs, our method could produce vivid colors with a single forward pass.
arXiv Detail & Related papers (2021-08-19T17:49:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.