Simple Lines, Big Ideas: Towards Interpretable Assessment of Human Creativity from Drawings
- URL: http://arxiv.org/abs/2511.12880v2
- Date: Thu, 20 Nov 2025 02:10:22 GMT
- Title: Simple Lines, Big Ideas: Towards Interpretable Assessment of Human Creativity from Drawings
- Authors: Zihao Lin, Zhenshan Shi, Sasa Zhao, Hanwei Zhu, Lingyu Zhu, Baoliang Chen, Lei Mo,
- Abstract summary: We propose a data-driven framework for automatic and interpretable creativity assessment from drawings.<n>Motivated by the cognitive evidence proposed in [6] that creativity can emerge from both what is drawn (content) and how it is drawn (style), we reinterpret the creativity score as a function of these two complementary dimensions.
- Score: 18.09092203643732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Assessing human creativity through visual outputs, such as drawings, plays a critical role in fields including psychology, education, and cognitive science. However, current assessment practices still rely heavily on expert-based subjective scoring, which is both labor-intensive and inherently subjective. In this paper, we propose a data-driven framework for automatic and interpretable creativity assessment from drawings. Motivated by the cognitive evidence proposed in [6] that creativity can emerge from both what is drawn (content) and how it is drawn (style), we reinterpret the creativity score as a function of these two complementary dimensions. Specifically, we first augment an existing creativity-labeled dataset with additional annotations targeting content categories. Based on the enriched dataset, we further propose a conditional model predicting content, style, and ratings simultaneously. In particular, the conditional learning mechanism that enables the model to adapt its visual feature extraction by dynamically tuning it to creativity-relevant signals conditioned on the drawing's stylistic and semantic cues. Experimental results demonstrate that our model achieves state-of-the-art performance compared to existing regression-based approaches and offers interpretable visualizations that align well with human judgments. The code and annotations will be made publicly available at https://github.com/WonderOfU9/CSCA_PRCV_2025
Related papers
- Fine-Tuning a Large Vision-Language Model for Artwork's Scoring and Critique [11.787232686718367]
We propose a framework for automated creativity assessment of human paintings by fine-tuning the vision-language model Qwen2-VL-7B with multi-task learning.<n>Our dataset contains 1000 human-created paintings scored on a 1-100 scale and paired with a short human-written description.<n> Experiments show strong accuracy, achieving Pearson r > 0.97 and MAE about 3.95 on the 100-point scale.
arXiv Detail & Related papers (2026-02-09T19:52:16Z) - Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment [51.40989269202702]
aesthetic quality assessment task is crucial for developing a human-aligned quantitative evaluation system for AIGC.<n>We propose ArtQuant, an aesthetics assessment framework for artistic images which couples isolated aesthetic dimensions through description generation.<n>Our approach achieves epoch state-of-the-art performance on several datasets while requiring only 33% of conventional trainings.
arXiv Detail & Related papers (2025-12-29T12:18:26Z) - CreativityPrism: A Holistic Benchmark for Large Language Model Creativity [64.18257552903151]
Creativity is often seen as a hallmark of human intelligence.<n>There is still no holistic framework to evaluate their creativity across diverse scenarios.<n>We propose CreativityPrism, an evaluation analysis framework that decomposes creativity into three dimensions: quality, novelty, and diversity.
arXiv Detail & Related papers (2025-10-23T00:22:10Z) - TraitSpaces: Towards Interpretable Visual Creativity for Human-AI Co-Creation [0.0]
Drawing on interviews with practicing artists and theories from psychology, we define 12 traits that capture affective, symbolic, cultural, and ethical dimensions of creativity.<n>Traits such as Environmental Dialogicity and Redemptive Arc are predicted with high reliability.<n>By linking cultural-aesthetic insights with computational modeling, our work aims not to reduce creativity to numbers, but to offer shared language and interpretable tools for artists, researchers, and AI systems to collaborate meaningfully.
arXiv Detail & Related papers (2025-09-29T06:24:18Z) - Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output.<n>Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ.<n>We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z) - APDDv2: Aesthetics of Paintings and Drawings Dataset with Artist Labeled Scores and Comments [45.57709215036539]
We introduce the Aesthetics Paintings and Drawings dataset (APDD), the first comprehensive collection of paintings encompassing 24 distinct artistic categories and 10 aesthetic attributes.
APDDv2 boasts an expanded image corpus and improved annotation quality, featuring detailed language comments.
We present an updated version of the Art Assessment Network for Specific Painting Styles, denoted as ArtCLIP. Experimental validation demonstrates the superior performance of this revised model in the realm of aesthetic evaluation, surpassing its predecessor in accuracy and efficacy.
arXiv Detail & Related papers (2024-11-13T11:46:42Z) - Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features [8.205321096201095]
Artistic inspiration plays a crucial role in producing works that resonate deeply with audiences.
This work proposes a novel framework for computationally modeling artistic preferences in different individuals.
Our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points.
arXiv Detail & Related papers (2024-10-03T18:10:16Z) - How Do You Perceive My Face? Recognizing Facial Expressions in Multi-Modal Context by Modeling Mental Representations [5.895694050664867]
We introduce a novel approach for facial expression classification that goes beyond simple classification tasks.
Our model accurately classifies a perceived face and synthesizes the corresponding mental representation perceived by a human when observing a face in context.
We evaluate synthesized expressions in a human study, showing that our model effectively produces approximations of human mental representations.
arXiv Detail & Related papers (2024-09-04T09:32:40Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images.
We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images.
This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z) - Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models [64.24227572048075]
We propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models.
Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects.
arXiv Detail & Related papers (2023-08-22T04:24:45Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z) - Drawing out of Distribution with Neuro-Symbolic Generative Models [49.79371715591122]
Drawing out of Distribution is a neuro-symbolic generative model of stroke-based drawing.
DooD operates directly on images, requires no supervision or expensive test-time inference.
We evaluate DooD on its ability to generalise across both data and tasks.
arXiv Detail & Related papers (2022-06-03T21:40:22Z) - Quantifying Learnability and Describability of Visual Concepts Emerging
in Representation Learning [91.58529629419135]
We consider how to characterise visual groupings discovered automatically by deep neural networks.
We introduce two concepts, visual learnability and describability, that can be used to quantify the interpretability of arbitrary image groupings.
arXiv Detail & Related papers (2020-10-27T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.