Related papers: Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features

Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features

URL: http://arxiv.org/abs/2410.02881v1
Date: Thu, 3 Oct 2024 18:10:16 GMT
Title: Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features
Authors: Gaurav Sahu, Olga Vechtomova,
Abstract summary: Artistic inspiration plays a crucial role in producing works that resonate deeply with audiences. This work proposes a novel framework for computationally modeling artistic preferences in different individuals. Our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points.
Score: 8.205321096201095
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artistic inspiration remains one of the least understood aspects of the creative process. It plays a crucial role in producing works that resonate deeply with audiences, but the complexity and unpredictability of aesthetic stimuli that evoke inspiration have eluded systematic study. This work proposes a novel framework for computationally modeling artistic preferences in different individuals through key linguistic and stylistic properties, with a focus on lyrical content. In addition to the framework, we introduce \textit{EvocativeLines}, a dataset of annotated lyric lines, categorized as either "inspiring" or "not inspiring," to facilitate the evaluation of our framework across diverse preference profiles. Our computational model leverages the proposed linguistic and poetic features and applies a calibration network on top of it to accurately forecast artistic preferences among different creative individuals. Our experiments demonstrate that our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points. Overall, this work contributes an interpretable and flexible framework that can be adapted to analyze any type of artistic preferences that are inherently subjective across a wide spectrum of skill levels.

Related papers

WordCraft: Interactive Artistic Typography with Attention Awareness and Noise Blending [12.655120187133779]
Artistic typography aims to stylize input characters with visual effects that are both creative and legible.<n>Traditional approaches rely heavily on manual design, while recent generative models, particularly diffusion-based methods, have enabled automated character stylization.<n>We introduce WordCraft, an interactive artistic typography system that integrates diffusion models to address these limitations.
arXiv Detail & Related papers (2025-07-13T10:49:09Z)
Calligrapher: Freestyle Text Image Customization [72.71919410487881]
Calligrapher is a novel diffusion-based framework that integrates advanced text customization with artistic typography.<n>By automating high-quality, visually consistent typography, Calligrapher surpasses traditional models.
arXiv Detail & Related papers (2025-06-30T17:59:06Z)
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output. Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ. We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z)
A Critical Assessment of Modern Generative Models' Ability to Replicate Artistic Styles [0.0]
This paper presents a critical assessment of the style replication capabilities of contemporary generative models. We examine how effectively these models reproduce traditional artistic styles while maintaining structural integrity and compositional balance. The analysis is based on a new large dataset of AI-generated works imitating artistic styles of the past.
arXiv Detail & Related papers (2025-02-21T07:00:06Z)
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning [14.405750888492735]
Image Aesthetic Assessment (IAA) is a vital and intricate task that entails analyzing and assessing an image's aesthetic values. Traditional methods of IAA often concentrate on a single aesthetic task and suffer from inadequate labeled datasets. We propose a comprehensive aesthetic MLLM capable of nuanced aesthetic insight.
arXiv Detail & Related papers (2024-12-16T16:35:35Z)
Textual Aesthetics in Large Language Models [80.09790024030525]
We introduce a pipeline for aesthetics polishing and help construct a textual aesthetics dataset named TexAes. We propose a textual aesthetics-powered fine-tuning method based on direct preference optimization, termed TAPO. Our experiments demonstrate that using textual aesthetics data and employing the TAPO fine-tuning method not only improves aesthetic scores but also enhances performance on general evaluation datasets.
arXiv Detail & Related papers (2024-11-05T09:22:08Z)
Towards Visual Text Design Transfer Across Languages [49.78504488452978]
We introduce a novel task of Multimodal Style Translation (MuST-Bench) MuST-Bench is a benchmark designed to evaluate the ability of visual text generation models to perform translation across different writing systems. In response, we introduce SIGIL, a framework for multimodal style translation that eliminates the need for style descriptions.
arXiv Detail & Related papers (2024-10-24T15:15:01Z)
VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models [53.59400446543756]
Artistic typography is a technique to visualize the meaning of input character in an imaginable and readable manner.<n>We introduce a dual-branch, training-free method called VitaGlyph, enabling flexible artistic typography with controllable geometry changes.
arXiv Detail & Related papers (2024-10-02T16:48:47Z)
Style-based Clustering of Visual Artworks and the Play of Neural Style-Representations [2.4374097382908477]
Clustering artworks based on style can have many potential real-world applications like art recommendations, style-based search and retrieval. We argue that clustering artworks based on style is largely an unaddressed problem.
arXiv Detail & Related papers (2024-09-12T17:44:07Z)
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively. Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z)
ORACLE: Leveraging Mutual Information for Consistent Character Generation with LoRAs in Diffusion Models [3.7599363231894185]
We introduce a novel framework designed to produce consistent character representations from a single text prompt. Our framework outperforms existing methods in generating characters with consistent visual identities.
arXiv Detail & Related papers (2024-06-04T23:39:08Z)
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images. However, adapting these models for artistic image editing presents two significant challenges. We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z)
Evaluating Large Language Model Creativity from a Literary Perspective [13.672268920902187]
This paper assesses the potential for large language models to serve as assistive tools in the creative writing process. We develop interactive and multi-voice prompting strategies that interleave background descriptions, instructions that guide composition, samples of text in the target style, and critical discussion of the given samples.
arXiv Detail & Related papers (2023-11-30T16:46:25Z)
Impressions: Understanding Visual Semiotics and Aesthetic Impact [66.40617566253404]
We present Impressions, a novel dataset through which to investigate the semiotics of images. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. This dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.
arXiv Detail & Related papers (2023-10-27T04:30:18Z)
ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer [60.6863849241972]
We learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image. We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics.
arXiv Detail & Related papers (2023-04-12T10:33:18Z)
Incorporating Stylistic Lexical Preferences in Generative Language Models [10.62343151429147]
We present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lexical preferences of an author into generative language models. Our experiments demonstrate that the proposed approach can generate text that distinctively aligns with a given target author's lexical style.
arXiv Detail & Related papers (2020-10-22T09:24:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.