Related papers: Exploring the Potential of Large Language Models in Artistic Creation: Collaboration and Reflection on Creative Programming

Related papers

Evaluating the Impact of LLM-guided Reflection on Learning Outcomes with Interactive AI-Generated Educational Podcasts [4.312185476003309]
This study examined whether embedding reflection prompts in an interactive AI-generated podcast improved learning and user experience compared to a version without prompts.<n>While learning outcomes were similar across conditions, reflection prompts reduced perceived attractiveness, highlighting a call for more research on reflective interactivity design.
arXiv Detail & Related papers (2025-08-06T18:03:42Z)
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards [53.36917093757101]
Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs)<n>We introduce textbfCogDual, a novel RPLA adopting a textitcognize-then-respond reasoning paradigm.<n>By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment.
arXiv Detail & Related papers (2025-07-23T02:26:33Z)
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning [25.02860760920562]
Multimodal large language models (MLLMs) have shown promising capabilities in reasoning tasks, but struggle with complex problems requiring explicit self-reflection and self-correction.<n>Existing reflection methods are simplistic and struggle to generate meaningful and instructive feedback.<n>We propose Multimodal Self-Reflection enhanced reasoning with Group Relative Policy Optimization (SRPO), a two-stage reflection-aware reinforcement learning framework.
arXiv Detail & Related papers (2025-06-02T14:21:44Z)
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection [60.75785864719726]
We present a novel pipeline, ReflectEvo, to demonstrate that small language models (SLMs) can enhance meta introspection through reflection learning.<n>We construct ReflectEvo-460k, a large-scale, comprehensive, self-generated reflection dataset with broadened instructions and diverse multi-domain tasks.
arXiv Detail & Related papers (2025-05-22T10:03:05Z)
Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations [53.950760059792614]
Large Language Models (LLMs) excel at countless tasks, yet struggle with creativity. We introduce a novel approach that couples LLMs with structured representations and cognitively inspired manipulations to generate more creative and diverse ideas. We demonstrate our approach in the culinary domain with DishCOVER, a model that generates creative recipes.
arXiv Detail & Related papers (2025-04-29T11:13:06Z)
Probing and Inducing Combinational Creativity in Vision-Language Models [52.76981145923602]
Recent advances in Vision-Language Models (VLMs) have sparked debate about whether their outputs reflect combinational creativity. We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels. To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework.
arXiv Detail & Related papers (2025-04-17T17:38:18Z)
ArtistAuditor: Auditing Artist Style Pirate in Text-to-Image Generation Models [61.55816738318699]
We propose a novel method for data-use auditing in the text-to-image generation model. ArtistAuditor employs a style extractor to obtain the multi-granularity style representations and treats artworks as samplings of an artist's style. The experimental results on six combinations of models and datasets show that ArtistAuditor can achieve high AUC values.
arXiv Detail & Related papers (2025-04-17T16:15:38Z)
Perception in Reflection [39.33505560810175]
We present a perception in reflection paradigm designed to transcend the limitations of current large vision-language models.<n>We propose Reflective Perception (RePer), a dual-model reflection mechanism that systematically alternates between policy and critic models.
arXiv Detail & Related papers (2025-04-09T17:59:02Z)
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output. Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ. We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z)
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [53.817538122688944]
We introduce Reinforced Meta-thinking Agents (ReMA) to elicit meta-thinking behaviors from Reasoning of Large Language Models (LLMs)<n>ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions.<n> Empirical results from single-turn experiments demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z)
Instruct-of-Reflection: Enhancing Large Language Models Iterative Reflection Capabilities via Dynamic-Meta Instruction [11.838351314880736]
Instruct-of-Reflection (IoRT) is a novel and general reflection framework that leverages dynamic-meta instruction to enhance the iterative reflection capability of Large Language Models (LLMs)<n>Our experiments demonstrate that IoRT achieves an average improvement of 10.1% over established baselines in mathematical and commonsense reasoning tasks.
arXiv Detail & Related papers (2025-03-02T14:02:03Z)
Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists [1.5296069874080693]
We compare the artistic capabilities of artists and laypeople using generative AI. On average, artists produced more faithful and creative outputs than their lay counterparts. While AI may ease content creation, professional expertise is still valuable.
arXiv Detail & Related papers (2025-01-21T18:53:21Z)
Meta-Reflection: A Feedback-Free Reflection Learning Framework [57.14485943991588]
We propose Meta-Reflection, a feedback-free reflection mechanism that requires only a single inference pass without external feedback.<n>Motivated by the human ability to remember and retrieve reflections from past experiences, Meta-Reflection integrates reflective insights into a codebook.<n>To thoroughly investigate and evaluate the practicality of Meta-Reflection in real-world scenarios, we introduce an industrial e-commerce benchmark named E-commerce Customer Intent Detection.
arXiv Detail & Related papers (2024-12-18T12:20:04Z)
Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art [90.8684263806649]
We show how AI can transcend human cognitive limitations in visual art creation. Our research hypothesizes that visual art contains a vast unexplored space of conceptual combinations. We present the Alien Recombination method to identify and generate concept combinations that lie beyond human cognitive availability.
arXiv Detail & Related papers (2024-11-18T11:55:38Z)
Diffusion-Based Visual Art Creation: A Survey and New Perspectives [51.522935314070416]
This survey explores the emerging realm of diffusion-based visual art creation, examining its development from both artistic and technical perspectives. Our findings reveal how artistic requirements are transformed into technical challenges and highlight the design and application of diffusion-based methods within visual art creation. We aim to shed light on the mechanisms through which AI systems emulate and possibly, enhance human capacities in artistic perception and creativity.
arXiv Detail & Related papers (2024-08-22T04:49:50Z)
GalleryGPT: Analyzing Paintings with Large Multimodal Models [64.98398357569765]
Artwork analysis is important and fundamental skill for art appreciation, which could enrich personal aesthetic sensibility and facilitate the critical thinking ability. Previous works for automatically analyzing artworks mainly focus on classification, retrieval, and other simple tasks, which is far from the goal of AI. We introduce a superior large multimodal model for painting analysis composing, dubbed GalleryGPT, which is slightly modified and fine-tuned based on LLaVA architecture.
arXiv Detail & Related papers (2024-08-01T11:52:56Z)
Equivalence: An analysis of artists' roles with Image Generative AI from Conceptual Art perspective through an interactive installation design practice [16.063735487844628]
This study explores how artists interact with advanced text-to-image Generative AI models. To exemplify this framework, a case study titled "Equivalence" converts users' speech input into continuously evolving paintings. This work aims to broaden our understanding of artists' roles and foster a deeper appreciation for the creative aspects inherent in artwork created with Image Generative AI.
arXiv Detail & Related papers (2024-04-29T02:45:23Z)
DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation [53.62727171363384]
We introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking. Our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting. We evaluate our approach on three datasets using six pre-trained LLMs.
arXiv Detail & Related papers (2023-12-18T16:41:22Z)
Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation [19.62178304006683]
We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas. We propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses.
arXiv Detail & Related papers (2023-10-19T17:53:14Z)
Interactive Neural Painting [66.9376011879115]
This paper proposes the first approach for Interactive Neural Painting (NP) We propose I-Paint, a novel method based on a conditional transformer Variational AutoEncoder (VAE) architecture with a two-stage decoder. Our experiments show that our approach provides good stroke suggestions and compares favorably to the state of the art.
arXiv Detail & Related papers (2023-07-31T07:02:00Z)
"It Felt Like Having a Second Mind": Investigating Human-AI Co-creativity in Prewriting with Large Language Models [20.509651636971864]
This study investigates human-LLM collaboration patterns and dynamics during prewriting. During collaborative prewriting, there appears to be a three-stage iterative Human-AI Co-creativity process.
arXiv Detail & Related papers (2023-07-20T16:55:25Z)
Inspire creativity with ORIBA: Transform Artists' Original Characters into Chatbots through Large Language Model [4.984601297028257]
This research delves into the intersection of illustration art and artificial intelligence (AI) By examining the impact of AI on the creative process and the boundaries of authorship, we aim to enhance human-AI interactions in creative fields.
arXiv Detail & Related papers (2023-06-16T11:25:44Z)
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks [59.09343552273045]
We propose a decoder-only model for multimodal tasks, which is surprisingly effective in jointly learning of these disparate vision-language tasks. We demonstrate that joint learning of these diverse objectives is simple, effective, and maximizes the weight-sharing of the model across these tasks. Our model achieves the state of the art on image-text and text-image retrieval, video question answering and open-vocabulary detection tasks, outperforming much larger and more extensively trained foundational models.
arXiv Detail & Related papers (2023-03-29T16:42:30Z)
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes [91.24112204588353]
We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks. In contrast to previous models, UViM has the same functional form for all tasks. We demonstrate the effectiveness of UViM on three diverse and challenging vision tasks.
arXiv Detail & Related papers (2022-05-20T17:47:59Z)
Understanding and Creating Art with AI: Review and Outlook [12.614901374282868]
Technologies related to artificial intelligence (AI) have a strong impact on the changes of research and creative practices in visual arts. This paper provides an integrated review of two facets of AI and art: 1) AI is used for art analysis and employed on digitized artwork collections; 2) AI is used for creative purposes and generating novel artworks. In relation to the role of AI in creating art, we address various practical and theoretical aspects of AI Art and consolidate related works that deal with those topics in detail.
arXiv Detail & Related papers (2021-02-18T01:38:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.