Assessing and Understanding Creativity in Large Language Models
- URL: http://arxiv.org/abs/2401.12491v1
- Date: Tue, 23 Jan 2024 05:19:47 GMT
- Title: Assessing and Understanding Creativity in Large Language Models
- Authors: Yunpu Zhao, Rui Zhang, Wenyi Li, Di Huang, Jiaming Guo, Shaohui Peng,
Yifan Hao, Yuanbo Wen, Xing Hu, Zidong Du, Qi Guo, Ling Li and Yunji Chen
- Abstract summary: This paper aims to establish an efficient framework for assessing the level of creativity in large language models (LLMs)
By adapting the Torrance Tests of Creative Thinking, the research evaluates the creative performance of various LLMs across 7 tasks.
We found that the creativity of LLMs primarily falls short in originality, while excelling in elaboration.
- Score: 33.37237667182931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of natural language processing, the rapid development of large
language model (LLM) has attracted more and more attention. LLMs have shown a
high level of creativity in various tasks, but the methods for assessing such
creativity are inadequate. The assessment of LLM creativity needs to consider
differences from humans, requiring multi-dimensional measurement while
balancing accuracy and efficiency. This paper aims to establish an efficient
framework for assessing the level of creativity in LLMs. By adapting the
modified Torrance Tests of Creative Thinking, the research evaluates the
creative performance of various LLMs across 7 tasks, emphasizing 4 criteria
including Fluency, Flexibility, Originality, and Elaboration. In this context,
we develop a comprehensive dataset of 700 questions for testing and an
LLM-based evaluation method. In addition, this study presents a novel analysis
of LLMs' responses to diverse prompts and role-play situations. We found that
the creativity of LLMs primarily falls short in originality, while excelling in
elaboration. Besides, the use of prompts and the role-play settings of the
model significantly influence creativity. Additionally, the experimental
results also indicate that collaboration among multiple LLMs can enhance
originality. Notably, our findings reveal a consensus between human evaluations
and LLMs regarding the personality traits that influence creativity. The
findings underscore the significant impact of LLM design on creativity and
bridges artificial intelligence and human creativity, offering insights into
LLMs' creativity and potential applications.
Related papers
- Understanding the Role of LLMs in Multimodal Evaluation Benchmarks [77.59035801244278]
This paper investigates the role of the Large Language Model (LLM) backbone in Multimodal Large Language Models (MLLMs) evaluation.
Our study encompasses four diverse MLLM benchmarks and eight state-of-the-art MLLMs.
Key findings reveal that some benchmarks allow high performance even without visual inputs and up to 50% of error rates can be attributed to insufficient world knowledge in the LLM backbone.
arXiv Detail & Related papers (2024-10-16T07:49:13Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Benchmarking Language Model Creativity: A Case Study on Code Generation [17.56712029335294]
creativity consists of at least two key characteristics: emphconvergent thinking (purposefulness to achieve a given goal) and emphdivergent thinking (adaptability to new environments or constraints) citeprunco 2003critical
We introduce a framework for quantifying LLM creativity that incorporates the two characteristics.
This is achieved by (1) Denial Prompting pushes LLMs to come up with more creative solutions to a given problem by incrementally imposing new constraints on the previous solution, and (2) defining and computing the NeoGauge metric which examines both convergent and divergent thinking in the generated creative
arXiv Detail & Related papers (2024-07-12T05:55:22Z) - Divergent Creativity in Humans and Large Language Models [37.67363469600804]
The recent surge in the capabilities of Large Language Models has led to claims that they are approaching a level of creativity akin to human capabilities.
We leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans.
arXiv Detail & Related papers (2024-05-13T22:37:52Z) - LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play [43.55248812883912]
Large language models (LLMs) have shown exceptional proficiency in natural language processing but often fall short of generating creative and original responses to open-ended questions.
We propose LLM Discussion, a three-phase discussion framework that facilitates vigorous and diverging idea exchanges.
We evaluate the efficacy of the proposed framework with the Alternative Uses Test, Similarities Test, Instances Test, and Scientific Creativity Test.
arXiv Detail & Related papers (2024-05-10T10:19:14Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - User-Controlled Knowledge Fusion in Large Language Models: Balancing
Creativity and Hallucination [5.046007553593371]
Large Language Models (LLMs) generate diverse, relevant, and creative responses.
Striking a balance between the LLM's imaginative capabilities and its adherence to factual information is a key challenge.
This paper presents an innovative user-controllable mechanism that modulates the balance between an LLM's imaginative capabilities and its adherence to factual information.
arXiv Detail & Related papers (2023-07-30T06:06:35Z) - A Survey on Evaluation of Large Language Models [87.60417393701331]
Large language models (LLMs) are gaining increasing popularity in both academia and industry.
This paper focuses on three key dimensions: what to evaluate, where to evaluate, and how to evaluate.
arXiv Detail & Related papers (2023-07-06T16:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.