Divergent Creativity in Humans and Large Language Models
- URL: http://arxiv.org/abs/2405.13012v2
- Date: Tue, 01 Jul 2025 19:34:19 GMT
- Title: Divergent Creativity in Humans and Large Language Models
- Authors: Antoine Bellemare-Pepin, François Lespinasse, Philipp Thölke, Yann Harel, Kory Mathewson, Jay A. Olson, Yoshua Bengio, Karim Jerbi,
- Abstract summary: Large Language Models (LLMs) have led to claims that they are approaching a level of creativity akin to human capabilities.<n>We leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a dataset of 100,000 humans.<n>We found evidence that LLMs can surpass average human performance on the Divergent Association Task, and approach human creative writing abilities.
- Score: 37.67363469600804
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs' semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence that LLMs can surpass average human performance on the Divergent Association Task, and approach human creative writing abilities, though they fall short of the typical performance of highly creative humans. Notably, even the top performing LLMs are still largely surpassed by highly creative individuals, underscoring a ceiling that current LLMs still fail to surpass. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labour by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.
Related papers
- Beyond Divergent Creativity: A Human-Based Evaluation of Creativity in Large Language Models [6.036586911740041]
Large language models (LLMs) are increasingly used in verbal creative tasks.<n>The widely used Divergent Association Task ( DAT) focuses on novelty, ignoring appropriateness.<n>We evaluate a range of state-of-the-art LLMs on DAT and show that their scores on the task are lower than those of two baselines that do not possess any creative abilities.
arXiv Detail & Related papers (2026-01-28T12:41:32Z) - Deep Associations, High Creativity: A Simple yet Effective Metric for Evaluating Large Language Models [0.3580891736370874]
We propose PACE, asking LLMs to generate Association Chains to evaluate their creativity.<n> PACE minimizes the risk of data contamination and offers a straightforward, highly efficient evaluation.
arXiv Detail & Related papers (2025-10-14T03:26:28Z) - Pixels, Patterns, but No Poetry: To See The World like Humans [33.773551676022514]
State-of-the-art MLLMs exhibit catastrophic failures on our perceptual tasks trivial for humans.<n>This paper shifts focus from reasoning to perception.
arXiv Detail & Related papers (2025-07-21T21:50:16Z) - From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning [63.25540801694765]
Large Language Models (LLMs) demonstrate striking linguistic abilities, yet whether they achieve this same balance remains unclear.<n>We apply the Information Bottleneck principle to quantitatively compare how LLMs and humans navigate this compression-meaning trade-off.
arXiv Detail & Related papers (2025-05-21T16:29:00Z) - Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations [53.950760059792614]
Large Language Models (LLMs) excel at countless tasks, yet struggle with creativity.
We introduce a novel approach that couples LLMs with structured representations and cognitively inspired manipulations to generate more creative and diverse ideas.
We demonstrate our approach in the culinary domain with DishCOVER, a model that generates creative recipes.
arXiv Detail & Related papers (2025-04-29T11:13:06Z) - Probing and Inducing Combinational Creativity in Vision-Language Models [52.76981145923602]
Recent advances in Vision-Language Models (VLMs) have sparked debate about whether their outputs reflect combinational creativity.
We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels.
To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework.
arXiv Detail & Related papers (2025-04-17T17:38:18Z) - How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition [75.11808682808065]
This study investigates whether large language models (LLMs) exhibit similar tendencies in understanding semantic size.
Our findings reveal that multi-modal training is crucial for LLMs to achieve more human-like understanding.
Lastly, we examine whether LLMs are influenced by attention-grabbing headlines with larger semantic sizes in a real-world web shopping scenario.
arXiv Detail & Related papers (2025-03-01T03:35:56Z) - Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs [17.44511150123112]
Large Language Models (LLMs) have shown significant limitations in understanding creative content.<n>We revisit this challenge by decomposing humor understanding into three components and systematically improve each.<n>Our refined approach achieves 82.4% accuracy in caption ranking, singificantly improving upon the previous 67% benchmark.
arXiv Detail & Related papers (2025-02-27T18:29:09Z) - A Causality-aware Paradigm for Evaluating Creativity of Multimodal Large Language Models [100.16387798660833]
Oogiri game is a creativity-driven task requiring humor and associative thinking.
LoTbench is an interactive, causality-aware evaluation framework.
Results show that while most LLMs exhibit constrained creativity, the performance gap between LLMs and humans is not insurmountable.
arXiv Detail & Related papers (2025-01-25T09:11:15Z) - Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models [2.9312156642007294]
We systematically review Large Language Models' capabilities across three important cognitive domains: decision-making biases, reasoning, and creativity.<n>On decision-making, our synthesis reveals that while LLMs demonstrate several human-like biases, some biases observed in humans are absent.<n>On reasoning, advanced LLMs like GPT-4 exhibit deliberative reasoning akin to human System-2 thinking, while smaller models fall short of human-level performance.<n>A distinct dichotomy emerges in creativity: while LLMs excel in language-based creative tasks, such as storytelling, they struggle with divergent thinking tasks that require real-world context.
arXiv Detail & Related papers (2024-12-20T02:26:56Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Benchmarking Language Model Creativity: A Case Study on Code Generation [17.56712029335294]
creativity consists of at least two key characteristics: emphconvergent thinking (purposefulness to achieve a given goal) and emphdivergent thinking (adaptability to new environments or constraints) citeprunco 2003critical
We introduce a framework for quantifying LLM creativity that incorporates the two characteristics.
This is achieved by (1) Denial Prompting pushes LLMs to come up with more creative solutions to a given problem by incrementally imposing new constraints on the previous solution, and (2) defining and computing the NeoGauge metric which examines both convergent and divergent thinking in the generated creative
arXiv Detail & Related papers (2024-07-12T05:55:22Z) - Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs)
Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z) - Characterising the Creative Process in Humans and Large Language Models [6.363158395541767]
We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task.
We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles.
Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity.
Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity.
arXiv Detail & Related papers (2024-05-01T23:06:46Z) - Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models [71.93366651585275]
Large language models (LLMs) have exhibited impressive performance in language comprehension and various reasoning tasks.
We propose Visualization-of-Thought (VoT) to elicit spatial reasoning of LLMs by visualizing their reasoning traces.
VoT significantly enhances the spatial reasoning abilities of LLMs.
arXiv Detail & Related papers (2024-04-04T17:45:08Z) - Should We Fear Large Language Models? A Structural Analysis of the Human
Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens
of Heidegger's Philosophy [0.0]
This study investigates the capabilities and risks of Large Language Models (LLMs)
It uses the innovative parallels between the statistical patterns of word relationships within LLMs and Martin Heidegger's concepts of "ready-to-hand" and "present-at-hand"
Our findings reveal that while LLMs possess the capability for Direct Explicative Reasoning and Pseudo Rational Reasoning, they fall short in authentic rational reasoning and have no creative reasoning capabilities.
arXiv Detail & Related papers (2024-03-05T19:40:53Z) - Assessing and Understanding Creativity in Large Language Models [33.37237667182931]
This paper aims to establish an efficient framework for assessing the level of creativity in large language models (LLMs)
By adapting the Torrance Tests of Creative Thinking, the research evaluates the creative performance of various LLMs across 7 tasks.
We found that the creativity of LLMs primarily falls short in originality, while excelling in elaboration.
arXiv Detail & Related papers (2024-01-23T05:19:47Z) - Can AI Be as Creative as Humans? [84.43873277557852]
We prove in theory that AI can be as creative as humans under the condition that it can properly fit the data generated by human creators.
The debate on AI's creativity is reduced into the question of its ability to fit a sufficient amount of data.
arXiv Detail & Related papers (2024-01-03T08:49:12Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting.
We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems.
We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z) - Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation [19.62178304006683]
We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas.
We propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses.
arXiv Detail & Related papers (2023-10-19T17:53:14Z) - Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration [83.4031923134958]
Corex is a suite of novel general-purpose strategies that transform Large Language Models into autonomous agents.
Inspired by human behaviors, Corex is constituted by diverse collaboration paradigms including Debate, Review, and Retrieve modes.
We demonstrate that orchestrating multiple LLMs to work in concert yields substantially better performance compared to existing methods.
arXiv Detail & Related papers (2023-09-30T07:11:39Z) - Principle-Driven Self-Alignment of Language Models from Scratch with
Minimal Human Supervision [84.31474052176343]
Recent AI-assistant agents, such as ChatGPT, rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback to align the output with human intentions.
This dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision.
We propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision.
arXiv Detail & Related papers (2023-05-04T17:59:28Z) - On the Creativity of Large Language Models [2.4555276449137042]
Large Language Models (LLMs) are revolutionizing several areas of Artificial Intelligence.
This article first analyzes the development of LLMs under the lens of creativity theories.
Then, we consider different classic perspectives, namely product, process, press, and person.
Finally, we examine the societal impact of these technologies with a particular focus on the creative industries.
arXiv Detail & Related papers (2023-03-27T18:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.