Related papers: Large Language Models show both individual and collective creativity comparable to humans

Large Language Models show both individual and collective creativity comparable to humans

URL: http://arxiv.org/abs/2412.03151v1
Date: Wed, 04 Dec 2024 09:18:54 GMT
Title: Large Language Models show both individual and collective creativity comparable to humans
Authors: Luning Sun, Yuzhuo Yuan, Yuan Yao, Yanyan Li, Hao Zhang, Xing Xie, Xiting Wang, Fang Luo, David Stillwell,
Abstract summary: Large Language Models (LLMs) show creativity comparable to humans.<n>We benchmark the LLMs against individual humans, and also take a novel approach by comparing them to the collective creativity of groups of humans.<n>When questioned 10 times, an LLM's collective creativity is equivalent to 8-10 humans.
Score: 39.90254321453145
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Artificial intelligence has, so far, largely automated routine tasks, but what does it mean for the future of work if Large Language Models (LLMs) show creativity comparable to humans? To measure the creativity of LLMs holistically, the current study uses 13 creative tasks spanning three domains. We benchmark the LLMs against individual humans, and also take a novel approach by comparing them to the collective creativity of groups of humans. We find that the best LLMs (Claude and GPT-4) rank in the 52nd percentile against humans, and overall LLMs excel in divergent thinking and problem solving but lag in creative writing. When questioned 10 times, an LLM's collective creativity is equivalent to 8-10 humans. When more responses are requested, two additional responses of LLMs equal one extra human. Ultimately, LLMs, when optimally applied, may compete with a small group of humans in the future of work.

Related papers

How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition [75.11808682808065]
This study investigates whether large language models (LLMs) exhibit similar tendencies in understanding semantic size. Our findings reveal that multi-modal training is crucial for LLMs to achieve more human-like understanding. Lastly, we examine whether LLMs are influenced by attention-grabbing headlines with larger semantic sizes in a real-world web shopping scenario.
arXiv Detail & Related papers (2025-03-01T03:35:56Z)
We're Different, We're the Same: Creative Homogeneity Across LLMs [6.532204241949196]
Large language models (LLMs) are now available for use as writing support tools, idea generators, and beyond. Several works have shown that using an LLM as a creative partner results in a narrower set of creative outputs.
arXiv Detail & Related papers (2025-01-31T18:12:41Z)
A Causality-aware Paradigm for Evaluating Creativity of Multimodal Large Language Models [100.16387798660833]
Oogiri game is a creativity-driven task requiring humor and associative thinking. LoTbench is an interactive, causality-aware evaluation framework. Results show that while most LLMs exhibit constrained creativity, the performance gap between LLMs and humans is not insurmountable.
arXiv Detail & Related papers (2025-01-25T09:11:15Z)
LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions [7.575628120822444]
"LLM+AL" is a method that bridges the natural language understanding capabilities of LLMs with the symbolic reasoning strengths of action languages. We compare "LLM+AL" against state-of-the-art LLMs, including ChatGPT-4, Claude 3 Opus, Gemini Ultra 1.0, and o1-preview. Our findings indicate that, although all methods exhibit errors, LLM+AL, with relatively minimal human corrections, consistently leads to correct answers.
arXiv Detail & Related papers (2025-01-01T13:20:01Z)
Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models [2.9312156642007294]
We systematically review Large Language Models' capabilities across three important cognitive domains: decision-making biases, reasoning, and creativity. On decision-making, our synthesis reveals that while LLMs demonstrate several human-like biases, some biases observed in humans are absent. On reasoning, advanced LLMs like GPT-4 exhibit deliberative reasoning akin to human System-2 thinking, while smaller models fall short of human-level performance. A distinct dichotomy emerges in creativity: while LLMs excel in language-based creative tasks, such as storytelling, they struggle with divergent thinking tasks that require real-world context.
arXiv Detail & Related papers (2024-12-20T02:26:56Z)
Large Language Models Reflect the Ideology of their Creators [71.65505524599888]
Large language models (LLMs) are trained on vast amounts of data to generate natural language. This paper shows that the ideological stance of an LLM appears to reflect the worldview of its creators.
arXiv Detail & Related papers (2024-10-24T04:02:30Z)
WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents [55.64361927346957]
We propose a neurosymbolic approach to learn rules gradient-free through large language models (LLMs) Our embodied LLM agent "WALL-E" is built upon model-predictive control (MPC) On open-world challenges in Minecraft and ALFWorld, WALL-E achieves higher success rates than existing methods.
arXiv Detail & Related papers (2024-10-09T23:37:36Z)
Language Model Alignment in Multilingual Trolley Problems [138.5684081822807]
Building on the Moral Machine experiment, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP. Our analysis explores the alignment of 19 different LLMs with human judgments, capturing preferences across six moral dimensions. We discover significant variance in alignment across languages, challenging the assumption of uniform moral reasoning in AI systems.
arXiv Detail & Related papers (2024-07-02T14:02:53Z)
Divergent Creativity in Humans and Large Language Models [37.67363469600804]
The recent surge in the capabilities of Large Language Models has led to claims that they are approaching a level of creativity akin to human capabilities. We leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans.
arXiv Detail & Related papers (2024-05-13T22:37:52Z)
Characterising the Creative Process in Humans and Large Language Models [6.363158395541767]
We provide an automated method to characterise how humans and LLMs explore semantic spaces on the Alternate Uses Task. We use sentence embeddings to identify response categories and compute semantic similarities, which we use to generate jump profiles. Our results corroborate earlier work in humans reporting both persistent (deep search in few semantic spaces) and flexible (broad search across multiple semantic spaces) pathways to creativity. Though LLMs as a population match human profiles, their relationship with creativity is different, where the more flexible models score higher on creativity.
arXiv Detail & Related papers (2024-05-01T23:06:46Z)
Limits of Large Language Models in Debating Humans [0.0]
Large Language Models (LLMs) have shown remarkable promise in their ability to interact proficiently with humans. This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real people with LLM agents acting as people.
arXiv Detail & Related papers (2024-02-06T03:24:27Z)
Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench [83.41621219298489]
We evaluate Large Language Models' (LLMs) anthropomorphic capabilities using the emotion appraisal theory from psychology. We collect a dataset containing over 400 situations that have proven effective in eliciting the eight emotions central to our study. We conduct a human evaluation involving more than 1,200 subjects worldwide.
arXiv Detail & Related papers (2023-08-07T15:18:30Z)
Can Large Language Models Transform Computational Social Science? [79.62471267510963]
Large Language Models (LLMs) are capable of performing many language processing tasks zero-shot (without training data) This work provides a road map for using LLMs as Computational Social Science tools.
arXiv Detail & Related papers (2023-04-12T17:33:28Z)
Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning Skills of LLMs [0.0]
This study aims to investigate the performance of large language models (LLMs) on different reasoning tasks. My findings indicate that LLMs excel at analogical and moral reasoning, yet struggle to perform as proficiently on spatial reasoning tasks.
arXiv Detail & Related papers (2023-03-22T22:53:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.