The creative psychometric item generator: a framework for item generation and validation using large language models
- URL: http://arxiv.org/abs/2409.00202v1
- Date: Fri, 30 Aug 2024 18:31:02 GMT
- Title: The creative psychometric item generator: a framework for item generation and validation using large language models
- Authors: Antonio Laverghetta Jr., Simone Luchini, Averie Linell, Roni Reiter-Palmon, Roger Beaty,
- Abstract summary: Large language models (LLMs) are being used to automate workplace processes requiring a high degree of creativity.
We develop a psychometrically inspired framework for creating test items for a classic free-response creativity test: the creative problem-solving (CPS) task.
We find strong empirical evidence that CPIG generates valid and reliable items and that this effect is not attributable to known biases in the evaluation process.
- Score: 1.765099515298011
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Increasingly, large language models (LLMs) are being used to automate workplace processes requiring a high degree of creativity. While much prior work has examined the creativity of LLMs, there has been little research on whether they can generate valid creativity assessments for humans despite the increasingly central role of creativity in modern economies. We develop a psychometrically inspired framework for creating test items (questions) for a classic free-response creativity test: the creative problem-solving (CPS) task. Our framework, the creative psychometric item generator (CPIG), uses a mixture of LLM-based item generators and evaluators to iteratively develop new prompts for writing CPS items, such that items from later iterations will elicit more creative responses from test takers. We find strong empirical evidence that CPIG generates valid and reliable items and that this effect is not attributable to known biases in the evaluation process. Our findings have implications for employing LLMs to automatically generate valid and reliable creativity tests for humans and AI.
Related papers
- A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts [2.709166684084394]
This study proposes a framework called GPS, which employs goals, prompts, and strategies to guide designers to systematically work with an LLM tool for improving the creativity of ideas generated during brainstorming.
Our framework, tested through a design example and a case study, demonstrates its effectiveness in stimulating creativity and its seamless LLM tool integration into design practices.
arXiv Detail & Related papers (2024-10-10T13:39:27Z) - Initial Development and Evaluation of the Creative Artificial Intelligence through Recurring Developments and Determinations (CAIRDD) System [0.0]
Large language models (LLMs) provide a facsimile of creativity and the appearance of sentience, while not actually being either creative or sentient.
This paper proposes a technique for enhancing LLM output creativity via an iterative process of concept injection and refinement.
arXiv Detail & Related papers (2024-09-03T21:04:07Z) - Creativity and Markov Decision Processes [0.20482269513546453]
We identify formal mappings between Boden's process theory of creativity and Markov Decision Processes (MDPs)
We study three out of eleven mappings in detail to understand which types of creative processes, opportunities foraberrations, and threats to creativity (uninspiration) could be observed in an MDP.
We conclude by discussing quality criteria for the selection of such mappings for future work and applications.
arXiv Detail & Related papers (2024-05-23T18:16:42Z) - CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation [4.664950672096393]
Large Language Models (LLMs) have proved effective and efficient in generating code.
CreativeEval is a framework for evaluating the creativity of LLMs within the context of generating hardware designs.
arXiv Detail & Related papers (2024-04-12T20:41:47Z) - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is a large language model-powered research idea writing agent.
It generates problems, methods, and experiment designs while iteratively refining them based on scientific literature.
We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z) - Assessing and Understanding Creativity in Large Language Models [33.37237667182931]
This paper aims to establish an efficient framework for assessing the level of creativity in large language models (LLMs)
By adapting the Torrance Tests of Creative Thinking, the research evaluates the creative performance of various LLMs across 7 tasks.
We found that the creativity of LLMs primarily falls short in originality, while excelling in elaboration.
arXiv Detail & Related papers (2024-01-23T05:19:47Z) - Can AI Be as Creative as Humans? [84.43873277557852]
We prove in theory that AI can be as creative as humans under the condition that it can properly fit the data generated by human creators.
The debate on AI's creativity is reduced into the question of its ability to fit a sufficient amount of data.
arXiv Detail & Related papers (2024-01-03T08:49:12Z) - Art or Artifice? Large Language Models and the False Promise of
Creativity [53.04834589006685]
We propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product.
TTCW consists of 14 binary tests organized into the original dimensions of Fluency, Flexibility, Originality, and Elaboration.
Our analysis shows that LLM-generated stories pass 3-10X less TTCW tests than stories written by professionals.
arXiv Detail & Related papers (2023-09-25T22:02:46Z) - CRITIC: Large Language Models Can Self-Correct with Tool-Interactive
Critiquing [139.77117915309023]
CRITIC allows large language models to validate and amend their own outputs in a manner similar to human interaction with tools.
Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs.
arXiv Detail & Related papers (2023-05-19T15:19:44Z) - Towards Creativity Characterization of Generative Models via Group-based
Subset Scanning [64.6217849133164]
We propose group-based subset scanning to identify, quantify, and characterize creative processes.
We find that creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2022-03-01T15:07:14Z) - Towards creativity characterization of generative models via group-based
subset scanning [51.84144826134919]
We propose group-based subset scanning to quantify, detect, and characterize creative processes.
Creative samples generate larger subsets of anomalies than normal or non-creative samples across datasets.
arXiv Detail & Related papers (2021-04-01T14:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.