Automatic Assessment of Divergent Thinking in Chinese Language with
TransDis: A Transformer-Based Language Model Approach
- URL: http://arxiv.org/abs/2306.14790v3
- Date: Sun, 24 Dec 2023 15:08:59 GMT
- Title: Automatic Assessment of Divergent Thinking in Chinese Language with
TransDis: A Transformer-Based Language Model Approach
- Authors: Tianchen Yang, Qifan Zhang, Zhaoyang Sun, and Yubo Hou
- Abstract summary: The TransDis system is capable of providing valid originality (quality) and flexibility (variety) scores for Alternative Uses Task (AUT) responses in Chinese.
We offer an open platform to compute originality and flexibility for AUT responses in Chinese and over 50 other languages.
- Score: 4.389212459491442
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Language models have been increasingly popular for automatic creativity
assessment, generating semantic distances to objectively measure the quality of
creative ideas. However, there is currently a lack of an automatic assessment
system for evaluating creative ideas in the Chinese language. To address this
gap, we developed TransDis, a scoring system using transformer-based language
models, capable of providing valid originality (quality) and flexibility
(variety) scores for Alternative Uses Task (AUT) responses in Chinese. Study 1
demonstrated that the latent model-rated originality factor, comprised of three
transformer-based models, strongly predicted human originality ratings, and the
model-rated flexibility strongly correlated with human flexibility ratings as
well. Criterion validity analyses indicated that model-rated originality and
flexibility positively correlated to other creativity measures, demonstrating
similar validity to human ratings. Study 2 & 3 showed that TransDis effectively
distinguished participants instructed to provide creative vs. common uses
(Study 2) and participants instructed to generate ideas in a flexible vs.
persistent way (Study 3). Our findings suggest that TransDis can be a reliable
and low-cost tool for measuring idea originality and flexibility in Chinese
language, potentially paving the way for automatic creativity assessment in
other languages. We offer an open platform to compute originality and
flexibility for AUT responses in Chinese and over 50 other languages
(https://osf.io/59jv2/).
Related papers
- COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes [83.84578306665976]
Large language models exhibit systematic deficiencies in creative writing, particularly in non-English contexts.<n>We present COIG-Writer, a novel Chinese creative writing dataset that captures both diverse outputs and their underlying thought processes.
arXiv Detail & Related papers (2025-10-16T15:01:19Z) - S-DAT: A Multilingual, GenAI-Driven Framework for Automated Divergent Thinking Assessment [23.509294903995745]
This paper introduces S- DAT (Synthetic-Divergent Association Task), a scalable, multilingual framework for automated assessment of divergent thinking (DT)<n>We evaluate S- DAT across eleven diverse languages, including English, Spanish, German, Russian, Hindi, and Japanese (Kanji, Hiragana, Katakana)<n>Unlike prior DAT approaches, the S- DAT shows convergent validity with other DT measures and correct discriminant validity with convergent thinking.
arXiv Detail & Related papers (2025-05-14T02:08:40Z) - Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach [32.654673913638426]
We propose an automated evaluation method based on the Torrance Test of Creative Writing (TTCW), which evaluates creativity as product.
Our method employs a reference-based Likert-style approach, scoring generated creative texts relative to high-quality reference texts.
arXiv Detail & Related papers (2025-04-22T10:52:23Z) - Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems [6.213698466889738]
We introduce a novel method inspired by information theory which generates challenging Chinese homophone words related to emotions.
Our approach generates homophones that were observed to cause translation errors in emotion preservation, and exposes vulnerabilities in machine translation systems.
We evaluate the efficacy of our method using human evaluation for the quality of these generated homophones, and compare it with an existing one.
arXiv Detail & Related papers (2025-03-20T13:56:15Z) - Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation [2.4555276449137042]
Large language models for creative tasks often lack diversity.<n>Common solutions, such as sampling at higher temperatures, can compromise the quality of the results.<n>We propose a context-based score to quantitatively evaluate value and originality.
arXiv Detail & Related papers (2025-02-18T19:00:01Z) - Inductive Linguistic Reasoning with Large Language Models [0.0]
We investigate the abilities of large language models to perform abstract multilingual reasoning through the lens of linguistic puzzles.
We employ a two-stage procedure, first generating analogical exemplars with a language model, and then applying them in-context.
Our results on the modeLing dataset show that analogical prompting is effective in eliciting models' knowledge of language grammar similarities.
arXiv Detail & Related papers (2024-12-09T03:37:11Z) - Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Dual-Layer Training and Decoding of Large Language Model with Simultaneously Thinking and Speaking [8.02728252625147]
Large Language Model can reasonably understand and generate human expressions but may lack thorough thinking and reasoning mechanisms.
In this paper, we are motivated by the cognitive mechanism in the natural world, and design a novel model architecture called TaS.
We train the language model by the thoughts-augmented data and successfully let the thinking layer automatically generate reasonable thoughts and finally output more reasonable responses.
arXiv Detail & Related papers (2024-09-18T15:32:48Z) - Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs)
Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - SimOAP: Improve Coherence and Consistency in Persona-based Dialogue
Generation via Over-sampling and Post-evaluation [54.66399120084227]
Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue.
For the persona-based dialogue generation task, consistency and coherence are great challenges for language models.
A two-stage SimOAP strategy is proposed, i.e., over-sampling and post-evaluation.
arXiv Detail & Related papers (2023-05-18T17:23:00Z) - Democratizing Ethical Assessment of Natural Language Generation Models [0.0]
Natural language generation models are computer systems that generate coherent language when prompted with a sequence of words as context.
Despite their ubiquity and many beneficial applications, language generation models also have the potential to inflict social harms.
Ethical assessment of these models is therefore critical.
This article introduces a new tool to democratize and standardize ethical assessment of natural language generation models.
arXiv Detail & Related papers (2022-06-30T12:20:31Z) - Language Model Evaluation Beyond Perplexity [47.268323020210175]
We analyze whether text generated from language models exhibits the statistical tendencies present in the human-generated text on which they were trained.
We find that neural language models appear to learn only a subset of the tendencies considered, but align much more closely with empirical trends than proposed theoretical distributions.
arXiv Detail & Related papers (2021-05-31T20:13:44Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Read Like Humans: Autonomous, Bidirectional and Iterative Language
Modeling for Scene Text Recognition [80.446770909975]
Linguistic knowledge is of great benefit to scene text recognition.
How to effectively model linguistic rules in end-to-end deep networks remains a research challenge.
We propose an autonomous, bidirectional and iterative ABINet for scene text recognition.
arXiv Detail & Related papers (2021-03-11T06:47:45Z) - Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [74.09352261943911]
We study knowledge-grounded dialogue generation with pre-trained language models.
We propose equipping response generation defined by a pre-trained language model with a knowledge selection module.
arXiv Detail & Related papers (2020-10-17T16:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.