Can Good Writing Be Generative? Expert-Level AI Writing Emerges through Fine-Tuning on High-Quality Books
- URL: http://arxiv.org/abs/2601.18353v1
- Date: Mon, 26 Jan 2026 10:59:21 GMT
- Title: Can Good Writing Be Generative? Expert-Level AI Writing Emerges through Fine-Tuning on High-Quality Books
- Authors: Tuhin Chakrabarty, Paramveer S. Dhillon,
- Abstract summary: Generative AI can emulate thousands of author styles in seconds with negligible marginal labor.<n>Experts preferred human writing in 82.7% of cases under the in-context prompting condition.<n>Lay judges, however, consistently preferred AI writing.
- Score: 7.522123492613126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creative writing has long been considered a uniquely human endeavor, requiring voice and style that machines could not replicate. This assumption is challenged by Generative AI that can emulate thousands of author styles in seconds with negligible marginal labor. To understand this better, we conducted a behavioral experiment where 28 MFA writers (experts) competed against three LLMs in emulating 50 critically acclaimed authors. Based on blind pairwise comparisons by 28 expert judges and 131 lay judges, we find that experts preferred human writing in 82.7% of cases under the in-context prompting condition but this reversed to 62% preference for AI after fine-tuning on authors' complete works. Lay judges, however, consistently preferred AI writing. Debrief interviews with expert writers revealed that their preference for AI writing triggered an identity crisis, eroding aesthetic confidence and questioning what constitutes "good writing." These findings challenge discourse about AI's creative limitations and raise fundamental questions about the future of creative labor.
Related papers
- Can professional translators identify machine-generated text? [0.0]
This study investigates whether professional translators can reliably identify short stories generated in Italian by artificial intelligence (AI) without prior specialized training.<n>Sixty-nine translators took part in an in-person experiment, where they assessed three anonymized short stories.<n>Low burstiness and narrative contradiction emerged as the most reliable indicators of synthetic authorship.
arXiv Detail & Related papers (2026-01-22T10:25:52Z) - Readers Prefer Outputs of AI Trained on Copyrighted Books over Expert Human Writers [8.031052360107092]
It's unclear if frontier AI models can generate high quality literary text while emulating authors' styles.<n>We compare MFA-trained expert writers with three frontier AI models: ChatGPT, Claude & Gemini in writing up to 450 word excerpts emulating 50 award-winning authors' diverse styles.
arXiv Detail & Related papers (2025-10-15T17:51:58Z) - Precarity and Solidarity: Preliminary results on a study of queer and disabled fiction writers' experiences with generative AI [0.0]
We find that queer and disabled writers are markedly more pessimistic than non-queer and non-disabled writers about the impact of AI on their industry.<n>We explore ways that generative AI exacerbates existing sources of instability and precarity in the publishing industry.
arXiv Detail & Related papers (2024-12-05T19:41:30Z) - "It was 80% me, 20% AI": Seeking Authenticity in Co-Writing with Large Language Models [97.22914355737676]
We examine whether and how writers want to preserve their authentic voice when co-writing with AI tools.
Our findings illuminate conceptions of authenticity in human-AI co-creation.
Readers' responses showed less concern about human-AI co-writing.
arXiv Detail & Related papers (2024-11-20T04:42:32Z) - AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text [53.15652021126663]
We present CREATIVITY INDEX as the first step to quantify the linguistic creativity of a text.<n>To compute CREATIVITY INDEX efficiently, we introduce DJ SEARCH, a novel dynamic programming algorithm.<n>Experiments reveal that the CREATIVITY INDEX of professional human authors is on average 66.2% higher than that of LLMs.
arXiv Detail & Related papers (2024-10-05T18:55:01Z) - Human Bias in the Face of AI: Examining Human Judgment Against Text Labeled as AI Generated [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.<n>We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - The Great AI Witch Hunt: Reviewers Perception and (Mis)Conception of Generative AI in Research Writing [36.188062803005515]
Generative AI (GenAI) use in research writing is growing fast.<n>It is unclear how peer reviewers recognize or misjudge AI-augmented manuscripts.<n>Our findings indicate that while AI-augmented writing improves readability, language diversity, and informativeness, it often lacks research details and reflective insights from authors.
arXiv Detail & Related papers (2024-06-27T02:38:25Z) - PaperCard for Reporting Machine Assistance in Academic Writing [48.33722012818687]
ChatGPT, a question-answering system released by OpenAI in November 2022, has demonstrated a range of capabilities that could be utilised in producing academic papers.
This raises critical questions surrounding the concept of authorship in academia.
We propose a framework we name "PaperCard", a documentation for human authors to transparently declare the use of AI in their writing process.
arXiv Detail & Related papers (2023-10-07T14:28:04Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Can Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap [56.611702960809644]
We benchmark AI's ability to imitate humans in three language tasks and three vision tasks.<n>Next, we conducted 72,191 Turing-like tests with 1,916 human judges and 10 AI judges.<n>Imitation ability showed minimal correlation with conventional AI performance metrics.
arXiv Detail & Related papers (2022-11-23T16:16:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.