Related papers: Examining Linguistic Shifts in Academic Writing Before and After the Launch of ChatGPT: A Study on Preprint Papers

Examining Linguistic Shifts in Academic Writing Before and After the Launch of ChatGPT: A Study on Preprint Papers

URL: http://arxiv.org/abs/2505.12218v1
Date: Sun, 18 May 2025 03:35:43 GMT
Title: Examining Linguistic Shifts in Academic Writing Before and After the Launch of ChatGPT: A Study on Preprint Papers
Authors: Tong Bao, Yi Zhao, Jin Mao, Chengzhi Zhang,
Abstract summary: Large Language Models (LLMs) have prompted academic concerns about their impact on academic writing.<n>We conducted a large-scale analysis across 823,798 abstracts published in last decade from arXiv dataset.
Score: 7.161475868971293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs), such as ChatGPT, have prompted academic concerns about their impact on academic writing. Existing studies have primarily examined LLM usage in academic writing through quantitative approaches, such as word frequency statistics and probability-based analyses. However, few have systematically examined the potential impact of LLMs on the linguistic characteristics of academic writing. To address this gap, we conducted a large-scale analysis across 823,798 abstracts published in last decade from arXiv dataset. Through the linguistic analysis of features such as the frequency of LLM-preferred words, lexical complexity, syntactic complexity, cohesion, readability and sentiment, the results indicate a significant increase in the proportion of LLM-preferred words in abstracts, revealing the widespread influence of LLMs on academic writing. Additionally, we observed an increase in lexical complexity and sentiment in the abstracts, but a decrease in syntactic complexity, suggesting that LLMs introduce more new vocabulary and simplify sentence structure. However, the significant decrease in cohesion and readability indicates that abstracts have fewer connecting words and are becoming more difficult to read. Moreover, our analysis reveals that scholars with weaker English proficiency were more likely to use the LLMs for academic writing, and focused on improving the overall logic and fluency of the abstracts. Finally, at discipline level, we found that scholars in Computer Science showed more pronounced changes in writing style, while the changes in Mathematics were minimal.

Related papers

Divergent LLM Adoption and Heterogeneous Convergence Paths in Research Writing [0.8046044493355781]
Large Language Models (LLMs) are reshaping content creation and academic writing.<n>This study investigates the impact of AI-assisted generative revisions on research manuscripts.
arXiv Detail & Related papers (2025-04-18T11:09:16Z)
ChatGPT as Linguistic Equalizer? Quantifying LLM-Driven Lexical Shifts in Academic Writing [2.0117661599862164]
This study investigates whether ChatGPT mitigates barriers and fosters equity by analyzing lexical complexity shifts across 2.8 million articles from OpenAlex ( 2020-2024)<n>We demonstrate that ChatGPT significantly enhances lexical complexity in NNES-authored abstracts, even after controlling for article-level controls, authorship patterns, and venue norms.<n>These findings provide causal evidence that ChatGPT reduces linguistic disparities and promotes equity in global academia.
arXiv Detail & Related papers (2025-04-10T14:11:24Z)
Linguistic Blind Spots of Large Language Models [14.755831733659699]
We study the performance of recent large language models (LLMs) on linguistic annotation tasks.<n>We find that recent LLMs show limited efficacy in addressing linguistic queries and often struggle with linguistically complex inputs.<n>Our results provide insights to inform future advancements in LLM design and development.
arXiv Detail & Related papers (2025-03-25T01:47:13Z)
Human-LLM Coevolution: Evidence from Academic Writing [0.0]
We report a marked drop in the frequency of several words previously identified as overused by ChatGPT, such as "delve"<n>The frequency of certain other words favored by ChatGPT, such as "significant", has instead kept increasing.
arXiv Detail & Related papers (2025-02-13T18:55:56Z)
Transforming Scholarly Landscapes: Influence of Large Language Models on Academic Fields beyond Computer Science [77.31665252336157]
Large Language Models (LLMs) have ushered in a transformative era in Natural Language Processing (NLP) This work empirically examines the influence and use of LLMs in fields beyond NLP.
arXiv Detail & Related papers (2024-09-29T01:32:35Z)
Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases. Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z)
Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism [62.571419297164645]
This paper provides a systematic overview of prior works on the logical reasoning ability of large language models for analyzing categorical syllogisms.<n>We first investigate all the possible variations for the categorical syllogisms from a purely logical perspective.<n>We then examine the underlying configurations (i.e., mood and figure) tested by the existing datasets.
arXiv Detail & Related papers (2024-06-26T21:17:20Z)
PhonologyBench: Evaluating Phonological Skills of Large Language Models [57.80997670335227]
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research. We present PhonologyBench, a novel benchmark consisting of three diagnostic tasks designed to explicitly test the phonological skills of LLMs. We observe a significant gap of 17% and 45% on Rhyme Word Generation and Syllable counting, respectively, when compared to humans.
arXiv Detail & Related papers (2024-04-03T04:53:14Z)
Mapping the Increasing Use of LLMs in Scientific Papers [99.67983375899719]
We conduct the first systematic, large-scale analysis across 950,965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals. Our findings reveal a steady increase in LLM usage, with the largest and fastest growth observed in Computer Science papers.
arXiv Detail & Related papers (2024-04-01T17:45:15Z)
MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations [37.13707912132472]
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions. Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly. We systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning.
arXiv Detail & Related papers (2023-10-18T00:02:38Z)
Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks. We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.