Exploring the Structure of AI-Induced Language Change in Scientific English
- URL: http://arxiv.org/abs/2506.21817v1
- Date: Thu, 26 Jun 2025 23:44:24 GMT
- Title: Exploring the Structure of AI-Induced Language Change in Scientific English
- Authors: Riley Galpin, Bryce Anderson, Tom S. Juzek,
- Abstract summary: We find that entire semantic clusters often shift together, with most or all words in a group increasing in usage.<n>This pattern suggests that changes induced by Large Language Models are primarily semantic and pragmatic rather than purely lexical.<n>Our analysis of "collapsing" words reveals a more complex picture, which is consistent with organic language change.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Scientific English has undergone rapid and unprecedented changes in recent years, with words such as "delve," "intricate," and "crucial" showing significant spikes in frequency since around 2022. These changes are widely attributed to the growing influence of Large Language Models like ChatGPT in the discourse surrounding bias and misalignment. However, apart from changes in frequency, the exact structure of these linguistic shifts has remained unclear. The present study addresses this and investigates whether these changes involve the replacement of synonyms by suddenly 'spiking words,' for example, "crucial" replacing "essential" and "key," or whether they reflect broader semantic and pragmatic qualifications. To further investigate structural changes, we include part of speech tagging in our analysis to quantify linguistic shifts over grammatical categories and differentiate between word forms, like "potential" as a noun vs. as an adjective. We systematically analyze synonym groups for widely discussed 'spiking words' based on frequency trends in scientific abstracts from PubMed. We find that entire semantic clusters often shift together, with most or all words in a group increasing in usage. This pattern suggests that changes induced by Large Language Models are primarily semantic and pragmatic rather than purely lexical. Notably, the adjective "important" shows a significant decline, which prompted us to systematically analyze decreasing lexical items. Our analysis of "collapsing" words reveals a more complex picture, which is consistent with organic language change and contrasts with the patterns of the abrupt spikes. These insights into the structure of language change contribute to our understanding of how language technology continues to shape human language.
Related papers
- Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English [0.0]
In recent years, written language, particularly in science and education, has undergone remarkable shifts in word usage.<n> Divergences between model output and target audience norms can be viewed as a form of misalignment.<n>We constructed a dataset of 22.1 million words from unscripted spoken language drawn from conversational science and technology podcasts.
arXiv Detail & Related papers (2025-08-01T00:47:33Z) - Correlation Does Not Imply Compensation: Complexity and Irregularity in the Lexicon [48.00488140516432]
We find evidence of a positive relationship between morphological irregularity and phonotactic complexity within languages.
We also find weak evidence of a negative relationship between word length and morphological irregularity.
arXiv Detail & Related papers (2024-06-07T18:09:21Z) - Survey in Characterization of Semantic Change [0.1474723404975345]
Understanding the meaning of words is vital for interpreting texts from different cultures.
Semantic changes can potentially impact the quality of the outcomes of computational linguistics algorithms.
arXiv Detail & Related papers (2024-02-29T12:13:50Z) - Syntactic Language Change in English and German: Metrics, Parsers, and Convergences [56.47832275431858]
The current paper looks at diachronic trends in syntactic language change in both English and German, using corpora of parliamentary debates from the last c. 160 years.
We base our observations on five dependencys, including the widely used Stanford Core as well as 4 newer alternatives.
We show that changes in syntactic measures seem to be more frequent at the tails of sentence length distributions.
arXiv Detail & Related papers (2024-02-18T11:46:16Z) - Neighboring Words Affect Human Interpretation of Saliency Explanations [65.29015910991261]
Word-level saliency explanations are often used to communicate feature-attribution in text-based models.
Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores.
We investigate how the marking of a word's neighboring words affect the explainee's perception of the word's importance in the context of a saliency explanation.
arXiv Detail & Related papers (2023-05-04T09:50:25Z) - Quantifying the Roles of Visual, Linguistic, and Visual-Linguistic
Complexity in Verb Acquisition [8.183763443800348]
We employ visual and linguistic representations of words sourced from pre-trained artificial neural networks.
We find that the representation of verbs is generally more variable and less discriminable within domain than the representation of nouns.
Visual variability is the strongest factor that internally drives verb learning, followed by visual-linguistic alignment and linguistic variability.
arXiv Detail & Related papers (2023-04-05T15:08:21Z) - Do Not Fire the Linguist: Grammatical Profiles Help Language Models
Detect Semantic Change [6.7485485663645495]
We first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages.
Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages.
arXiv Detail & Related papers (2022-04-12T11:20:42Z) - Slangvolution: A Causal Analysis of Semantic Change and Frequency
Dynamics in Slang [18.609276255676175]
We study slang, an informal language that is typically restricted to a specific group or social setting.
We analyze the semantic change and frequency shift of slang words and compare them to those of standard, nonslang words.
We show that slang words undergo less semantic change but tend to have larger frequency shifts over time.
arXiv Detail & Related papers (2022-03-09T11:34:43Z) - Disambiguatory Signals are Stronger in Word-initial Positions [48.18148856974974]
We point out the confounds in existing methods for comparing the informativeness of segments early in the word versus later in the word.
We find evidence across hundreds of languages that indeed there is a cross-linguistic tendency to front-load information in words.
arXiv Detail & Related papers (2021-02-03T18:19:16Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - Where New Words Are Born: Distributional Semantic Analysis of Neologisms
and Their Semantic Neighborhoods [51.34667808471513]
We investigate the importance of two factors, semantic sparsity and frequency growth rates of semantic neighbors, formalized in the distributional semantics paradigm.
We show that both factors are predictive word emergence although we find more support for the latter hypothesis.
arXiv Detail & Related papers (2020-01-21T19:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.