Related papers: "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing

"You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing

URL: http://arxiv.org/abs/2505.08127v1
Date: Mon, 12 May 2025 23:58:41 GMT
Title: "You Cannot Sound Like GPT": Signs of language discrimination and resistance in computer science publishing
Authors: Haley Lepp, Daniel Scott Smith,
Abstract summary: We examine how peer reviewers critique writing clarity.<n>We find significant bias against authors associated with institutions in countries where English is less widely spoken.<n>We see only a muted shift in the expression of this bias after the introduction of ChatGPT in late 2022.
Score: 1.4579344926652844
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: LLMs have been celebrated for their potential to help multilingual scientists publish their research. Rather than interpret LLMs as a solution, we hypothesize their adoption can be an indicator of existing linguistic exclusion in scientific writing. Using the case study of ICLR, an influential, international computer science conference, we examine how peer reviewers critique writing clarity. Analyzing almost 80,000 peer reviews, we find significant bias against authors associated with institutions in countries where English is less widely spoken. We see only a muted shift in the expression of this bias after the introduction of ChatGPT in late 2022. To investigate this unexpectedly minor change, we conduct interviews with 14 conference participants from across five continents. Peer reviewers describe associating certain features of writing with people of certain language backgrounds, and such groups in turn with the quality of scientific work. While ChatGPT masks some signs of language background, reviewers explain that they now use ChatGPT "style" and non-linguistic features as indicators of author demographics. Authors, aware of this development, described the ongoing need to remove features which could expose their "non-native" status to reviewers. Our findings offer insight into the role of ChatGPT in the reproduction of scholarly language ideologies which conflate producers of "good English" with producers of "good science."

Related papers

Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study [59.30098850050971]
This work evaluates LLM prompting-based detection across eight non-English languages.<n>We show that while zero-shot and few-shot prompting lag behind fine-tuned encoder models on most of the real-world evaluation sets, they achieve better generalization on functional tests for hate speech detection.
arXiv Detail & Related papers (2025-05-09T16:00:01Z)
ChatGPT as Linguistic Equalizer? Quantifying LLM-Driven Lexical Shifts in Academic Writing [2.0117661599862164]
This study investigates whether ChatGPT mitigates barriers and fosters equity by analyzing lexical complexity shifts across 2.8 million articles from OpenAlex ( 2020-2024)<n>We demonstrate that ChatGPT significantly enhances lexical complexity in NNES-authored abstracts, even after controlling for article-level controls, authorship patterns, and venue norms.<n>These findings provide causal evidence that ChatGPT reduces linguistic disparities and promotes equity in global academia.
arXiv Detail & Related papers (2025-04-10T14:11:24Z)
Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages [51.96666324242191]
We analyze whether user utilization of novel writing assistants in a charity advertisement writing task is affected by the AI's performance in a second language.<n>We quantify the extent to which these patterns translate into the persuasiveness of generated charity advertisements.
arXiv Detail & Related papers (2025-02-13T17:49:30Z)
The BS-meter: A ChatGPT-Trained Instrument to Detect Sloppy Language-Games [41.94295877935867]
We show that a statistical model of sloppy bullshit can reliably relate the Frankfurtian artificial bullshit of ChatGPT to the political and workplace functions of bullshit as observed in natural human language.
arXiv Detail & Related papers (2024-11-22T18:55:21Z)
Quite Good, but Not Enough: Nationality Bias in Large Language Models -- A Case Study of ChatGPT [4.998396762666333]
This study investigates nationality bias in ChatGPT (GPT-3.5), a large language model (LLM) designed for text generation. The research covers 195 countries, 4 temperature settings, and 3 distinct prompt types, generating 4,680 discourses about nationality descriptions in Chinese and English.
arXiv Detail & Related papers (2024-05-11T12:11:52Z)
Holmes: A Benchmark to Assess the Linguistic Competence of Language Models [59.627729608055006]
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence. We use computation-based probing to examine LMs' internal representations regarding distinct linguistic phenomena. As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities.
arXiv Detail & Related papers (2024-04-29T17:58:36Z)
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency in Large Language Model (LLM)-generated content.<n>We introduce the Language Agency Bias Evaluation benchmark, which comprehensively evaluates biases in LLMs.<n>Using LABE, we unveil language agency social biases in 3 recent LLMs: ChatGPT, Llama3, and Mistral.
arXiv Detail & Related papers (2024-04-16T12:27:54Z)
Emergent AI-Assisted Discourse: Case Study of a Second Language Writer Authoring with ChatGPT [5.8131604120288385]
This study investigates the role of ChatGPT in facilitating academic writing, especially among language learners. Using a case study approach, this study examines the experiences of Kailing, a doctoral student, who integrates ChatGPT throughout their academic writing process.
arXiv Detail & Related papers (2023-10-17T00:22:10Z)
Exploring the effectiveness of ChatGPT-based feedback compared with teacher feedback and self-feedback: Evidence from Chinese to English translation [1.25097469793837]
ChatGPT, a cutting-edge AI-powered,can quickly generate responses on given commands. This study compared the revised Chinese to English translation texts produced by Chinese Master of Translation and Interpretation (MTI) students.
arXiv Detail & Related papers (2023-09-04T14:54:39Z)
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial. We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z)
Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology [9.941695905504282]
This study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation.<n>We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder.
arXiv Detail & Related papers (2023-05-15T04:10:13Z)
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP) This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources. Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z)
Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks. This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z)
Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour. Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.