Related papers: The Digital Divide in Generative AI: Evidence from Large Language Model Use in College Admissions Essays

The Digital Divide in Generative AI: Evidence from Large Language Model Use in College Admissions Essays

URL: http://arxiv.org/abs/2602.17791v1
Date: Thu, 19 Feb 2026 19:47:49 GMT
Title: The Digital Divide in Generative AI: Evidence from Large Language Model Use in College Admissions Essays
Authors: Jinsook Lee, Conrad Borchers, AJ Alvero, Thorsten Joachims, Rene F. Kizilcec,
Abstract summary: Large language models (LLMs) have become popular writing tools among students.<n>LLMs may expand access to high-quality feedback for students with less access to traditional writing support.<n>This study examines how adoption of LLM-assisted writing varies across socioeconomic groups.
Score: 12.696416066678731
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs) have become popular writing tools among students and may expand access to high-quality feedback for students with less access to traditional writing support. At the same time, LLMs may standardize student voice or invite overreliance. This study examines how adoption of LLM-assisted writing varies across socioeconomic groups and how it relates to outcomes in a high-stakes context: U.S. college admissions. We analyze a de-identified longitudinal dataset of applications to a selective university from 2020 to 2024 (N = 81,663). Estimating LLM use using a distribution-based detector trained on synthetic and historical essays, we tracked how student writing changed as LLM use proliferated, how adoption differed by socioeconomic status (SES), and whether potential benefits translated equitably into admissions outcomes. Using fee-waiver status as a proxy for SES, we observe post-2023 convergence in surface-level linguistic features, with the largest changes in fee-waived and rejected applicants. Estimated LLM use rose sharply in 2024 across all groups, with disproportionately larger increases among lower SES applicants, consistent with an access hypothesis in which LLMs substitute for scarce writing support. However, increased estimated LLM use was more strongly associated with declines in predicted admission probability for lower SES applicants than for higher SES applicants, even after controlling for academic credentials and stylometric features. These findings raise concerns about equity and the validity of essay-based evaluation in an era of AI-assisted writing and provide the first large-scale longitudinal evidence linking LLM adoption, linguistic change, and evaluative outcomes in college admissions.

Related papers

Counterfactual LLM-based Framework for Measuring Rhetorical Style [15.917819866091191]
We introduce a counterfactual, LLM-based framework to disentangle rhetorical style from substantive content in machine learning papers.<n>Applying this method to 8,485 ICLR submissions sampled from 2017 to 2025, we generate more than 250,000 counterfactual writings.<n>We find that visionary framing significantly predicts downstream attention, including citations and media attention, even after controlling for peer-review evaluations.
arXiv Detail & Related papers (2025-12-22T22:22:46Z)
LLM-REVal: Can We Trust LLM Reviewers Yet? [70.58742663985652]
Large language models (LLMs) have inspired researchers to integrate them extensively into the academic workflow.<n>This study focuses on how the deep integration of LLMs into both peer-review and research processes may influence scholarly fairness.
arXiv Detail & Related papers (2025-10-14T10:30:20Z)
Does the Prompt-based Large Language Model Recognize Students' Demographics and Introduce Bias in Essay Scoring? [3.7498611358320733]
Large Language Models (LLMs) are widely used in Automated Essay Scoring (AES)<n>This study explores the relationship between the model's predictive power of students' demographic attributes based on their written works and its predictive bias in the scoring task in the prompt-based paradigm.
arXiv Detail & Related papers (2025-04-30T05:36:28Z)
Poor Alignment and Steerability of Large Language Models: Evidence from College Admission Essays [19.405531377930977]
We investigate the use of large language models (LLM) in high-stakes admissions contexts.<n>We find that both types of LLM-generated essays are linguistically distinct from human-authored essays.<n>The demographically prompted and unprompted synthetic texts were also more similar to each other than to the human text.
arXiv Detail & Related papers (2025-03-25T20:54:50Z)
Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review [66.73247554182376]
Large language models (LLMs) have led to their integration into peer review.<n>The unchecked adoption of LLMs poses significant risks to the integrity of the peer review system.<n>We show that manipulating 5% of the reviews could potentially cause 12% of the papers to lose their position in the top 30% rankings.
arXiv Detail & Related papers (2024-12-02T16:55:03Z)
Embracing AI in Education: Understanding the Surge in Large Language Model Use by Secondary Students [53.20318273452059]
Large language models (LLMs) like OpenAI's ChatGPT have opened up new avenues in education.<n>Despite school restrictions, our survey of over 300 middle and high school students revealed that a remarkable 70% of students have utilized LLMs.<n>We propose a few ideas to address such issues, including subject-specific models, personalized learning, and AI classrooms.
arXiv Detail & Related papers (2024-11-27T19:19:34Z)
A Large-Scale Study of Relevance Assessments with Large Language Models: An Initial Look [52.114284476700874]
This paper reports on the results of a large-scale evaluation (the TREC 2024 RAG Track) where four different relevance assessment approaches were deployed. We find that automatically generated UMBRELA judgments can replace fully manual judgments to accurately capture run-level effectiveness. Surprisingly, we find that LLM assistance does not appear to increase correlation with fully manual assessments, suggesting that costs associated with human-in-the-loop processes do not bring obvious tangible benefits.
arXiv Detail & Related papers (2024-11-13T01:12:35Z)
Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models [3.005864877840858]
ChatGPT and other similar tools have prompted tremendous public excitement and experimental effort about the potential of large language models (LLMs) to improve learning experience and outcomes. However, little research has systematically examined the real-world impacts of LLM availability on educational equity. We analyze 1,140,328 academic writing submissions from 16,791 college students across 2,391 courses between 2021 and 2024 at a public, minority-serving institution in the US. We find that students' overall writing quality gradually increased following the availability of LLMs and that the writing quality gaps between linguistically advantaged and disadvantaged students became increasingly narrower
arXiv Detail & Related papers (2024-10-29T17:35:46Z)
United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections [42.72938925647165]
"Synthetic samples" based on large language models (LLMs) have been argued to serve as efficient alternatives to surveys of humans.<n>"Synthetic samples" might exhibit bias due to training data and fine-tuning processes being unrepresentative of diverse contexts.<n>This study investigates if and under which conditions LLM-generated synthetic samples can be used for public opinion prediction.
arXiv Detail & Related papers (2024-08-29T16:01:06Z)
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews [51.453135368388686]
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM) Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world LLM-use at the corpus level.
arXiv Detail & Related papers (2024-03-11T21:51:39Z)
Large Language Models are Not Yet Human-Level Evaluators for Abstractive Summarization [66.08074487429477]
We investigate the stability and reliability of large language models (LLMs) as automatic evaluators for abstractive summarization. We find that while ChatGPT and GPT-4 outperform the commonly used automatic metrics, they are not ready as human replacements.
arXiv Detail & Related papers (2023-05-22T14:58:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.