ChatGPT or academic scientist? Distinguishing authorship with over 99%
accuracy using off-the-shelf machine learning tools
- URL: http://arxiv.org/abs/2303.16352v1
- Date: Tue, 28 Mar 2023 23:16:00 GMT
- Title: ChatGPT or academic scientist? Distinguishing authorship with over 99%
accuracy using off-the-shelf machine learning tools
- Authors: Heather Desaire, Aleesa E. Chua, Madeline Isom, Romana Jarosova, and
David Hua
- Abstract summary: ChatGPT has enabled access to AI-generated writing for the masses.
Need to discriminate human writing from AI is now both critical and urgent.
We developed a method for discriminating text generated by ChatGPT from (human) academic scientists.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: ChatGPT has enabled access to AI-generated writing for the masses, and within
just a few months, this product has disrupted the knowledge economy, initiating
a culture shift in the way people work, learn, and write. The need to
discriminate human writing from AI is now both critical and urgent,
particularly in domains like higher education and academic writing, where AI
had not been a significant threat or contributor to authorship. Addressing this
need, we developed a method for discriminating text generated by ChatGPT from
(human) academic scientists, relying on prevalent and accessible supervised
classification methods. We focused on how a particular group of humans,
academic scientists, write differently than ChatGPT, and this targeted approach
led to the discovery of new features for discriminating (these) humans from AI;
as examples, scientists write long paragraphs and have a penchant for equivocal
language, frequently using words like but, however, and although. With a set of
20 features, including the aforementioned ones and others, we built a model
that assigned the author, as human or AI, at well over 99% accuracy, resulting
in 20 times fewer misclassified documents compared to the field-leading
approach. This strategy for discriminating a particular set of humans writing
from AI could be further adapted and developed by others with basic skills in
supervised classification, enabling access to many highly accurate and targeted
models for detecting AI usage in academic writing and beyond.
Related papers
- AI Literacy in K-12 and Higher Education in the Wake of Generative AI: An Integrative Review [3.5297361401370044]
There is little consensus among researchers and practitioners on how to discuss and design AI literacy interventions.
This paper applies an integrative review method to examine empirical and theoretical AI literacy studies published since 2020.
arXiv Detail & Related papers (2025-02-27T23:32:03Z) - Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing [55.2480439325792]
Misclassification can lead to false plagiarism accusations and misleading claims about AI prevalence in online content.
We systematically evaluate eleven state-of-the-art AI-text detectors using our AI-Polished-Text Evaluation dataset.
Our findings reveal that detectors frequently misclassify even minimally polished text as AI-generated, struggle to differentiate between degrees of AI involvement, and exhibit biases against older and smaller models.
arXiv Detail & Related papers (2025-02-21T18:45:37Z) - Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI [95.81924314159943]
We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity.
We also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.
arXiv Detail & Related papers (2025-02-17T09:56:46Z) - People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text [37.36534911201806]
We hire annotators to read 300 non-fiction English articles and label them as either human-written or AI-generated.
Experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text.
We release our annotated dataset and code to spur future research into both human and automated detection of AI-generated text.
arXiv Detail & Related papers (2025-01-26T19:31:34Z) - Using Machine Learning to Distinguish Human-written from Machine-generated Creative Fiction [0.0]
Training a Large Language Model on writers' output to generate "sham books" in a particular style seems to constitute a new form of plagiarism.
In this study, we trained Machine Learning classifier models to distinguish short samples of human-written from machine-generated creative fiction.
arXiv Detail & Related papers (2024-12-15T12:46:57Z) - "It was 80% me, 20% AI": Seeking Authenticity in Co-Writing with Large Language Models [97.22914355737676]
We examine whether and how writers want to preserve their authentic voice when co-writing with AI tools.
Our findings illuminate conceptions of authenticity in human-AI co-creation.
Readers' responses showed less concern about human-AI co-writing.
arXiv Detail & Related papers (2024-11-20T04:42:32Z) - Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.
We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - Distributed agency in second language learning and teaching through generative AI [0.0]
ChatGPT can provide informal second language practice through chats in written or voice forms.
Instructors can use AI to build learning and assessment materials in a variety of media.
arXiv Detail & Related papers (2024-03-29T14:55:40Z) - Generative AI in Writing Research Papers: A New Type of Algorithmic Bias
and Uncertainty in Scholarly Work [0.38850145898707145]
Large language models (LLMs) and generative AI tools present challenges in identifying and addressing biases.
generative AI tools are susceptible to goal misgeneralization, hallucinations, and adversarial attacks such as red teaming prompts.
We find that incorporating generative AI in the process of writing research manuscripts introduces a new type of context-induced algorithmic bias.
arXiv Detail & Related papers (2023-12-04T04:05:04Z) - Techniques for supercharging academic writing with generative AI [0.0]
This Perspective maps out principles and methods for using generative artificial intelligence (AI) to elevate the quality and efficiency of academic writing.
We introduce a human-AI collaborative framework that delineates the rationale (why), process (how), and nature (what) of AI engagement in writing.
arXiv Detail & Related papers (2023-10-26T04:35:00Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - PaperCard for Reporting Machine Assistance in Academic Writing [48.33722012818687]
ChatGPT, a question-answering system released by OpenAI in November 2022, has demonstrated a range of capabilities that could be utilised in producing academic papers.
This raises critical questions surrounding the concept of authorship in academia.
We propose a framework we name "PaperCard", a documentation for human authors to transparently declare the use of AI in their writing process.
arXiv Detail & Related papers (2023-10-07T14:28:04Z) - Perception, performance, and detectability of conversational artificial
intelligence across 32 university courses [15.642614735026106]
We compare the performance of ChatGPT against students on 32 university-level courses.
We find that ChatGPT's performance is comparable, if not superior, to that of students in many courses.
We find an emerging consensus among students to use the tool, and among educators to treat this as plagiarism.
arXiv Detail & Related papers (2023-05-07T10:37:51Z) - AI, write an essay for me: A large-scale comparison of human-written
versus ChatGPT-generated essays [66.36541161082856]
ChatGPT and similar generative AI models have attracted hundreds of millions of users.
This study compares human-written versus ChatGPT-generated argumentative student essays.
arXiv Detail & Related papers (2023-04-24T12:58:28Z) - Is This Abstract Generated by AI? A Research for the Gap between
AI-generated Scientific Text and Human-written Scientific Text [13.438933219811188]
We investigate the gap between scientific content generated by AI and written by humans.
We find that there exists a writing style'' gap between AI-generated scientific text and human-written scientific text.
arXiv Detail & Related papers (2023-01-24T04:23:20Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.