Decoding AI and Human Authorship: Nuances Revealed Through NLP and Statistical Analysis
- URL: http://arxiv.org/abs/2408.00769v1
- Date: Mon, 15 Jul 2024 18:09:03 GMT
- Title: Decoding AI and Human Authorship: Nuances Revealed Through NLP and Statistical Analysis
- Authors: Mayowa Akinwande, Oluwaseyi Adeliyi, Toyyibat Yussuph,
- Abstract summary: This research explores the nuanced differences in texts produced by AI and those written by humans.
The study investigates various linguistic traits, patterns of creativity, and potential biases inherent in human-written and AI- generated texts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This research explores the nuanced differences in texts produced by AI and those written by humans, aiming to elucidate how language is expressed differently by AI and humans. Through comprehensive statistical data analysis, the study investigates various linguistic traits, patterns of creativity, and potential biases inherent in human-written and AI- generated texts. The significance of this research lies in its contribution to understanding AI's creative capabilities and its impact on literature, communication, and societal frameworks. By examining a meticulously curated dataset comprising 500K essays spanning diverse topics and genres, generated by LLMs, or written by humans, the study uncovers the deeper layers of linguistic expression and provides insights into the cognitive processes underlying both AI and human-driven textual compositions. The analysis revealed that human-authored essays tend to have a higher total word count on average than AI-generated essays but have a shorter average word length compared to AI- generated essays, and while both groups exhibit high levels of fluency, the vocabulary diversity of Human authored content is higher than AI generated content. However, AI- generated essays show a slightly higher level of novelty, suggesting the potential for generating more original content through AI systems. The paper addresses challenges in assessing the language generation capabilities of AI models and emphasizes the importance of datasets that reflect the complexities of human-AI collaborative writing. Through systematic preprocessing and rigorous statistical analysis, this study offers valuable insights into the evolving landscape of AI-generated content and informs future developments in natural language processing (NLP).
Related papers
- Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.
We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - Measuring Human Contribution in AI-Assisted Content Generation [68.03658922067487]
This study raises the research question of measuring human contribution in AI-assisted content generation.
By calculating mutual information between human input and AI-assisted output relative to self-information of AI-assisted output, we quantify the proportional information contribution of humans in content generation.
arXiv Detail & Related papers (2024-08-27T05:56:04Z) - Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI)
Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic.
Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z) - Differentiating between human-written and AI-generated texts using linguistic features automatically extracted from an online computational tool [0.0]
This study aims to investigate how various linguistic components are represented in both types of texts, assessing the ability of AI to emulate human writing.
Despite AI-generated texts appearing to mimic human speech, the results revealed significant differences across multiple linguistic features.
arXiv Detail & Related papers (2024-07-04T05:37:09Z) - Generative AI in Writing Research Papers: A New Type of Algorithmic Bias
and Uncertainty in Scholarly Work [0.38850145898707145]
Large language models (LLMs) and generative AI tools present challenges in identifying and addressing biases.
generative AI tools are susceptible to goal misgeneralization, hallucinations, and adversarial attacks such as red teaming prompts.
We find that incorporating generative AI in the process of writing research manuscripts introduces a new type of context-induced algorithmic bias.
arXiv Detail & Related papers (2023-12-04T04:05:04Z) - Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing
AI-Generated Text [0.0]
My research investigates the use of cutting-edge hybrid deep learning models to accurately differentiate between AI-generated text and human writing.
I applied a robust methodology, utilising a carefully selected dataset comprising AI and human texts from various sources, each tagged with instructions.
arXiv Detail & Related papers (2023-11-27T06:26:53Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - The Imitation Game: Detecting Human and AI-Generated Texts in the Era of
ChatGPT and BARD [3.2228025627337864]
We introduce a novel dataset of human-written and AI-generated texts in different genres.
We employ several machine learning models to classify the texts.
Results demonstrate the efficacy of these models in discerning between human and AI-generated text.
arXiv Detail & Related papers (2023-07-22T21:00:14Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - Is This Abstract Generated by AI? A Research for the Gap between
AI-generated Scientific Text and Human-written Scientific Text [13.438933219811188]
We investigate the gap between scientific content generated by AI and written by humans.
We find that there exists a writing style'' gap between AI-generated scientific text and human-written scientific text.
arXiv Detail & Related papers (2023-01-24T04:23:20Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.