Is This Abstract Generated by AI? A Research for the Gap between
AI-generated Scientific Text and Human-written Scientific Text
- URL: http://arxiv.org/abs/2301.10416v1
- Date: Tue, 24 Jan 2023 04:23:20 GMT
- Title: Is This Abstract Generated by AI? A Research for the Gap between
AI-generated Scientific Text and Human-written Scientific Text
- Authors: Yongqiang Ma, Jiawei Liu, Fan Yi
- Abstract summary: We investigate the gap between scientific content generated by AI and written by humans.
We find that there exists a writing style'' gap between AI-generated scientific text and human-written scientific text.
- Score: 13.438933219811188
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: BACKGROUND: Recent neural language models have taken a significant step
forward in producing remarkably controllable, fluent, and grammatical text.
Although some recent works have found that AI-generated text is not
distinguishable from human-authored writing for crowd-sourcing workers, there
still exist errors in AI-generated text which are even subtler and harder to
spot. METHOD: In this paper, we investigate the gap between scientific content
generated by AI and written by humans. Specifically, we first adopt several
publicly available tools or models to investigate the performance for detecting
GPT-generated scientific text. Then we utilize features from writing style to
analyze the similarities and differences between the two types of content.
Furthermore, more complex and deep perspectives, such as consistency,
coherence, language redundancy, and factual errors, are also taken into
consideration for in-depth analysis. RESULT: The results suggest that while AI
has the potential to generate scientific content that is as accurate as
human-written content, there is still a gap in terms of depth and overall
quality. AI-generated scientific content is more likely to contain errors in
language redundancy and factual issues. CONCLUSION: We find that there exists a
``writing style'' gap between AI-generated scientific text and human-written
scientific text. Moreover, based on the analysis result, we summarize a series
of model-agnostic or distribution-agnostic features, which could be utilized to
unknown or novel domain distribution and different generation methods. Future
research should focus on not only improving the capabilities of AI models to
produce high-quality content but also examining and addressing ethical and
security concerns related to the generation and the use of AI-generated
content.
Related papers
- Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.
We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - Decoding AI and Human Authorship: Nuances Revealed Through NLP and Statistical Analysis [0.0]
This research explores the nuanced differences in texts produced by AI and those written by humans.
The study investigates various linguistic traits, patterns of creativity, and potential biases inherent in human-written and AI- generated texts.
arXiv Detail & Related papers (2024-07-15T18:09:03Z) - Who Writes the Review, Human or AI? [0.36498648388765503]
This study proposes a methodology to accurately distinguish AI-generated and human-written book reviews.
Our approach utilizes transfer learning, enabling the model to identify generated text across different topics.
The experimental results demonstrate that it is feasible to detect the original source of text, achieving an accuracy rate of 96.86%.
arXiv Detail & Related papers (2024-05-30T17:38:44Z) - Deep Learning Detection Method for Large Language Models-Generated
Scientific Content [0.0]
Large Language Models generate scientific content that is indistinguishable from that written by humans.
This research paper presents a novel ChatGPT-generated scientific text detection method, AI-Catcher.
On average, AI-Catcher improved accuracy by 37.4%.
arXiv Detail & Related papers (2024-02-27T19:16:39Z) - Generative AI in Writing Research Papers: A New Type of Algorithmic Bias
and Uncertainty in Scholarly Work [0.38850145898707145]
Large language models (LLMs) and generative AI tools present challenges in identifying and addressing biases.
generative AI tools are susceptible to goal misgeneralization, hallucinations, and adversarial attacks such as red teaming prompts.
We find that incorporating generative AI in the process of writing research manuscripts introduces a new type of context-induced algorithmic bias.
arXiv Detail & Related papers (2023-12-04T04:05:04Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - The Imitation Game: Detecting Human and AI-Generated Texts in the Era of
ChatGPT and BARD [3.2228025627337864]
We introduce a novel dataset of human-written and AI-generated texts in different genres.
We employ several machine learning models to classify the texts.
Results demonstrate the efficacy of these models in discerning between human and AI-generated text.
arXiv Detail & Related papers (2023-07-22T21:00:14Z) - AI, write an essay for me: A large-scale comparison of human-written
versus ChatGPT-generated essays [66.36541161082856]
ChatGPT and similar generative AI models have attracted hundreds of millions of users.
This study compares human-written versus ChatGPT-generated argumentative student essays.
arXiv Detail & Related papers (2023-04-24T12:58:28Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - On the probability-quality paradox in language generation [76.69397802617064]
We analyze language generation through an information-theoretic lens.
We posit that human-like language should contain an amount of information close to the entropy of the distribution over natural strings.
arXiv Detail & Related papers (2022-03-31T17:43:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.