Related papers: Neural Language Models are Effective Plagiarists

Neural Language Models are Effective Plagiarists

URL: http://arxiv.org/abs/2201.07406v1
Date: Wed, 19 Jan 2022 04:00:46 GMT
Title: Neural Language Models are Effective Plagiarists
Authors: Stella Biderman and Edward Raff
Abstract summary: We find that a student using GPT-J can complete introductory level programming assignments without triggering suspicion from MOSS. GPT-J was not trained on the problems in question and is not provided with any examples to work from. We conclude that the code written by GPT-J is diverse in structure, lacking any particular tells that future plagiarism detection techniques may use to try to identify algorithmically generated code.
Score: 38.85940137464184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As artificial intelligence (AI) technologies become increasingly powerful and prominent in society, their misuse is a growing concern. In educational settings, AI technologies could be used by students to cheat on assignments and exams. In this paper we explore whether transformers can be used to solve introductory level programming assignments while bypassing commonly used AI tools to detect plagiarism. We find that a student using GPT-J [Wang and Komatsuzaki, 2021] can complete introductory level programming assignments without triggering suspicion from MOSS [Aiken, 2000], a widely used plagiarism detection tool. This holds despite the fact that GPT-J was not trained on the problems in question and is not provided with any examples to work from. We further find that the code written by GPT-J is diverse in structure, lacking any particular tells that future plagiarism detection techniques may use to try to identify algorithmically generated code. We conclude with a discussion of the ethical and educational implications of large language models and directions for future research.

Related papers

How Do Programming Students Use Generative AI? [7.863638253070439]
We studied how programming students actually use generative AI tools like ChatGPT. We observed two prevalent usage strategies: to seek knowledge about general concepts and to directly generate solutions. Our findings indicate that concerns about potential decrease in programmers' agency and productivity with Generative AI are justified.
arXiv Detail & Related papers (2025-01-17T10:25:41Z)
AI Content Self-Detection for Transformer-based Large Language Models [0.0]
This paper introduces the idea of direct origin detection and evaluates whether generative AI systems can recognize their output and distinguish it from human-written texts. Google's Bard model exhibits the largest capability of self-detection with an accuracy of 94%, followed by OpenAI's ChatGPT with 83%.
arXiv Detail & Related papers (2023-12-28T10:08:57Z)
PaperCard for Reporting Machine Assistance in Academic Writing [48.33722012818687]
ChatGPT, a question-answering system released by OpenAI in November 2022, has demonstrated a range of capabilities that could be utilised in producing academic papers. This raises critical questions surrounding the concept of authorship in academia. We propose a framework we name "PaperCard", a documentation for human authors to transparently declare the use of AI in their writing process.
arXiv Detail & Related papers (2023-10-07T14:28:04Z)
Identifying and Mitigating the Security Risks of Generative AI [179.2384121957896]
This paper reports the findings of a workshop held at Google on the dual-use dilemma posed by GenAI. GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. We discuss short-term and long-term goals for the community on this topic.
arXiv Detail & Related papers (2023-08-28T18:51:09Z)
A LLM Assisted Exploitation of AI-Guardian [57.572998144258705]
We evaluate the robustness of AI-Guardian, a recent defense to adversarial examples published at IEEE S&P 2023. We write none of the code to attack this model, and instead prompt GPT-4 to implement all attack algorithms following our instructions and guidance. This process was surprisingly effective and efficient, with the language model at times producing code from ambiguous instructions faster than the author of this paper could have done.
arXiv Detail & Related papers (2023-07-20T17:33:25Z)
Perception, performance, and detectability of conversational artificial intelligence across 32 university courses [15.642614735026106]
We compare the performance of ChatGPT against students on 32 university-level courses. We find that ChatGPT's performance is comparable, if not superior, to that of students in many courses. We find an emerging consensus among students to use the tool, and among educators to treat this as plagiarism.
arXiv Detail & Related papers (2023-05-07T10:37:51Z)
Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc. Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques. In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z)
How Generative AI models such as ChatGPT can be (Mis)Used in SPC Practice, Education, and Research? An Exploratory Study [2.0841728192954663]
Generative Artificial Intelligence (AI) models have the potential to revolutionize Statistical Process Control (SPC) practice, learning, and research. These tools are in the early stages of development and can be easily misused or misunderstood. We explore ChatGPT's ability to provide code, explain basic concepts, and create knowledge related to SPC practice, learning, and research.
arXiv Detail & Related papers (2023-02-17T15:48:37Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
Will ChatGPT get you caught? Rethinking of Plagiarism Detection [0.0]
The rise of Artificial Intelligence (AI) technology and its impact on education has been a topic of growing concern in recent years. The use of chatbots, particularly ChatGPT, for generating academic essays has sparked fears among scholars. This study aims to explore the originality of contents produced by one of the most popular AI chatbots, ChatGPT.
arXiv Detail & Related papers (2023-02-08T20:59:18Z)
Mossad: Defeating Software Plagiarism Detection [0.48225981108928456]
This paper presents an entirely automatic program transformation approach, Mossad, that defeats popular software plagiarism detection tools. It comprises a framework that couples techniques inspired by genetic programming with domain-specific knowledge to effectively undermine plagiarism detectors. Moss is both fast and effective: it can, in minutes, generate modified versions of programs that are likely to escape detection.
arXiv Detail & Related papers (2020-10-04T22:02:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.