Enhancing Robustness of LLM-Synthetic Text Detectors for Academic
Writing: A Comprehensive Analysis
- URL: http://arxiv.org/abs/2401.08046v1
- Date: Tue, 16 Jan 2024 01:58:36 GMT
- Title: Enhancing Robustness of LLM-Synthetic Text Detectors for Academic
Writing: A Comprehensive Analysis
- Authors: Zhicheng Dou, Yuchen Guo, Ching-Chun Chang, Huy H. Nguyen, Isao
Echizen
- Abstract summary: Large language models (LLMs) offer numerous advantages in terms of revolutionizing work and study methods.
They have also garnered significant attention due to their potential negative consequences.
One example is generating academic reports or papers with little to no human contribution.
- Score: 35.351782110161025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of large language models (LLMs), such as Generative Pre-trained
Transformer 4 (GPT-4) used by ChatGPT, has profoundly impacted the academic and
broader community. While these models offer numerous advantages in terms of
revolutionizing work and study methods, they have also garnered significant
attention due to their potential negative consequences. One example is
generating academic reports or papers with little to no human contribution.
Consequently, researchers have focused on developing detectors to address the
misuse of LLMs. However, most existing methods prioritize achieving higher
accuracy on restricted datasets, neglecting the crucial aspect of
generalizability. This limitation hinders their practical application in
real-life scenarios where reliability is paramount. In this paper, we present a
comprehensive analysis of the impact of prompts on the text generated by LLMs
and highlight the potential lack of robustness in one of the current
state-of-the-art GPT detectors. To mitigate these issues concerning the misuse
of LLMs in academic writing, we propose a reference-based Siamese detector
named Synthetic-Siamese which takes a pair of texts, one as the inquiry and the
other as the reference. Our method effectively addresses the lack of robustness
of previous detectors (OpenAI detector and DetectGPT) and significantly
improves the baseline performances in realistic academic writing scenarios by
approximately 67% to 95%.
Related papers
- DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios [38.952481877244644]
We present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task.
Our development of DetectRL reveals the strengths and limitations of current SOTA detectors.
We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios.
arXiv Detail & Related papers (2024-10-31T09:01:25Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - ConvNLP: Image-based AI Text Detection [1.4419517737536705]
This paper presents a novel approach for detecting AI-generated text using a visual representation of word embedding.
We have formulated a novel Convolutional Neural Network called ZigZag ResNet, as well as a scheduler for improving generalization, named ZigZag Scheduler.
Our best model detects AI-generated text with an impressive average detection rate (over inter- and intra-domain test data) of 88.35%.
arXiv Detail & Related papers (2024-07-09T20:44:40Z) - SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations.
First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics.
Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations.
Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z) - Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore [51.65730053591696]
We propose a simple but effective black-box zero-shot detection approach.
It is predicated on the observation that human-written texts typically contain more grammatical errors than LLM-generated texts.
Our method achieves an average AUROC of 98.7% and shows strong robustness against paraphrase and adversarial perturbation attacks.
arXiv Detail & Related papers (2024-05-07T12:57:01Z) - The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions [0.0]
Launch of ChatGPT by OpenAI in November 2022 marked a pivotal moment for Artificial Intelligence.
Large Language Models (LLMs) demonstrate remarkable conversational capabilities across various domains.
These models are susceptible to errors - "hallucinations" and omissions, generating incorrect or incomplete information.
arXiv Detail & Related papers (2024-03-13T21:39:39Z) - OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with
Adversarially Generated Examples [44.118047780553006]
OUTFOX is a framework that improves the robustness of LLM-generated-text detectors by allowing both the detector and the attacker to consider each other's output.
Experiments show that the proposed detector improves the detection performance on the attacker-generated texts by up to +41.3 points F1-score.
The detector shows a state-of-the-art detection performance: up to 96.9 points F1-score, beating existing detectors on non-attacked texts.
arXiv Detail & Related papers (2023-07-21T17:40:47Z) - Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users.
Recent works have proposed algorithms to detect LLM-generated text and protect LLMs.
We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z) - Large Language Models can be Guided to Evade AI-Generated Text Detection [40.7707919628752]
Large language models (LLMs) have shown remarkable performance in various tasks and have been extensively utilized by the public.
We equip LLMs with prompts, rather than relying on an external paraphraser, to evaluate the vulnerability of these detectors.
We propose a novel Substitution-based In-Context example optimization method (SICO) to automatically construct prompts for evading the detectors.
arXiv Detail & Related papers (2023-05-18T10:03:25Z) - MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs)
We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples.
Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.