Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis
- URL: http://arxiv.org/abs/2601.07974v1
- Date: Mon, 12 Jan 2026 20:16:06 GMT
- Title: Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis
- Authors: Yuxi Xia, Kinga StaĆczak, Benjamin Roth,
- Abstract summary: We present a systematic study aimed at explaining generalization behavior through linguistic analysis.<n>We construct a benchmark that spans 6 prompting strategies, 7 large language models (LLMs), and 4 domain datasets.<n>We fine-tune classification-based detectors on various generation settings and evaluate their cross-prompt, cross-model, and cross-dataset generalization.
- Score: 2.626100048563503
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-text detectors achieve high accuracy on in-domain benchmarks, but often struggle to generalize across different generation conditions such as unseen prompts, model families, or domains. While prior work has reported these generalization gaps, there are limited insights about the underlying causes. In this work, we present a systematic study aimed at explaining generalization behavior through linguistic analysis. We construct a comprehensive benchmark that spans 6 prompting strategies, 7 large language models (LLMs), and 4 domain datasets, resulting in a diverse set of human- and AI-generated texts. Using this dataset, we fine-tune classification-based detectors on various generation settings and evaluate their cross-prompt, cross-model, and cross-dataset generalization. To explain the performance variance, we compute correlations between generalization accuracies and feature shifts of 80 linguistic features between training and test conditions. Our analysis reveals that generalization performance for specific detectors and evaluation conditions is significantly associated with linguistic features such as tense usage and pronoun frequency.
Related papers
- BAID: A Benchmark for Bias Assessment of AI Detectors [9.156813547624923]
We propose BAID, a comprehensive evaluation framework for AI detectors across various types of biases.<n>We introduce over 200k samples spanning 7 major categories: demographics, age, educational grade level, dialect, formality, political leaning, and topic.<n>We find consistent disparities in detection performance, particularly low recall rates for texts from underrepresented groups.
arXiv Detail & Related papers (2025-12-12T12:01:42Z) - Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [58.419940585826744]
We introduce FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors.<n>We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy.<n>Our framework paves the way for more robust classification in AI-generated content detection via post-processing.
arXiv Detail & Related papers (2025-02-06T21:58:48Z) - Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions [3.0906699069248806]
Construction Grammar (CxG) is a psycholinguistically grounded framework for testing generalization.<n>Our dataset consists of English phrasal constructions, for which speakers are known to be able to abstract over commonplace instantiations.<n>Our results demonstrate that state-of-the-art models, including GPT-o1, exhibit a performance drop of over 40% on our second task.
arXiv Detail & Related papers (2025-01-08T18:15:10Z) - Evaluating Structural Generalization in Neural Machine Translation [13.880151307013318]
We construct SGET, a dataset covering various types of compositional generalization with control of words and sentence structures.
We show that neural machine translation models struggle more in structural generalization than in lexical generalization.
We also find different performance trends in semantic parsing and machine translation, which indicates the importance of evaluations across various tasks.
arXiv Detail & Related papers (2024-06-19T09:09:11Z) - Compositional Generalization for Data-to-Text Generation [86.79706513098104]
We propose a novel model that addresses compositional generalization by clustering predicates into groups.
Our model generates text in a sentence-by-sentence manner, relying on one cluster of predicates at a time.
It significantly outperforms T5baselines across all evaluation metrics.
arXiv Detail & Related papers (2023-12-05T13:23:15Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - SLOG: A Structural Generalization Benchmark for Semantic Parsing [68.19511282584304]
The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.
Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training, are often underrepresented.
We introduce SLOG, a semantic parsing dataset that extends COGS with 17 structural generalization cases.
arXiv Detail & Related papers (2023-10-23T15:39:09Z) - How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios.
We find that the efficacy of such learning paradigms varies with the type of OOD.
Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z) - Sentiment Analysis on Brazilian Portuguese User Reviews [0.0]
This work analyzes the predictive performance of a range of document embedding strategies, assuming the polarity as the system outcome.
This analysis includes five sentiment analysis datasets in Brazilian Portuguese, unified in a single dataset, and a reference partitioning in training, testing, and validation sets, both made publicly available through a digital repository.
arXiv Detail & Related papers (2021-12-10T11:18:26Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z) - Detecting and Understanding Generalization Barriers for Neural Machine
Translation [53.23463279153577]
This paper attempts to identify and understand generalization barrier words within an unseen input sentence.
We propose a principled definition of generalization barrier words and a modified version which is tractable in computation.
We then conduct extensive analyses on those detected generalization barrier words on both Zh$Leftrightarrow$En NIST benchmarks.
arXiv Detail & Related papers (2020-04-05T12:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.