Related papers: DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution

DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution

URL: http://arxiv.org/abs/2512.04838v1
Date: Thu, 04 Dec 2025 14:21:42 GMT
Title: DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution
Authors: L. D. M. S. Sai Teja, N. Siva Gopala Krishna, Ufaq Khan, Muhammad Haris Khan, Partha Pakray, Atul Mishra,
Abstract summary: In the age of advanced large language models, the boundaries between human and AI-generated text are becoming increasingly blurred.<n>We introduce a novel framework, called Info-Mask for mixed authorship detection that integrates stylometric cues, perplexity-driven signals, and structured boundary modeling.<n>Our findings highlight both the promise and limitations of adversarially robust, interpretable mixed-authorship detection.
Score: 20.178134447843092
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the age of advanced large language models (LLMs), the boundaries between human and AI-generated text are becoming increasingly blurred. We address the challenge of segmenting mixed-authorship text, that is identifying transition points in text where authorship shifts from human to AI or vice-versa, a problem with critical implications for authenticity, trust, and human oversight. We introduce a novel framework, called Info-Mask for mixed authorship detection that integrates stylometric cues, perplexity-driven signals, and structured boundary modeling to accurately segment collaborative human-AI content. To evaluate the robustness of our system against adversarial perturbations, we construct and release an adversarial benchmark dataset Mixed-text Adversarial setting for Segmentation (MAS), designed to probe the limits of existing detectors. Beyond segmentation accuracy, we introduce Human-Interpretable Attribution (HIA overlays that highlight how stylometric features inform boundary predictions, and we conduct a small-scale human study assessing their usefulness. Across multiple architectures, Info-Mask significantly improves span-level robustness under adversarial conditions, establishing new baselines while revealing remaining challenges. Our findings highlight both the promise and limitations of adversarially robust, interpretable mixed-authorship detection, with implications for trust and oversight in human-AI co-authorship.

Related papers

DependencyAI: Detecting AI Generated Text through Dependency Parsing [10.075606234222963]
We introduce DependencyAI, a simple and interpretable approach for detecting AI-generated text.<n>Our method achieves competitive performance across monolingual, multi-generator, and multilingual settings.
arXiv Detail & Related papers (2026-02-17T11:42:28Z)
Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection [71.59834293521074]
We develop a framework to distinguish between human-authored and machine-generated text.<n>Our method achieves 98.3% AUROC and AUPR with only 8.9% FPR95 on DeepFake dataset.<n>Code, pretrained weights, and demo will be released.
arXiv Detail & Related papers (2025-10-07T08:14:45Z)
Fine-Grained Detection of AI-Generated Text Using Sentence-Level Segmentation [3.088244520495001]
A sentence-level sequence labeling model proposed to detect transitions between human- and AI-generated text.<n>Our model combines the state-of-the-art pre-trained Transformer models, incorporating Neural Networks (NN) and Conditional Random Fields (CRFs)<n>The evaluation is performed on two publicly available benchmark datasets containing collaborative human and AI-generated texts.
arXiv Detail & Related papers (2025-09-22T14:22:55Z)
DNA-DetectLLM: Unveiling AI-Generated Text via a DNA-Inspired Mutation-Repair Paradigm [17.258462909671525]
We introduce DNA-DetectLLM, a zero-shot detection method for distinguishing AI-generated and human-written text.<n>DNA-DetectLLM achieves relative improvements of 5.55% in AUROC and 2.08% in F1 score across multiple public benchmark datasets.
arXiv Detail & Related papers (2025-09-19T03:08:13Z)
CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection [60.52240468810558]
We introduce CoCoNUTS, a content-oriented benchmark built upon a fine-grained dataset of AI-generated peer reviews.<n>We also develop CoCoDet, an AI review detector via a multi-task learning framework, to achieve more accurate and robust detection of AI involvement in review content.
arXiv Detail & Related papers (2025-08-28T06:03:11Z)
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring [14.887491317701997]
This paper explores the possibility of fine-grained MGT detection under human-AI coauthoring.<n>We suggest fine-grained detectors can pave pathways toward coauthored text detection with a numeric AI ratio.<n> Empirical results show that metric-based methods struggle to conduct fine-grained detection with a 0.462 average F1 score.
arXiv Detail & Related papers (2025-06-03T14:52:44Z)
Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection [44.05134959039957]
We investigate how sociolinguistic attributes-gender, CEFR proficiency, academic field, and language environment-impact state-of-the-art AI text detectors.<n>Our results reveal significant biases: CEFR proficiency and language environment consistently affected detector accuracy, while gender and academic field showed detector-dependent effects.<n>These findings highlight the crucial need for socially aware AI text detection to avoid unfairly penalizing specific demographic groups.
arXiv Detail & Related papers (2025-02-18T07:49:31Z)
Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts. We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z)
DEMASQ: Unmasking the ChatGPT Wordsmith [63.8746084667206]
We propose an effective ChatGPT detector named DEMASQ, which accurately identifies ChatGPT-generated content. Our method addresses two critical factors: (i) the distinct biases in text composition observed in human- and machine-generated content and (ii) the alterations made by humans to evade previous detection methods.
arXiv Detail & Related papers (2023-11-08T21:13:05Z)
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses. Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs. To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z)
On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases. We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.