Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models
- URL: http://arxiv.org/abs/2510.15061v2
- Date: Tue, 21 Oct 2025 21:42:07 GMT
- Title: Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models
- Authors: Samuel Paech, Allen Roush, Judah Goldfeder, Ravid Shwartz-Ziv,
- Abstract summary: We present Antislop, a framework providing tools to both detect and eliminate these overused patterns.<n>The Antislop Sampler uses backtracking to suppress unwanted strings at inference time without destroying vocabulary.<n> FTPO achieves 90% slop reduction while maintaining or improving performance in cross-domain evals including GSM8K, MMLU, and creative writing tasks.
- Score: 8.02516998509823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Widespread LLM adoption has introduced characteristic repetitive phraseology, termed "slop," which degrades output quality and makes AI-generated text immediately recognizable. We present Antislop, a comprehensive framework providing tools to both detect and eliminate these overused patterns. Our approach combines three innovations: (1) The Antislop Sampler, which uses backtracking to suppress unwanted strings at inference time without destroying vocabulary; (2) An automated pipeline that profiles model-specific slop against human baselines and generates training data; (3) Final Token Preference Optimization (FTPO), a novel fine-tuning method that operates on individual tokens, surgically adjusting logits wherever a banned pattern has appeared in an inference trace. We demonstrate that some slop patterns appear over 1,000x more frequently in LLM output than human text. The Antislop Sampler successfully suppresses 8,000+ patterns while maintaining quality, whereas token banning becomes unusable at just 2,000. Most importantly, FTPO achieves 90% slop reduction while maintaining or improving performance in cross-domain evals including GSM8K, MMLU, and creative writing tasks. In contrast, DPO suffers significant degradation in writing quality and lexical diversity despite achieving weaker suppression. We release all code and results under MIT license: https://github.com/sam-paech/auto-antislop.
Related papers
- Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models [20.691302472834675]
Transferable backdoors pose a severe threat to the Pre-trained Language Models (PLMs) supply chain.<n>We propose Patronus, a novel framework that use input-side invariance of triggers against parameter shifts.<n>Experiments demonstrate that Patronus achieves $geq98.7%$ backdoor detection recall and reduce attack success rates to clean settings.
arXiv Detail & Related papers (2025-12-07T15:51:56Z) - Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors [77.82885394684202]
We propose textbfContrastive textbfParaphrase textbfAttack (CoPA), a training-free method that effectively deceives text detectors.<n>CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by large language models.<n>Our theoretical analysis suggests the superiority of the proposed attack.
arXiv Detail & Related papers (2025-05-21T10:08:39Z) - The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models [48.073219761367184]
We investigate an APR pipeline that balances the generation of multiple outputs and multiple rounds of iteration.<n>We fine-tune each model on an APR dataset with three sizes (1K, 30K, 65K) and two techniques (Full Fine-Tuning and LoRA)<n>Our results show that by using only a fraction (1%) of the fine-tuning dataset, we can achieve improvements of up to 78% in the number of plausible patches generated.
arXiv Detail & Related papers (2025-05-05T18:06:51Z) - Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.<n> LCD can distort the global distribution over strings, sampling tokens based only on local information.<n>We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z) - Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models [2.7174461714624805]
We introduce Obliviate, a lightweight method that surgically suppresses exact reproduction of specified sequences.<n>Obliviate first identifies memorized passages and then, for each target token, minimally adjusts the model's output distribution.<n>We evaluate Obliviate on four popular 6-8B- parameter models (LLaMA-3.1, LLaMA-3.1-Instruct, Qwen-2.5, and Yi-1.5) using synthetic benchmarks and organic copyrighted excerpts.
arXiv Detail & Related papers (2025-02-20T20:02:56Z) - Auto-Prompt Generation is Not Robust: Prompt Optimization Driven by Pseudo Gradient [50.15090865963094]
We introduce PertBench, a comprehensive benchmark dataset that includes a wide range of input perturbations.<n>Our analysis reveals substantial vulnerabilities in existing prompt generation strategies.<n>We propose PGO, a gradient-free prompt generation framework that leverages perturbation types as pseudo-gradient signals.
arXiv Detail & Related papers (2024-12-24T06:05:08Z) - SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications [13.948608558319307]
Speculative decoding is widely adopted to reduce latency in large language model (LLM) inference.<n>Agentic frameworks submit repetitive inference requests, which result in long and highly predictable computation.<n>We introduce emphSuffixDecoding, a novel method that utilizes efficient suffix trees to cache long token sequences.
arXiv Detail & Related papers (2024-11-07T18:49:33Z) - Autoregressive Speech Synthesis without Vector Quantization [135.4776759536272]
We present MELLE, a novel continuous-valued token based language modeling approach for text-to-speech synthesis (TTS)<n>MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition.<n>MELLE mitigates robustness issues by avoiding the inherent flaws of sampling vector-quantized codes.
arXiv Detail & Related papers (2024-07-11T14:36:53Z) - Mitigating the Learning Bias towards Repetition by Self-Contrastive
Training for Open-Ended Generation [92.42032403795879]
We show that pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts.
We attribute their overestimation of token-level repetition probabilities to the learning bias.
We find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops.
arXiv Detail & Related papers (2023-07-04T07:53:55Z) - Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to
Document Level [4.250876580245865]
Existing AI-generated text classifiers have limited accuracy and often produce false positives.
We propose a novel approach using natural language processing (NLP) techniques.
We generate multiple paraphrased versions of a given question and inputting them into the large language model to generate answers.
By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student's response.
arXiv Detail & Related papers (2023-06-13T20:34:55Z) - Towards Variable-Length Textual Adversarial Attacks [68.27995111870712]
It is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data.
In this paper, we propose variable-length textual adversarial attacks(VL-Attack)
Our method can achieve $33.18$ BLEU score on IWSLT14 German-English translation, achieving an improvement of $1.47$ over the baseline model.
arXiv Detail & Related papers (2021-04-16T14:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.