Say Anything but This: When Tokenizer Betrays Reasoning in LLMs
- URL: http://arxiv.org/abs/2601.14658v1
- Date: Wed, 21 Jan 2026 05:09:09 GMT
- Title: Say Anything but This: When Tokenizer Betrays Reasoning in LLMs
- Authors: Navid Ayoobi, Marcus I Armstrong, Arjun Mukherjee,
- Abstract summary: Large language models (LLMs) reason over discrete token ID sequences.<n>Modern subword tokenizers routinely produce non-unique encodings.<n>We show that tokenization can betray LLM reasoning through one-to-many token ID mappings.
- Score: 0.7162422068114824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) reason over discrete token ID sequences, yet modern subword tokenizers routinely produce non-unique encodings: multiple token ID sequences can detokenize to identical surface strings. This representational mismatch creates an unmeasured fragility wherein reasoning processes can fail. LLMs may treat two internal representations as distinct "words" even when they are semantically identical at the text level. In this work, we show that tokenization can betray LLM reasoning through one-to-many token ID mappings. We introduce a tokenization-consistency probe that requires models to replace designated target words in context while leaving all other content unchanged. The task is intentionally simple at the surface level, enabling us to attribute failures to tokenizer-detokenizer artifacts rather than to knowledge gaps or parameter limitations. Through analysis of over 11000 replacement trials across state-of-the-art open-source LLMs, we find a non-trivial rate of outputs exhibit phantom edits: cases where models operate under the illusion of correct reasoning, a phenomenon arising from tokenizer-induced representational defects. We further analyze these cases and provide a taxonomy of eight systematic tokenizer artifacts, including whitespace-boundary shifts and intra-word resegmentation. These findings indicate that part of apparent reasoning deficiency originates in the tokenizer layer, motivating tokenizer-level remedies before incurring the cost of training ever-larger models on ever-larger corpora.
Related papers
- Step-Level Sparse Autoencoder for Reasoning Process Interpretation [48.99201531966593]
Large Language Models (LLMs) have achieved strong complex reasoning capabilities through Chain-of-Thought (CoT) reasoning.<n>We propose step-level sparse autoencoder (SSAE), which serves as an analytical tool to disentangle different aspects of LLMs' reasoning steps into sparse features.<n> Experiments on multiple base models and reasoning tasks show the effectiveness of the extracted features.
arXiv Detail & Related papers (2026-03-03T14:25:02Z) - LiteToken: Removing Intermediate Merge Residues From BPE Tokenizers [76.59130257385826]
Intermediate merge residues in BPE vocabularies are frequent during merge learning so that retained in the final vocabulary, but are mostly further merged and rarely emitted when tokenizing the corpus during tokenizer usage.<n>We present a systematic empirical characterization of this phenomenon across commonly used tokenizers and introduce LiteToken, a simple method for removing residue tokens.<n>Experiments show that LiteToken reduces token fragmentation, reduces parameters, and improves robustness to noisy or misspelled inputs, while preserving overall performance.
arXiv Detail & Related papers (2026-02-04T16:19:05Z) - Understanding LLM Failures: A Multi-Tape Turing Machine Analysis of Systematic Errors in Language Model Reasoning [0.033842793760651545]
Large language models (LLMs) exhibit failure modes on seemingly trivial tasks.<n>We propose a formalisation of interaction using a deterministic multi-tape Turing machine.<n>The model enables precise localisation of failure modes to specific pipeline stages.
arXiv Detail & Related papers (2026-01-27T16:12:01Z) - TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar [8.34539885321864]
We show that semantically identical code snippets can be tokenized differently depending on superficial factors such as whitespace or identifier naming.<n>We introduce TokDrift, a framework that applies semantic-preserving rewrite rules to create code variants differing only in tokenization.<n>Our findings identify misaligned tokenization as a hidden obstacle to reliable code understanding and generation.
arXiv Detail & Related papers (2025-10-16T17:59:45Z) - TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models [57.32952956674526]
We introduce TokenSwap, a more evasive and stealthy backdoor attack on large vision-language models (LVLMs)<n>Instead of enforcing a fixed targeted content, TokenSwap subtly disrupts the understanding of object relationships in text.<n> TokenSwap achieves high attack success rates while maintaining superior evasiveness and stealthiness.
arXiv Detail & Related papers (2025-09-29T10:19:22Z) - Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations [83.93566096400723]
We find that instruction-tuned models retain up to 93.4% of their original performance when given a randomly sampled tokenization.<n>Character-level segmentation improves string manipulation and code understanding tasks by up to +14%.<n>Right-aligned digit grouping enhances large-number arithmetic by +33%.
arXiv Detail & Related papers (2025-06-23T18:02:26Z) - Sampling from Your Language Model One Byte at a Time [82.71473348639489]
Tokenization can introduce distortion into the model's generations, known as the Prompt Boundary Problem (PBP)<n>We present an inference-time method to convert any autore LM with a BPE tokenizer into a character-level or byte-level LM.<n>Our method efficiently solves the PBP and is also able to unify the vocabularies of language models with different tokenizers.
arXiv Detail & Related papers (2025-06-17T02:37:04Z) - Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning [53.57895922042783]
Large Language Models (LLMs) excel at reasoning and planning when trained on chainof-thought (CoT) data.<n>We propose a hybrid representation of the reasoning process, where we partially abstract away the initial reasoning steps using latent discrete tokens.
arXiv Detail & Related papers (2025-02-05T15:33:00Z) - Attribution analysis of legal language as used by LLM [0.0]
We use two publicly-available legal datasets, a simpler binary classification task of overruling'' texts, and a more elaborate multiple choice task identifying holding'' judicial decisions.<n>We find that while all models correctly classify some test examples from the casehold task, other examples can only be identified by only one, model, and attribution can be used to highlight the reasons for this.
arXiv Detail & Related papers (2025-01-28T22:48:29Z) - Tokenization Falling Short: On Subword Robustness in Large Language Models [12.193639356480851]
This study systematically investigates these challenges and their impact on language models.
Our findings reveal that scaling model parameters can mitigate the issue of tokenization.
Our experiments show that subword regularization such as BPE-dropout can mitigate this issue.
arXiv Detail & Related papers (2024-06-17T16:05:32Z) - Large Language Models as Carriers of Hidden Messages [0.0]
Simple fine-tuning can embed hidden text into large language models (LLMs), which is revealed only when triggered by a specific query.<n>Our work demonstrates that embedding hidden text via fine-tuning, although seemingly secure due to the vast number of potential triggers, is vulnerable to extraction.<n>We introduce an extraction attack called Unconditional Token Forcing (UTF), which iteratively feeds tokens from the LLM's vocabulary to reveal sequences with high token probabilities, indicating hidden text candidates.
arXiv Detail & Related papers (2024-06-04T16:49:06Z) - Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs)
We conduct experiments using LLMs with different subword tokenization methods.
We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z) - Identifying and Analyzing Performance-Critical Tokens in Large Language Models [52.404072802235234]
We study how large language models learn to perform tasks from demonstrations.<n>Our work sheds light on how large language models learn to perform tasks from demonstrations and deepens our understanding of the roles different types of tokens play in large language models.
arXiv Detail & Related papers (2024-01-20T20:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.