Token-Efficient Change Detection in LLM APIs
- URL: http://arxiv.org/abs/2602.11083v1
- Date: Wed, 11 Feb 2026 17:48:29 GMT
- Title: Token-Efficient Change Detection in LLM APIs
- Authors: Timothée Chauvin, Clément Lalanne, Erwan Le Merrer, Jean-Michel Loubes, François Taïani, Gilles Tredan,
- Abstract summary: Existing methods are either too expensive for deployment at scale, or require initial white-box access to model weights or grey-box access to log probabilities.<n>We aim to achieve both low cost and strict black-box operation, observing only output tokens.<n>Our approach hinges on specific inputs we call Border Inputs, for which there exists more than one output top token.
- Score: 8.484873526700978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote change detection in LLMs is a difficult problem. Existing methods are either too expensive for deployment at scale, or require initial white-box access to model weights or grey-box access to log probabilities. We aim to achieve both low cost and strict black-box operation, observing only output tokens. Our approach hinges on specific inputs we call Border Inputs, for which there exists more than one output top token. From a statistical perspective, optimal change detection depends on the model's Jacobian and the Fisher information of the output distribution. Analyzing these quantities in low-temperature regimes shows that border inputs enable powerful change detection tests. Building on this insight, we propose the Black-Box Border Input Tracking (B3IT) scheme. Extensive in-vivo and in-vitro experiments show that border inputs are easily found for non-reasoning tested endpoints, and achieve performance on par with the best available grey-box approaches. B3IT reduces costs by $30\times$ compared to existing methods, while operating in a strict black-box setting.
Related papers
- IOTA: Corrective Knowledge-Guided Prompt Learning via Black-White Box Framework [57.66924056568018]
We propose a novel black-whIte bOx prompT leArning framework (IOTA) for adapting pre-trained models to downstream tasks.<n>IOTA integrates a data-driven Black Box module with a knowledge-driven White Box module for downstream task adaptation.
arXiv Detail & Related papers (2026-01-28T12:03:48Z) - White-Box Sensitivity Auditing with Steering Vectors [14.807513989606647]
We propose a white-box sensitivity auditing framework for language language (LLMs)<n>Our method conducts internal sensitivity tests by manipulating key concepts relevant to the model's intended function for the task.<n>Our method consistently reveals substantial dependence on protected attributes in model predictions.
arXiv Detail & Related papers (2026-01-23T02:03:20Z) - Visualizing token importance for black-box language models [48.747801442240565]
We consider the problem of auditing black-box large language models (LLMs) to ensure they behave reliably when deployed in production settings.<n>We propose Distribution-Based Sensitivity Analysis (DBSA) to evaluate the sensitivity of the output of a language model for each input token.
arXiv Detail & Related papers (2025-12-12T14:01:43Z) - CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning [62.56541355300587]
We introduce a general test-time calibration framework that adaptively modifies the model toward high-reward reasoning paths.<n>Within this framework, we propose CarBoN, a two-phase method that first explores the solution space and then learns a calibration of the logits.<n>Experiments on MATH-500 and AIME-2024 show that CarBoN improves efficiency, with up to $4times$ fewer rollouts to reach the same accuracy.
arXiv Detail & Related papers (2025-10-17T14:04:37Z) - Beyond Linear Probes: Dynamic Safety Monitoring for Language Models [67.15793594651609]
Traditional safety monitors require the same amount of compute for every query.<n>We introduce Truncated Polynomials (TPCs), a natural extension of linear probes for dynamic activation monitoring.<n>Our key insight is that TPCs can be trained and evaluated progressively, term-by-term.
arXiv Detail & Related papers (2025-09-30T13:32:59Z) - Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test [24.393978712663618]
API providers may discreetly serve quantized or fine-tuned variants to reduce costs or maliciously alter model behaviors.<n>We propose a rank-based uniformity test that can verify the behavioral equality of a black-box LLM to a locally deployed authentic model.<n>We evaluate the approach across diverse threat scenarios, including quantization, harmful fine-tuning, jailbreak prompts, and full model substitution.
arXiv Detail & Related papers (2025-06-08T03:00:31Z) - Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [71.7892165868749]
Commercial Large Language Model (LLM) APIs create a fundamental trust problem.<n>Users pay for specific models but have no guarantee that providers deliver them faithfully.<n>We formalize this model substitution problem and evaluate detection methods under realistic adversarial conditions.<n>We propose and evaluate the use of Trusted Execution Environments (TEEs) as one practical and robust solution.
arXiv Detail & Related papers (2025-04-07T03:57:41Z) - Beyond Next Token Probabilities: Learnable, Fast Detection of Hallucinations and Data Contamination on LLM Output Distributions [60.43398881149664]
We introduce LOS-Net, a lightweight attention-based architecture trained on an efficient encoding of the LLM Output Signature.<n>It achieves superior performance across diverse benchmarks and LLMs, while maintaining extremely low detection latency.
arXiv Detail & Related papers (2025-03-18T09:04:37Z) - A Watermark for Black-Box Language Models [31.772364827073808]
We propose a principled watermarking scheme that requires only the ability to sample sequences from the LLM.<n>We provide performance guarantees, demonstrate how it can be leveraged when white-box access is available, and show when it can outperform existing white-box schemes via comprehensive experiments.
arXiv Detail & Related papers (2024-10-02T23:39:19Z) - Ten Words Only Still Help: Improving Black-Box AI-Generated Text
Detection via Proxy-Guided Efficient Re-Sampling [19.780068724002888]
POGER is a proxy-guided efficient re-sampling method for black-box AIGT detection.
It outperforms all baselines in macro F1 under black-box, partial white-box, and out-of-distribution settings.
arXiv Detail & Related papers (2024-02-14T14:32:16Z) - Black-Box Tuning of Vision-Language Models with Effective Gradient
Approximation [71.21346469382821]
We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models.
CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
arXiv Detail & Related papers (2023-12-26T06:31:28Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.