Related papers: Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs

Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs

URL: http://arxiv.org/abs/2504.04715v1
Date: Mon, 07 Apr 2025 03:57:41 GMT
Title: Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Authors: Will Cai, Tianneng Shi, Xuandong Zhao, Dawn Song,
Abstract summary: Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge.<n>Users pay for services based on advertised model capabilities.<n> providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs.<n>This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
Score: 60.881609323604685
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The proliferation of Large Language Models (LLMs) accessed via black-box APIs introduces a significant trust challenge: users pay for services based on advertised model capabilities (e.g., size, performance), but providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking. Detecting such substitutions is difficult due to the black-box nature, typically limiting interaction to input-output queries. This paper formalizes the problem of model substitution detection in LLM APIs. We systematically evaluate existing verification techniques, including output-based statistical tests, benchmark evaluations, and log probability analysis, under various realistic attack scenarios like model quantization, randomized substitution, and benchmark evasion. Our findings reveal the limitations of methods relying solely on text outputs, especially against subtle or adaptive attacks. While log probability analysis offers stronger guarantees when available, its accessibility is often limited. We conclude by discussing the potential of hardware-based solutions like Trusted Execution Environments (TEEs) as a pathway towards provable model integrity, highlighting the trade-offs between security, performance, and provider adoption. Code is available at https://github.com/sunblaze-ucb/llm-api-audit

Related papers

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test [24.393978712663618]
API providers may discreetly serve quantized or fine-tuned variants to reduce costs or maliciously alter model behaviors.<n>We propose a rank-based uniformity test that can verify the behavioral equality of a black-box LLM to a locally deployed authentic model.<n>We evaluate the approach across diverse threat scenarios, including quantization, harmful fine-tuning, jailbreak prompts, and full model substitution.
arXiv Detail & Related papers (2025-06-08T03:00:31Z)
Invisible Tokens, Visible Bills: The Urgent Need to Audit Hidden Operations in Opaque LLM Services [22.700907666937177]
This position paper highlights emerging accountability challenges in commercial Opaque LLM Services (COLS)<n>We formalize two key risks: textitquantity inflation, where token and call counts may be artificially inflated, and textitquality downgrade, where providers might quietly substitute lower-cost models or tools.<n>We propose a modular three-layer auditing framework for COLS and users that enables trustworthy verification across execution, secure logging, and user-facing auditability without exposing proprietary internals.
arXiv Detail & Related papers (2025-05-24T02:26:49Z)
LR-IAD:Mask-Free Industrial Anomaly Detection with Logical Reasoning [1.3124513975412255]
Industrial Anomaly Detection (IAD) is critical for ensuring product quality by identifying defects. Existing vision-language models (VLMs) and Multimodal Large Language Models (MLLMs) address some limitations but rely on mask annotations. We propose a reward function that dynamically prioritizes rare defect patterns during training to handle class imbalance.
arXiv Detail & Related papers (2025-04-28T06:52:35Z)
Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM [0.7018579932647147]
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection.
arXiv Detail & Related papers (2025-04-07T12:32:14Z)
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics [9.549568621873386]
GateLens is an Algebra-based tool for analyzing datasets in the automotive domain.<n>It achieves higher F1 scores and handling complex and ambiguous queries with greater robustness.<n>It reduces analysis time by over 80% while maintaining high accuracy and reliability.
arXiv Detail & Related papers (2025-03-27T17:48:32Z)
Learning on LLM Output Signatures for gray-box Behavior Analysis [52.81120759532526]
Large Language Models (LLMs) have achieved widespread adoption, yet our understanding of their behavior remains limited.<n>We develop a transformer-based approach to process contamination and data detection in gray-box settings.<n>Our approach achieves superior performance on hallucination and data detection in gray-box settings, significantly outperforming existing baselines.
arXiv Detail & Related papers (2025-03-18T09:04:37Z)
Do Large Language Model Benchmarks Test Reliability? [66.1783478365998]
We investigate how well current benchmarks quantify model reliability. Motivated by this gap in the evaluation of reliability, we propose the concept of so-called platinum benchmarks. We evaluate a wide range of models on these platinum benchmarks and find that, indeed, frontier LLMs still exhibit failures on simple tasks.
arXiv Detail & Related papers (2025-02-05T18:58:19Z)
SVIP: Towards Verifiable Inference of Open-source Large Language Models [33.910670775972335]
Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains. Their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API. This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings.
arXiv Detail & Related papers (2024-10-29T17:52:45Z)
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method [108.56493934296687]
We introduce a divergence-based calibration method, inspired by the divergence-from-randomness concept, to calibrate token probabilities for pretraining data detection. We have developed a Chinese-language benchmark, PatentMIA, to assess the performance of detection approaches for LLMs on Chinese text.
arXiv Detail & Related papers (2024-09-23T07:55:35Z)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing. We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
On Leveraging Large Language Models for Enhancing Entity Resolution: A Cost-efficient Approach [7.996010840316654]
We propose an uncertainty reduction framework using Large Language Models (LLMs) to improve entity resolution results. LLMs capitalize on their advanced linguistic capabilities and a pay-as-you-go'' model that provides significant advantages to those without extensive data science expertise. We show that our method is efficient and effective, offering promising applications in real-world tasks.
arXiv Detail & Related papers (2024-01-07T09:06:58Z)
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models [37.63939774027709]
Large language models (LLMs) specializing in natural language generation (NLG) have recently started exhibiting promising capabilities. We propose and compare several confidence/uncertainty measures, applying them to *selective NLG* where unreliable results could either be ignored or yielded for further assessment. Results reveal that a simple measure for the semantic dispersion can be a reliable predictor of the quality of LLM responses.
arXiv Detail & Related papers (2023-05-30T16:31:26Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.