SVIP: Towards Verifiable Inference of Open-source Large Language Models
- URL: http://arxiv.org/abs/2410.22307v1
- Date: Tue, 29 Oct 2024 17:52:45 GMT
- Title: SVIP: Towards Verifiable Inference of Open-source Large Language Models
- Authors: Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, Huan Zhang,
- Abstract summary: Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains.
Their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API.
This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings.
- Score: 33.910670775972335
- License:
- Abstract: Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains. However, their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API. This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings. In this paper, we formalize the problem of verifiable inference for LLMs. Existing verifiable computing solutions based on cryptographic or game-theoretic techniques are either computationally uneconomical or rest on strong assumptions. We introduce SVIP, a secret-based verifiable LLM inference protocol that leverages intermediate outputs from LLM as unique model identifiers. By training a proxy task on these outputs and requiring the computing provider to return both the generated text and the processed intermediate outputs, users can reliably verify whether the computing provider is acting honestly. In addition, the integration of a secret mechanism further enhances the security of our protocol. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per query for verification.
Related papers
- Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection [49.15148871877941]
Next-token distribution outputs offer a theoretically appealing approach for detection of large language models (LLMs)
We propose the Perplexity Attention Weighted Network (PAWN), which uses the last hidden states of the LLM and positions to weight the sum of a series of features based on metrics from the next-token distribution across the sequence length.
PAWN shows competitive and even better performance in-distribution than the strongest baselines with a fraction of their trainable parameters.
arXiv Detail & Related papers (2025-01-07T17:00:49Z) - Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models [49.48313161005423]
A hybrid language model (HLM) architecture integrates a small language model (SLM) operating on a mobile device with a large language model (LLM) hosted at the base station (BS) of a wireless network.
The HLM token generation process follows the speculative inference principle: the SLM's vocabulary distribution is uploaded to the LLM, which either accepts or rejects it, with rejected tokens being resampled by the LLM.
We propose a novel HLM structure coined Uncertainty-aware opportunistic HLM (U-HLM), wherein the SLM locally measures its output uncertainty and skips both up
arXiv Detail & Related papers (2024-12-17T09:08:18Z) - Can adversarial attacks by large language models be attributed? [1.3812010983144802]
Attributing outputs from Large Language Models in adversarial settings presents significant challenges that are likely to grow in importance.
We investigate this attribution problem using formal language theory, specifically language identification in the limit as introduced by Gold and extended by Angluin.
Our results show that due to the non-identifiability of certain language classes it is theoretically impossible to attribute outputs to specific LLMs with certainty.
arXiv Detail & Related papers (2024-11-12T18:28:57Z) - FedDTPT: Federated Discrete and Transferable Prompt Tuning for Black-Box Large Language Models [14.719919025265224]
Fine-tuning large language models (LLMs) with data from specific scenarios poses privacy leakage risks.
We propose for the first time a federated discrete and transferable prompt tuning, namely FedDTPT, for black-box large language models.
Our approach achieves higher accuracy, reduced communication overhead, and robustness to non-iid data in a black-box setting.
arXiv Detail & Related papers (2024-11-01T19:19:23Z) - SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization [8.121663525764294]
Large language models (LLMs) play a crucial role in our daily lives due to their ability to understand and generate human-like text.
In this report, we design a collaborative inference architecture between a server and its clients to alleviate the throughput limit.
We show in the experiments that we are able to efficiently distribute the workload allowing for roughly 1/3 reduction in the server workload.
arXiv Detail & Related papers (2024-10-14T17:38:41Z) - MEGen: Generative Backdoor in Large Language Models via Model Editing [56.46183024683885]
Large language models (LLMs) have demonstrated remarkable capabilities.
Their powerful generative abilities enable flexible responses based on various queries or instructions.
This paper proposes an editing-based generative backdoor, named MEGen, aiming to create a customized backdoor for NLP tasks with the least side effects.
arXiv Detail & Related papers (2024-08-20T10:44:29Z) - Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models [79.76293901420146]
Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial.
Our research investigates the fragility of uncertainty estimation and explores potential attacks.
We demonstrate that an attacker can embed a backdoor in LLMs, which, when activated by a specific trigger in the input, manipulates the model's uncertainty without affecting the final output.
arXiv Detail & Related papers (2024-07-15T23:41:11Z) - Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models [35.77228114378362]
Large Language Models (LLMs) generate malicious outputs when inputs contain specific "triggers" set by attackers.
Traditional defense strategies are impractical for API-accessible LLMs due to limited model access, high computational costs, and data requirements.
We propose Chain-of-Scrutiny (CoS) which leverages LLMs' unique reasoning abilities to mitigate backdoor attacks.
arXiv Detail & Related papers (2024-06-10T00:53:25Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Tuning-Free Accountable Intervention for LLM Deployment -- A
Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks.
We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z) - Privacy-Preserving XGBoost Inference [0.6345523830122165]
A major barrier to adoption is the sensitive nature of predictive queries.
One central goal of privacy-preserving machine learning (PPML) is to enable users to submit encrypted queries to a remote ML service.
We propose a privacy-preserving XGBoost prediction algorithm, which we have implemented and evaluated empirically on AWS SageMaker.
arXiv Detail & Related papers (2020-11-09T21:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.