Related papers: TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference

TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference

URL: http://arxiv.org/abs/2501.16007v1
Date: Mon, 27 Jan 2025 12:46:45 GMT
Title: TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference
Authors: Jack Min Ong, Matthew Di Ferrante, Aaron Pazdera, Ryan Garner, Sami Jaghouar, Manveer Basra, Johannes Hagemann,
Abstract summary: Large language models (LLMs) have proven to be very capable, but access to the best models currently rely on inference providers which introduces trust challenges.<n>We propose TOPLOC, a novel method for verifiable inference that addresses this problem.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have proven to be very capable, but access to the best models currently rely on inference providers which introduces trust challenges -- how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality sensitive hashing mechanism for intermediate activations which can detect unauthorized modifications to models, prompts, or precision with 100% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes memory overhead of the generated commits by $1000\times$, requiring only 258 bytes of storage per 32 new tokens compared to the 262KB requirement of storing the token embeddings directly for Llama-3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and lays a foundation for decentralized and verifiable AI services.

Related papers

Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding [48.52389201779425]
Speculative decoding accelerates inference by generating multiple draft tokens using a lightweight model and verifying them in parallel.<n>Existing verification methods rely heavily on distributional consistency while overlooking semantic correctness.<n>We propose Reflective Verification, a training-free and semantics-aware approach that achieves a better trade-off between correctness and efficiency.
arXiv Detail & Related papers (2025-05-24T10:26:27Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Self-Supervised Inference of Agents in Trustless Environments [44.99833362998488]
We propose a novel approach where agents can form swarms to produce high-quality responses effectively. This is accomplished by utilizing agents capable of data inference and ranking. We show that our approach is an order of magnitude faster than other trustless inference strategies reaching less than 125 ms validation latency.
arXiv Detail & Related papers (2024-09-12T20:32:07Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Data-Driven Distributionally Robust Safety Verification Using Barrier Certificates and Conditional Mean Embeddings [0.24578723416255752]
We develop scalable formal verification algorithms without shifting the problem to unrealistic assumptions. In a pursuit of developing scalable formal verification algorithms without shifting the problem to unrealistic assumptions, we employ the concept of barrier certificates. We show how to solve the resulting program efficiently using sum-of-squares optimization and a Gaussian process envelope.
arXiv Detail & Related papers (2024-03-15T17:32:02Z)
Enhancing Security in Federated Learning through Adaptive Consensus-Based Model Update Validation [2.28438857884398]
This paper introduces an advanced approach for fortifying Federated Learning (FL) systems against label-flipping attacks. We propose a consensus-based verification process integrated with an adaptive thresholding mechanism. Our results indicate a significant mitigation of label-flipping attacks, bolstering the FL system's resilience.
arXiv Detail & Related papers (2024-03-05T20:54:56Z)
Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition [86.21889574126878]
We show how per-frame entropy values can be normalized and aggregated to obtain a confidence measure per unit and per word. We evaluate the proposed confidence measures on LibriSpeech test sets, and show that they are up to 2 and 4 times better than confidence estimation based on the maximum per-frame probability.
arXiv Detail & Related papers (2022-12-16T20:27:40Z)
Efficient ML Models for Practical Secure Inference [9.536081713410812]
cryptographic primitives allow inference without revealing users' inputs to a model provider or model's weights to a user. Secure inference is in principle feasible for this setting, but there are no existing techniques that make it practical at scale. We show that the primary bottlenecks in secure inference are large linear layers which can be optimized with the choice of network backbone.
arXiv Detail & Related papers (2022-08-26T09:42:21Z)
Integer-arithmetic-only Certified Robustness for Quantized Neural Networks [14.737638416823772]
A line of work on tackling adversarial examples is certified robustness via randomized smoothing. Such a mechanism usually uses floating-point arithmetic for calculations in inference. We show our approach can obtain a comparable accuracy and 4x5x speedup over floating-point arithmetic certified robust methods.
arXiv Detail & Related papers (2021-08-21T01:15:19Z)
CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions. A Java library called CREMA has been recently released to model, process and query credal networks. We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z)
Incremental Verification of Fixed-Point Implementations of Neural Networks [0.19573380763700707]
We develop and evaluate a novel symbolic verification framework using incremental bounded model checking (BMC), satisfiability modulo theories (SMT), and invariant inference. Our approach was able to verify and produce examples for 85.8% of 21 test cases considering different input images, and 100% of the properties related to covering methods.
arXiv Detail & Related papers (2020-12-21T10:03:44Z)
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data. To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.