ZKTorch: Compiling ML Inference to Zero-Knowledge Proofs via Parallel Proof Accumulation
- URL: http://arxiv.org/abs/2507.07031v2
- Date: Thu, 10 Jul 2025 23:50:39 GMT
- Title: ZKTorch: Compiling ML Inference to Zero-Knowledge Proofs via Parallel Proof Accumulation
- Authors: Bing-Jyue Chen, Lilia Tang, Daniel Kang,
- Abstract summary: We propose an end-to-end proving system that compiles ML models into base cryptographic operations.<n>ZKTorch is built on top of a novel parallel extension to the Mira accumulation scheme, enabling succinct proofs with minimal accumulation overhead.<n>These contributions allow ZKTorch to achieve at least a $3times$ reduction in the proof size compared to specialized protocols and up to a $6times$ speedup in proving time over a general-purpose ZKML framework.
- Score: 3.7687375904925484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As AI models become ubiquitous in our daily lives, there has been an increasing demand for transparency in ML services. However, the model owner does not want to reveal the weights, as they are considered trade secrets. To solve this problem, researchers have turned to zero-knowledge proofs of ML model inference. These proofs convince the user that the ML model output is correct, without revealing the weights of the model to the user. Past work on these provers can be placed into two categories. The first method compiles the ML model into a low-level circuit, and proves the circuit using a ZK-SNARK. The second method uses custom cryptographic protocols designed only for a specific class of models. Unfortunately, the first method is highly inefficient, making it impractical for the large models used today, and the second method does not generalize well, making it difficult to update in the rapidly changing field of machine learning. To solve this, we propose ZKTorch, an open source end-to-end proving system that compiles ML models into base cryptographic operations called basic blocks, each proved using specialized protocols. ZKTorch is built on top of a novel parallel extension to the Mira accumulation scheme, enabling succinct proofs with minimal accumulation overhead. These contributions allow ZKTorch to achieve at least a $3\times$ reduction in the proof size compared to specialized protocols and up to a $6\times$ speedup in proving time over a general-purpose ZKML framework.
Related papers
- MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models [53.36415620647177]
Semi-structured sparsity offers a promising solution by strategically retaining $N$ elements out of every $M$ weights.<n>Existing (N:M)-compatible approaches typically fall into two categories: rule-based layerwise greedy search, which suffers from considerable errors, and gradient-driven learning, which incurs prohibitive training costs.<n>We propose a novel linear-space probabilistic framework named MaskPro, which aims to learn a prior categorical distribution for every $M$ consecutive weights and subsequently leverages this distribution to generate the (N:M)-sparsity throughout an $N$-way sampling
arXiv Detail & Related papers (2025-06-15T15:02:59Z) - Generation of Optimized Solidity Code for Machine Learning Models using LLMs [5.07666452437053]
We propose a novel approach that enables conversion of the inferencing path of an ML model as well as its weights trained off-chain into Solidity code using Large Language Models (LLMs)<n>We have also developed a proof of concept decentralized application using the code so generated for verifying the accuracy claims of the underlying ML model.
arXiv Detail & Related papers (2025-03-08T13:12:52Z) - P2W: From Power Traces to Weights Matrix -- An Unconventional Transfer Learning Approach [1.1383507019490222]
The rapid growth of deploying machine learning (ML) models within embedded systems on a chip (SoCs) has led to transformative shifts in fields like healthcare and autonomous vehicles.<n>One of the primary challenges for training such embedded ML models is the lack of publicly available high-quality training data.<n>We introduce a novel unconventional transfer learning approach to train a new ML model by extracting and using weights from an existing ML model.
arXiv Detail & Related papers (2025-02-20T19:05:28Z) - Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment [56.44025052765861]
Large language models (LLMs) have revolutionized Natural Language Processing (NLP), but their size creates computational bottlenecks.
We introduce a novel approach to create accurate, sparse foundational versions of performant LLMs.
We show a total speedup on CPUs for sparse-quantized LLaMA models of up to 8.6x.
arXiv Detail & Related papers (2024-05-06T16:03:32Z) - Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens [15.566726645722657]
We propose a novel framework specifically designed for speculative sampling.
Within this framework, we introduce a lightweight draft model that effectively utilizes previously generated tokens to predict subsequent words.
We demonstrate impressive results, achieving an average latency speedup ratio of 2.7x compared to the vanilla auto-regressive decoding approach.
arXiv Detail & Related papers (2024-02-24T08:10:39Z) - Co(ve)rtex: ML Models as storage channels and their (mis-)applications [2.792027541710663]
In machine learning systems, don't-care states and undefined behavior have been shown to be sources of significant vulnerabilities.
We consider the ML model as a storage channel with a capacity that increases with over parameterization.
We develop optimizations to improve the capacity in this case, including a novel ML-specific substitution based error correction protocol.
arXiv Detail & Related papers (2023-07-17T19:57:10Z) - The False Promise of Imitating Proprietary LLMs [158.65692029352584]
An emerging method to cheaply improve a weaker language model is to finetune it on outputs from a stronger model.
This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.
We first finetune a series of LMs that imitate ChatGPT using varying base model sizes.
We then evaluate the models using crowd raters and canonical NLP benchmarks.
arXiv Detail & Related papers (2023-05-25T05:00:12Z) - Scaling up Trustless DNN Inference with Zero-Knowledge Proofs [47.42532753464726]
We present the first practical ImageNet-scale method to verify ML model inference non-interactively, i.e., after the inference has been done.
We provide the first ZKSNARK proof of valid inference for a full resolution ImageNet model, achieving 79% top-5 accuracy.
arXiv Detail & Related papers (2022-10-17T00:35:38Z) - Proof-of-Learning: Definitions and Practice [15.585184189361486]
Training machine learning (ML) models typically involves expensive iterative optimization.
There is currently no mechanism for the entity which trained the model to prove that these parameters were indeed the result of this optimization procedure.
This paper introduces the concept of proof-of-learning in ML.
arXiv Detail & Related papers (2021-03-09T18:59:54Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.