Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy
- URL: http://arxiv.org/abs/2510.16830v2
- Date: Mon, 10 Nov 2025 08:02:49 GMT
- Title: Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy
- Authors: Hasan Akgul, Daniel Borg, Arta Berisha, Amina Rahimova, Andrej Novak, Mila Petrov,
- Abstract summary: We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs.<n>We show that the system composes with probabilistic audits and bandwidth constraints.<n>Results indicate that the system is feasible today for real parameter efficient pipelines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models are often adapted through parameter efficient fine tuning, but current release practices provide weak assurances about what data were used and how updates were computed. We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs that a released model was obtained from a public initialization under a declared training program and an auditable dataset commitment. The approach combines five elements. First, commitments that bind data sources, preprocessing, licenses, and per epoch quota counters to a manifest. Second, a verifiable sampler that supports public replayable and private index hiding batch selection. Third, update circuits restricted to parameter efficient fine tuning that enforce AdamW style optimizer semantics and proof friendly approximations with explicit error budgets. Fourth, recursive aggregation that folds per step proofs into per epoch and end to end certificates with millisecond verification. Fifth, provenance binding and optional trusted execution property cards that attest code identity and constants. On English and bilingual instruction mixtures, the method maintains utility within tight budgets while achieving practical proof performance. Policy quotas are enforced with zero violations, and private sampling windows show no measurable index leakage. Federated experiments demonstrate that the system composes with probabilistic audits and bandwidth constraints. These results indicate that end to end verifiable fine tuning is feasible today for real parameter efficient pipelines, closing a critical trust gap for regulated and decentralized deployments.
Related papers
- IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation [49.796717294455796]
We present IMMACULATE, a practical auditing framework that detects economically motivated deviations.<n>IMMACULATE selectively audits a small fraction of requests using verifiable computation, achieving strong detection guarantees while amortizing cryptographic overhead.
arXiv Detail & Related papers (2026-02-26T07:21:02Z) - Secure Tool Manifest and Digital Signing Solution for Verifiable MCP and LLM Pipelines [5.979408039210097]
Large Language Models (LLMs) are increasingly adopted in sensitive domains such as healthcare and financial institutions' data analytics.<n>Existing control mechanisms, such as the Model Context Protocol (MCP), define compliance policies for tool invocation but lack verifiable enforcement and transparent validation of model actions.<n>We propose a novel Secure Tool Manifest and Digital Signing Framework, a structured and security-aware extension of Model Context Protocols.
arXiv Detail & Related papers (2026-01-30T16:22:21Z) - Refinement Provenance Inference: Detecting LLM-Refined Training Prompts from Model Behavior [58.751981587234916]
This paper formalizes the Refinement Provenance Inference (RPI) audit task as Refinement Provenance Inference (RPI)<n>We propose RePro, a logit-based framework that fuses teacher-forced likelihood features with logit-ranking signals.<n>During training, RePro learns a transferable representation via shadow fine-tuning, and uses a lightweight linear head to infer provenance on unseen victims without training-data access.
arXiv Detail & Related papers (2026-01-05T10:16:41Z) - Memory in Large Language Models: Mechanisms, Evaluation and Evolution [8.158439933515131]
We propose a four-part taxonomy (parametric, contextual, external, procedural/episodic) and a memory quadruple (location, persistence, write/access path, controllability)<n>For updating and forgetting, we present DMM Gov: coordinating DAPT/TAPT, PEFT, model editing (ROME, MEND, MEMIT, SERAC), and RAG to form an auditable loop.<n>This yields a reproducible, comparable, and governable coordinate system for research and deployment.
arXiv Detail & Related papers (2025-09-23T10:06:58Z) - Probing Pre-trained Language Models on Code Changes: Insights from ReDef, a High-Confidence Just-in-Time Defect Prediction Dataset [0.0]
We present ReDef, a high-confidence benchmark of function-level modifications curated from 22 large-scale C/C++ projects.<n>Defective cases are anchored by revert commits, while clean cases are validated through post-hoc history checks.<n>This pipeline yields 3,164 defective and 10,268 clean modifications, offering substantially more reliable labels than prior existing resources.
arXiv Detail & Related papers (2025-09-11T07:07:11Z) - Proof-Carrying Numbers (PCN): A Protocol for Trustworthy Numeric Answers from LLMs via Claim Verification [0.0]
We propose textbfProof-Carrying Numbers (PCN), a presentation-layer protocol that enforces numeric fidelity through mechanical verification.<n>PCN is lightweight and model-agnostic, integrates seamlessly into existing applications, and can be extended with cryptographic commitments.
arXiv Detail & Related papers (2025-09-08T17:20:16Z) - Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding [48.52389201779425]
Speculative decoding accelerates inference by generating multiple draft tokens using a lightweight model and verifying them in parallel.<n>Existing verification methods rely heavily on distributional consistency while overlooking semantic correctness.<n>We propose Reflective Verification, a training-free and semantics-aware approach that achieves a better trade-off between correctness and efficiency.
arXiv Detail & Related papers (2025-05-24T10:26:27Z) - Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [71.7892165868749]
Commercial Large Language Model (LLM) APIs create a fundamental trust problem.<n>Users pay for specific models but have no guarantee that providers deliver them faithfully.<n>We formalize this model substitution problem and evaluate detection methods under realistic adversarial conditions.<n>We propose and evaluate the use of Trusted Execution Environments (TEEs) as one practical and robust solution.
arXiv Detail & Related papers (2025-04-07T03:57:41Z) - TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference [7.103455333148043]
Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers.<n>We propose TOPLOC, a novel method for verifiable inference that addresses this problem.
arXiv Detail & Related papers (2025-01-27T12:46:45Z) - OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs [59.836774258359945]
OpenFactCheck is a framework for building customized automatic fact-checking systems.<n>It allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims.<n>CheckerEVAL is a solution for gauging the reliability of automatic fact-checkers' verification results using human-annotated datasets.
arXiv Detail & Related papers (2024-05-09T07:15:19Z) - Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking [48.68044413117397]
PINOSE trains a probing model on offline self-consistency checking results, thereby circumventing the need for human-annotated data.
It examines various aspects of internal states prior to response decoding, contributing to more effective detection of factual inaccuracies.
arXiv Detail & Related papers (2024-04-10T05:00:35Z) - Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data [48.409306245463]
We develop models that quote verbatim statements from trusted sources in their pre-training data.<n>The core of Quote-Tuning is a fast membership inference function that efficiently verifies text against trusted corpora.<n> Experiments show that Quote-Tuning significantly increases verbatim quotes from high-quality documents by up to 130% relative to base models.
arXiv Detail & Related papers (2024-04-05T02:27:09Z) - Auditing Fairness by Betting [43.515287900510934]
We provide practical, efficient, and nonparametric methods for auditing the fairness of deployed classification and regression models.<n>Our methods are sequential and allow for the continuous monitoring of incoming data.<n>We demonstrate the efficacy of our approach on three benchmark fairness datasets.
arXiv Detail & Related papers (2023-05-27T20:14:11Z) - Sequential Kernelized Independence Testing [77.237958592189]
We design sequential kernelized independence tests inspired by kernelized dependence measures.<n>We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.