Operator: A Protocol for Trustless Delegation Under Uncertainty
- URL: http://arxiv.org/abs/2507.00631v7
- Date: Mon, 04 Aug 2025 08:11:39 GMT
- Title: Operator: A Protocol for Trustless Delegation Under Uncertainty
- Authors: David Shi, Kevin Joo,
- Abstract summary: We propose a protocol that enforces correctness through collateralized claims in a verification game.<n>Tasks are published as intents, and solvers compete to fulfill them.<n>Any challenger can challenge a result by staking against it to trigger the verification process.<n>Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Correctness is an emergent property of systems where exposing error is cheaper than committing it. In dynamic, low-trust environments, autonomous AI agents benefit from delegating work to sub-agents, yet correctness cannot be assured through upfront specification or centralized oversight. We propose a protocol that enforces correctness through collateralized claims in a recursive verification game. Tasks are published as intents, and solvers compete to fulfill them. Selected solvers carry out tasks under risk, with correctness checked post hoc by verifiers. Any challenger can challenge a result by staking against it to trigger the verification process. Incorrect agents are slashed and correct opposition is rewarded, with an escalation path that penalizes erroneous verifiers themselves. When incentives are aligned across solvers, challengers, and verifiers, falsification conditions make correctness the Nash equilibrium.
Related papers
- Core Safety Values for Provably Corrigible Agents [2.6451153531057985]
We introduce the first implementable framework for corrigibility, with provable guarantees in multi-step, partially observed environments.<n>Our framework replaces a single reward with five *structurally separate* utility heads.<n>For open-ended settings where adversaries can modify the agent, we prove that deciding whether an arbitrary post-hack agent will ever violate corrigibility is undecidable.
arXiv Detail & Related papers (2025-07-28T16:19:25Z) - Search-Based Correction of Reasoning Chains for Language Models [72.61861891295302]
Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs)<n>We introduce a new self-correction framework that augments each reasoning step in a CoT with a latent variable indicating its veracity.<n>We also introduce Search Corrector, a discrete search algorithm over-valued veracity assignments.
arXiv Detail & Related papers (2025-05-17T04:16:36Z) - Preemptive Detection and Correction of Misaligned Actions in LLM Agents [70.54226917774933]
InferAct is a novel approach to detect misaligned actions before execution.<n>It alerts users for timely correction, preventing adverse outcomes.<n>InferAct achieves up to 20% improvements on Marco-F1 against baselines in misaligned action detection.
arXiv Detail & Related papers (2024-07-16T15:24:44Z) - SERENE: A Collusion Resilient Replication-based Verification Framework [0.4297070083645048]
Collusion detection and mitigation solutions often require the use of a trusted third party server or verified tasks.
We propose SERENE, a collusion resilient replication-based verification framework that detects, and mitigates colluding workers.
We implement and compare SERENE's performance to Staab et. al, resulting in an average of 50% and 60% accuracy improvement in detection and mitigation accuracy respectively.
arXiv Detail & Related papers (2024-04-17T14:11:31Z) - Lyra: Orchestrating Dual Correction in Automated Theorem Proving [63.115422781158934]
Lyra is a new framework that employs two distinct correction mechanisms: Tool Correction and Conjecture Correction.
Tool Correction contributes to mitigating hallucinations, thereby improving the overall accuracy of the proof.
Conjecture Correction refines generation with instruction but does not collect paired (generation, error & refinement) prompts.
arXiv Detail & Related papers (2023-09-27T17:29:41Z) - Formally Verifying a Real World Smart Contract [52.30656867727018]
We search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
In this article, we present our search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
arXiv Detail & Related papers (2023-07-05T14:30:21Z) - Formalizing the Problem of Side Effect Regularization [81.97441214404247]
We propose a formal criterion for side effect regularization via the assistance game framework.
In these games, the agent solves a partially observable Markov decision process.
We show that this POMDP is solved by trading off the proxy reward with the agent's ability to achieve a range of future tasks.
arXiv Detail & Related papers (2022-06-23T16:36:13Z) - Learning to Give Checkable Answers with Prover-Verifier Games [23.93694563816463]
We introduce Prover-Verifier Games (PVGs), a game-theoretic framework to encourage learning agents to solve decision problems in a verifiable manner.
We analyze variants of the framework, including simultaneous and sequential games, and narrow the space down to a subset of games which provably have the desired equilibria.
We develop instantiations of the PVG for two algorithmic tasks, and show that in practice, the verifier learns a robust decision rule that is able to receive useful and reliable information from an untrusted prover.
arXiv Detail & Related papers (2021-08-27T02:56:06Z) - Claim Check-Worthiness Detection as Positive Unlabelled Learning [53.24606510691877]
Claim check-worthiness detection is a critical component of fact checking systems.
We illuminate a central challenge in claim check-worthiness detection underlying all of these tasks.
Our best performing method is a unified approach which automatically corrects for this using a variant of positive unlabelled learning.
arXiv Detail & Related papers (2020-03-05T16:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.