Inter-Agent Trust Models: A Comparative Study of Brief, Claim, Proof, Stake, Reputation and Constraint in Agentic Web Protocol Design-A2A, AP2, ERC-8004, and Beyond
- URL: http://arxiv.org/abs/2511.03434v1
- Date: Wed, 05 Nov 2025 12:50:06 GMT
- Title: Inter-Agent Trust Models: A Comparative Study of Brief, Claim, Proof, Stake, Reputation and Constraint in Agentic Web Protocol Design-A2A, AP2, ERC-8004, and Beyond
- Authors: Botao 'Amber' Hu, Helena Rong,
- Abstract summary: We study trust models in inter-agent protocol design.<n>We analyze assumptions, attack surfaces, and design trade-offs.<n>We distill actionable design guidelines for safer, interoperable, and scalable agent economies.
- Score: 1.5755923640031846
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As the "agentic web" takes shape-billions of AI agents (often LLM-powered) autonomously transacting and collaborating-trust shifts from human oversight to protocol design. In 2025, several inter-agent protocols crystallized this shift, including Google's Agent-to-Agent (A2A), Agent Payments Protocol (AP2), and Ethereum's ERC-8004 "Trustless Agents," yet their underlying trust assumptions remain under-examined. This paper presents a comparative study of trust models in inter-agent protocol design: Brief (self- or third-party verifiable claims), Claim (self-proclaimed capabilities and identity, e.g. AgentCard), Proof (cryptographic verification, including zero-knowledge proofs and trusted execution environment attestations), Stake (bonded collateral with slashing and insurance), Reputation (crowd feedback and graph-based trust signals), and Constraint (sandboxing and capability bounding). For each, we analyze assumptions, attack surfaces, and design trade-offs, with particular emphasis on LLM-specific fragilities-prompt injection, sycophancy/nudge-susceptibility, hallucination, deception, and misalignment-that render purely reputational or claim-only approaches brittle. Our findings indicate no single mechanism suffices. We argue for trustless-by-default architectures anchored in Proof and Stake to gate high-impact actions, augmented by Brief for identity and discovery and Reputation overlays for flexibility and social signals. We comparatively evaluate A2A, AP2, ERC-8004 and related historical variations in academic research under metrics spanning security, privacy, latency/cost, and social robustness (Sybil/collusion/whitewashing resistance). We conclude with hybrid trust model recommendations that mitigate reputation gaming and misinformed LLM behavior, and we distill actionable design guidelines for safer, interoperable, and scalable agent economies.
Related papers
- AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security [126.49733412191416]
Current guardrail models lack agentic risk awareness and transparency in risk diagnosis.<n>We propose a unified three-dimensional taxonomy that categorizes agentic risks by their source (where), failure mode (how), and consequence (what)<n>We introduce a new fine-grained agentic safety benchmark (ATBench) and a Diagnostic Guardrail framework for agent safety and security (AgentDoG)
arXiv Detail & Related papers (2026-01-26T13:45:41Z) - The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability [7.452202188661883]
We introduce and empirically validate the Trust-Vulnerability Paradox (TVP)<n>TVP: increasing inter-agent trust to enhance coordination simultaneously expands risks of over-exposure and over-authorization.<n>This study formalizes TVP, establishes reproducible baselines with unified metrics, and demonstrates that trust must be modeled and scheduled as a first-class security variable in multi-agent system design.
arXiv Detail & Related papers (2025-10-21T12:18:39Z) - Context Lineage Assurance for Non-Human Identities in Critical Multi-Agent Systems [0.08316523707191924]
We introduce a cryptographically grounded mechanism for lineage verification, anchored in append-only Merkle tree structures.<n>Unlike traditional A2A models that primarily secure point-to-point interactions, our approach enables both agents and external verifiers to cryptographically validate multi-hop provenance.<n>In parallel, we augment the A2A agent card to incorporate explicit identity verification primitives, enabling both peer agents and human approvers to authenticate the legitimacy of NHI representations.
arXiv Detail & Related papers (2025-09-22T20:59:51Z) - The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration [72.33801123508145]
Large language models (LLMs) are integral to multi-agent systems.<n>Privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations.<n>In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information.
arXiv Detail & Related papers (2025-09-16T16:57:25Z) - VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z) - BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability [8.539128225018489]
BlockA2A is a unified multi-agent trust framework for agent-to-agent interoperability.<n>It eliminates centralized trust bottlenecks, ensures message authenticity and execution integrity, and guarantees accountability across agent interactions.<n>It neutralizes attacks through real-time mechanisms, including Byzantine agent flagging, reactive execution halting, and instant permission revocation.
arXiv Detail & Related papers (2025-08-02T11:59:21Z) - Quantum-Safe Identity Verification using Relativistic Zero-Knowledge Proof Systems [3.8435472626703473]
Identity verification is essential in sectors like finance, healthcare, and online services to ensure security and prevent fraud.<n>Current password/PIN-based identity solutions are susceptible to phishing or skimming attacks.<n>We explore identity verification through graph coloring-based relativistic zero-knowledge proofs.
arXiv Detail & Related papers (2025-07-18T18:59:19Z) - Improving Google A2A Protocol: Protecting Sensitive Data and Mitigating Unintended Harms in Multi-Agent Systems [3.398964351541323]
Googles A2A protocol provides a secure communication framework for AI agents.<n>We identify key weaknesses of A2A: insufficient token lifetime control, lack of strong customer authentication, overbroad access scopes, and missing consent flows.<n>We propose protocol-level enhancements grounded in a structured threat model for semi-trusted multi-agent systems.
arXiv Detail & Related papers (2025-05-18T16:25:21Z) - Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents [61.132523071109354]
This paper investigates the interplay between AI developers, regulators and users, modelling their strategic choices under different regulatory scenarios.<n>Our research identifies emerging behaviours of strategic AI agents, which tend to adopt more "pessimistic" stances than pure game-theoretic agents.
arXiv Detail & Related papers (2025-04-11T15:41:21Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.