Verifiable Semantics for Agent-to-Agent Communication
- URL: http://arxiv.org/abs/2602.16424v1
- Date: Wed, 18 Feb 2026 12:55:58 GMT
- Title: Verifiable Semantics for Agent-to-Agent Communication
- Authors: Philipp Schoenegger, Matt Carlson, Chris Schneider, Chris Daly,
- Abstract summary: Multiagent AI systems require consistent communication.<n>Natural language is interpretable but vulnerable to semantic drift.<n>We propose a certification protocol based on the stimulus-meaning model.
- Score: 0.2866560512724962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multiagent AI systems require consistent communication, but we lack methods to verify that agents share the same understanding of the terms used. Natural language is interpretable but vulnerable to semantic drift, while learned protocols are efficient but opaque. We propose a certification protocol based on the stimulus-meaning model, where agents are tested on shared observable events and terms are certified if empirical disagreement falls below a statistical threshold. In this protocol, agents restricting their reasoning to certified terms ("core-guarded reasoning") achieve provably bounded disagreement. We also outline mechanisms for detecting drift (recertification) and recovering shared vocabulary (renegotiation). In simulations with varying degrees of semantic divergence, core-guarding reduces disagreement by 72-96%. In a validation with fine-tuned language models, disagreement is reduced by 51%. Our framework provides a first step towards verifiable agent-to-agent communication.
Related papers
- Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling [0.0]
Colluding language-model agents can hide coordination in messages that remain policy-compliant at the surface level.<n>We present CLBC, a protocol where generation and admission are separated.<n>We show how this protocol yields an upper bound on transcript leakage in terms of latent leakage plus explicit residual channels.
arXiv Detail & Related papers (2026-02-27T23:42:37Z) - Preventing the Collapse of Peer Review Requires Verification-First AI [49.995126139461085]
We propose truth-coupling, i.e. how tightly venue scores track latent scientific truth.<n>We formalize two forces that drive a phase transition toward proxy-sovereign evaluation.
arXiv Detail & Related papers (2026-01-23T17:17:32Z) - Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation [76.5533899503582]
Large language models (LLMs) are increasingly used as judges to evaluate agent performance.<n>We show this paradigm implicitly assumes that the agent's chain-of-thought (CoT) reasoning faithfully reflects both its internal reasoning and the underlying environment state.<n>We demonstrate that manipulated reasoning alone can inflate false positive rates of state-of-the-art VLM judges by up to 90% across 800 trajectories spanning diverse web tasks.
arXiv Detail & Related papers (2026-01-21T06:07:43Z) - VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z) - Large Language Models for Validating Network Protocol Parsers [8.007994733372675]
Protocol standards are typically written in natural language, whereas implementations are in source code.<n>We propose PARVAL, a framework built on large language models (LLMs)<n>It transforms both protocol standards and their implementations into a unified intermediate representation, referred to as format specifications.<n>It successfully identifies inconsistencies between the implementation and its RFC standard, achieving a low false positive rate of 5.6%.
arXiv Detail & Related papers (2025-04-18T07:09:56Z) - VerifiAgent: a Unified Verification Agent in Language Model Reasoning [10.227089771963943]
We propose a unified verification agent that integrates two levels of verification: meta-verification and tool-based adaptive verification.<n>VerifiAgent autonomously selects appropriate verification tools based on the reasoning type.<n>It can be effectively applied to inference scaling, achieving better results with fewer generated samples and costs.
arXiv Detail & Related papers (2025-04-01T04:05:03Z) - Detecting Backdoor Attacks via Similarity in Semantic Communication Systems [3.565151496245487]
This work proposes a defense mechanism that leverages semantic similarity to detect backdoor attacks.<n>By analyzing deviations in semantic feature space and establishing a threshold-based detection framework, the proposed approach effectively identifies poisoned samples.
arXiv Detail & Related papers (2025-02-06T02:22:36Z) - Cognitive Semantic Communication Systems Driven by Knowledge Graph:
Principle, Implementation, and Performance Evaluation [74.38561925376996]
Two cognitive semantic communication frameworks are proposed for the single-user and multiple-user communication scenarios.
An effective semantic correction algorithm is proposed by mining the inference rule from the knowledge graph.
For the multi-user cognitive semantic communication system, a message recovery algorithm is proposed to distinguish messages of different users.
arXiv Detail & Related papers (2023-03-15T12:01:43Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Emergence of Pragmatics from Referential Game between Theory of Mind
Agents [64.25696237463397]
We propose an algorithm, using which agents can spontaneously learn the ability to "read between lines" without any explicit hand-designed rules.
We integrate the theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol.
arXiv Detail & Related papers (2020-01-21T19:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.