Related papers: PCN-Rec: Agentic Proof-Carrying Negotiation for Reliable Governance-Constrained Recommendation

PCN-Rec: Agentic Proof-Carrying Negotiation for Reliable Governance-Constrained Recommendation

URL: http://arxiv.org/abs/2601.09771v1
Date: Wed, 14 Jan 2026 15:00:00 GMT
Title: PCN-Rec: Agentic Proof-Carrying Negotiation for Reliable Governance-Constrained Recommendation
Authors: Aradhya Dixit, Shreem Dixit,
Abstract summary: PCN-Rec is a proof-carrying negotiation pipeline that separates natural-language reasoning from deterministic enforcement.<n>On MovieLens-100K with governance constraints, PCN-Rec achieves a 98.55% pass rate on feasible users.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern LLM-based recommenders can generate compelling ranked lists, but they struggle to reliably satisfy governance constraints such as minimum long-tail exposure or diversity requirements. We present PCN-Rec, a proof-carrying negotiation pipeline that separates natural-language reasoning from deterministic enforcement. A base recommender (MF/CF) produces a candidate window of size W, which is negotiated by two agents: a User Advocate optimizing relevance and a Policy Agent enforcing constraints. A mediator LLM synthesizes a top-N slate together with a structured certificate (JSON) describing the claimed constraint satisfaction. A deterministic verifier recomputes all constraints from the slate and accepts only verifier-checked certificates; if verification fails, a deterministic constrained-greedy repair produces a compliant slate for re-verification, yielding an auditable trace. On MovieLens-100K with governance constraints, PCN-Rec achieves a 98.55% pass rate on feasible users (n = 551, W = 80) versus a one-shot single-LLM baseline without verification/repair, while preserving utility with only a 0.021 absolute drop in NDCG@10 (0.403 vs. 0.424); differences are statistically significant (p < 0.05).

Related papers

Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling [0.0]
Colluding language-model agents can hide coordination in messages that remain policy-compliant at the surface level.<n>We present CLBC, a protocol where generation and admission are separated.<n>We show how this protocol yields an upper bound on transcript leakage in terms of latent leakage plus explicit residual channels.
arXiv Detail & Related papers (2026-02-27T23:42:37Z)
CoRefine: Confidence-Guided Self-Refinement for Adaptive Test-Time Compute [10.548368675645403]
CoRefine is a confidence-guided self-refinement method that achieves competitive accuracy using a fraction of the tokens.<n>The controller consumes full-trace confidence to decide whether to halt, re-examine, or try a different approach.<n>We extend this to CoRefine-Tree, a hybrid sequential-parallel variant that adaptively balances exploration and exploitation.
arXiv Detail & Related papers (2026-02-09T17:44:41Z)
Replayable Financial Agents: A Determinism-Faithfulness Assurance Harness for Tool-Using LLM Agents [0.7699235580548228]
LLM agents struggle with regulatory audit replay: when asked to reproduce a transaction flagged decision with identical inputs, most deployments fail to return consistent results.<n>This paper introduces the DeterminismFaithfulness Assurance Harness (DFAH), a framework for measuring trajectory determinism and evidence-conditioned faithfulness in tool-using agents deployed in financial services.
arXiv Detail & Related papers (2026-01-17T19:47:55Z)
Secure, Verifiable, and Scalable Multi-Client Data Sharing via Consensus-Based Privacy-Preserving Data Distribution [0.0]
CPPDD is an autonomous protocol for secure multi-client data aggregation.<n>It enforces unanimous-release confidentiality through a dual-layer protection mechanism.<n>It achieves 100% malicious deviation detection, exact data recovery, and three-to-four orders of magnitude lower FLOPs compared to MPC and HE baselines.
arXiv Detail & Related papers (2026-01-01T18:12:50Z)
LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems [95.35293543918762]
Large language models (LLMs) often generate unreliable answers, while uncertainty methods fail to fully distinguish correct from incorrect predictions.<n>We address this issue through the lens of false discovery rate (FDR) control, ensuring that among all accepted predictions, the proportion of errors does not exceed a target risk level.<n>We propose LEC, which reinterprets selective prediction as a constrained decision problem by enforcing a Linear Expectation Constraint.
arXiv Detail & Related papers (2025-12-01T11:27:09Z)
Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy [0.0]
We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs.<n>We show that the system composes with probabilistic audits and bandwidth constraints.<n>Results indicate that the system is feasible today for real parameter efficient pipelines.
arXiv Detail & Related papers (2025-10-19T13:33:27Z)
Unsupervised Conformal Inference: Bootstrapping and Alignment to Control LLM Uncertainty [49.19257648205146]
We propose an unsupervised conformal inference framework for generation.<n>Our gates achieve close-to-nominal coverage and provide tighter, more stable thresholds than split UCP.<n>The result is a label-free, API-compatible gate for test-time filtering.
arXiv Detail & Related papers (2025-09-26T23:40:47Z)
COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [71.7892165868749]
Commercial Large Language Model (LLM) APIs create a fundamental trust problem.<n>Users pay for specific models but have no guarantee that providers deliver them faithfully.<n>We formalize this model substitution problem and evaluate detection methods under realistic adversarial conditions.<n>We propose and evaluate the use of Trusted Execution Environments (TEEs) as one practical and robust solution.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Robust Conformal Prediction with a Single Binary Certificate [58.450154976190795]
Conformal prediction (CP) converts any model's output to prediction sets with a guarantee to cover the true label with (adjustable) high probability.<n>We propose a robust conformal prediction that produces smaller sets even with significantly lower MC samples.
arXiv Detail & Related papers (2025-03-07T08:41:53Z)
LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond [135.8013388183257]
We propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits. Most LLMs struggle on SummEdits, with performance close to random chance. The best-performing model, GPT-4, is still 8% below estimated human performance.
arXiv Detail & Related papers (2023-05-23T21:50:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.