DoubleAgents: Exploring Mechanisms of Building Trust with Proactive AI
- URL: http://arxiv.org/abs/2509.12626v1
- Date: Tue, 16 Sep 2025 03:43:13 GMT
- Title: DoubleAgents: Exploring Mechanisms of Building Trust with Proactive AI
- Authors: Tao Long, Xuanming Zhang, Sitong Wang, Zhou Yu, Lydia B Chilton,
- Abstract summary: DoubleAgents is an agentic planning tool that embeds transparency and control through user intervention.<n>A built-in respondent simulation generates realistic scenarios, allowing users to rehearse, refine policies, and calibrate their reliance.
- Score: 29.777890680647186
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Agentic workflows promise efficiency, but adoption hinges on whether people actually trust systems that act on their behalf. We present DoubleAgents, an agentic planning tool that embeds transparency and control through user intervention, value-reflecting policies, rich state visualizations, and uncertainty flagging for human coordination tasks. A built-in respondent simulation generates realistic scenarios, allowing users to rehearse, refine policies, and calibrate their reliance before live use. We evaluate DoubleAgents in a two-day lab study (n=10), two deployments (n=2), and a technical evaluation. Results show that participants initially hesitated to delegate but grew more reliant as they experienced transparency, control, and adaptive learning during simulated cases. Deployment results demonstrate DoubleAgents' real-world relevance and usefulness, showing that the effort required scaled appropriately with task complexity and contextual data. We contribute trust-by-design patterns and mechanisms for proactive AI -- consistency, controllability, and explainability -- along with simulation as a safe path to build and calibrate trust over time.
Related papers
- Toward Safe and Responsible AI Agents: A Three-Pillar Model for Transparency, Accountability, and Trustworthiness [0.0]
This paper presents a conceptual and operational framework for developing and operating safe and trustworthy AI agents.<n>The framework is based on a Three-Pillar Model grounded in transparency, accountability, and trustworthiness.
arXiv Detail & Related papers (2026-01-09T07:27:43Z) - Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems [0.0]
We propose a Verifiability-First architecture that integrates run-time attestations of agent actions using cryptographic and symbolic methods.<n>We also embed Audit Agents that continuously verify intent versus behavior using constrained reasoning.<n>Our approach shifts the evaluation focus from how likely misalignment is to how quickly and reliably misalignment can be detected and remediated.
arXiv Detail & Related papers (2025-12-19T06:12:43Z) - What's the next frontier for Data-centric AI? Data Savvy Agents [71.76058707995398]
We argue that data-savvy capabilities should be a top priority in the design of agentic systems.<n>We propose four key capabilities to realize this vision: Proactive data acquisition, Sophisticated data processing, Interactive test data synthesis, and Continual adaptation.
arXiv Detail & Related papers (2025-11-02T17:09:29Z) - How can we assess human-agent interactions? Case studies in software agent design [52.953425368394306]
We make two major steps towards the rigorous assessment of human-agent interactions.<n>We propose PULSE, a framework for more efficient human-centric evaluation of agent designs.<n>We deploy the framework on a large-scale web platform built around the open-source software agent OpenHands.
arXiv Detail & Related papers (2025-10-10T19:04:28Z) - Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents [58.00130492861884]
TraitBasis is a lightweight, model-agnostic method for systematically stress testing AI agents.<n>TraitBasis learns directions in activation space corresponding to steerable user traits.<n>We observe on average a 2%-30% performance degradation on $tau$-Trait across frontier models.
arXiv Detail & Related papers (2025-10-06T05:03:57Z) - Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm [60.36837655498119]
We propose a Trajectory-based validated-by-Reproducing Agent-benchmark Complexity Evolution framework.<n>This framework takes an original task from an existing benchmark and encourages agents to evolve it into a new task with higher difficulty.<n>Experiments on the GAIA benchmark demonstrate that the TRACE framework consistently enhances task complexity while improving the reliability of correctness.
arXiv Detail & Related papers (2025-10-01T01:52:52Z) - Explainable AI for Maritime Autonomous Surface Ships (MASS): Adaptive Interfaces and Trustworthy Human-AI Collaboration [0.0]
This article synthesizes 100 studies on automation transparency for Maritime Autonomous Surface Ships.<n>We identify where human unsafe control actions concentrate in handover and emergency loops.<n>Design strategies for transparency at three layers: sensor/SA acquisition and fusion, HMI/eHMI presentation, and engineer-facing processes.
arXiv Detail & Related papers (2025-09-19T13:18:54Z) - Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection [108.5042835056188]
This work introduces Agent4FaceForgery to address two fundamental problems.<n>How to capture the diverse intents and iterative processes of human forgery creation.<n>How to model the complex, often adversarial, text-image interactions that accompany forgeries in social media.
arXiv Detail & Related papers (2025-09-16T01:05:01Z) - Co-Investigator AI: The Rise of Agentic AI for Smarter, Trustworthy AML Compliance Narratives [2.7295959384567356]
Co-Investigator AI is an agentic framework optimized to produce Suspicious Activity Reports (SARs) significantly faster and with greater accuracy than traditional methods.<n>We demonstrate its ability to streamline SAR drafting, align narratives with regulatory expectations, and enable compliance teams to focus on higher-order analytical work.
arXiv Detail & Related papers (2025-09-10T08:16:04Z) - Get Experience from Practice: LLM Agents with Record & Replay [16.179801770737892]
This paper proposes a new paradigm called AgentRR (Agent Record & Replay), which introduces the classical record-and-replay mechanism into AI agent frameworks.<n>We detail a multi-level experience abstraction method and a check function mechanism in AgentRR.<n>In addition, we explore multiple application modes of AgentRR, including user-recorded task demonstration, large-small model collaboration and privacy-aware agent execution.
arXiv Detail & Related papers (2025-05-23T10:33:14Z) - SOPBench: Evaluating Language Agents at Following Standard Operating Procedures and Constraints [59.645885492637845]
SOPBench is an evaluation pipeline that transforms each service-specific SOP code program into a directed graph of executable functions.<n>Our approach transforms each service-specific SOP code program into a directed graph of executable functions and requires agents to call these functions based on natural language SOP descriptions.<n>We evaluate 18 leading models, and results show the task is challenging even for top-tier models.
arXiv Detail & Related papers (2025-03-11T17:53:02Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation [46.42384207122049]
We design SimulateBench to evaluate the believability of large language models (LLMs) when simulating human behaviors.
Based on SimulateBench, we evaluate the performances of 10 widely used LLMs when simulating characters.
arXiv Detail & Related papers (2023-12-28T16:51:11Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - On Assessing The Safety of Reinforcement Learning algorithms Using
Formal Methods [6.2822673562306655]
Safety mechanisms such as adversarial training, adversarial detection, and robust learning are not always adapted to all disturbances in which the agent is deployed.
It is therefore necessary to propose new solutions adapted to the learning challenges faced by the agent.
We use reward shaping and a modified Q-learning algorithm as defense mechanisms to improve the agent's policy when facing adversarial perturbations.
arXiv Detail & Related papers (2021-11-08T23:08:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.