FuguReport

MosaicLeaks:Privacy Risks in Querying-in-the-Open for Deep Research Agents

Authors Alexander Gurung, Spandana Gella, Alexandre Drouin, Issam H. Laradji, Perouz Taslakian, Rafael Pardinas
Affiliations ServiceNow / The University of Edinburgh / Mila – Quebec AI Institute / McGill University / The University of British Columbia
Categories Evaluation / Privacy Risk Evaluation / Benchmarking privacy leakage, Application / Research Assistant / Deep research agent privacy, Security-Privacy / Data Leakage / Enterprise document leak risk
License CC BY 4.0

Abstract Overview

This paper studies privacy leakage in deep research agents that combine private enterprise documents with external web search. It introduces MosaicLeaks, a benchmark of 1,001 multi-hop tasks that force agents to alternate between local and public information sources, making external queries depend on private context. The authors evaluate leakage by giving an adversary model access only to the agent’s web queries and testing whether it can infer research intent, answer private questions, or generate verifiable claims about enterprise documents. Across multiple models, they find that leakage is common, simple privacy prompting only partially helps, and optimizing only for task performance can increase leakage.

Novelty

The work appears novel in framing privacy risk for deep research agents through the mosaic effect, where multiple seemingly benign external queries become revealing in aggregate. It also contributes both a benchmark that explicitly interleaves private and public multi-hop dependencies and a privacy-aware RL method that assigns dense rewards for both task success and leakage avoidance.

Results

Empirically, the paper shows that models across families and sizes leak private information at all three evaluated levels. For Qwen3-4B-Instruct, task-performance RL improved strict chain success from 48.7% to 59.3% but increased answer/full-information leakage from 34.0% to 51.7%. Their Privacy-Aware Deep Research (PA-DR) training improved accuracy to 58.7% while reducing answer/full-information leakage to 9.9%, with an additional privacy prompt reaching 59.3% accuracy and 7.6% leakage.

Key Points

  1. MosaicLeaks is a 1,001-task benchmark designed so that answering requires chaining private enterprise documents with public web information, creating realistic opportunities for query-based leakage.
  2. Privacy is evaluated from external queries alone at three levels—intent leakage, answer leakage, and full-information leakage—capturing both direct and mosaic-style disclosure risks.
  3. The proposed PA-DR reinforcement learning approach improves task performance while substantially lowering severe leakage, unlike naive prompting or performance-only training.

References

This page was created using generative AI such as GPT-5, Claude Opus 4, Gemini 3, Gemini 3.1 Flash Image, and their higher-end successor versions. No guarantee can be made regarding its contents.