Agent Search Queries Leak Private Data in MosaicLeaks Study
Researchers from ServiceNow AI and the University of Edinburgh found that deep-research AI agents leak sensitive internal data through external search logs.
On June 18, 2026, researchers from ServiceNow AI and the University of Edinburgh published MosaicLeaks, a benchmark and research paper detailing how deep-research AI agents leak sensitive data through external web search queries. The study identifies a “mosaic effect” where an adversary can reconstruct private enterprise information solely by observing the aggregate logs of an agent’s search tool usage. For developers building systems that interleave internal documents with public data, this exposes a structural vulnerability in standard agentic search behavior.
The Capable-but-Leaky Paradox
The research isolates a failure mode in agents that synthesize private context with external web search. A single query to a search API might appear benign. An aggregate log of these queries allows adversaries to infer sensitive data without accessing the source documents or the agent’s internal reasoning.
The study categorizes this vulnerability into three severity tiers. Intent leakage reveals the private research goals the agent is pursuing. Answer leakage uses the query logs to answer specific internal questions. Full-information leakage occurs when the cumulative log states verifiably true private claims.
Models optimized strictly for task accuracy exacerbate this vulnerability. As agents improve at connecting disparate documents, they tend to include more private context in their search parameters to yield better external results. The researchers found that standard zero-shot privacy prompting reduces but fails to eliminate the risk.
Benchmark Performance Evaluation
The researchers introduced a benchmark of 1,001 multi-hop deep research tasks to measure this tradeoff. They evaluated standard reinforcement learning for task performance against their proposed Privacy-Aware Deep Research (PA-DR) framework. PA-DR applies a learned privacy classifier to provide dense credit assignment for privacy violations during the training process.
Testing on Qwen3-4B-Instruct, the PA-DR framework improved task success while cutting the leakage rate by more than two-thirds.
| Training Method | Strict Chain Success Rate | Leakage Rate |
|---|---|---|
| Standard RL (Baseline) | 48.7% | 34.0% |
| Privacy-Aware Deep Research | 58.7% | 9.9% |
The performance gains in PA-DR indicate that models can learn to abstract their external queries effectively if privacy constraints are embedded directly into the training loop. This aligns with broader challenges in evaluating and testing AI agents across multi-step execution environments.
Enterprise Data Security Context
The release of MosaicLeaks follows a separate security incident involving ServiceNow earlier in June 2026. A misconfigured API endpoint on the company’s Australia platform release granted unauthenticated access to customer instances. While ServiceNow confirmed the activity originated primarily from bug-bounty researchers rather than malicious actors, the event highlighted the rigid security requirements for enterprise data platforms.
If you manage enterprise retrieval-augmented generation pipelines, you must evaluate how your models formulate external search API calls. Top-level guardrails and system prompts are insufficient for preventing data extraction through search logs. Implement privacy-aware credit assignment during post-training alignment to ensure agents sanitize their external tool usage before execution.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Benchmark Custom AI Agent Tools via Hugging Face
Learn how to evaluate open-weights models against your proprietary APIs using Hugging Face's private benchmarking framework and sandboxed environments.
Anthropic pushes MCP for production agents despite RCE flaws
Anthropic outlined a production roadmap for the Model Context Protocol, introducing dynamic tool discovery and programmable integrations for AI agents.
Google Validates Model Unlearning via Black-Box Kernel Tests
A new framework from Google Research uses two-sample kernel testing to verify data removal from machine learning models without accessing internal weights.
Pre-Auth RCE in ChromaDB Python Server Earns 10.0 Severity
A max-severity flaw in ChromaDB's Python API server allows unauthenticated attackers to execute arbitrary code by loading remote malicious models.
CVE-2026-31431 Grants Local Root via Linux Page Cache Write
A logic bug in the Linux kernel's userspace crypto API allows unprivileged local users to gain root access across major distributions dating back to 2017.