Related papers: The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents

The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents

URL: http://arxiv.org/abs/2603.00801v1
Date: Sat, 28 Feb 2026 20:27:44 GMT
Title: The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents
Authors: Shrey Shah, Levent Ozgur,
Abstract summary: Language agents increasingly act as web-enabled systems that search, browse, and synthesize information from diverse sources.<n>These sources can include unreliable or adversarial content, and the robustness of agents to adversarial ranking remains poorly understood.<n>We introduce Synthetic Web Benchmark, a procedurally generated environment comprising thousands of hyperlinked articles with ground-truth labels for credibility and factuality.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language agents increasingly act as web-enabled systems that search, browse, and synthesize information from diverse sources. However, these sources can include unreliable or adversarial content, and the robustness of agents to adversarial ranking - where misleading information appears prominently in search results - remains poorly understood. Existing benchmarks evaluate functional navigation or static factuality but cannot causally isolate this vulnerability, and current mitigation strategies for retrieval-augmented generation remain largely untested under such conditions. We introduce Synthetic Web Benchmark, a procedurally generated environment comprising thousands of hyperlinked articles with ground-truth labels for credibility and factuality, process-level interaction traces, and contamination filtering to eliminate training-data leakage. By injecting a single high-plausibility misinformation article into a controllable search rank, we measure the causal effect of adversarial exposure in six frontier models. The results reveal catastrophic failures: accuracy collapses despite unlimited access to truthful sources, with minimal search escalation and severe miscalibration. These findings expose fundamental limitations in how current frontier models handle conflicting information, with immediate implications for deployment in high-stakes domains. Our benchmark enables systematic analysis of these failure modes and provides a controlled testbed for evaluating mitigation strategies under adversarial ranking - a gap in current research. This work establishes a reproducible baseline for developing search-robust and epistemically humble agents capable of resisting manipulation in high-stakes domains.

Related papers

Deep Learning for Contextualized NetFlow-Based Network Intrusion Detection: Methods, Data, Evaluation and Deployment [5.402853794565817]
This paper synthesizes recent research on context-aware deep learning for flow-based intrusion detection.<n>We organize existing methods into a four-dimensional taxonomy covering temporal context, graph or relational context, multimodal context, and multi-resolution context.<n>We review common failure modes that can inflate reported results, including temporal leakage, data splitting, dataset design flaws, limited dataset diversity, and weak cross-dataset generalization.
arXiv Detail & Related papers (2026-02-05T12:25:18Z)
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics [89.1999907891494]
We present WebDetective, a benchmark of hint-free multi-hop questions paired with a controlled Wikipedia sandbox.<n>Our evaluation of 25 state-of-the-art models reveals systematic weaknesses across all architectures.<n>We develop an agentic workflow, EvidenceLoop, that explicitly targets the challenges our benchmark identifies.
arXiv Detail & Related papers (2025-10-01T07:59:03Z)
Deep Learning Models for Robust Facial Liveness Detection [56.08694048252482]
This study introduces a robust solution through novel deep learning models addressing the deficiencies in contemporary anti-spoofing techniques.<n>By innovatively integrating texture analysis and reflective properties associated with genuine human traits, our models distinguish authentic presence from replicas with remarkable precision.
arXiv Detail & Related papers (2025-08-12T17:19:20Z)
Unmasking Synthetic Realities in Generative AI: A Comprehensive Review of Adversarially Robust Deepfake Detection Systems [4.359154048799454]
Deepfake proliferation-synthetic media poses challenges to digital security, misinformation mitigation, and identity preservation.<n>This systematic review evaluates state-of-the-art deepfake detection methodologies, emphasizing reproducible implementations for transparency and validation.<n>We delineate two core paradigms: (1) detection of fully synthetic media leveraging statistical anomalies and hierarchical feature extraction, and (2) localization of manipulated regions within authentic content employing multi-modal cues such as visual artifacts and temporal inconsistencies.
arXiv Detail & Related papers (2025-07-24T22:05:52Z)
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation [63.55258191625131]
InfoDeepSeek is a new benchmark for assessing agentic information seeking in real-world, dynamic web environments.<n>We propose a systematic methodology for constructing challenging queries satisfying the criteria of determinacy, difficulty, and diversity.<n>We develop the first evaluation framework tailored to dynamic agentic information seeking, including fine-grained metrics about the accuracy, utility, and compactness of information seeking outcomes.
arXiv Detail & Related papers (2025-05-21T14:44:40Z)
Information Retrieval Induced Safety Degradation in AI Agents [52.15553901577888]
This study investigates how expanding retrieval access affects model reliability, bias propagation, and harmful content generation.<n>Retrieval-enabled agents built on aligned LLMs often behave more unsafely than uncensored models without retrieval.<n>These findings underscore the need for robust mitigation strategies to ensure fairness and reliability in retrieval-enabled and increasingly autonomous AI systems.
arXiv Detail & Related papers (2025-05-20T11:21:40Z)
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts.<n>Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z)
Ensuring Medical AI Safety: Interpretability-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data [14.991686165405959]
We show the applicability of the framework using four medical datasets across two modalities.<n>We successfully identify and unlearn these biases in VGG16, ResNet50, and contemporary Vision Transformer models.
arXiv Detail & Related papers (2025-01-23T16:39:09Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.