FuguReport

Learning to Act under Noise: Enhancing Agent Robustness via Noisy Environments

Authors Yuxin Chen, Xiaodong Cai, Junfeng Fang, Zhuowen Han, Yu Wang, Yaorui Shi, Yi Zhang, Qi Gu, Xunliang Cai, Xiang Wang, An Zhang, Tat-Seng Chua
Affiliations University of Science and Technology of China / National University of Singapore / Meituan / Tsinghua University / Tianjin University
Categories Method / Robustness / Enhancing agent robustness to noise, Application / Interactive Agents / Agents in noisy user and tool environments, Evaluation / Robustness Evaluation / Performance under noisy and dynamic conditions
License CC BY 4.0

Abstract Overview

This paper studies the mismatch between idealized agent training and real-world deployment, arguing that current LLM agents are trained in overly clean environments and therefore degrade under stochastic, imperfect interactions. The authors propose NoisyAgent, a training framework that explicitly injects two kinds of environmental noise into agent learning: user-side noise, which introduces ambiguity, inconsistency, and redundancy in instructions, and tool-side noise, which simulates failures, incomplete outputs, misleading responses, and redundant tool feedback. To keep training stable, the method mixes clean and noisy rollouts and computes advantages separately for each group, while progressively increasing the amount and difficulty of noise based on a measured robustness gap between clean and perturbed rollouts. Experiments are conducted on noisy robustness benchmarks as well as standard agent benchmarks to test whether noise-aware training improves both robustness and general capability.

Novelty

The paper’s main novelty is to treat realistic interaction noise as a first-class component of agent training rather than only an evaluation condition. It combines automatic user/tool noise injection with hybrid clean-noisy rollouts and an adaptive curriculum that increases noise when the model has sufficiently adapted.

Results

Across both Qwen3-8B and Qwen3-32B backbones, NoisyAgent achieves the best reported results on AgentNoiseBench across all listed domains and metrics, outperforming GRPO, DAPO, and GSPO. The gains also carry over to idealized benchmarks: for example, with Qwen3-32B on τ2-Bench Retail, Avg@4 improves to 60.31 versus 58.55 for GSPO, and on AgentNoiseBench-τ2 Retail it reaches 43.20 versus 37.72 for GSPO. Ablation results further show that removing controlled injection, scheduling, or noise exposure reduces performance, indicating that each component contributes to robustness gains.

Key Points

  1. NoisyAgent models two practical noise sources during training: noisy user interactions and noisy tool outputs.
  2. The training strategy combines clean and perturbed rollouts with separate group-wise normalization and a progressive noise schedule to stabilize learning.
  3. The method improves robustness on noisy benchmarks and also yields consistent gains on standard clean benchmarks, suggesting better generalization rather than a trade-off.

References

This page was created using generative AI such as GPT-5, Claude Opus 4, Gemini 3, Gemini 3.1 Flash Image, and their higher-end successor versions. No guarantee can be made regarding its contents.