Better with Experience: Self-Evolving LLM Agents for Evidence-Grounded Health Community Notes
Abstract Overview
This paper introduces EvoNote, a self-evolving agent framework for generating evidence-grounded health Community Notes on social platforms. The central idea is to reuse experience from prior misinformation-correction episodes by converting trajectory-level feedback into phase-specific memory for claim analysis, evidence acquisition, and note writing. The authors also construct MM-HealthCN, a 1.2K-instance multimodal benchmark of user-flagged health posts paired with human-written notes and helpfulness labels. Their evaluation emphasizes hierarchical utility judgment and pairwise comparison against human-written notes and automated baselines.
Novelty
The distinctive contribution is a memory-based self-evolving design that distills feedback from past correction trajectories into actionable, phase-specific strategies rather than treating each post independently. The work also contributes a multimodal benchmark and a health-specific utility evaluation protocol tailored to Community Notes generation.
Results
On MM-HealthCN, EvoNote-generated notes were preferred over corresponding human-written Community Notes in 89.6% of cases under the reported human-validated utility judge, and the method outperformed several web-search, Community Notes, and memory-augmented baselines. On unresolved Needs More Ratings posts, the system produced helpful notes for 82.0% of cases, while the paper reports reducing median candidate-correction time from over 13 hours in the human pipeline to under 2 minutes.
Key Points
- EvoNote uses a Social Utility Judge and Memory Evolver to turn completed note-generation trajectories into reusable memory for later cases.
- The authors introduce MM-HealthCN, a 1.2K multimodal benchmark spanning text, image, and video health misinformation posts with linked Community Notes data.
- Analyses attribute performance gains to stronger evidence use, including higher-quality and more diverse sources, and to explicit claim analysis combined with evolving memory.