Related papers: Japanese AI Agent System on Human Papillomavirus Vaccination: System Design

Japanese AI Agent System on Human Papillomavirus Vaccination: System Design

URL: http://arxiv.org/abs/2601.10718v1
Date: Mon, 15 Dec 2025 15:13:22 GMT
Title: Japanese AI Agent System on Human Papillomavirus Vaccination: System Design
Authors: Junyu Liu, Siwen Yang, Dexiu Ma, Qian Niu, Zequn Zhang, Momoko Nagai-Tanima, Tomoki Aoyama,
Abstract summary: Human papillomavirus (HPV) vaccine hesitancy poses significant public health challenges, particularly in Japan.<n>This study aimed to develop a dual-purpose AI agent system that provides verified HPV vaccine information through a conversational interface.
Score: 13.804421144399791
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human papillomavirus (HPV) vaccine hesitancy poses significant public health challenges, particularly in Japan where proactive vaccination recommendations were suspended from 2013 to 2021. The resulting information gap is exacerbated by misinformation on social media, and traditional ways cannot simultaneously address individual queries while monitoring population-level discourse. This study aimed to develop a dual-purpose AI agent system that provides verified HPV vaccine information through a conversational interface while generating analytical reports for medical institutions based on user interactions and social media. We implemented a system comprising: a vector database integrating academic papers, government sources, news media, and social media; a Retrieval-Augmented Generation chatbot using ReAct agent architecture with multi-tool orchestration across five knowledge sources; and an automated report generation system with modules for news analysis, research synthesis, social media sentiment analysis, and user interaction pattern identification. Performance was assessed using a 0-5 scoring scale. For single-turn evaluation, the chatbot achieved mean scores of 4.83 for relevance, 4.89 for routing, 4.50 for reference quality, 4.90 for correctness, and 4.88 for professional identity (overall 4.80). Multi-turn evaluation yielded higher scores: context retention 4.94, topic coherence 5.00, and overall 4.98. The report generation system achieved completeness 4.00-5.00, correctness 4.00-5.00, and helpfulness 3.67-5.00, with reference validity 5.00 across all periods. This study demonstrates the feasibility of an integrated AI agent system for bidirectional HPV vaccine communication. The architecture enables verified information delivery with source attribution while providing systematic public discourse analysis, with a transferable framework for adaptation to other medical contexts.

Related papers

Generating Natural-Language Surgical Feedback: From Structured Representation to Domain-Grounded Evaluation [66.7752700084159]
High-quality feedback from a surgical trainer is pivotal for improving trainee performance and long-term skill acquisition.<n>We present a structure-aware pipeline that learns a surgical action ontology from real trainer-to-trainee transcripts.
arXiv Detail & Related papers (2025-11-19T06:19:34Z)
DispatchMAS: Fusing taxonomy and artificial intelligence agents for emergency medical services [49.70819009392778]
Large Language Models (LLMs) and Multi-Agent Systems (MAS) offer opportunities to augment dispatchers.<n>This study aimed to develop and evaluate a taxonomy-grounded, multi-agent system for simulating realistic scenarios.
arXiv Detail & Related papers (2025-10-24T08:01:21Z)
Grounding Large Language Models in Clinical Evidence: A Retrieval-Augmented Generation System for Querying UK NICE Clinical Guidelines [1.9615061725959186]
This paper presents the development and evaluation of a Retrieval-Augmented Generation system for querying the United Kingdom's National Institute for Health and Care Excellence (NICE) clinical guidelines using Large Language Models (LLMs)<n>The system's retrieval architecture, composed of a hybrid embedding mechanism, was evaluated against a database of 10,195 text chunks derived from three hundred guidelines.<n>It demonstrates high performance, with a Mean Reciprocal Rank (MRR) of 0.814, a Recall of 81% at the first chunk and of 99.1% within the top ten retrieved chunks, when evaluated on 7901 queries.
arXiv Detail & Related papers (2025-10-03T12:57:13Z)
Performance of a large language model-Artificial Intelligence based chatbot for counseling patients with sexually transmitted infections and genital diseases [4.910821423749911]
Otiz is an AI-based platform designed specifically for STI detection and counseling.<n>Four STIs (anogenital warts, herpes, syphilis, urethritis/cervicitis) were evaluated using prompts mimicking patient language.<n>Otiz scored highly on diagnostic accuracy (4.14.7), overall accuracy (4.34.6), correctness of information (5.0), comprehensibility (4.2-4.4), and empathy (4.5-4.3.6)
arXiv Detail & Related papers (2024-12-11T20:36:32Z)
Simulated patient systems are intelligent when powered by large language model-based AI agents [32.73072809937573]
We developed AIPatient, an intelligent simulated patient system powered by large language model-based AI agents.<n>The system incorporates the Retrieval Augmented Generation framework, powered by six task-specific LLM-based AI agents for complex reasoning.<n>For simulation reality, the system is also powered by the AIPatient KG (Knowledge Graph), built with de-identified real patient data.
arXiv Detail & Related papers (2024-09-27T17:17:15Z)
Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study [4.769236554995528]
We propose a retrieval-augmented generation architecture for medical question answering on emerging issues associated with health-related topics.<n>Our framework generates individual summaries followed by an aggregated summary to answer medical queries from large amounts of user-generated social media data.<n>Our framework achieves comparable median scores in terms of relevance, length, hallucination, coverage, and coherence when evaluated using GPT-4 and Nous-Hermes-2-7B-DPO.
arXiv Detail & Related papers (2024-05-29T20:56:52Z)
Hierarchical Multi-Label Classification of Online Vaccine Concerns [8.271202196208]
Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. We explore the task of detecting vaccine concerns in online discourse using large language models (LLMs) in a zero-shot setting without the need for expensive training datasets.
arXiv Detail & Related papers (2024-02-01T20:56:07Z)
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection [62.23830810096617]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression. This paper presents the development of a state-of-the-art Conformer based speech recognition system built on the DementiaBank Pitt corpus for automatic AD detection.
arXiv Detail & Related papers (2022-06-23T12:50:55Z)
Human Evaluation and Correlation with Automatic Metrics in Consultation Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes. We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors. We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z)
MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG. We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation. Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)
AVA: an Automatic eValuation Approach to Question Answering Systems [123.36351076384479]
AVA uses Transformer-based language models to encode question, answer, and reference text. Our solutions achieve up to 74.7% in F1 score in predicting human judgement for single answers.
arXiv Detail & Related papers (2020-05-02T05:00:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.