Related papers: Leveraging Generative AI for Clinical Evidence Summarization Needs to Ensure Trustworthiness

Leveraging Generative AI for Clinical Evidence Summarization Needs to Ensure Trustworthiness

URL: http://arxiv.org/abs/2311.11211v3
Date: Mon, 1 Apr 2024 02:04:25 GMT
Title: Leveraging Generative AI for Clinical Evidence Summarization Needs to Ensure Trustworthiness
Authors: Gongbo Zhang, Qiao Jin, Denis Jered McInerney, Yong Chen, Fei Wang, Curtis L. Cole, Qian Yang, Yanshan Wang, Bradley A. Malin, Mor Peleg, Byron C. Wallace, Zhiyong Lu, Chunhua Weng, Yifan Peng,
Abstract summary: Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task.
Score: 47.51360338851017
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence.

Related papers

Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions. We propose a novel approach utilizing structured medical reasoning. Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z)
Medical Hallucinations in Foundation Models and Their Impact on Healthcare [53.97060824532454]
Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine. We define medical hallucination as any instance in which a model generates misleading medical content. Our results reveal that inference techniques such as Chain-of-Thought (CoT) and Search Augmented Generation can effectively reduce hallucination rates. These findings underscore the ethical and practical imperative for robust detection and mitigation strategies.
arXiv Detail & Related papers (2025-02-26T02:30:44Z)
Retrieval-augmented systems can be dangerous medical communicators [21.371504193281226]
Patients have long sought health information online, and increasingly, they are turning to generative AI to answer their health-related queries. Retrieval-augmented generation and citation grounding have been widely promoted as methods to reduce hallucinations and improve the accuracy of AI-generated responses. This paper argues that even when these methods produce literally accurate content drawn from source documents sans hallucinations, they can still be highly misleading.
arXiv Detail & Related papers (2025-02-18T01:57:02Z)
Knowledge Graph-Driven Retrieval-Augmented Generation: Integrating Deepseek-R1 with Weaviate for Advanced Chatbot Applications [45.935798913942904]
We propose an innovative framework that combines structured biomedical knowledge with large language models (LLMs) Our system develops a thorough knowledge graph by identifying and refining causal relationships and named entities from medical abstracts related to age-related macular degeneration (AMD) Using a vector-based retrieval process and a locally deployed language model, our framework produces responses that are both contextually relevant and verifiable, with direct references to clinical evidence.
arXiv Detail & Related papers (2025-02-16T12:52:28Z)
Towards Next-Generation Medical Agent: How o1 is Reshaping Decision-Making in Medical Scenarios [46.729092855387165]
We study the choice of the backbone LLM for medical AI agents, which is the foundation for the agent's overall reasoning and action generation. Our findings demonstrate o1's ability to enhance diagnostic accuracy and consistency, paving the way for smarter, more responsive AI tools.
arXiv Detail & Related papers (2024-11-16T18:19:53Z)
Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering [51.26412822853409]
We present a novel personalized federated learning (pFL) method for medical visual question answering (VQA) models. Our method introduces learnable prompts into a Transformer architecture to efficiently train it on diverse medical datasets without massive computational costs.
arXiv Detail & Related papers (2024-10-23T00:31:17Z)
Generative AI for Health Technology Assessment: Opportunities, Challenges, and Policy Considerations [12.73011921253]
This review introduces the transformative potential of generative Artificial Intelligence (AI) and foundation models, including large language models (LLMs), for health technology assessment (HTA) We explore their applications in four critical areas, synthesis evidence, evidence generation, clinical trials and economic modeling. Despite their promise, these technologies, while rapidly improving, are still nascent and continued careful evaluation in their applications to HTA is required.
arXiv Detail & Related papers (2024-07-09T09:25:27Z)
Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine [10.004952611099947]
Retrieval-augmented generation (RAG) enables models to generate more accurate contents by leveraging the retrieval of external knowledge. RAG can pave the way for connecting generative AI with medical applications and is expected to bring innovations in equity, reliability, and personalization to health care.
arXiv Detail & Related papers (2024-06-18T09:53:37Z)
A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions [23.36640449085249]
We trace the recent advances of Medical Large Language Models (Med-LLMs) The wide-ranging applications of Med-LLMs are investigated across various healthcare domains. We discuss the challenges associated with ensuring fairness, accountability, privacy, and robustness.
arXiv Detail & Related papers (2024-06-06T03:15:13Z)
Would You Trust an AI Doctor? Building Reliable Medical Predictions with Kernel Dropout Uncertainty [14.672477787408887]
We introduce a Bayesian Monte Carlo Dropout model with kernel modelling to enhance reliability on small medical datasets. We demonstrate significant improvements in reliability, even with limited data, offering a promising step towards building trust in AI-driven medical predictions.
arXiv Detail & Related papers (2024-04-16T11:43:26Z)
Practical Applications of Advanced Cloud Services and Generative AI Systems in Medical Image Analysis [17.4235794108467]
The article explores the transformative potential of generative AI in medical imaging, emphasizing its ability to generate syntheticACM-2 data. By addressing limitations in dataset size and diversity, these models contribute to more accurate diagnoses and improved patient outcomes.
arXiv Detail & Related papers (2024-03-26T09:55:49Z)
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare [73.78776682247187]
Concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare.
arXiv Detail & Related papers (2023-08-11T10:49:05Z)
Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks. We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z)
SPeC: A Soft Prompt-Based Calibration on Performance Variability of Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization. Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z)
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation [110.31526448744096]
We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. We are building MedPerf, an open framework for benchmarking machine learning in the medical domain.
arXiv Detail & Related papers (2021-09-29T18:09:41Z)
Explainable AI meets Healthcare: A Study on Heart Disease Dataset [0.0]
The aim is to enlighten practitioners on the understandability and interpretability of explainable AI systems using a variety of techniques. Our paper contains examples based on the heart disease dataset and elucidates on how the explainability techniques should be preferred to create trustworthiness.
arXiv Detail & Related papers (2020-11-06T05:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.