The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem
- URL: http://arxiv.org/abs/2407.18322v2
- Date: Wed, 4 Sep 2024 17:16:05 GMT
- Title: The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem
- Authors: Joe B Hakim, Jeffery L Painter, Darmendra Ramcharran, Vijay Kara, Greg Powell, Paulina Sobczak, Chiho Sato, Andrew Bate, Andrew Beam,
- Abstract summary: Large language models (LLMs) are useful tools with the capacity for performing specific types of knowledge work at an effective scale.
However, deployments in high-risk and safety-critical domains pose unique challenges, notably the issue of hallucinations.
This is particularly concerning in settings such as drug safety, where inaccuracies could lead to patient harm.
We have developed and demonstrated a proof of concept suite of guardrails specifically designed to mitigate certain types of hallucinations and errors for drug safety.
- Score: 0.6965384453064829
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are useful tools with the capacity for performing specific types of knowledge work at an effective scale. However, LLM deployments in high-risk and safety-critical domains pose unique challenges, notably the issue of ``hallucination,'' where LLMs can generate fabricated information. This is particularly concerning in settings such as drug safety, where inaccuracies could lead to patient harm. To mitigate these risks, we have developed and demonstrated a proof of concept suite of guardrails specifically designed to mitigate certain types of hallucinations and errors for drug safety, and potentially applicable to other medical safety-critical contexts. These guardrails include mechanisms to detect anomalous documents to prevent the ingestion of inappropriate data, identify incorrect drug names or adverse event terms, and convey uncertainty in generated content. We integrated these guardrails with an LLM fine-tuned for a text-to-text task, which involves converting both structured and unstructured data within adverse event reports into natural language. This method was applied to translate individual case safety reports, demonstrating effective application in a pharmacovigilance processing task. Our guardrail framework offers a set of tools with broad applicability across various domains, ensuring LLMs can be safely used in high-risk situations by eliminating the occurrence of key errors, including the generation of incorrect pharmacovigilance-related terms, thus adhering to stringent regulatory and quality standards in medical safety-critical environments.
Related papers
- Visual-Semantic Knowledge Conflicts in Operating Rooms: Synthetic Data Curation for Surgical Risk Perception in Multimodal Large Language Models [7.916129615051081]
We introduce a dataset comprising over 34,000 synthetic images generated by diffusion models.<n>The dataset includes 214 human-annotated images that serve as a gold-standard reference for validation.
arXiv Detail & Related papers (2025-06-25T07:06:29Z) - Should LLM Safety Be More Than Refusing Harmful Instructions? [6.5137518437747]
This paper presents a systematic evaluation of Large Language Models' (LLMs) behavior on long-tail distributed (encrypted) texts.<n>We introduce a two-dimensional framework for assessing LLM safety.<n>We demonstrate that models that possess capabilities to decrypt ciphers may be susceptible to mismatched-generalization attacks.
arXiv Detail & Related papers (2025-06-03T05:00:12Z) - Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation [70.62656296780074]
We propose AIDSAFE: Agentic Iterative Deliberation for Safety Reasoning, a novel data generation recipe.<n>A data refiner stage in AIDSAFE ensures high-quality outputs by eliminating repetitive, redundant, and deceptive thoughts.<n>Our evaluations demonstrate that AIDSAFE-generated CoTs achieve superior policy adherence and reasoning quality.
arXiv Detail & Related papers (2025-05-27T21:34:40Z) - AgentSGEN: Multi-Agent LLM in the Loop for Semantic Collaboration and GENeration of Synthetic Data [3.3186271052113843]
scarcity of data presents a major obstacle to training AI systems for safety-critical applications, such as construction safety.<n>We propose a novel multi-agent framework that employs an iterative, in-the-loop collaboration between two agents.<n> powered by LLM's capabilities to reasoning and common-sense knowledge, this collaborative design produces synthetic images tailored to safety-critical scenarios.
arXiv Detail & Related papers (2025-05-07T22:43:33Z) - Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? [83.53005932513155]
Multi-modal large language models (MLLMs) have made significant progress, yet their safety alignment remains limited.
We propose finetuning MLLMs on a small set of benign instruct-following data with responses replaced by simple, clear rejection sentences.
arXiv Detail & Related papers (2025-04-14T09:03:51Z) - LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts [88.96201324719205]
Safety concerns in large language models (LLMs) have gained significant attention due to their exposure to potentially harmful data during pre-training.<n>We identify a new safety vulnerability in LLMs, where seemingly benign prompts, semantically related to harmful content, can bypass safety mechanisms.<n>We introduce a novel attack method, textitActorBreaker, which identifies actors related to toxic prompts within pre-training distribution.
arXiv Detail & Related papers (2024-10-14T16:41:49Z) - HELIOT: LLM-Based CDSS for Adverse Drug Reaction Management [3.2627279988912194]
Heliot is an innovative Clinical Decision Support System (CDSS) for adverse drug reaction management.
It processes free-text clinical information using Large Language Models (LLMs) integrated with a comprehensive pharmaceutical data repository.
Heliot learns from past patient-specific medication tolerances to reduce false alerts, enabling more nuanced and contextual adverse drug event warnings.
arXiv Detail & Related papers (2024-09-24T18:55:10Z) - Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence.
This paper uncovers a significant backdoor security threat within this process.
By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z) - Adversarial Attacks on Large Language Models in Medicine [34.17895005922139]
The integration of Large Language Models into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care.
The susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts.
This study investigates the vulnerability of LLMs to two types of adversarial attacks in three medical tasks.
arXiv Detail & Related papers (2024-06-18T04:24:30Z) - Current state of LLM Risks and AI Guardrails [0.0]
Large language models (LLMs) have become increasingly sophisticated, leading to widespread deployment in sensitive applications where safety and reliability are paramount.
These risks necessitate the development of "guardrails" to align LLMs with desired behaviors and mitigate potential harm.
This work explores the risks associated with deploying LLMs and evaluates current approaches to implementing guardrails and model alignment techniques.
arXiv Detail & Related papers (2024-06-16T22:04:10Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes [0.0]
Large Language Models (LLMs) have gained widespread adoption across various domains, including chatbots and auto-task completion agents.
These models are susceptible to safety vulnerabilities such as jailbreaking, prompt injection, and privacy leakage attacks.
This study investigates the impact of these modifications on LLM safety, a critical consideration for building reliable and secure AI systems.
arXiv Detail & Related papers (2024-04-05T20:31:45Z) - Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations [76.19419888353586]
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations.
We present our efforts to create and deploy a library of detectors: compact and easy-to-build classification models that provide labels for various harms.
arXiv Detail & Related papers (2024-03-09T21:07:16Z) - Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines.
While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety.
This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z) - Forcing Generative Models to Degenerate Ones: The Power of Data
Poisoning Attacks [10.732558183444985]
Malicious actors can covertly exploit large language models (LLMs) vulnerabilities through poisoning attacks aimed at generating undesirable outputs.
This paper explores various poisoning techniques to assess their effectiveness across a range of generative tasks.
We show that it is possible to successfully poison an LLM during the fine-tuning stage using as little as 1% of the total tuning data samples.
arXiv Detail & Related papers (2023-12-07T23:26:06Z) - Multilingual Jailbreak Challenges in Large Language Models [96.74878032417054]
In this study, we reveal the presence of multilingual jailbreak challenges within large language models (LLMs)
We consider two potential risky scenarios: unintentional and intentional.
We propose a novel textscSelf-Defense framework that automatically generates multilingual training data for safety fine-tuning.
arXiv Detail & Related papers (2023-10-10T09:44:06Z) - Foveate, Attribute, and Rationalize: Towards Physically Safe and
Trustworthy AI [76.28956947107372]
Covertly unsafe text is an area of particular interest, as such text may arise from everyday scenarios and are challenging to detect as harmful.
We propose FARM, a novel framework leveraging external knowledge for trustworthy rationale generation in the context of safety.
Our experiments show that FARM obtains state-of-the-art results on the SafeText dataset, showing absolute improvement in safety classification accuracy by 5.9%.
arXiv Detail & Related papers (2022-12-19T17:51:47Z) - Validation and Transparency in AI systems for pharmacovigilance: a case
study applied to the medical literature monitoring of adverse events [0.483420384410068]
We present a case study on how to operationalize existing guidance for validated AI systems in pharmacovigilance.
We describe an AI system designed with the goal of reducing effort to mitigate activities built in close collaboration with subject matter experts.
arXiv Detail & Related papers (2021-12-21T21:02:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.