Bridging the Gap in Drug Safety Data Analysis: Large Language Models for SQL Query Generation
- URL: http://arxiv.org/abs/2406.10690v2
- Date: Wed, 19 Jun 2024 21:41:11 GMT
- Title: Bridging the Gap in Drug Safety Data Analysis: Large Language Models for SQL Query Generation
- Authors: Jeffery L. Painter, Venkateswara Rao Chalamalasetti, Raymond Kassekert, Andrew Bate,
- Abstract summary: Traditionally, accessing safety data required database expertise, limiting broader use.
This paper introduces a novel application of Large Language Models (LLMs) to democratize database access for non-technical users.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pharmacovigilance (PV) is essential for drug safety, primarily focusing on adverse event monitoring. Traditionally, accessing safety data required database expertise, limiting broader use. This paper introduces a novel application of Large Language Models (LLMs) to democratize database access for non-technical users. Utilizing OpenAI's GPT-4, we developed a chatbot that generates structured query language (SQL) queries from natural language, bridging the gap between domain knowledge and technical requirements. The proposed application aims for more inclusive and efficient data access, enhancing decision making in drug safety. By providing LLMs with plain language summaries of expert knowledge, our approach significantly improves query accuracy over methods relying solely on database schemas. The application of LLMs in this context not only optimizes PV data analysis, ensuring timely and precise drug safety reporting -- a crucial component in adverse drug reaction monitoring -- but also promotes safer pharmacological practices and informed decision making across various data intensive fields.
Related papers
- A Survey on Employing Large Language Models for Text-to-SQL Tasks [7.728180183687891]
The increasing volume of data stored in relational databases has led to the need for efficient querying and utilization of this data in various sectors.
To take advantage of the recent developments in Large Language Models (LLMs), a range of new methods have emerged, with a primary focus on prompt engineering and fine-tuning.
arXiv Detail & Related papers (2024-07-21T14:48:23Z) - Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy.
Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models.
This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z) - Large Language Models: A New Approach for Privacy Policy Analysis at Scale [1.7570777893613145]
This research proposes the application of Large Language Models (LLMs) as an alternative for effectively and efficiently extracting privacy practices from privacy policies at scale.
We leverage well-known LLMs such as ChatGPT and Llama 2, and offer guidance on the optimal design of prompts, parameters, and models.
Using several renowned datasets in the domain as a benchmark, our evaluation validates its exceptional performance, achieving an F1 score exceeding 93%.
arXiv Detail & Related papers (2024-05-31T15:12:33Z) - ChatSOS: Vector Database Augmented Generative Question Answering Assistant in Safety Engineering [0.0]
This study develops a vector database from 117 explosion accident reports in China spanning 2013 to 2023.
By utilizing the vector database, which outperforms the relational database in information retrieval quality, we provide LLMs with richer, more relevant knowledge.
arXiv Detail & Related papers (2024-05-08T07:21:26Z) - Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application [54.984348122105516]
Large Language Models (LLMs) pretrained on massive text corpus presents a promising avenue for enhancing recommender systems.
We propose an Llm-driven knowlEdge Adaptive RecommeNdation (LEARN) framework that synergizes open-world knowledge with collaborative knowledge.
arXiv Detail & Related papers (2024-05-07T04:00:30Z) - Persona-DB: Efficient Large Language Model Personalization for Response
Prediction with Collaborative Data Refinement [82.56964750522161]
We introduce Persona-DB, a simple framework consisting of a hierarchical construction process to improve generalization across task contexts.
In the task of response forecasting, Persona-DB demonstrates superior efficiency in maintaining accuracy with a significantly reduced retrieval size.
Our experiments also indicate a marked improvement of over 15% under cold-start scenarios, when users have extremely sparse data.
arXiv Detail & Related papers (2024-02-16T20:20:43Z) - Large Language Models Can Be Good Privacy Protection Learners [53.07930843882592]
We introduce Privacy Protection Language Models (PPLM), a novel paradigm for fine-tuning language models.
Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning.
In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model's knowledge.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Utilizing Large Language Models for Natural Interface to Pharmacology
Databases [7.32741812808506]
We introduce a Large Language Model (LLM)-based Natural Language Interface to interact with structured information stored in databases.
This framework can generalize to query a wide range of pharmaceutical data and knowledge bases.
arXiv Detail & Related papers (2023-07-26T17:50:11Z) - Unveiling the Potential of Knowledge-Prompted ChatGPT for Enhancing Drug
Trafficking Detection on Social Media [30.791563171321062]
We propose an analytical framework to compose emphknowledge-informed prompts, which serve as the interface that humans can interact with and use LLMs to perform the detection task.
Our experimental findings demonstrate that the proposed framework outperforms other baseline language models in terms of drug trafficking detection accuracy.
The implications of our research extend to social networks, emphasizing the importance of incorporating prior knowledge and scenario-based prompts into analytical tools to improve online security and public safety.
arXiv Detail & Related papers (2023-07-07T16:15:59Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.