BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for
Real-World Pharmacovigilance
- URL: http://arxiv.org/abs/2305.13395v2
- Date: Fri, 20 Oct 2023 15:51:45 GMT
- Title: BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for
Real-World Pharmacovigilance
- Authors: Karel D'Oosterlinck, Fran\c{c}ois Remy, Johannes Deleu, Thomas
Demeester, Chris Develder, Klim Zaporojets, Aneiss Ghodsi, Simon Ellershaw,
Jack Collins, Christopher Potts
- Abstract summary: We introduce BioDEX, a large-scale resource for Biomedical adverse Drug Event Extraction.
In this work, we consider the task of predicting the core information of the report given its originating paper.
We estimate human performance to be 72.0% F1, whereas our best model achieves 62.3% F1.
- Score: 26.88690084623426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Timely and accurate extraction of Adverse Drug Events (ADE) from biomedical
literature is paramount for public safety, but involves slow and costly manual
labor. We set out to improve drug safety monitoring (pharmacovigilance, PV)
through the use of Natural Language Processing (NLP). We introduce BioDEX, a
large-scale resource for Biomedical adverse Drug Event Extraction, rooted in
the historical output of drug safety reporting in the U.S. BioDEX consists of
65k abstracts and 19k full-text biomedical papers with 256k associated
document-level safety reports created by medical experts. The core features of
these reports include the reported weight, age, and biological sex of a
patient, a set of drugs taken by the patient, the drug dosages, the reactions
experienced, and whether the reaction was life threatening. In this work, we
consider the task of predicting the core information of the report given its
originating paper. We estimate human performance to be 72.0% F1, whereas our
best model achieves 62.3% F1, indicating significant headroom on this task. We
also begin to explore ways in which these models could help professional PV
reviewers. Our code and data are available: https://github.com/KarelDO/BioDEX.
Related papers
- "Hey..! This medicine made me sick": Sentiment Analysis of User-Generated Drug Reviews using Machine Learning Techniques [2.2874754079405535]
This project proposes a drug review classification system that classifies user reviews on a particular drug into different classes, such as positive, negative, and neutral.
The collected data is manually labeled and verified manually to ensure that the labels are correct.
arXiv Detail & Related papers (2024-04-09T08:42:34Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine [68.7814360102644]
We propose the Re$3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning.
We demonstrate the effectiveness of our method in generating patient discharge instructions.
arXiv Detail & Related papers (2022-10-23T16:34:39Z) - PHEE: A Dataset for Pharmacovigilance Event Extraction from Text [42.365919892504415]
PHEE is a novel dataset for pharmacovigilance comprising over 5000 annotated events from medical case reports and biomedical literature.
We describe the hierarchical event schema designed to provide coarse and fine-grained information about patients' demographics, treatments and (side) effects.
arXiv Detail & Related papers (2022-10-22T21:57:42Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - Filter Drug-induced Liver Injury Literature with Natural Language
Processing and Ensemble Learning [0.0]
Drug-induced liver injury (DILI) describes the adverse effects of drugs that damage liver.
Life-threatening results including liver failure or death were also reported in severe DILI cases.
Data extraction from previous publications relies heavily on manual labelling.
Recent development of artificial intelligence enabled automatic processing of biomedical texts.
arXiv Detail & Related papers (2022-03-09T23:53:07Z) - Deep learning for drug repurposing: methods, databases, and applications [54.08583498324774]
Repurposing existing drugs for new therapies is an attractive solution that accelerates drug development at reduced experimental costs.
In this review, we introduce guidelines on how to utilize deep learning methodologies and tools for drug repurposing.
arXiv Detail & Related papers (2022-02-08T09:42:08Z) - R-BERT-CNN: Drug-target interactions extraction from biomedical
literature [1.8814209805277506]
We present our participation for the DrugProt task BioCreative VII challenge.
Drug-target interactions (DTIs) are critical for drug discovery and repurposing.
There are >32M biomedical articles on PubMed and manually extracting DTIs from such a huge knowledge base is challenging.
arXiv Detail & Related papers (2021-10-31T22:50:33Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z) - Biomedical Information Extraction for Disease Gene Prioritization [0.34998703934432673]
We introduce a biomedical information extraction pipeline that extracts biological relationships from text.
We apply it to tens of millions of PubMed abstracts to extract protein-protein interactions (PPIs) and augment these extractions to a biomedical knowledge graph.
We show that, despite already containing PPIs from an established structured source, augmenting our own IE-based extractions to the graph allows us to predict novel disease-gene associations with a 20% relative increase in hit@30.
arXiv Detail & Related papers (2020-11-10T15:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.