Selection of pseudo-annotated data for adverse drug reaction
classification across drug groups
- URL: http://arxiv.org/abs/2111.12477v1
- Date: Wed, 24 Nov 2021 13:11:05 GMT
- Title: Selection of pseudo-annotated data for adverse drug reaction
classification across drug groups
- Authors: Ilseyar Alimova and Elena Tutubalina
- Abstract summary: We assess the robustness of state-of-the-art neural architectures across different drug groups.
We investigate several strategies to use pseudo-labeled data in addition to a manually annotated train set.
- Score: 12.259552039796027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic monitoring of adverse drug events (ADEs) or reactions (ADRs) is
currently receiving significant attention from the biomedical community. In
recent years, user-generated data on social media has become a valuable
resource for this task. Neural models have achieved impressive performance on
automatic text classification for ADR detection. Yet, training and evaluation
of these methods are carried out on user-generated texts about a targeted drug.
In this paper, we assess the robustness of state-of-the-art neural
architectures across different drug groups. We investigate several strategies
to use pseudo-labeled data in addition to a manually annotated train set.
Out-of-dataset experiments diagnose the bottleneck of supervised models in
terms of breakdown performance, while additional pseudo-labeled data improves
overall results regardless of the text selection strategy.
Related papers
- Regressor-free Molecule Generation to Support Drug Response Prediction [83.25894107956735]
Conditional generation based on the target IC50 score can obtain a more effective sampling space.
Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels.
arXiv Detail & Related papers (2024-05-23T13:22:17Z) - "Hey..! This medicine made me sick": Sentiment Analysis of User-Generated Drug Reviews using Machine Learning Techniques [2.2874754079405535]
This project proposes a drug review classification system that classifies user reviews on a particular drug into different classes, such as positive, negative, and neutral.
The collected data is manually labeled and verified manually to ensure that the labels are correct.
arXiv Detail & Related papers (2024-04-09T08:42:34Z) - A large dataset curation and benchmark for drug target interaction [0.7699646945563469]
Bioactivity data plays a key role in drug discovery and repurposing.
We propose a way to standardize and represent efficiently a very large dataset curated from multiple public sources.
arXiv Detail & Related papers (2024-01-30T17:06:25Z) - Towards a more inductive world for drug repurposing approaches [0.545520830707066]
Drug-target interaction (DTI) prediction is a challenging, albeit essential task in drug repurposing.
We show that DTI prediction methods based on transductive models lack generalization and lead to inflated performance.
We propose a novel biologically-driven strategy for negative edge subsampling and show through in vitro validation that newly discovered interactions are indeed true.
arXiv Detail & Related papers (2023-11-21T15:28:44Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - A Deep Learning Approach to the Prediction of Drug Side-Effects on
Molecular Graphs [2.4087148947930634]
We develop a methodology to predict drug side-effects using Graph Neural Networks.
We build a dataset from freely accessible and well established data sources.
The results show that our method has an improved classification capability, under many parameters and metrics.
arXiv Detail & Related papers (2022-11-30T10:12:41Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for
AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise
Annotations [90.27736364704108]
We present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery.
DrugOOD comes with an open-source Python package that fully automates benchmarking processes.
We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction.
arXiv Detail & Related papers (2022-01-24T12:32:48Z) - Neural Medication Extraction: A Comparison of Recent Models in
Supervised and Semi-supervised Learning Settings [0.751289645756884]
Drug prescriptions are essential information that must be encoded in electronic medical records.
This is why the medication extraction task has emerged.
We present an independent and comprehensive evaluation of state-of-the-art neural architectures on the I2B2 medical prescription extraction task.
arXiv Detail & Related papers (2021-10-19T19:23:38Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.