Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML
and PDF Documents: Improving Data Access and Visualization for Veterinarians
- URL: http://arxiv.org/abs/2112.00962v1
- Date: Thu, 2 Dec 2021 03:39:25 GMT
- Title: Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML
and PDF Documents: Improving Data Access and Visualization for Veterinarians
- Authors: Majid Jaberi-Douraki, Soudabeh Taghian Dinani, Nuwan Indika Millagaha
Gedara, Xuan Xu, Emily Richards, Fiona Maunsell, Nader Zad, Lisa Ann Tell
- Abstract summary: Extra-label drug use in food animal medicine is authorized by the US Animal Medicinal Drug Use Clarification Act (AMDUCA)
Occasionally there is a paucity of scientific data on which to base a withdrawal interval or a large number of animals being treated, driving the need to test for drug residues.
Rapid assay commercial farm-side tests are essential for monitoring drug residues in animal products to protect human health.
- Score: 3.055086390437118
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extra-label drug use in food animal medicine is authorized by the US Animal
Medicinal Drug Use Clarification Act (AMDUCA), and estimated withdrawal
intervals are based on published scientific pharmacokinetic data. Occasionally
there is a paucity of scientific data on which to base a withdrawal interval or
a large number of animals being treated, driving the need to test for drug
residues. Rapid assay commercial farm-side tests are essential for monitoring
drug residues in animal products to protect human health. Active ingredients,
sensitivity, matrices, and species that have been evaluated for commercial
rapid assay tests are typically reported on manufacturers' websites or in PDF
documents that are available to consumers but may require a special access
request. Additionally, this information is not always correlated with
FDA-approved tolerances. Furthermore, parameter changes for these tests can be
very challenging to regularly identify, especially those listed on websites or
in documents that are not publicly available. Therefore, artificial
intelligence plays a critical role in efficiently extracting the data and
ensure current information. Extracting tables from PDF and HTML documents has
been investigated both by academia and commercial tool builders. Research in
text mining of such documents has become a widespread yet challenging arena in
implementing natural language programming. However, techniques of extracting
tables are still in their infancy and being investigated and improved by
researchers. In this study, we developed and evaluated a data-mining method for
automatically extracting rapid assay data from electronic documents. Our
automatic electronic data extraction method includes a software package module,
a developed pattern recognition tool, and a data mining engine. Assay details
were provided by several commercial entities that produce these rapid drug
residue assay
Related papers
- DrugAgent: Explainable Drug Repurposing Agent with Large Language Model-based Reasoning [10.528489471229946]
We propose a multi-agent framework to enhance the drug repurposing process using state-of-the-art machine learning techniques and knowledge integration.
Our framework comprises several specialized agents: an AI Agent trains robust drug-target interaction (DTI) models; a Knowledge Graph Agent utilizes the drug-gene interaction database (DGIdb) to systematically extract DTIs.
By integrating outputs from these agents, our system effectively harnesses diverse data sources, including external databases, to propose viable repurposing candidates.
arXiv Detail & Related papers (2024-08-23T21:24:59Z) - "Hey..! This medicine made me sick": Sentiment Analysis of User-Generated Drug Reviews using Machine Learning Techniques [2.2874754079405535]
This project proposes a drug review classification system that classifies user reviews on a particular drug into different classes, such as positive, negative, and neutral.
The collected data is manually labeled and verified manually to ensure that the labels are correct.
arXiv Detail & Related papers (2024-04-09T08:42:34Z) - Learning to Describe for Predicting Zero-shot Drug-Drug Interactions [54.172575323610175]
Adverse drug-drug interactions can compromise the effectiveness of concurrent drug administration.
Traditional computational methods for DDI prediction may fail to capture interactions for new drugs due to the lack of knowledge.
We propose TextDDI with a language model-based DDI predictor and a reinforcement learning(RL)-based information selector.
arXiv Detail & Related papers (2024-03-13T09:42:46Z) - ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug
Discovery [79.08833067391093]
Real-world pharmaceutical datasets often exhibit highly imbalanced distribution.
We introduce ImDrug, a benchmark with an open-source Python library which consists of 4 imbalance settings, 11 AI-ready datasets, 54 learning tasks and 16 baseline algorithms tailored for imbalanced learning.
It provides an accessible and customizable testbed for problems and solutions spanning a broad spectrum of the drug discovery pipeline.
arXiv Detail & Related papers (2022-09-16T13:35:57Z) - Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model.
We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them.
Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for
AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise
Annotations [90.27736364704108]
We present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery.
DrugOOD comes with an open-source Python package that fully automates benchmarking processes.
We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction.
arXiv Detail & Related papers (2022-01-24T12:32:48Z) - Neural Medication Extraction: A Comparison of Recent Models in
Supervised and Semi-supervised Learning Settings [0.751289645756884]
Drug prescriptions are essential information that must be encoded in electronic medical records.
This is why the medication extraction task has emerged.
We present an independent and comprehensive evaluation of state-of-the-art neural architectures on the I2B2 medical prescription extraction task.
arXiv Detail & Related papers (2021-10-19T19:23:38Z) - Semi-Supervised Exaggeration Detection of Health Science Press Releases [23.930041685595775]
Recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings.
We present a formalization of and study into the problem of exaggeration detection in science communication.
We introduce MT-PET, a multi-task version of Pattern Exploiting Training (PET), which leverages knowledge from complementary cloze-style QA tasks to improve few-shot learning.
arXiv Detail & Related papers (2021-08-30T19:32:20Z) - Data Mining with Big Data in Intrusion Detection Systems: A Systematic
Literature Review [68.15472610671748]
Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation.
The rapid rate and volume of data creation has begun to pose significant challenges for data management and security.
The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance.
arXiv Detail & Related papers (2020-05-23T20:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.