Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine
- URL: http://arxiv.org/abs/2407.10086v2
- Date: Fri, 19 Jul 2024 14:28:26 GMT
- Title: Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine
- Authors: Omid Rohanian, Mohammadmahdi Nouriborji, Olena Seminog, Rodrigo Furst, Thomas Mendy, Shanthi Levanita, Zaharat Kadri-Alabi, Nusrat Jabin, Daniela Toale, Georgina Humphreys, Emilia Antonio, Adrian Bucher, Alice Norton, David A. Clifton,
- Abstract summary: Pandemic PACT project aims to track and analyse research funding and clinical evidence for a wide range of diseases with outbreak potential.
This paper introduces the Pandemic PACT Advanced Categorisation Engine (PPACE) along with its associated dataset.
- Score: 10.692728349388297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces the Pandemic PACT Advanced Categorisation Engine (PPACE) along with its associated dataset. PPACE is a fine-tuned model developed to automatically classify research abstracts from funded biomedical projects according to WHO-aligned research priorities. This task is crucial for monitoring research trends and identifying gaps in global health preparedness and response. Our approach builds on human-annotated projects, which are allocated one or more categories from a predefined list. A large language model is then used to generate `rationales' explaining the reasoning behind these annotations. This augmented data, comprising expert annotations and rationales, is subsequently used to fine-tune a smaller, more efficient model. Developed as part of the Pandemic PACT project, which aims to track and analyse research funding and clinical evidence for a wide range of diseases with outbreak potential, PPACE supports informed decision-making by research funders, policymakers, and independent researchers. We introduce and release both the trained model and the instruction-based dataset used for its training. Our evaluation shows that PPACE significantly outperforms its baselines. The release of PPACE and its associated dataset offers valuable resources for researchers in multilabel biomedical document classification and supports advancements in aligning biomedical research with key global health priorities.
Related papers
- A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions [66.40362209055023]
This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods.
By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models.
arXiv Detail & Related papers (2024-07-07T18:02:00Z) - Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges [2.1835659964186087]
This paper presents a systematic review of generative models used to synthesize various medical data types.
Our study encompasses a broad array of medical data modalities and explores various generative models.
arXiv Detail & Related papers (2024-06-27T14:00:11Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - A Review of Deep Learning Methods for Photoplethysmography Data [10.27280499967643]
Photoplethysmography is a promising device due to its advantages in portability, user-friendly operation, and non-invasive capabilities.
Recent advancements in deep learning have demonstrated remarkable outcomes by leveraging PPG signals for tasks related to personal health management.
arXiv Detail & Related papers (2024-01-23T14:11:29Z) - Discovering Mental Health Research Topics with Topic Modeling [13.651763262606782]
This study aims to identify general trends in the field and pinpoint high-impact research topics by analyzing a large dataset of mental health research papers.
Our dataset comprises 96,676 research papers pertaining to mental health, enabling us to examine the relationships between different topics using their abstracts.
To enhance our analysis, we also generated word clouds to provide a comprehensive overview of the machine learning models applied in mental health research.
arXiv Detail & Related papers (2023-08-25T05:25:05Z) - Literature-based Discovery for Landscape Planning [1.1939762265857434]
This project demonstrates how medical corpus hypothesis generation can be used to derive new research angles for landscape and urban planners.
AGATHA was used to identify likely conceptual relationships between emerging infectious diseases (EIDs) and deforestation.
This research also serves as a partial proof-of-concept for the application of medical database hypothesis generation to medicine-adjacent hypothesis discovery.
arXiv Detail & Related papers (2023-06-05T04:32:46Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Machine Learning Applications for Therapeutic Tasks with Genomics Data [49.98249191161107]
We review the literature on machine learning applications for genomics through the lens of therapeutic development.
We identify twenty-two machine learning in genomics applications across the entire therapeutics pipeline.
We pinpoint seven important challenges in this field with opportunities for expansion and impact.
arXiv Detail & Related papers (2021-05-03T21:20:20Z) - Challenges in biomarker discovery and biorepository for Gulf-war-disease
studies: a novel data platform solution [48.7576911714538]
We introduce a novel data platform, named ROSALIND, to overcome the challenges, foster healthy and vital collaborations and advance scientific inquiries.
We follow the principles etched in the platform name - ROSALIND stands for resource organisms with self-governed accessibility, linkability, integrability, neutrality, and dependability.
The deployment of ROSALIND in our GWI study in recent 12 months has accelerated the pace of data experiment and analysis, removed numerous error sources, and increased research quality and productivity.
arXiv Detail & Related papers (2021-02-04T20:38:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.