Building a Relation Extraction Baseline for Gene-Disease Associations: A
Reproducibility Study
- URL: http://arxiv.org/abs/2207.06226v1
- Date: Mon, 4 Jul 2022 08:19:43 GMT
- Title: Building a Relation Extraction Baseline for Gene-Disease Associations: A
Reproducibility Study
- Authors: Laura Menotti
- Abstract summary: We reproduce DEXTER, a system to automatically extract Gene-Disease Associations from biomedical abstracts.
The goal is to provide a benchmark for future works regarding Relation Extraction (RE)
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Reproducibility is an important task in scientific research. It is crucial
for researchers to compare newly developed systems with the state-of-the-art to
assess whether they made a breakthrough. However previous works may not be
immediately reproducible, for example due to the lack of source code. In this
work we reproduce DEXTER, a system to automatically extract Gene-Disease
Associations (GDAs) from biomedical abstracts. The goal is to provide a
benchmark for future works regarding Relation Extraction (RE), enabling
researchers to test and compare their results.
Related papers
- Towards an AI co-scientist [48.11351101913404]
We introduce an AI co-scientist, a multi-agent system built on Gemini 2.0.
The AI co-scientist is intended to help uncover new, original knowledge and to formulate demonstrably novel research hypotheses.
The system's design incorporates a generate, debate, and evolve approach to hypothesis generation, inspired by the scientific method.
arXiv Detail & Related papers (2025-02-26T06:17:13Z) - GeneSUM: Large Language Model-based Gene Summary Extraction [20.181381276458488]
We propose GeneSUM, a two-stage automated gene summary extractor utilizing a large language model (LLM)
Our approach retrieves and eliminates redundancy of target gene literature and then fine-tunes the LLM to refine and streamline the summarization process.
arXiv Detail & Related papers (2024-12-24T04:20:43Z) - AIGS: Generating Science from AI-Powered Automated Falsification [17.50867181053229]
We propose Baby-AIGS as a baby-step demonstration of a full-process AIGS system, which is a multi-agent system with agents in roles representing key research process.
Experiments on three tasks preliminarily show that Baby-AIGS could produce meaningful scientific discoveries, though not on par with experienced human researchers.
arXiv Detail & Related papers (2024-11-17T13:40:35Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - GENEVIC: GENetic data Exploration and Visualization via Intelligent interactive Console [6.786793669890866]
GENEVIC is an AI-driven chat framework that bridges the gap between genetic data generation and biomedical knowledge discovery.
It automates the analysis, retrieval, and visualization of customized domain-specific genetic information.
It integrates functionalities to generate protein interaction networks, enrich gene sets, and search scientific literature from PubMed, Google Scholar, and arXiv.
arXiv Detail & Related papers (2024-04-04T20:53:30Z) - ARAGOG: Advanced RAG Output Grading [44.99833362998488]
Retrieval-Augmented Generation (RAG) is essential for integrating external knowledge into Large Language Model (LLM) outputs.
This study assesses various RAG methods' impacts on retrieval precision and answer similarity.
arXiv Detail & Related papers (2024-04-01T10:43:52Z) - Predicting Parkinson's disease trajectory using clinical and functional MRI features: a reproduction and replication study [1.621204680136386]
Parkinson's disease (PD) is a common neurodegenerative disorder with a poorly understood physiopathology and no established biomarkers for the diagnosis of early stages and for prediction of disease progression.
Several neuroimaging biomarkers have been studied recently, but these are susceptible to several sources of variability related for instance to cohort selection or image analysis.
This study is part of a larger project investigating the replicability of potential neuroimaging biomarkers of PD.
arXiv Detail & Related papers (2024-02-20T13:42:50Z) - Retrosynthesis prediction enhanced by in-silico reaction data
augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation.
On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z) - BioRED: A Comprehensive Biomedical Relation Extraction Dataset [6.915371362219944]
We present BioRED, a first-of-its-kind biomedical RE corpus with multiple entity types and relation pairs.
We label each relation as describing either a novel finding or previously known background knowledge, enabling automated algorithms to differentiate between novel and background information.
Our results show that while existing approaches can reach high performance on the NER task, there is much room for improvement for the RE task.
arXiv Detail & Related papers (2022-04-08T19:23:49Z) - RECOVER: sequential model optimization platform for combination drug
repurposing identifies novel synergistic compounds in vitro [46.773794687622825]
We employ a sequential model optimization search applied to a deep learning model to quickly discover highly synergistic drug combinations active against a cancer cell line.
We find that the set of combinations queried by our model is enriched for highly synergistic combinations.
Remarkably, we rediscovered a synergistic drug combination that was later confirmed to be under study within clinical trials.
arXiv Detail & Related papers (2022-02-07T02:54:29Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - ACRE: Abstract Causal REasoning Beyond Covariation [90.99059920286484]
We introduce the Abstract Causal REasoning dataset for systematic evaluation of current vision systems in causal induction.
Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario.
We notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning.
arXiv Detail & Related papers (2021-03-26T02:42:38Z) - Neural networks for Anatomical Therapeutic Chemical (ATC) [83.73971067918333]
We propose combining multiple multi-label classifiers trained on distinct sets of features, including sets extracted from a Bidirectional Long Short-Term Memory Network (BiLSTM)
Experiments demonstrate the power of this approach, which is shown to outperform the best methods reported in the literature.
arXiv Detail & Related papers (2021-01-22T19:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.