Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking
Technique
- URL: http://arxiv.org/abs/2312.03303v1
- Date: Wed, 6 Dec 2023 06:07:50 GMT
- Title: Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking
Technique
- Authors: Ilya Tyagin, Ilya Safro
- Abstract summary: This paper presents a novel benchmarking framework Dyport for evaluating biomedical hypothesis generation systems.
We integrate knowledge from curated databases into a dynamic graph, accompanied by a method to quantify discovery importance.
Being flexible, our benchmarking system is designed for broad application in hypothesis generation quality verification.
- Score: 2.0077755400451855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel benchmarking framework Dyport for evaluating
biomedical hypothesis generation systems. Utilizing curated datasets, our
approach tests these systems under realistic conditions, enhancing the
relevance of our evaluations. We integrate knowledge from the curated databases
into a dynamic graph, accompanied by a method to quantify discovery importance.
This not only assesses hypothesis accuracy but also their potential impact in
biomedical research which significantly extends traditional link prediction
benchmarks. Applicability of our benchmarking process is demonstrated on
several link prediction systems applied on biomedical semantic knowledge
graphs. Being flexible, our benchmarking system is designed for broad
application in hypothesis generation quality verification, aiming to expand the
scope of scientific discovery within the biomedical research community.
Availability and implementation: Dyport framework is fully open-source. All
code and datasets are available at: https://github.com/IlyaTyagin/Dyport
Related papers
- WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking [13.880278087741482]
deep learning has revolutionized computer-aided drug discovery.
While deep learning has revolutionized computer-aided drug discovery, the AI community has predominantly focused on model innovation.
We seek to establish a new gold standard for small molecule drug discovery benchmarking, WelQrate.
arXiv Detail & Related papers (2024-11-14T21:49:41Z) - Causal Representation Learning from Multimodal Biological Observations [57.00712157758845]
We aim to develop flexible identification conditions for multimodal data.
We establish identifiability guarantees for each latent component, extending the subspace identification results from prior work.
Our key theoretical ingredient is the structural sparsity of the causal connections among distinct modalities.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels [19.90354530235266]
We introduce a novel approach called Self-Learning Hypothetical Document Embeddings (SL-HyDE) to tackle this issue.
SL-HyDE leverages large language models (LLMs) as generators to generate hypothetical documents based on a given query.
We present the Chinese Medical Information Retrieval Benchmark (CMIRB), a comprehensive evaluation framework grounded in real-world medical scenarios.
arXiv Detail & Related papers (2024-10-26T02:53:20Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Towards Biologically Plausible and Private Gene Expression Data
Generation [47.72947816788821]
Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications.
Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions.
We initiate a systematic analysis of how DP generative models perform in their natural application scenarios, specifically focusing on real-world gene expression data.
arXiv Detail & Related papers (2024-02-07T14:39:11Z) - A large dataset curation and benchmark for drug target interaction [0.7699646945563469]
Bioactivity data plays a key role in drug discovery and repurposing.
We propose a way to standardize and represent efficiently a very large dataset curated from multiple public sources.
arXiv Detail & Related papers (2024-01-30T17:06:25Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Benchmarking Graph Neural Networks [75.42159546060509]
Graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs.
For any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress.
GitHub repository has reached 1,800 stars and 339 forks, which demonstrates the utility of the proposed open-source framework.
arXiv Detail & Related papers (2020-03-02T15:58:46Z) - AGATHA: Automatic Graph-mining And Transformer based Hypothesis
generation Approach [1.7954335118363964]
We present a hypothesis generation system that can introduce data-driven insights earlier in the discovery process.
AGATHA prioritizes plausible term-pairs among entity sets, allowing us to recommend new research directions.
This system achieves best-in-class performance on an established benchmark.
arXiv Detail & Related papers (2020-02-13T17:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.