SAGE: Agentic Framework for Interpretable and Clinically Translatable Computational Pathology Biomarker Discovery
- URL: http://arxiv.org/abs/2602.00953v1
- Date: Sun, 01 Feb 2026 01:12:12 GMT
- Title: SAGE: Agentic Framework for Interpretable and Clinically Translatable Computational Pathology Biomarker Discovery
- Authors: Sahar Almahfouz Nasser, Juan Francisco Pesantez Borja, Jincheng Liu, Tanvir Hasan, Zenghan Wang, Suman Ghosh, Sandeep Manandhar, Shikhar Shiromani, Twisha Shah, Naoto Tokuyama, Anant Madabhushi,
- Abstract summary: We introduce SAGE, an agentic AI system designed to identify interpretable, engineered pathology biomarkers by grounding them in biological evidence.<n> SAGE integrates literature-anchored reasoning with multimodal data analysis to correlate image-derived features with molecular biomarkers, such as gene expression, and clinically relevant outcomes.
- Score: 0.8778472217028965
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Despite significant progress in computational pathology, many AI models remain black-box and difficult to interpret, posing a major barrier to clinical adoption due to limited transparency and explainability. This has motivated continued interest in engineered image-based biomarkers, which offer greater interpretability but are often proposed based on anecdotal evidence or fragmented prior literature rather than systematic biological validation. We introduce SAGE (Structured Agentic system for hypothesis Generation and Evaluation), an agentic AI system designed to identify interpretable, engineered pathology biomarkers by grounding them in biological evidence. SAGE integrates literature-anchored reasoning with multimodal data analysis to correlate image-derived features with molecular biomarkers, such as gene expression, and clinically relevant outcomes. By coordinating specialized agents for biological contextualization and empirical hypothesis validation, SAGE prioritizes transparent, biologically supported biomarkers and advances the clinical translation of computational pathology.
Related papers
- From Literature to Hypotheses: An AI Co-Scientist System for Biomarker-Guided Drug Combination Hypothesis Generation [4.281508114645598]
CoDHy is an interactive, human-in-the-loop system for biomarker-guided drug combination hypothesis generation in cancer research.<n>It integrates structured biomedical databases and unstructured literature evidence into a task-specific knowledge graph.<n>Users can configure the scientific context, inspect intermediate results, and iteratively refine hypotheses.
arXiv Detail & Related papers (2026-02-28T12:14:37Z) - Unlocking Biomedical Insights: Hierarchical Attention Networks for High-Dimensional Data Interpretation [0.3821469577674901]
Hierarchical Attention-based Interpretable Network (HAIN) is a novel architecture that unifies multi-level attention mechanisms, dimensionality reduction, and explanation-driven loss functions.<n> Comprehensive evaluation on The Cancer Genome Atlas dataset demonstrates that HAIN achieves a classification accuracy of 94.3%.<n>HAIN effectively identifies biologically relevant cancer biomarkers, supporting its utility for clinical and research applications.
arXiv Detail & Related papers (2025-10-21T20:08:50Z) - A Machine Learning Pipeline for Multiple Sclerosis Biomarker Discovery: Comparing explainable AI and Traditional Statistical Approaches [35.18016233072556]
We present a machine learning pipeline for biomarker discovery in Multiple Sclerosis (MS)<n>After robust preprocessing we trained an XGBoost classifier optimized via Bayesian search.<n>Our comparison revealed both overlapping and unique biomarkers between SHAP and DEA, suggesting complementary strengths.<n>This study highlights the value of combining explainable AI (xAI) with traditional statistical methods to gain deeper insights into disease mechanism.
arXiv Detail & Related papers (2025-09-26T15:31:34Z) - BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology [0.9603373981832565]
BioX-CPath is an explainable graph neural network architecture for whole slide image (WSI) classification.<n>At its core, BioX-CPath introduces a novel Stain-Aware Attention Pooling (SAAP) module that generates biologically meaningful, stain-aware patient embeddings.
arXiv Detail & Related papers (2025-03-26T18:00:22Z) - BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research.<n>Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning.<n>To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z) - Large Language Models for Bioinformatics [58.892165394487414]
This survey focuses on the evolution, classification, and distinguishing features of bioinformatics-specific language models (BioLMs)<n>We explore the wide-ranging applications of BioLMs in critical areas such as disease diagnosis, drug discovery, and vaccine development.<n>We identify key challenges and limitations inherent in BioLMs, including data privacy and security concerns, interpretability issues, biases in training data and model outputs, and domain adaptation complexities.
arXiv Detail & Related papers (2025-01-10T01:43:05Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Unified Representation of Genomic and Biomedical Concepts through Multi-Task, Multi-Source Contrastive Learning [45.6771125432388]
We introduce GENomic REpresentation with Language Model (GENEREL)
GENEREL is a framework designed to bridge genetic and biomedical knowledge bases.
Our experiments demonstrate GENEREL's ability to effectively capture the nuanced relationships between SNPs and clinical concepts.
arXiv Detail & Related papers (2024-10-14T04:19:52Z) - Deep Learning Predicts Biomarker Status and Discovers Related
Histomorphology Characteristics for Low-Grade Glioma [21.281553456323998]
Biomarker detection is an indispensable part in the diagnosis and treatment of low-grade glioma (LGG)
We propose an interpretable deep learning pipeline to predict the status of five biomarkers in LGG using only hematoxylin and eosin-stained whole slide images and slide-level biomarker status labels.
Our pipeline not only provides a novel approach for biomarker prediction, enhancing the applicability of molecular treatments for LGG patients but also facilitates the discovery of new mechanisms in molecular functionality and LGG progression.
arXiv Detail & Related papers (2023-10-11T13:05:33Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.