Automated Genomic Interpretation via Concept Bottleneck Models for Medical Robotics
- URL: http://arxiv.org/abs/2510.01618v1
- Date: Thu, 02 Oct 2025 02:51:34 GMT
- Title: Automated Genomic Interpretation via Concept Bottleneck Models for Medical Robotics
- Authors: Zijun Li, Jinchang Zhang, Ming Zhang, Guoyu Lu,
- Abstract summary: We propose an automated genomic interpretation module that transforms raw DNA sequences into actionable, interpretable decisions.<n>Our framework combines Chaos Game Representation with a Concept Bottleneck Model (CBM), enforcing predictions to flow through biologically meaningful concepts.<n>By bridging the gap between interpretable genomic modeling and automated decision-making, this work establishes a reliable foundation for robotic and clinical automation in genomic medicine.
- Score: 26.40159044419644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an automated genomic interpretation module that transforms raw DNA sequences into actionable, interpretable decisions suitable for integration into medical automation and robotic systems. Our framework combines Chaos Game Representation (CGR) with a Concept Bottleneck Model (CBM), enforcing predictions to flow through biologically meaningful concepts such as GC content, CpG density, and k mer motifs. To enhance reliability, we incorporate concept fidelity supervision, prior consistency alignment, KL distribution matching, and uncertainty calibration. Beyond accurate classification of HIV subtypes across both in-house and LANL datasets, our module delivers interpretable evidence that can be directly validated against biological priors. A cost aware recommendation layer further translates predictive outputs into decision policies that balance accuracy, calibration, and clinical utility, reducing unnecessary retests and improving efficiency. Extensive experiments demonstrate that the proposed system achieves state of the art classification performance, superior concept prediction fidelity, and more favorable cost benefit trade-offs compared to existing baselines. By bridging the gap between interpretable genomic modeling and automated decision-making, this work establishes a reliable foundation for robotic and clinical automation in genomic medicine.
Related papers
- MedAD-R1: Eliciting Consistent Reasoning in Interpretible Medical Anomaly Detection via Consistency-Reinforced Policy Optimization [46.65200216642429]
We introduce MedAD-38K, the first large-scale, multi-modal, and multi-center benchmark for MedAD featuring diagnostic Chain-of-Thought (CoT) annotations alongside structured Visual Question-Answering (VQA) pairs.<n>Our proposed model, MedAD-R1, achieves state-of-the-art (SOTA) performance on the MedAD-38K benchmark, outperforming strong baselines by more than 10%.
arXiv Detail & Related papers (2026-02-01T07:56:10Z) - AgentScore: Autoformulation of Deployable Clinical Scoring Systems [45.88028371034407]
We introduce AgentScore, which performs semantically guided optimization in unit-weighted clinical checklists.<n>AgentScore outperforms existing score-generation methods and achieves AUC comparable to more flexible interpretable models.<n>On two additional externally validated tasks, AgentScore achieves higher discrimination than established guideline-based scores.
arXiv Detail & Related papers (2026-01-29T21:11:06Z) - Enhancing Lung Cancer Treatment Outcome Prediction through Semantic Feature Engineering Using Large Language Models [5.778370321351782]
We introduce a framework that uses Large Language Models (LLMs) as Goal-oriented Knowledge Curators (GKC)<n>GKC converts laboratory, genomic, and medication data into high-fidelity, task-aligned features.<n>We benchmarked GKC against expert-engineered features, direct text embeddings, and an end-to-end transformer.
arXiv Detail & Related papers (2025-12-01T23:56:45Z) - Linearized Optimal Transport for Analysis of High-Dimensional Point-Cloud and Single-Cell Data [45.87606039212519]
Single-cell technologies generate high-dimensional point clouds of cells.<n>Each patient is represented by an irregular point cloud rather than a simple vector.<n>We adapt the Linear Optimal Transport framework to embed irregular point clouds into a fixed-dimensional Euclidean space.
arXiv Detail & Related papers (2025-10-24T21:33:12Z) - Unlocking Biomedical Insights: Hierarchical Attention Networks for High-Dimensional Data Interpretation [0.3821469577674901]
Hierarchical Attention-based Interpretable Network (HAIN) is a novel architecture that unifies multi-level attention mechanisms, dimensionality reduction, and explanation-driven loss functions.<n> Comprehensive evaluation on The Cancer Genome Atlas dataset demonstrates that HAIN achieves a classification accuracy of 94.3%.<n>HAIN effectively identifies biologically relevant cancer biomarkers, supporting its utility for clinical and research applications.
arXiv Detail & Related papers (2025-10-21T20:08:50Z) - Interpretable Clinical Classification with Kolgomorov-Arnold Networks [70.72819760172744]
Kolmogorov-Arnold Networks (KANs) offer intrinsic interpretability through transparent, symbolic representations.<n>KANs support built-in patient-level insights, intuitive visualizations, and nearest-patient retrieval.<n>These results position KANs as a promising step toward trustworthy AI that clinicians can understand, audit, and act upon.
arXiv Detail & Related papers (2025-09-20T17:21:58Z) - Machine Learning for Medicine Must Be Interpretable, Shareable, Reproducible and Accountable by Design [0.12891210250935148]
We argue that these principles should form the foundational design criteria for machine learning algorithms in medicine.<n>We discuss how intrinsically interpretable modeling approaches can serve as powerful alternatives to opaque deep networks.<n>We then examine accountability in model development, calling for rigorous evaluation, fairness, and uncertainty quantification.
arXiv Detail & Related papers (2025-08-22T05:23:34Z) - Counterfactual Probabilistic Diffusion with Expert Models [44.96279296893773]
We propose a time series diffusion-based framework that incorporates guidance from imperfect expert models.<n>Our method, ODE-Diff, bridges mechanistic and data-driven approaches, enabling more reliable and interpretable causal inference.
arXiv Detail & Related papers (2025-08-18T20:44:32Z) - Automated SNOMED CT Concept Annotation in Clinical Text Using Bi-GRU Neural Networks [0.31457219084519]
This study introduces a neural sequence labeling approach for SNOMED CT concept recognition using a Bidirectional GRU model.<n>We preprocess text with domain-adapted SpaCy and SciBERT-based tokenization, segmenting sentences into overlapping 19-token chunks enriched with contextual, syntactic, and morphological features.<n>The Bi-GRU model assigns IOB tags to identify concept spans and achieves strong performance with a 90 percent F1-score on the validation set.
arXiv Detail & Related papers (2025-08-04T16:08:49Z) - Bayesian Hybrid Machine Learning of Gallstone Risk [0.0]
Gallstone disease is a complex, multifactorial condition with significant global health burdens.<n>We propose a hybrid machine learning framework that integrates robust variable selection with advanced interaction detection.<n>This proposed framework not only enhances prediction but also yields actionable insights, offering a valuable support tool for medical research and decision-making.
arXiv Detail & Related papers (2025-06-17T14:19:02Z) - A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example [2.325005809983534]
This paper introduces a symbolic and statistical learning framework to identify key regulatory mechanisms and model uncertainty.<n>A Metropolis-adjusted Langevin algorithm with adjoint sensitivity analysis is developed for posterior exploration.<n>An empirical study demonstrates its ability to recover missing regulatory mechanisms and improve model fidelity under datalimited conditions.
arXiv Detail & Related papers (2025-05-06T04:39:34Z) - Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments [50.310636905746975]
Real-world machine learning systems often encounter model performance degradation due to distributional shifts in the underlying data generating process.
Existing approaches to addressing shifts, such as concept drift adaptation, are limited by their reason-agnostic nature.
We propose self-healing machine learning (SHML) to overcome these limitations.
arXiv Detail & Related papers (2024-10-31T20:05:51Z) - Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Improving Clinical Decision Support through Interpretable Machine Learning and Error Handling in Electronic Health Records [6.594072648536156]
Trust-MAPS translates clinical domain knowledge into high-dimensional, mixed-integer programming models.<n>Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks, but also lend interpretability to ML models.
arXiv Detail & Related papers (2023-08-21T15:14:49Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.