Related papers: Large Language Model's Multi-Capability Alignment in Biomedical Domain

Large Language Model's Multi-Capability Alignment in Biomedical Domain

URL: http://arxiv.org/abs/2508.04278v1
Date: Wed, 06 Aug 2025 10:06:11 GMT
Title: Large Language Model's Multi-Capability Alignment in Biomedical Domain
Authors: Wentao Wu, Linqing Chen, Hanmeng Zhong, Weilei Wang,
Abstract summary: BalancedBio is a framework for parameter-efficient biomedical reasoning.<n>It addresses multi-capability integration in domain-specific AI alignment.<n>It achieves state-of-the-art results in its parameter class.<n>Real-world deployment yields 78% cost reduction, 23% improved diagnostic accuracy, and 89% clinician acceptance.
Score: 3.1427813443719868
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: BalancedBio is a theoretically grounded framework for parameter-efficient biomedical reasoning, addressing multi-capability integration in domain-specific AI alignment. It establishes the Biomedical Multi-Capability Convergence Theorem, proving orthogonal gradient spaces are essential to prevent capability interference for safe deployment. Key innovations include: (1) Medical Knowledge Grounded Synthetic Generation (MKGSG), extending Source2Synth with clinical workflow constraints and medical ontology validation for factual accuracy and safety; and (2) Capability Aware Group Relative Policy Optimization, deriving optimal hybrid reward weighting to maintain orthogonality in RL, using a reward model with rule-based and model-based scores adapted to biomedical tasks. Mathematical analysis proves Pareto-optimal convergence, preserving performance across capabilities. It achieves state-of-the-art results in its parameter class: domain expertise (80.95% BIOMED-MMLU, +15.32% over baseline), reasoning (61.94%, +7.75%), instruction following (67.95%, +6.44%), and integration (86.7%, +18.5%). Theoretical safety guarantees include bounds on capability preservation and clinical accuracy. Real-world deployment yields 78% cost reduction, 23% improved diagnostic accuracy, and 89% clinician acceptance. This work provides a principled methodology for biomedical AI alignment, enabling efficient reasoning with essential safety and reliability, with the 0.5B model version to be released.

Related papers

Addressing High Class Imbalance in Multi-Class Diabetic Retinopathy Severity Grading with Augmentation and Transfer Learning [1.5939351525664014]
This paper presents a robust deep learning framework for both binary and five-class Diabetic retinopathy (DR) classification.<n>For binary classification, our proposed model achieves a state-of-the-art accuracy of 98.9%, with a precision of 98.6%, recall of 99.3%, F1-score of 98.9%, and an AUC of 99.4%.<n>In the more challenging five-class severity classification task, our model obtains a competitive accuracy of 84.6% and an AUC of 94.1%, outperforming several existing approaches.
arXiv Detail & Related papers (2025-07-23T01:52:27Z)
An Explainable AI-Enhanced Machine Learning Approach for Cardiovascular Disease Detection and Risk Assessment [0.0]
Heart disease remains a major global health concern.<n>Traditional diagnostic methods fail to accurately identify and manage heart disease risks.<n>Machine learning has the potential to significantly enhance the accuracy, efficiency, and speed of heart disease diagnosis.
arXiv Detail & Related papers (2025-07-15T10:38:38Z)
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration [17.11245701879749]
Generalist Medical AI (GMAI) systems have demonstrated expert-level performance in biomedical perception tasks.<n>Here, we present XMedGPT, a clinician-centric, multi-modal AI assistant that integrates textual and visual interpretability.<n>We validate XMedGPT across four pillars: multi-modal interpretability, uncertainty quantification, and prognostic modeling, and rigorous benchmarking.
arXiv Detail & Related papers (2025-05-11T08:32:01Z)
Toward Automated Regulatory Decision-Making: Trustworthy Medical Device Risk Classification with Multimodal Transformers and Self-Training [3.439579933384111]
Transformer-based framework integrates textual descriptions and visual information to predict device regulatory classification.<n>Our approach achieves up to 90.4% accuracy and 97.9% AUROC, significantly outperforming text-only (77.2%) and image-only (54.8%) baselines.
arXiv Detail & Related papers (2025-05-01T09:41:41Z)
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization [0.06554326244334867]
This paper introduces Preferred-MedLLM-Qwen-72B, a 72B- parameter model optimized for the Japanese medical domain.<n>We employ a two-stage fine-tuning process on the Qwen2.5-72B base model to achieve both high accuracy and stable reasoning.
arXiv Detail & Related papers (2025-04-25T05:15:31Z)
MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts [54.915060471994686]
We propose MAST-Pro, a novel framework that integrates dynamic Mixture-of-Experts (D-MoE) and knowledge-driven prompts for pan-tumor segmentation.<n>Specifically, text and anatomical prompts provide domain-specific priors guiding tumor representation learning, while D-MoE dynamically selects experts to balance generic and tumor-specific feature learning.<n>Experiments on multi-anatomical tumor datasets demonstrate that MAST-Pro outperforms state-of-the-art approaches, achieving up to a 5.20% improvement in average improvement while reducing trainable parameters by 91.04%, without compromising accuracy.
arXiv Detail & Related papers (2025-03-18T15:39:44Z)
Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z)
Simulated patient systems are intelligent when powered by large language model-based AI agents [32.73072809937573]
We developed AIPatient, an intelligent simulated patient system powered by large language model-based AI agents.<n>The system incorporates the Retrieval Augmented Generation framework, powered by six task-specific LLM-based AI agents for complex reasoning.<n>For simulation reality, the system is also powered by the AIPatient KG (Knowledge Graph), built with de-identified real patient data.
arXiv Detail & Related papers (2024-09-27T17:17:15Z)
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers [48.21255861863282]
BMRetriever is a series of dense retrievers for enhancing biomedical retrieval. BMRetriever exhibits strong parameter efficiency, with the 410M variant outperforming baselines up to 11.7 times larger.
arXiv Detail & Related papers (2024-04-29T05:40:08Z)
BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion [16.83901927767791]
We present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to obtain a holistic profile. Our model achieves a mean concordance index of 0.77 and a time-dependent area under the curve of 0.84, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2024-02-16T14:19:33Z)
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z)
Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks. By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation. DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.