Related papers: Towards Domain Specification of Embedding Models in Medicine

Towards Domain Specification of Embedding Models in Medicine

URL: http://arxiv.org/abs/2507.19407v2
Date: Wed, 06 Aug 2025 10:08:06 GMT
Title: Towards Domain Specification of Embedding Models in Medicine
Authors: Mohammad Khodadad, Ali Shiraee Kasmaee, Mahdi Astaraki, Hamidreza Mahyar,
Abstract summary: We propose a comprehensive benchmark suite of 51 tasks spanning classification, clustering, pair classification, and retrieval modeled on the Massive Text Embedding Benchmark (MTEB)<n>Our results demonstrate that this combined approach establishes a robust evaluation framework and yields embeddings that consistently outperform state of the art alternatives in different tasks.
Score: 1.0713888959520208
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Medical text embedding models are foundational to a wide array of healthcare applications, ranging from clinical decision support and biomedical information retrieval to medical question answering, yet they remain hampered by two critical shortcomings. First, most models are trained on a narrow slice of medical and biological data, beside not being up to date in terms of methodology, making them ill suited to capture the diversity of terminology and semantics encountered in practice. Second, existing evaluations are often inadequate: even widely used benchmarks fail to generalize across the full spectrum of real world medical tasks. To address these gaps, we leverage MEDTE, a GTE model extensively fine-tuned on diverse medical corpora through self-supervised contrastive learning across multiple data sources, to deliver robust medical text embeddings. Alongside this model, we propose a comprehensive benchmark suite of 51 tasks spanning classification, clustering, pair classification, and retrieval modeled on the Massive Text Embedding Benchmark (MTEB) but tailored to the nuances of medical text. Our results demonstrate that this combined approach not only establishes a robust evaluation framework but also yields embeddings that consistently outperform state of the art alternatives in different tasks.

Related papers

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning [57.873833577058]
We build a multimodal dataset enriched with extensive medical knowledge.<n>We then introduce our medical-specialized MLLM: Lingshu.<n>Lingshu undergoes multi-stage training to embed medical expertise and enhance its task-solving capabilities.
arXiv Detail & Related papers (2025-06-08T08:47:30Z)
Repurposing Foundation Model for Generalizable Medical Time Series Classification [16.21546283978257]
FORMED is a framework for repurposing a backbone foundation model to enable highly generalizable MedTS classification on unseen datasets.<n>We evaluate FORMED on 5 diverse MedTS datasets, benchmarking against 11 Task-Specific Models (TSM) and 4 Task-Specific Adaptation (TSA) methods.<n>Our results demonstrate FORMED's dominant performance, achieving up to 35% absolute improvement in F1-score (on ADFTD dataset) over specialized baselines.
arXiv Detail & Related papers (2024-10-03T23:50:04Z)
MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation [40.9095393430871]
We introduce MedViLaM, a unified vision-language model towards a generalist model for medical data. MedViLaM can flexibly encode and interpret various forms of medical data, including clinical language and imaging. We present instances of zero-shot generalization to new medical concepts and tasks, effective transfer learning across different tasks, and the emergence of zero-shot medical reasoning.
arXiv Detail & Related papers (2024-09-29T12:23:10Z)
MOSMOS: Multi-organ segmentation facilitated by medical report supervision [10.396987980136602]
We propose a novel pre-training & fine-tuning framework for Multi-Organ Supervision (MOS) Specifically, we first introduce global contrastive learning to align medical image-report pairs in the pre-training stage. To remedy the discrepancy, we further leverage multi-label recognition to implicitly learn the semantic correspondence between image pixels and organ tags.
arXiv Detail & Related papers (2024-09-04T03:46:17Z)
Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges [2.1835659964186087]
This paper presents a systematic review of generative models used to synthesize various medical data types. Our study encompasses a broad array of medical data modalities and explores various generative models.
arXiv Detail & Related papers (2024-06-27T14:00:11Z)
Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed. In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset. We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
Diagonal Hierarchical Consistency Learning for Semi-supervised Medical Image Segmentation [0.0]
We propose a novel framework for robust semi-supervised medical image segmentation using diagonal hierarchical consistency learning (DiHC-Net) It is composed of multiple sub-models with identical multi-scale architecture but with distinct sub-layers, such as up-sampling and normalisation layers. A series of experiments verifies the efficacy of our simple framework, outperforming all previous approaches on public benchmark dataset covering organ and tumour.
arXiv Detail & Related papers (2023-11-10T12:38:16Z)
C^2M-DoT: Cross-modal consistent multi-view medical report generation with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT) C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z)
Towards Generalist Biomedical AI [28.68106423175678]
We introduce Med-PaLM Multimodal (Med-PaLM M), our proof of concept for a generalist biomedical AI system. Med-PaLM M is a large multimodal generative model that flexibly encodes and interprets biomedical data. We conduct a radiologist evaluation of model-generated (and human) chest X-ray reports and observe encouraging performance across model scales.
arXiv Detail & Related papers (2023-07-26T17:52:22Z)
Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.