CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers
- URL: http://arxiv.org/abs/2603.00123v1
- Date: Mon, 23 Feb 2026 07:19:30 GMT
- Title: CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers
- Authors: Yannian Gu, Xizhuo Zhang, Linjie Mu, Yongrui Yu, Zhongzhen Huang, Shaoting Zhang, Xiaofan Zhang,
- Abstract summary: Most existing approaches for 3D CT analysis largely rely on static, single-pass inference.<n>By leveraging the Model Context Protocol (MCP), CT-Flow shifts from closed-box inference to an open, tool-aware paradigm.<n>Built upon this, CT-Flow functions as a clinical orchestrator capable of decomposing complex natural language queries into automated tool-use sequences.
- Score: 14.499713300688555
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in Large Vision-Language Models (LVLMs) have shown strong potential for multi-modal radiological reasoning, particularly in tasks like diagnostic visual question answering (VQA) and radiology report generation. However, most existing approaches for 3D CT analysis largely rely on static, single-pass inference. In practice, clinical interpretation is a dynamic, tool-mediated workflow where radiologists iteratively review slices and use measurement, radiomics, and segmentation tools to refine findings. To bridge this gap, we propose CT-Flow, an agentic framework designed for interoperable volumetric interpretation. By leveraging the Model Context Protocol (MCP), CT-Flow shifts from closed-box inference to an open, tool-aware paradigm. We curate CT-FlowBench, the first large-scale instruction-tuning benchmark tailored for 3D CT tool-use and multi-step reasoning. Built upon this, CT-Flow functions as a clinical orchestrator capable of decomposing complex natural language queries into automated tool-use sequences. Experimental evaluations on CT-FlowBench and standard 3D VQA datasets demonstrate that CT-Flow achieves state-of-the-art performance, surpassing baseline models by 41% in diagnostic accuracy and achieving a 95% success rate in autonomous tool invocation. This work provides a scalable foundation for integrating autonomous, agentic intelligence into real-world clinical radiology.
Related papers
- 3DMedAgent: Unified Perception-to-Understanding for 3D Medical Analysis [42.29123264398027]
3DMedAgent is a unified agent that enables 2D MLLMs to perform general 3D CT analysis without 3D-specific fine-tuning.<n> Experiments across over 40 tasks demonstrate that 3DMedAgent consistently outperforms general, medical, and 3D-specific MLLMs.
arXiv Detail & Related papers (2026-02-20T08:31:26Z) - ProtoFlow: Interpretable and Robust Surgical Workflow Modeling with Learned Dynamic Scene Graph Prototypes [42.15644075070622]
ProtoFlow is a novel framework that learns dynamic graph prototypes to model complex surgical events.<n>We evaluate our approach on the fine-grained CAT-SG dataset.
arXiv Detail & Related papers (2025-12-16T04:59:58Z) - CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis [7.57931364659531]
We introduce CTFlow, a latent flow matching transformer model conditioned on clinical reports.<n>We use the A-VAE from FLUX to define our latent space, and rely on the CT-Clip text encoder to encode the clinical reports.<n>We evaluate our results against state-of-the-art generative CT model, and demonstrate the superiority of our approach in terms of temporal coherence, image diversity and text-image alignment.
arXiv Detail & Related papers (2025-08-18T12:58:21Z) - Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification [0.0]
Multi-label classification of 3D CT scans is a challenging task due to the volumetric nature of the data and the variety of anomalies to be detected.<n>Existing deep learning methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies effectively.<n>We present CT-Scroll, a novel global-local attention model specifically designed to emulate the scrolling behavior of radiologists during the analysis of 3D CT scans.
arXiv Detail & Related papers (2025-03-26T15:47:50Z) - Interpretable Auto Window Setting for Deep-Learning-Based CT Analysis [0.9285295512807729]
The window setting in Computed Tomography (CT) has always been an indispensable part of the CT analysis process.<n>We propose an plug-and-play module originate from Tanh activation function, which is compatible with mainstream deep learning architectures.<n>We confirm the effectiveness of the proposed method in multiple open-source datasets, yielding 10%200% Dice improvements on hard segment targets.
arXiv Detail & Related papers (2025-01-07T08:15:03Z) - MvKeTR: Chest CT Report Generation with Multi-View Perception and Knowledge Enhancement [1.6355783973385114]
Multi-view perception knowledge-enhanced TansfoRmer (MvKeTR)<n>MVPA with view-aware attention is proposed to synthesize diagnostic information from multiple anatomical views effectively.<n>Cross-Modal Knowledge Enhancer (CMKE) is devised to retrieve the most similar reports based on the query volume.
arXiv Detail & Related papers (2024-11-27T12:58:23Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning [47.700298779672366]
3D medical image segmentation is a challenging task with crucial implications for disease diagnosis and treatment planning.<n>Recent advances in deep learning have significantly enhanced fully supervised medical image segmentation.<n>We propose a novel probabilistic-aware weakly supervised learning pipeline, specifically designed for 3D medical imaging.
arXiv Detail & Related papers (2024-03-05T00:46:53Z) - AutoCT: Automated CT registration, segmentation, and quantification [0.5461938536945721]
We provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans.
The engineered pipeline enables atlas-based CT segmentation and quantification.
On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.
arXiv Detail & Related papers (2023-10-26T21:09:47Z) - Hepatic vessel segmentation based on 3Dswin-transformer with inductive
biased multi-head self-attention [46.46365941681487]
We propose a robust end-to-end vessel segmentation network called Indu BIased Multi-Head Attention Vessel Net.
We introduce the voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels.
On the other hand, we propose inductive biased multi-head self-attention which learns inductive biased relative positional embedding from absolute position embedding.
arXiv Detail & Related papers (2021-11-05T10:17:08Z) - Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules.
We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.