Related papers: CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers

CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers

URL: http://arxiv.org/abs/2603.00123v1
Date: Mon, 23 Feb 2026 07:19:30 GMT
Title: CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers
Authors: Yannian Gu, Xizhuo Zhang, Linjie Mu, Yongrui Yu, Zhongzhen Huang, Shaoting Zhang, Xiaofan Zhang,
Abstract summary: Most existing approaches for 3D CT analysis largely rely on static, single-pass inference.<n>By leveraging the Model Context Protocol (MCP), CT-Flow shifts from closed-box inference to an open, tool-aware paradigm.<n>Built upon this, CT-Flow functions as a clinical orchestrator capable of decomposing complex natural language queries into automated tool-use sequences.
Score: 14.499713300688555
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Large Vision-Language Models (LVLMs) have shown strong potential for multi-modal radiological reasoning, particularly in tasks like diagnostic visual question answering (VQA) and radiology report generation. However, most existing approaches for 3D CT analysis largely rely on static, single-pass inference. In practice, clinical interpretation is a dynamic, tool-mediated workflow where radiologists iteratively review slices and use measurement, radiomics, and segmentation tools to refine findings. To bridge this gap, we propose CT-Flow, an agentic framework designed for interoperable volumetric interpretation. By leveraging the Model Context Protocol (MCP), CT-Flow shifts from closed-box inference to an open, tool-aware paradigm. We curate CT-FlowBench, the first large-scale instruction-tuning benchmark tailored for 3D CT tool-use and multi-step reasoning. Built upon this, CT-Flow functions as a clinical orchestrator capable of decomposing complex natural language queries into automated tool-use sequences. Experimental evaluations on CT-FlowBench and standard 3D VQA datasets demonstrate that CT-Flow achieves state-of-the-art performance, surpassing baseline models by 41% in diagnostic accuracy and achieving a 95% success rate in autonomous tool invocation. This work provides a scalable foundation for integrating autonomous, agentic intelligence into real-world clinical radiology.

Related papers

3DMedAgent: Unified Perception-to-Understanding for 3D Medical Analysis [42.29123264398027]
3DMedAgent is a unified agent that enables 2D MLLMs to perform general 3D CT analysis without 3D-specific fine-tuning.<n> Experiments across over 40 tasks demonstrate that 3DMedAgent consistently outperforms general, medical, and 3D-specific MLLMs.
arXiv Detail & Related papers (2026-02-20T08:31:26Z)
ProtoFlow: Interpretable and Robust Surgical Workflow Modeling with Learned Dynamic Scene Graph Prototypes [42.15644075070622]
ProtoFlow is a novel framework that learns dynamic graph prototypes to model complex surgical events.<n>We evaluate our approach on the fine-grained CAT-SG dataset.
arXiv Detail & Related papers (2025-12-16T04:59:58Z)
CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis [7.57931364659531]
We introduce CTFlow, a latent flow matching transformer model conditioned on clinical reports.<n>We use the A-VAE from FLUX to define our latent space, and rely on the CT-Clip text encoder to encode the clinical reports.<n>We evaluate our results against state-of-the-art generative CT model, and demonstrate the superiority of our approach in terms of temporal coherence, image diversity and text-image alignment.
arXiv Detail & Related papers (2025-08-18T12:58:21Z)
Imitating Radiological Scrolling: A Global-Local Attention Model for 3D Chest CT Volumes Multi-Label Anomaly Classification [0.0]
Multi-label classification of 3D CT scans is a challenging task due to the volumetric nature of the data and the variety of anomalies to be detected.<n>Existing deep learning methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies effectively.<n>We present CT-Scroll, a novel global-local attention model specifically designed to emulate the scrolling behavior of radiologists during the analysis of 3D CT scans.
arXiv Detail & Related papers (2025-03-26T15:47:50Z)
Interpretable Auto Window Setting for Deep-Learning-Based CT Analysis [0.9285295512807729]
The window setting in Computed Tomography (CT) has always been an indispensable part of the CT analysis process.<n>We propose an plug-and-play module originate from Tanh activation function, which is compatible with mainstream deep learning architectures.<n>We confirm the effectiveness of the proposed method in multiple open-source datasets, yielding 10%200% Dice improvements on hard segment targets.
arXiv Detail & Related papers (2025-01-07T08:15:03Z)
MvKeTR: Chest CT Report Generation with Multi-View Perception and Knowledge Enhancement [1.6355783973385114]
Multi-view perception knowledge-enhanced TansfoRmer (MvKeTR)<n>MVPA with view-aware attention is proposed to synthesize diagnostic information from multiple anatomical views effectively.<n>Cross-Modal Knowledge Enhancer (CMKE) is devised to retrieve the most similar reports based on the query volume.
arXiv Detail & Related papers (2024-11-27T12:58:23Z)
3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans. Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z)
Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning [47.700298779672366]
3D medical image segmentation is a challenging task with crucial implications for disease diagnosis and treatment planning.<n>Recent advances in deep learning have significantly enhanced fully supervised medical image segmentation.<n>We propose a novel probabilistic-aware weakly supervised learning pipeline, specifically designed for 3D medical imaging.
arXiv Detail & Related papers (2024-03-05T00:46:53Z)
AutoCT: Automated CT registration, segmentation, and quantification [0.5461938536945721]
We provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.
arXiv Detail & Related papers (2023-10-26T21:09:47Z)
Hepatic vessel segmentation based on 3Dswin-transformer with inductive biased multi-head self-attention [46.46365941681487]
We propose a robust end-to-end vessel segmentation network called Indu BIased Multi-Head Attention Vessel Net. We introduce the voxel-wise embedding rather than patch-wise embedding to locate precise liver vessel voxels. On the other hand, we propose inductive biased multi-head self-attention which learns inductive biased relative positional embedding from absolute position embedding.
arXiv Detail & Related papers (2021-11-05T10:17:08Z)
Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules. We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.