MediRound: Multi-Round Entity-Level Reasoning Segmentation in Medical Images
- URL: http://arxiv.org/abs/2511.12110v1
- Date: Sat, 15 Nov 2025 08:59:21 GMT
- Title: MediRound: Multi-Round Entity-Level Reasoning Segmentation in Medical Images
- Authors: Qinyue Tong, Ziqian Lu, Jun Liu, Rui Zuo, Zheming Lu,
- Abstract summary: We introduce Multi-Round Entity-Level Medical Reasoning (MEMR-Seg)<n>MEMR-Seg is a new task that requires generating segmentation masks through multi-round queries with entity-level reasoning.<n>We construct MR-MedSeg, a large-scale dataset of 177K multi-round medical segmentation dialogues, featuring entity-based reasoning across rounds.
- Score: 10.168003371332746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the progress in medical image segmentation, most existing methods remain task-specific and lack interactivity. Although recent text-prompt-based segmentation approaches enhance user-driven and reasoning-based segmentation, they remain confined to single-round dialogues and fail to perform multi-round reasoning. In this work, we introduce Multi-Round Entity-Level Medical Reasoning Segmentation (MEMR-Seg), a new task that requires generating segmentation masks through multi-round queries with entity-level reasoning. To support this task, we construct MR-MedSeg, a large-scale dataset of 177K multi-round medical segmentation dialogues, featuring entity-based reasoning across rounds. Furthermore, we propose MediRound, an effective baseline model designed for multi-round medical reasoning segmentation. To mitigate the inherent error propagation in the chain-like pipeline of multi-round segmentation, we introduce a lightweight yet effective Judgment & Correction Mechanism during model inference. Experimental results demonstrate that our method effectively addresses the MEMR-Seg task and outperforms conventional medical referring segmentation methods.
Related papers
- Leveraging Causal Reasoning Method for Explaining Medical Image Segmentation Models [15.976622378615714]
Medical image segmentation plays a vital role in clinical decision-making, enabling precise localization of lesions and guiding interventions.<n>Current explanation techniques have primarily focused on classification tasks, leaving the segmentation domain relatively underexplored.<n>We introduce an explanation model for segmentation task which employs the causal inference framework and backpropagates the average treatment effect (ATE) into a metric to determine the influence of input regions, as well as network components, on target segmentation areas.
arXiv Detail & Related papers (2026-02-24T03:26:27Z) - TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - Sim4Seg: Boosting Multimodal Multi-disease Medical Diagnosis Segmentation with Region-Aware Vision-Language Similarity Masks [54.00822479127598]
We introduce a medical vision-language task named Medical Diagnosis (MDS)<n>MDS aims to understand clinical queries for medical images and generate the corresponding segmentation masks as well as diagnostic results.<n>We propose Sim4Seg, a novel framework that improves the performance of diagnosis segmentation.
arXiv Detail & Related papers (2025-11-10T03:22:42Z) - Integrating Background Knowledge in Medical Semantic Segmentation with Logic Tensor Networks [1.2795501345884845]
We introduce Logic Networks (LTNs) to encode medical background knowledge using first-order logic (FOL) rules.<n>We evaluate our method on the task of segmenting the hippocampus in brain MRI scans.
arXiv Detail & Related papers (2025-09-26T14:26:26Z) - CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation [32.48945636401865]
We introduce a novel model named CRISP-SAM2 with CRoss-modal Interaction and Semantic Prompting based on SAM2.<n>This model represents a promising approach to multi-organ medical segmentation guided by textual descriptions of organs.<n>Our method begins by converting visual and textual inputs into cross-modal contextualized semantics.
arXiv Detail & Related papers (2025-06-29T07:05:27Z) - MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models [48.24824129683951]
We introduce medical image reasoning segmentation, a novel task that aims to generate segmentation masks based on complex and implicit medical instructions.<n>To address this, we propose MedSeg-R, an end-to-end framework that leverages the reasoning abilities of MLLMs to interpret clinical questions.<n>It is built on two core components: 1) a global context understanding module that interprets images and comprehends complex medical instructions to generate multi-modal intermediate tokens, and 2) a pixel-level grounding module that decodes these tokens to produce precise segmentation masks.
arXiv Detail & Related papers (2025-06-12T08:13:38Z) - MAMBO-NET: Multi-Causal Aware Modeling Backdoor-Intervention Optimization for Medical Image Segmentation Network [51.68708264694361]
Confusion factors can affect medical images, such as complex anatomical variations and imaging modality limitations.<n>We propose a multi-causal aware modeling backdoor-intervention optimization network for medical image segmentation.<n>Our method significantly reduces the influence of confusion factors, leading to enhanced segmentation accuracy.
arXiv Detail & Related papers (2025-05-28T01:40:10Z) - MediSee: Reasoning-based Pixel-level Perception in Medical Images [6.405810587061276]
We introduce a novel medical vision task: Medical Reasoning and Detection (MedSD)<n>MedSD aims to comprehend implicit queries about medical images and generate the corresponding segmentation mask and bounding box for the target object.<n>We propose MediSee, an effective baseline model designed for medical reasoning segmentation and detection.
arXiv Detail & Related papers (2025-04-15T09:28:53Z) - Dynamically evolving segment anything model with continuous learning for medical image segmentation [50.92344083895528]
We introduce EvoSAM, a dynamically evolving medical image segmentation model.<n>EvoSAM continuously accumulates new knowledge from an ever-expanding array of scenarios and tasks.<n>Experiments conducted by surgical clinicians on blood vessel segmentation confirm that EvoSAM enhances segmentation efficiency based on user prompts.
arXiv Detail & Related papers (2025-03-08T14:37:52Z) - Generalized Organ Segmentation by Imitating One-shot Reasoning using
Anatomical Correlation [55.1248480381153]
We propose OrganNet which learns a generalized organ concept from a set of annotated organ classes and then transfer this concept to unseen classes.
We show that OrganNet can effectively resist the wide variations in organ morphology and produce state-of-the-art results in one-shot segmentation task.
arXiv Detail & Related papers (2021-03-30T13:41:12Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.