Graph Neural Networks for Surgical Scene Segmentation
- URL: http://arxiv.org/abs/2511.16430v1
- Date: Thu, 20 Nov 2025 14:58:29 GMT
- Title: Graph Neural Networks for Surgical Scene Segmentation
- Authors: Yihan Li, Nikhil Churamani, Maria Robu, Imanol Luengo, Danail Stoyanov,
- Abstract summary: We propose two segmentation models integrating Vision Transformer (ViT) feature encoders with Graph Neural Networks (GNNs) to explicitly model spatial relationships between anatomical regions.<n>The proposed approaches achieve up to 7-8% improvement in Mean Intersection over Union (mIoU) and 6% improvement in Mean Dice (mDice) scores over state-of-the-art baselines.
- Score: 10.617051271345018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Purpose: Accurate identification of hepatocystic anatomy is critical to preventing surgical complications during laparoscopic cholecystectomy. Deep learning models often struggle with occlusions, long-range dependencies, and capturing the fine-scale geometry of rare structures. This work addresses these challenges by introducing graph-based segmentation approaches that enhance spatial and semantic understanding in surgical scene analyses. Methods: We propose two segmentation models integrating Vision Transformer (ViT) feature encoders with Graph Neural Networks (GNNs) to explicitly model spatial relationships between anatomical regions. (1) A static k Nearest Neighbours (k-NN) graph with a Graph Convolutional Network with Initial Residual and Identity Mapping (GCNII) enables stable long-range information propagation. (2) A dynamic Differentiable Graph Generator (DGG) with a Graph Attention Network (GAT) supports adaptive topology learning. Both models are evaluated on the Endoscapes-Seg50 and CholecSeg8k benchmarks. Results: The proposed approaches achieve up to 7-8% improvement in Mean Intersection over Union (mIoU) and 6% improvement in Mean Dice (mDice) scores over state-of-the-art baselines. It produces anatomically coherent predictions, particularly on thin, rare and safety-critical structures. Conclusion: The proposed graph-based segmentation methods enhance both performance and anatomical consistency in surgical scene segmentation. By combining ViT-based global context with graph-based relational reasoning, the models improve interpretability and reliability, paving the way for safer laparoscopic and robot-assisted surgery through a precise identification of critical anatomical features.
Related papers
- Prior-AttUNet: Retinal OCT Fluid Segmentation Based on Normal Anatomical Priors and Attention Gating [6.013762133627291]
This study introduces Prior-AttUNet, a segmentation model augmented with generative anatomical priors.<n>The framework adopts a hybrid dual-path architecture that integrates a generative prior pathway with a segmentation network.<n>The model maintains a low computational cost of 0.37 TFLOPs, striking an effective balance between segmentation precision and inference efficiency.
arXiv Detail & Related papers (2025-12-25T14:37:04Z) - Morphology-Aware KOA Classification: Integrating Graph Priors with Vision Models [13.437469558862084]
We propose a novel framework that combines anatomical structure with radiographic features.<n>Our approach enforces alignment between geometry-informed graph embeddings and radiographic features.<n> Experiments on the Osteoarthritis Initiative dataset demonstrate that our approach surpasses single-modality baselines by up to 10% in accuracy.
arXiv Detail & Related papers (2025-10-20T17:20:19Z) - Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction [14.897714298219569]
We propose S$2$G-Net, a novel neural architecture that unifies state-space sequence modeling with multi-view Graph Neural Networks (GNNs) for ICU LOS prediction.<n> Experiments on the large-scale MIMIC-IV cohort dataset show that S$2$G-Net consistently outperforms sequence models.
arXiv Detail & Related papers (2025-08-24T23:32:25Z) - Neovascularization Segmentation via a Multilateral Interaction-Enhanced Graph Convolutional Network [48.788798029027085]
This paper proposes a novel multilateral graph convolutional interaction-enhanced CNV segmentation network (MTG-Net)<n> MTG-Net consists of a multi-task framework and two graph-based cross-task modules: Multilateral Interaction Graph Reasoning (MIGR) and Multilateral Reinforcement Graph Reasoning (MRGR)<n> Experimental results demonstrate that MTG-Net outperforms existing methods, achieving a Dice socre of 87.21% for region segmentation and 88.12% for vessel segmentation.
arXiv Detail & Related papers (2025-08-05T08:10:19Z) - A Graph-Based Framework for Interpretable Whole Slide Image Analysis [86.37618055724441]
We develop a framework that transforms whole-slide images into biologically-informed graph representations.<n>Our approach builds graph nodes from tissue regions that respect natural structures, not arbitrary grids.<n>We demonstrate strong performance on challenging cancer staging and survival prediction tasks.
arXiv Detail & Related papers (2025-03-14T20:15:04Z) - Towards Robust Cardiac Segmentation using Graph Convolutional Networks [0.9507020058422264]
We propose a graph architecture that uses two convolutional rings based on cardiac anatomy and show that this eliminates anatomical incorrect multi-structure segmentations.
We show this predictor can detect out-of-distribution and unsuitable input images in real-time.
arXiv Detail & Related papers (2023-10-02T13:55:06Z) - Compact & Capable: Harnessing Graph Neural Networks and Edge Convolution
for Medical Image Classification [0.0]
We introduce a novel model that combines GNNs and edge convolution, leveraging the interconnectedness of RGB channel feature values to strongly represent connections between crucial graph nodes.
Our proposed model performs on par with state-of-the-art Deep Neural Networks (DNNs) but does so with 1000 times fewer parameters, resulting in reduced training time and data requirements.
arXiv Detail & Related papers (2023-07-24T13:39:21Z) - Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN)<n>In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism.<n>By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - Structured Landmark Detection via Topology-Adapting Deep Graph Learning [75.20602712947016]
We present a new topology-adapting deep graph learning approach for accurate anatomical facial and medical landmark detection.
The proposed method constructs graph signals leveraging both local image features and global shape features.
Experiments are conducted on three public facial image datasets (WFLW, 300W, and COFW-68) as well as three real-world X-ray medical datasets (Cephalometric (public), Hand and Pelvis)
arXiv Detail & Related papers (2020-04-17T11:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.