Related papers: GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule

GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule

URL: http://arxiv.org/abs/2512.10252v1
Date: Thu, 11 Dec 2025 03:19:50 GMT
Title: GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule
Authors: Rui Wang, Yimu Sun, Jingxing Guo, Huisi Wu, Jing Qin,
Abstract summary: We introduce GDKVM, a novel architecture for echocardiography video segmentation.<n>The model employs Linear Key-Value Association (LKVA) to effectively model inter-frame correlations, and introduces Gated Delta Rule (GDR) to efficiently store intermediate memory states.<n>We validated GDKVM on two mainstream echocardiography video datasets (CAMUS and EchoNet-Dynamic) and compared it with various state-of-the-art methods. Experimental results show that GDKVM outperforms existing approaches in terms of segmentation accuracy and robustness, while ensuring real-time performance.
Score: 28.526034344479935
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate segmentation of cardiac chambers in echocardiography sequences is crucial for the quantitative analysis of cardiac function, aiding in clinical diagnosis and treatment. The imaging noise, artifacts, and the deformation and motion of the heart pose challenges to segmentation algorithms. While existing methods based on convolutional neural networks, Transformers, and space-time memory networks have improved segmentation accuracy, they often struggle with the trade-off between capturing long-range spatiotemporal dependencies and maintaining computational efficiency with fine-grained feature representation. In this paper, we introduce GDKVM, a novel architecture for echocardiography video segmentation. The model employs Linear Key-Value Association (LKVA) to effectively model inter-frame correlations, and introduces Gated Delta Rule (GDR) to efficiently store intermediate memory states. Key-Pixel Feature Fusion (KPFF) module is designed to integrate local and global features at multiple scales, enhancing robustness against boundary blurring and noise interference. We validated GDKVM on two mainstream echocardiography video datasets (CAMUS and EchoNet-Dynamic) and compared it with various state-of-the-art methods. Experimental results show that GDKVM outperforms existing approaches in terms of segmentation accuracy and robustness, while ensuring real-time performance. Code is available at https://github.com/wangrui2025/GDKVM.

Related papers

Point Tracking as a Temporal Cue for Robust Myocardial Segmentation in Echocardiography Videos [2.7509305461575875]
Myocardium segmentation in echocardiography videos is a challenging task due to low contrast, noise, and anatomical variability.<n>Traditional deep learning models either process frames independently, ignoring temporal information, or rely on memory-based feature propagation.<n>We propose Point-Seg, a transformer-based segmentation framework that integrates point tracking as a temporal cue.
arXiv Detail & Related papers (2026-01-14T06:23:36Z)
A DyL-Unet framework based on dynamic learning for Temporally Consistent Echocardiographic Segmentation [0.328418927821443]
We propose DyL-UNet, a dynamic learning-based temporal consistency U-Net segmentation architecture.<n>The framework constructs an Echo-Dynamics Graph (EDG) through dynamic learning to extract dynamic information from videos.<n>Experiments on the CAMUS and EchoNet-Dynamic datasets demonstrate that DyL-UNet maintains segmentation accuracy comparable to existing methods.
arXiv Detail & Related papers (2025-09-23T14:17:01Z)
CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner [14.429336783145644]
Left ventricular ejection fraction (LVEF) serves as a key indicator of heart function.<n>Existing LVEF estimation methods depend on large-scale annotated video datasets.<n>We propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling.
arXiv Detail & Related papers (2025-09-21T12:52:08Z)
Echo-DND: A dual noise diffusion model for robust and precise left ventricle segmentation in echocardiography [0.6749750044497732]
This paper introduces Echo-DND, a novel dual-noise diffusion model for echocardiogram segmentation.<n>The model's performance was rigorously validated on the CAMUS and EchoNet-Dynamic datasets.<n>It achieves high Dice scores of 0.962 and 0.939 on these datasets, respectively.
arXiv Detail & Related papers (2025-06-18T06:27:08Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Semantic-aware Temporal Channel-wise Attention for Cardiac Function Assessment [69.02116920364311]
Existing video-based methods do not pay much attention to the left ventricular region, nor the left ventricular changes caused by motion. We propose a semi-supervised auxiliary learning paradigm with a left ventricular segmentation task, which contributes to the representation learning for the left ventricular region. Our approach achieves state-of-the-art performance on the Stanford dataset with an improvement of 0.22 MAE, 0.26 RMSE, and 1.9% $R2$.
arXiv Detail & Related papers (2023-10-09T05:57:01Z)
ARHNet: Adaptive Region Harmonization for Lesion-aware Augmentation to Improve Segmentation Performance [61.04246102067351]
We propose a foreground harmonization framework (ARHNet) to tackle intensity disparities and make synthetic images look more realistic. We demonstrate the efficacy of our method in improving the segmentation performance using real and synthetic images.
arXiv Detail & Related papers (2023-07-02T10:39:29Z)
Echocardiography Segmentation Using Neural ODE-based Diffeomorphic Registration Field [0.0]
We present a novel method for diffevolution image registration using neural ordinary differential equations (Neural ODE) The proposed method, Echo-ODE, introduces several key improvements compared to the previous state-of-the-art. The results show that our method surpasses the previous state-of-the-art in multiple aspects.
arXiv Detail & Related papers (2023-06-16T08:37:27Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
Light-weight spatio-temporal graphs for segmentation and ejection fraction prediction in cardiac ultrasound [5.597394612661975]
We propose an automated method called EchoGraphs for predicting ejection fraction and segmenting the left ventricle. Models for direct coordinate regression based on Graph Conal Networks (GCNs) are used to detect the keypoints. Compared to semantic segmentation, GCNs show accurate segmentation and improvements in robustness and inference runtime.
arXiv Detail & Related papers (2022-07-06T10:03:44Z)
Efficient Global-Local Memory for Real-time Instrument Segmentation of Robotic Surgical Video [53.14186293442669]
We identify two important clues for surgical instrument perception, including local temporal dependency from adjacent frames and global semantic correlation in long-range duration. We propose a novel dual-memory network (DMNet) to relate both global and local-temporal knowledge. Our method largely outperforms the state-of-the-art works on segmentation accuracy while maintaining a real-time speed.
arXiv Detail & Related papers (2021-09-28T10:10:14Z)
Weakly-supervised Learning For Catheter Segmentation in 3D Frustum Ultrasound [74.22397862400177]
We propose a novel Frustum ultrasound based catheter segmentation method. The proposed method achieved the state-of-the-art performance with an efficiency of 0.25 second per volume.
arXiv Detail & Related papers (2020-10-19T13:56:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.