Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full
and Sparse Annotations
- URL: http://arxiv.org/abs/2312.01726v1
- Date: Mon, 4 Dec 2023 08:32:31 GMT
- Title: Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full
and Sparse Annotations
- Authors: Hong Liu, Dong Wei, Donghuan Lu, Xiaoying Tang, Liansheng Wang, Yefeng
Zheng
- Abstract summary: This work presents a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to obtain continuous 3D retinal layer surfaces from OCT volumes.
Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction.
- Score: 32.69359482975795
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Layer segmentation is important to quantitative analysis of retinal optical
coherence tomography (OCT). Recently, deep learning based methods have been
developed to automate this task and yield remarkable performance. However, due
to the large spatial gap and potential mismatch between the B-scans of an OCT
volume, all of them were based on 2D segmentation of individual B-scans, which
may lose the continuity and diagnostic information of the retinal layers in 3D
space. Besides, most of these methods required dense annotation of the OCT
volumes, which is labor-intensive and expertise-demanding. This work presents a
novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to
obtain continuous 3D retinal layer surfaces from OCT volumes, which works well
with both full and sparse annotations. The 2D features of individual B-scans
are extracted by an encoder consisting of 2D convolutions. These 2D features
are then used to produce the alignment displacement vectors and layer
segmentation by two 3D decoders coupled via a spatial transformer module. Two
losses are proposed to utilize the retinal layers' natural property of being
smooth for B-scan alignment and layer segmentation, respectively, and are the
key to the semi-supervised learning with sparse annotation. The entire
framework is trained end-to-end. To the best of our knowledge, this is the
first work that attempts 3D retinal layer segmentation in volumetric OCT images
based on CNNs. Experiments on a synthetic dataset and three public clinical
datasets show that our framework can effectively align the B-scans for
potential motion correction, and achieves superior performance to
state-of-the-art 2D deep learning methods in terms of both layer segmentation
accuracy and cross-B-scan 3D continuity in both fully and semi-supervised
settings, thus offering more clinical values than previous works.
Related papers
- Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Spatiotemporal Modeling Encounters 3D Medical Image Analysis:
Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity.
More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume.
The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z) - Self-supervised learning via inter-modal reconstruction and feature
projection networks for label-efficient 3D-to-2D segmentation [4.5206601127476445]
We propose a novel convolutional neural network (CNN) and self-supervised learning (SSL) method for label-efficient 3D-to-2D segmentation.
Results on different datasets demonstrate that the proposed CNN significantly improves the state of the art in scenarios with limited labeled data by up to 8% in Dice score.
arXiv Detail & Related papers (2023-07-06T14:16:25Z) - Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging.
Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image.
This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses.
We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z) - Joint Self-Supervised Image-Volume Representation Learning with
Intra-Inter Contrastive Clustering [31.52291149830299]
Self-supervised learning can overcome the lack of labeled training samples by learning feature representations from unlabeled data.
Most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes.
We propose a novel framework for unsupervised joint learning on 2D and 3D data modalities.
arXiv Detail & Related papers (2022-12-04T18:57:44Z) - Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retina OCT Images [33.99874168018807]
In this study, a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) is proposed to obtain continuous 3D retinal layer surfaces from OCT.
Our framework achieves superior results to state-of-the-art 2D methods in terms of both layer segmentation accuracy and cross-B-scan 3D continuity.
arXiv Detail & Related papers (2022-03-04T15:55:09Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Learning Joint 2D-3D Representations for Depth Completion [90.62843376586216]
We design a simple yet effective neural network block that learns to extract joint 2D and 3D features.
Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points.
arXiv Detail & Related papers (2020-12-22T22:58:29Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.