2.75D: Boosting learning by representing 3D Medical imaging to 2D
features for small data
- URL: http://arxiv.org/abs/2002.04251v3
- Date: Mon, 22 Jan 2024 20:05:23 GMT
- Title: 2.75D: Boosting learning by representing 3D Medical imaging to 2D
features for small data
- Authors: Xin Wang, Ruisheng Su, Weiyi Xie, Wenjin Wang, Yi Xu, Ritse Mann,
Jungong Han, Tao Tan
- Abstract summary: 3D convolutional neural networks (CNNs) have started to show superior performance to 2D CNNs in numerous deep learning tasks.
Applying transfer learning on 3D CNN is challenging due to a lack of publicly available pre-trained 3D models.
In this work, we proposed a novel 2D strategical representation of volumetric data, namely 2.75D.
As a result, 2D CNN networks can also be used to learn volumetric information.
- Score: 54.223614679807994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In medical-data driven learning, 3D convolutional neural networks (CNNs) have
started to show superior performance to 2D CNNs in numerous deep learning
tasks, proving the added value of 3D spatial information in feature
representation. However, the difficulty in collecting more training samples to
converge, more computational resources and longer execution time make this
approach less applied. Also, applying transfer learning on 3D CNN is
challenging due to a lack of publicly available pre-trained 3D models. To
tackle these issues, we proposed a novel 2D strategical representation of
volumetric data, namely 2.75D. In this work, the spatial information of 3D
images is captured in a single 2D view by a spiral-spinning technique. As a
result, 2D CNN networks can also be used to learn volumetric information.
Besides, we can fully leverage pre-trained 2D CNNs for downstream vision
problems. We also explore a multi-view 2.75D strategy, 2.75D 3 channels
(2.75Dx3), to boost the advantage of 2.75D. We evaluated the proposed methods
on three public datasets with different modalities or organs (Lung CT, Breast
MRI, and Prostate MRI), against their 2D, 2.5D, and 3D counterparts in
classification tasks. Results show that the proposed methods significantly
outperform other counterparts when all methods were trained from scratch on the
lung dataset. Such performance gain is more pronounced with transfer learning
or in the case of limited training data. Our methods also achieved comparable
performance on other datasets. In addition, our methods achieved a substantial
reduction in time consumption of training and inference compared with the 2.5D
or 3D method.
Related papers
- EmbodiedSAM: Online Segment Any 3D Thing in Real Time [61.2321497708998]
Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration.
An online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed.
arXiv Detail & Related papers (2024-08-21T17:57:06Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding [96.95120198412395]
We introduce tri-modal pre-training framework that automatically generates holistic language descriptions for 3D shapes.
It only needs 3D data as input, eliminating the need for any manual 3D annotations, and is therefore scalable to large datasets.
We conduct experiments on two large-scale 3D datasets, NN and ShapeNet, and augment them with tri-modal datasets of 3D point clouds, captioning, and language for training.
Experiments show that NN-2 demonstrates substantial benefits in three downstream tasks: zero-shot 3D classification, standard 3D classification with finetuning, and 3D (3D
arXiv Detail & Related papers (2023-05-14T23:14:09Z) - Video Pretraining Advances 3D Deep Learning on Chest CT Tasks [63.879848037679224]
Pretraining on large natural image classification datasets has aided model development on data-scarce 2D medical tasks.
These 2D models have been surpassed by 3D models on 3D computer vision benchmarks.
We show video pretraining for 3D models can enable higher performance on smaller datasets for 3D medical tasks.
arXiv Detail & Related papers (2023-04-02T14:46:58Z) - Super Images -- A New 2D Perspective on 3D Medical Imaging Analysis [0.0]
We present a simple yet effective 2D method to handle 3D data while efficiently embedding the 3D knowledge during training.
Our method generates a super-resolution image by stitching slices side by side in the 3D image.
While attaining equal, if not superior, results to 3D networks utilizing only 2D counterparts, the model complexity is reduced by around threefold.
arXiv Detail & Related papers (2022-05-05T09:59:03Z) - Data Efficient 3D Learner via Knowledge Transferred from 2D Model [30.077342050473515]
We deal with the data scarcity challenge of 3D tasks by transferring knowledge from strong 2D models via RGB-D images.
We utilize a strong and well-trained semantic segmentation model for 2D images to augment RGB-D images with pseudo-label.
Our method already outperforms existing state-of-the-art that is tailored for 3D label efficiency.
arXiv Detail & Related papers (2022-03-16T09:14:44Z) - Learning from 2D: Pixel-to-Point Knowledge Transfer for 3D Pretraining [21.878815180924832]
We present a novel 3D pretraining method by leveraging 2D networks learned from rich 2D datasets.
Our experiments show that the 3D models pretrained with 2D knowledge boost the performances across various real-world 3D downstream tasks.
arXiv Detail & Related papers (2021-04-10T05:40:42Z) - Semantic Segmentation of Neuronal Bodies in Fluorescence Microscopy
Using a 2D+3D CNN Training Strategy with Sparsely Annotated Data [0.0]
Bidimensional CNNs yield good results in neuron localization but lead to inaccurate surface reconstruction.
3D CNNs would require manually annotated data on a large scale and hence considerable human effort.
We propose a two-phase strategy for training native 3D CNN models on sparse 2D annotations.
arXiv Detail & Related papers (2020-08-31T18:01:02Z) - 3D Self-Supervised Methods for Medical Imaging [7.65168530693281]
We propose 3D versions for five different self-supervised methods, in the form of proxy tasks.
Our methods facilitate neural network feature learning from unlabeled 3D images, aiming to reduce the required cost for expert annotation.
The developed algorithms are 3D Contrastive Predictive Coding, 3D Rotation prediction, 3D Jigsaw puzzles, Relative 3D patch location, and 3D Exemplar networks.
arXiv Detail & Related papers (2020-06-06T09:56:58Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.