Combining Image- and Geometric-based Deep Learning for Shape Regression:
A Comparison to Pixel-level Methods for Segmentation in Chest X-Ray
- URL: http://arxiv.org/abs/2401.07542v1
- Date: Mon, 15 Jan 2024 09:03:50 GMT
- Title: Combining Image- and Geometric-based Deep Learning for Shape Regression:
A Comparison to Pixel-level Methods for Segmentation in Chest X-Ray
- Authors: Ron Keuth, Mattias Heinrich
- Abstract summary: We propose a novel hybrid method that combines a lightweight CNN backbone with a geometric neural network (Point Transformer) for shape regression.
We include the nnU-Net as an upper baseline, which has $3.7times$ more trainable parameters than our proposed method.
- Score: 0.07143413923310668
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: When solving a segmentation task, shaped-base methods can be beneficial
compared to pixelwise classification due to geometric understanding of the
target object as shape, preventing the generation of anatomical implausible
predictions in particular for corrupted data. In this work, we propose a novel
hybrid method that combines a lightweight CNN backbone with a geometric neural
network (Point Transformer) for shape regression. Using the same CNN encoder,
the Point Transformer reaches segmentation quality on per with current
state-of-the-art convolutional decoders ($4\pm1.9$ vs $3.9\pm2.9$ error in mm
and $85\pm13$ vs $88\pm10$ Dice), but crucially, is more stable w.r.t image
distortion, starting to outperform them at a corruption level of 30%.
Furthermore, we include the nnU-Net as an upper baseline, which has $3.7\times$
more trainable parameters than our proposed method.
Related papers
- Equirectangular image construction method for standard CNNs for Semantic
Segmentation [5.5856758231015915]
We propose a methodology for converting a perspective image into equirectangular image.
The inverse transformation of the spherical center projection and the equidistant cylindrical projection are employed.
Experiments demonstrate that an optimal value of phi for effective semantic segmentation of equirectangular images is 6pi/16 for standard CNNs.
arXiv Detail & Related papers (2023-10-13T14:11:33Z) - GeoTransformer: Fast and Robust Point Cloud Registration with Geometric
Transformer [63.85771838683657]
We study the problem of extracting accurate correspondences for point cloud registration.
Recent keypoint-free methods have shown great potential through bypassing the detection of repeatable keypoints.
We propose Geometric Transformer, or GeoTransformer for short, to learn geometric feature for robust superpoint matching.
arXiv Detail & Related papers (2023-07-25T02:36:04Z) - Self-Supervised Learning from Non-Object Centric Images with a Geometric
Transformation Sensitive Architecture [7.825153552141346]
We propose a Geometric Transformation Sensitive Architecture to be sensitive to geometric transformations.
Our method encourages the student to be sensitive by predicting rotation and using targets that vary with those transformations.
Our approach demonstrates improved performance when using non-object-centric images as pretraining data.
arXiv Detail & Related papers (2023-04-17T06:32:37Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for
Analysis-by-Synthesis [62.47221232706105]
We propose VoGE, which utilizes the Gaussian reconstruction kernels as volumetric primitives.
To efficiently render via VoGE, we propose an approximate closeform solution for the volume density aggregation and a coarse-to-fine rendering strategy.
VoGE outperforms SoTA when applied to various vision tasks, e.g., object pose estimation, shape/texture fitting, and reasoning.
arXiv Detail & Related papers (2022-05-30T19:52:11Z) - Convolution-Free Medical Image Segmentation using Transformers [8.130670465411239]
We show that a different method, based entirely on self-attention between neighboring image patches, can achieve competitive or better results.
We show that the proposed model can achieve segmentation accuracies that are better than the state of the art CNNs on three datasets.
arXiv Detail & Related papers (2021-02-26T18:49:13Z) - A scaling hypothesis for projected entangled-pair states [0.0]
We introduce a new paradigm for scaling simulations with projected entangled-pair states (PEPS) for critical strongly-correlated systems.
We use the effective correlation length $chi$ for inducing a collapse of data points, $f(D,chi)=f(xi(D,chi))$, for arbitrary values of $D$ and the environment bond dimension $chi$.
We test our hypothesis on the critical 3-D dimer model, the 3-D classical Ising model, and the 2-D quantum Heisenberg model.
arXiv Detail & Related papers (2021-02-05T12:48:01Z) - Masked Contrastive Representation Learning for Reinforcement Learning [202.8261654227565]
CURL, which uses contrastive learning to extract high-level features from raw pixels of individual video frames, is an efficient algorithm.
We propose a new algorithm, masked contrastive representation learning for RL, that takes the correlation among consecutive inputs into consideration.
Our method achieves consistent improvements over CURL on $14$ out of $16$ environments from DMControl suite and $21$ out of $26$ environments from Atari 2600 Games.
arXiv Detail & Related papers (2020-10-15T02:00:10Z) - Improving Network Slimming with Nonconvex Regularization [8.017631543721684]
Convolutional neural networks (CNNs) have developed to become powerful models for various computer vision tasks.
Most of the state-of-the-art CNNs cannot be deployed directly.
straightforward approach to compressing CNN is proposed.
arXiv Detail & Related papers (2020-10-03T01:04:02Z) - Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view
Human Reconstruction [97.3274868990133]
Geo-PIFu is a method to recover a 3D mesh from a monocular color image of a clothed person.
We show that, by both encoding query points and constraining global shape using latent voxel features, the reconstruction we obtain for clothed human meshes exhibits less shape distortion and improved surface details compared to competing methods.
arXiv Detail & Related papers (2020-06-15T01:11:48Z) - PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds.
Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.