Subpixel object segmentation using wavelets and multi resolution
analysis
- URL: http://arxiv.org/abs/2110.15233v1
- Date: Thu, 28 Oct 2021 15:43:21 GMT
- Title: Subpixel object segmentation using wavelets and multi resolution
analysis
- Authors: Ray Sheombarsing, Nikita Moriakov, Jan-Jakob Sonke, Jonas Teuwen
- Abstract summary: We propose a novel deep learning framework for fast prediction of boundaries of two-dimensional simply connected domains.
The boundaries are modelled as (piecewise) smooth closed curves using wavelets and the so-called Pyramid Algorithm.
Our model demonstrates up to 5x faster inference speed compared to the U-Net, while maintaining similar performance in terms of Dice score and Hausdorff distance.
- Score: 4.970364068620608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel deep learning framework for fast prediction of boundaries
of two-dimensional simply connected domains using wavelets and Multi Resolution
Analysis (MRA). The boundaries are modelled as (piecewise) smooth closed curves
using wavelets and the so-called Pyramid Algorithm. Our network architecture is
a hybrid analog of the U-Net, where the down-sampling path is a two-dimensional
encoder with learnable filters, and the upsampling path is a one-dimensional
decoder, which builds curves up from low to high resolution levels. Any wavelet
basis induced by a MRA can be used. This flexibility allows for incorporation
of priors on the smoothness of curves. The effectiveness of the proposed method
is demonstrated by delineating boundaries of simply connected domains (organs)
in medical images using Debauches wavelets and comparing performance with a
U-Net baseline. Our model demonstrates up to 5x faster inference speed compared
to the U-Net, while maintaining similar performance in terms of Dice score and
Hausdorff distance.
Related papers
- TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Dynamic Frame Interpolation in Wavelet Domain [57.25341639095404]
Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience.
Existing methods have achieved great success by employing advanced motion models and synthesis networks.
WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
arXiv Detail & Related papers (2023-09-07T06:41:15Z) - Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering [84.37776381343662]
Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information.
We propose mip voxel grids (Mip-VoG), an explicit multiscale representation for real-time anti-aliasing rendering.
Our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously.
arXiv Detail & Related papers (2023-04-20T04:05:22Z) - Position-Aware Relation Learning for RGB-Thermal Salient Object
Detection [3.115635707192086]
We propose a position-aware relation learning network (PRLNet) for RGB-T SOD based on swin transformer.
PRLNet explores the distance and direction relationships between pixels to strengthen intra-class compactness and inter-class separation.
In addition, we constitute a pure transformer encoder-decoder network to enhance multispectral feature representation for RGB-T SOD.
arXiv Detail & Related papers (2022-09-21T07:34:30Z) - A Point-Cloud Deep Learning Framework for Prediction of Fluid Flow
Fields on Irregular Geometries [62.28265459308354]
Network learns end-to-end mapping between spatial positions and CFD quantities.
Incompress laminar steady flow past a cylinder with various shapes for its cross section is considered.
Network predicts the flow fields hundreds of times faster than our conventional CFD.
arXiv Detail & Related papers (2020-10-15T12:15:02Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Sliced Iterative Normalizing Flows [7.6146285961466]
We develop an iterative (greedy) deep learning (DL) algorithm which is able to transform an arbitrary probability distribution function (PDF) into the target PDF.
As special cases of this algorithm, we introduce two sliced iterative Normalizing Flow (SINF) models, which map from the data to the latent space (GIS) and vice versa.
arXiv Detail & Related papers (2020-07-01T18:00:04Z) - Joint Multi-Dimension Pruning via Numerical Gradient Update [120.59697866489668]
We present joint multi-dimension pruning (abbreviated as JointPruning), an effective method of pruning a network on three crucial aspects: spatial, depth and channel simultaneously.
We show that our method is optimized collaboratively across the three dimensions in a single end-to-end training and it is more efficient than the previous exhaustive methods.
arXiv Detail & Related papers (2020-05-18T17:57:09Z) - Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental
Study [2.6205925938720833]
State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan.
A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections.
We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints.
arXiv Detail & Related papers (2020-04-06T11:08:12Z) - A deep learning approach for the computation of curvature in the
level-set method [0.0]
We propose a strategy to estimate the mean curvature of two-dimensional implicit in the level-set method.
Our approach is based on fitting feed-forward neural networks to synthetic data sets constructed from circular immersed in uniform grids of various resolutions.
arXiv Detail & Related papers (2020-02-04T00:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.