Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images
- URL: http://arxiv.org/abs/2002.01981v1
- Date: Wed, 5 Feb 2020 20:30:29 GMT
- Title: Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images
- Authors: Arie Agranonik, Maya Herman, Mark Last
- Abstract summary: In this paper we implement the algorithm we developed in [1] called 3DPIFCM in a parallel environment by using on a GPU.
Our results show that the parallel version of the algorithm performs up to 27x faster than the original sequential version and 68x faster than GAIFCM algorithm.
- Score: 3.3946853660795884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we implemented the algorithm we developed in [1] called 3DPIFCM
in a parallel environment by using CUDA on a GPU. In our previous work we
introduced 3DPIFCM which performs segmentation of images in noisy conditions
and uses particle swarm optimization for finding the optimal algorithm
parameters to account for noise. This algorithm achieved state of the art
segmentation accuracy when compared to FCM (Fuzzy C-Means), IFCMPSO (Improved
Fuzzy C-Means with Particle Swarm Optimization), GAIFCM (Genetic Algorithm
Improved Fuzzy C-Means) on noisy MRI images of an adult Brain.
When using a genetic algorithm or PSO (Particle Swarm Optimization) on a
single machine for optimization we witnessed long execution times for practical
clinical usage. Therefore, in the current paper our goal was to speed up the
execution of 3DPIFCM by taking out parts of the algorithm and executing them as
kernels on a GPU. The algorithm was implemented using the CUDA [13] framework
from NVIDIA and experiments where performed on a server containing 64GB RAM , 8
cores and a TITAN X GPU with 3072 SP cores and 12GB of GPU memory.
Our results show that the parallel version of the algorithm performs up to
27x faster than the original sequential version and 68x faster than GAIFCM
algorithm. We show that the speedup of the parallel version increases as we
increase the size of the image due to better utilization of cores in the GPU.
Also, we show a speedup of up to 5x in our Brainweb experiment compared to
other generic variants such as IFCMPSO and GAIFCM.
Related papers
- 3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt [65.25603275491544]
We present 3DGS-LM, a new method that accelerates the reconstruction of 3D Gaussian Splatting (3DGS)
Our method is 30% faster than the original 3DGS while obtaining the same reconstruction quality optimization.
arXiv Detail & Related papers (2024-09-19T16:31:44Z) - INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order
Gradient Computations in Implicit Neural Representation Processing [66.00729477511219]
Given a function represented as a computation graph, traditional architectures face challenges in efficiently computing its nth-order gradient.
We introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture.
We present results that demonstrate 1.8-4.8x and 1.5-3.6x speedup compared to CPU and GPU baselines respectively.
arXiv Detail & Related papers (2023-08-11T04:24:39Z) - Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications.
We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z) - Fast and High-Quality Image Denoising via Malleable Convolutions [72.18723834537494]
We present Malleable Convolution (MalleConv), as an efficient variant of dynamic convolution.
Unlike previous works, MalleConv generates a much smaller set of spatially-varying kernels from input.
We also build an efficient denoising network using MalleConv, coined as MalleNet.
arXiv Detail & Related papers (2022-01-02T18:35:20Z) - GPU-accelerated Faster Mean Shift with euclidean distance metrics [1.3507758562554621]
Mean-shift algorithm is widely used to solve clustering problems.
In previous research, we proposed a novel GPU-accelerated Faster Mean-shift algorithm.
In this study, we extend and improve the previous algorithm to handle Euclidean distance metrics.
arXiv Detail & Related papers (2021-12-27T20:18:24Z) - GPU optimization of the 3D Scale-invariant Feature Transform Algorithm
and a Novel BRIEF-inspired 3D Fast Descriptor [5.1537294207900715]
This work details a highly efficient implementation of the 3D scale-invariant feature transform (SIFT) algorithm, for the purpose of machine learning from large sets of medical image data.
The primary operations of the 3D SIFT code are implemented on a graphics processing unit (GPU), including convolution, sub-sampling, and 4D peak detection from scale-space pyramids.
The performance improvements are quantified in keypoint detection and image-to-image matching experiments, using 3D MRI human brain volumes of different people.
arXiv Detail & Related papers (2021-12-19T20:56:40Z) - RAMA: A Rapid Multicut Algorithm on GPU [23.281726932718232]
We propose a highly parallel primal-dual algorithm for the multicut (a.k.a.magnitude correlation clustering) problem.
Our algorithm produces primal solutions and dual lower bounds that estimate the distance to optimum.
We can solve very large scale benchmark problems with up to $mathcalO(108)$ variables in a few seconds with small primal-dual gaps.
arXiv Detail & Related papers (2021-09-04T10:33:59Z) - Efficient and Generic 1D Dilated Convolution Layer for Deep Learning [52.899995651639436]
We introduce our efficient implementation of a generic 1D convolution layer covering a wide range of parameters.
It is optimized for x86 CPU architectures, in particular, for architectures containing Intel AVX-512 and AVX-512 BFloat16 instructions.
We demonstrate the performance of our optimized 1D convolution layer by utilizing it in the end-to-end neural network training with real genomics datasets.
arXiv Detail & Related papers (2021-04-16T09:54:30Z) - Increased performance in DDM analysis by calculating structure functions
through Fourier transform in time [0.0]
We present an algorithm to efficiently process a set of images according to the Differential Dynamic Microscopy analysis scheme.
The new implementation computes the DDM analysis faster, thanks to an additional Fourier transform in time instead of performing differences of signals.
Without GPU hardware acceleration and for the same set of images, we found that the new algorithm is 300 faster than the old one both running only on the CPU.
arXiv Detail & Related papers (2020-12-02T21:12:45Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - 3DPIFCM Novel Algorithm for Segmentation of Noisy Brain MRI Images [3.3946853660795884]
3DPIFCM is an extension of a well-known IFCM (Improved Fuzzy C-Means) algorithm.
It performs fuzzy segmentation and introduces a fitness function that is affected by proximity of the voxels.
The 3DPIFCM algorithm uses PSO (Particle Swarm Optimization) in order to optimize the fitness function.
arXiv Detail & Related papers (2020-02-05T20:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.