Semantic Video Segmentation for Intracytoplasmic Sperm Injection
Procedures
- URL: http://arxiv.org/abs/2101.01207v1
- Date: Mon, 4 Jan 2021 19:33:12 GMT
- Title: Semantic Video Segmentation for Intracytoplasmic Sperm Injection
Procedures
- Authors: Peter He, Raksha Jain, J\'er\^ome Chambost, C\'eline Jacques, Cristina
Hickman
- Abstract summary: We present the first deep learning model for the analysis of intracytoplasmic sperm injection (ICSI) procedures.
We train a deep neural network to segment key objects in the videos achieving a mean IoU of 0.962, and to localize the needle tip achieving a mean pixel error of 3.793 pixels at 14 FPS on a single GPU.
- Score: 7.813460653362095
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the first deep learning model for the analysis of intracytoplasmic
sperm injection (ICSI) procedures. Using a dataset of ICSI procedure videos, we
train a deep neural network to segment key objects in the videos achieving a
mean IoU of 0.962, and to localize the needle tip achieving a mean pixel error
of 3.793 pixels at 14 FPS on a single GPU. We further analyze the variation
between the dataset's human annotators and find the model's performance to be
comparable to human experts.
Related papers
- A novel open-source ultrasound dataset with deep learning benchmarks for
spinal cord injury localization and anatomical segmentation [1.02101998415327]
We present an ultrasound dataset of 10,223-mode (B-mode) images consisting of sagittal slices of porcine spinal cords.
We benchmark the performance metrics of several state-of-the-art object detection algorithms to localize the site of injury.
We evaluate the zero-shot generalization capabilities of the segmentation models on human ultrasound spinal cord images.
arXiv Detail & Related papers (2024-09-24T20:22:59Z) - StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset [56.71580976007712]
We propose to use the Human-Object Offset between anchors which are densely sampled from the surface of human mesh and object mesh to represent human-object spatial relation.
Based on this representation, we propose Stacked Normalizing Flow (StackFLOW) to infer the posterior distribution of human-object spatial relations from the image.
During the optimization stage, we finetune the human body pose and object 6D pose by maximizing the likelihood of samples.
arXiv Detail & Related papers (2024-07-30T04:57:21Z) - Neural Fields for 3D Tracking of Anatomy and Surgical Instruments in Monocular Laparoscopic Video Clips [1.339950379203994]
We propose a method for joint tracking of all structures simultaneously on a single 2D monocular video clip.
Due to the small size of instruments, they generally cover a small part of the image only, resulting in decreased tracking accuracy.
We evaluate tracking on video clips laparoscopic cholecystectomies, where we find mean tracking accuracies of 92.4% for anatomical structures and 87.4% for instruments.
arXiv Detail & Related papers (2024-03-28T09:44:20Z) - AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation [55.179287851188036]
We introduce a novel all-in-one-stage framework, AiOS, for expressive human pose and shape recovery without an additional human detection step.
We first employ a human token to probe a human location in the image and encode global features for each instance.
Then, we introduce a joint-related token to probe the human joint in the image and encoder a fine-grained local feature.
arXiv Detail & Related papers (2024-03-26T17:59:23Z) - WATUNet: A Deep Neural Network for Segmentation of Volumetric Sweep
Imaging Ultrasound [1.2903292694072621]
Volume sweep imaging (VSI) is an innovative approach that enables untrained operators to capture quality ultrasound images.
We present a novel segmentation model known as Wavelet_Attention_UNet (WATUNet)
In this model, we incorporate wavelet gates (WGs) and attention gates (AGs) between the encoder and decoder instead of a simple connection to overcome the limitations mentioned.
arXiv Detail & Related papers (2023-11-17T20:32:37Z) - Comparative analysis of deep learning approaches for AgNOR-stained
cytology samples interpretation [52.77024349608834]
This paper provides a way to analyze argyrophilic nucleolar organizer regions (AgNOR) stained slide using deep learning approaches.
Our results show that the semantic segmentation using U-Net with ResNet-18 or ResNet-34 as the backbone have similar results.
The best model shows an IoU for nucleus, cluster, and satellites of 0.83, 0.92, and 0.99 respectively.
arXiv Detail & Related papers (2022-10-19T15:15:32Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Evaluation of Deep Learning Topcoders Method for Neuron
Individualization in Histological Macaque Brain Section [0.0]
We propose an ensemble Deep Learning algorithm to perform cell individualization on neurological data.
Results suggest that the proposed method successfully segments neuronal cells in both object-level and pixel-level, with an average detection accuracy of 0.93.
arXiv Detail & Related papers (2021-11-10T16:38:35Z) - Predicting Semen Motility using three-dimensional Convolutional Neural
Networks [0.0]
We propose an improved deep learning based approach using three-dimensional convolutional neural networks to predict sperm motility from microscopic videos of the semen sample.
Our models indicate that deep learning based automatic semen analysis may become a valuable and effective tool in fertility and IVF labs.
arXiv Detail & Related papers (2021-01-08T07:38:52Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Appearance Learning for Image-based Motion Estimation in Tomography [60.980769164955454]
In tomographic imaging, anatomical structures are reconstructed by applying a pseudo-inverse forward model to acquired signals.
Patient motion corrupts the geometry alignment in the reconstruction process resulting in motion artifacts.
We propose an appearance learning approach recognizing the structures of rigid motion independently from the scanned object.
arXiv Detail & Related papers (2020-06-18T09:49:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.