Learning Vector Quantized Shape Code for Amodal Blastomere Instance
Segmentation
- URL: http://arxiv.org/abs/2012.00985v1
- Date: Wed, 2 Dec 2020 06:17:28 GMT
- Title: Learning Vector Quantized Shape Code for Amodal Blastomere Instance
Segmentation
- Authors: Won-Dong Jang, Donglai Wei, Xingxuan Zhang, Brian Leahy, Helen Yang,
James Tompkin, Dalit Ben-Yosef, Daniel Needleman, and Hanspeter Pfister
- Abstract summary: Amodal instance segmentation aims to recover the complete silhouette of an object even when the object is not fully visible.
We propose to classify input features into intermediate shape codes and recover complete object shapes from them.
Our method would enable accurate measurement of blastomeres in in vitro fertilization (IVF) clinics.
- Score: 33.558545104711186
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Blastomere instance segmentation is important for analyzing embryos'
abnormality. To measure the accurate shapes and sizes of blastomeres, their
amodal segmentation is necessary. Amodal instance segmentation aims to recover
the complete silhouette of an object even when the object is not fully visible.
For each detected object, previous methods directly regress the target mask
from input features. However, images of an object under different amounts of
occlusion should have the same amodal mask output, which makes it harder to
train the regression model. To alleviate the problem, we propose to classify
input features into intermediate shape codes and recover complete object shapes
from them. First, we pre-train the Vector Quantized Variational Autoencoder
(VQ-VAE) model to learn these discrete shape codes from ground truth amodal
masks. Then, we incorporate the VQ-VAE model into the amodal instance
segmentation pipeline with an additional refinement module. We also detect an
occlusion map to integrate occlusion information with a backbone feature. As
such, our network faithfully detects bounding boxes of amodal objects. On an
internal embryo cell image benchmark, the proposed method outperforms previous
state-of-the-art methods. To show generalizability, we show segmentation
results on the public KINS natural image benchmark. To examine the learned
shape codes and model design choices, we perform ablation studies on a
synthetic dataset of simple overlaid shapes. Our method would enable accurate
measurement of blastomeres in in vitro fertilization (IVF) clinics, which
potentially can increase IVF success rate.
Related papers
- Amodal Instance Segmentation with Diffusion Shape Prior Estimation [10.064183379778388]
Amodal Instance (AIS) presents an intriguing challenge, including the segmentation prediction of both visible and occluded parts of objects within images.
Previous methods have often relied on shape prior information gleaned from training data to enhance amodal segmentation.
Recent advancements highlight the potential of conditioned diffusion models, pretrained on extensive datasets, to generate images from latent space.
arXiv Detail & Related papers (2024-09-26T19:59:12Z) - Sequential Amodal Segmentation via Cumulative Occlusion Learning [15.729212571002906]
A visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order.
We introduce a diffusion model with cumulative occlusion learning designed for sequential amodal segmentation of objects with uncertain categories.
This model iteratively refines the prediction using the cumulative mask strategy during diffusion, effectively capturing the uncertainty of invisible regions.
It is akin to the human capability for amodal perception, i.e., to decipher the spatial ordering among objects and accurately predict complete contours for occluded objects in densely layered visual scenes.
arXiv Detail & Related papers (2024-05-09T14:17:26Z) - Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping [14.958823096408175]
Foundation models are a strong trend in deep learning and computer vision.
Here, we focus on training such an object identification model.
Key solution to train such a model is the centroid triplet loss (CTL), which aggregates image features to their centroids.
arXiv Detail & Related papers (2024-04-09T13:01:26Z) - PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample
Consensus [26.366299016589256]
We present a real-time method for robust estimation of multiple instances of geometric models from noisy data.
A neural network segments the input data into clusters representing potential model instances.
We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.
arXiv Detail & Related papers (2024-01-26T14:54:56Z) - MAP: Domain Generalization via Meta-Learning on Anatomy-Consistent
Pseudo-Modalities [12.194439938007672]
We propose Meta learning on Anatomy-consistent Pseudo-modalities (MAP)
MAP improves model generalizability by learning structural features.
We evaluate our model on seven public datasets of various retinal imaging modalities.
arXiv Detail & Related papers (2023-09-03T22:56:22Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Self-Supervised Predictive Convolutional Attentive Block for Anomaly
Detection [97.93062818228015]
We propose to integrate the reconstruction-based functionality into a novel self-supervised predictive architectural building block.
Our block is equipped with a loss that minimizes the reconstruction error with respect to the masked area in the receptive field.
We demonstrate the generality of our block by integrating it into several state-of-the-art frameworks for anomaly detection on image and video.
arXiv Detail & Related papers (2021-11-17T13:30:31Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.