Amodal Segmentation through Out-of-Task and Out-of-Distribution
Generalization with a Bayesian Model
- URL: http://arxiv.org/abs/2010.13175v4
- Date: Sat, 9 Jul 2022 04:42:39 GMT
- Title: Amodal Segmentation through Out-of-Task and Out-of-Distribution
Generalization with a Bayesian Model
- Authors: Yihong Sun, Adam Kortylewski, Alan Yuille
- Abstract summary: Amodal completion is a visual task that humans perform easily but which is difficult for computer vision algorithms.
We formulate amodal segmentation as an out-of-task and out-of-distribution generalization problem.
Our algorithm outperforms alternative methods that use the same supervision by a large margin.
- Score: 19.235173141731885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Amodal completion is a visual task that humans perform easily but which is
difficult for computer vision algorithms. The aim is to segment those object
boundaries which are occluded and hence invisible. This task is particularly
challenging for deep neural networks because data is difficult to obtain and
annotate. Therefore, we formulate amodal segmentation as an out-of-task and
out-of-distribution generalization problem. Specifically, we replace the fully
connected classifier in neural networks with a Bayesian generative model of the
neural network features. The model is trained from non-occluded images using
bounding box annotations and class labels only, but is applied to generalize
out-of-task to object segmentation and to generalize out-of-distribution to
segment occluded objects. We demonstrate how such Bayesian models can naturally
generalize beyond the training task labels when they learn a prior that models
the object's background context and shape. Moreover, by leveraging an outlier
process, Bayesian models can further generalize out-of-distribution to segment
partially occluded objects and to predict their amodal object boundaries. Our
algorithm outperforms alternative methods that use the same supervision by a
large margin, and even outperforms methods where annotated amodal segmentations
are used during training, when the amount of occlusion is large. Code is
publicly available at https://github.com/YihongSun/Bayesian-Amodal.
Related papers
- LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes.
We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net)
The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z) - Towards a Generalist and Blind RGB-X Tracker [91.36268768952755]
We develop a single model tracker that can remain blind to any modality X during inference time.
Our training process is extremely simple, integrating multi-label classification loss with a routing function.
Our generalist and blind tracker can achieve competitive performance compared to well-established modal-specific models.
arXiv Detail & Related papers (2024-05-28T03:00:58Z) - Sequential Amodal Segmentation via Cumulative Occlusion Learning [15.729212571002906]
A visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order.
We introduce a diffusion model with cumulative occlusion learning designed for sequential amodal segmentation of objects with uncertain categories.
This model iteratively refines the prediction using the cumulative mask strategy during diffusion, effectively capturing the uncertainty of invisible regions.
It is akin to the human capability for amodal perception, i.e., to decipher the spatial ordering among objects and accurately predict complete contours for occluded objects in densely layered visual scenes.
arXiv Detail & Related papers (2024-05-09T14:17:26Z) - BLADE: Box-Level Supervised Amodal Segmentation through Directed
Expansion [10.57956193654977]
Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision.
We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks.
Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
arXiv Detail & Related papers (2024-01-03T09:37:03Z) - Amodal Ground Truth and Completion in the Wild [84.54972153436466]
We use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images.
This pipeline is used to construct an amodal completion evaluation benchmark, MP3D-Amodal, consisting of a variety of object categories and labels.
arXiv Detail & Related papers (2023-12-28T18:59:41Z) - Coarse-to-Fine Amodal Segmentation with Shape Prior [52.38348188589834]
Amodal object segmentation is a challenging task that involves segmenting both visible and occluded parts of an object.
We propose a novel approach called Coarse-to-Fine: C2F-Seg, that addresses this problem by progressively modeling the amodal segmentation.
arXiv Detail & Related papers (2023-08-31T15:56:29Z) - Foreground-Background Separation through Concept Distillation from
Generative Image Foundation Models [6.408114351192012]
We present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions.
We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis.
arXiv Detail & Related papers (2022-12-29T13:51:54Z) - Self-supervised Amodal Video Object Segmentation [57.929357732733926]
Amodal perception requires inferring the full shape of an object that is partially occluded.
This paper develops a new framework of amodal Video object segmentation (SaVos)
arXiv Detail & Related papers (2022-10-23T14:09:35Z) - A Weakly Supervised Amodal Segmenter with Boundary Uncertainty
Estimation [35.103437828235826]
This paper addresses weakly supervised amodal instance segmentation.
The goal is to segment both visible and occluded (amodal) object parts, while training provides only ground-truth visible (modal) segmentations.
arXiv Detail & Related papers (2021-08-23T02:27:29Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.