Related papers: Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model

Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model

URL: http://arxiv.org/abs/2010.13175v4
Date: Sat, 9 Jul 2022 04:42:39 GMT
Title: Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model
Authors: Yihong Sun, Adam Kortylewski, Alan Yuille
Abstract summary: Amodal completion is a visual task that humans perform easily but which is difficult for computer vision algorithms. We formulate amodal segmentation as an out-of-task and out-of-distribution generalization problem. Our algorithm outperforms alternative methods that use the same supervision by a large margin.
Score: 19.235173141731885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Amodal completion is a visual task that humans perform easily but which is difficult for computer vision algorithms. The aim is to segment those object boundaries which are occluded and hence invisible. This task is particularly challenging for deep neural networks because data is difficult to obtain and annotate. Therefore, we formulate amodal segmentation as an out-of-task and out-of-distribution generalization problem. Specifically, we replace the fully connected classifier in neural networks with a Bayesian generative model of the neural network features. The model is trained from non-occluded images using bounding box annotations and class labels only, but is applied to generalize out-of-task to object segmentation and to generalize out-of-distribution to segment occluded objects. We demonstrate how such Bayesian models can naturally generalize beyond the training task labels when they learn a prior that models the object's background context and shape. Moreover, by leveraging an outlier process, Bayesian models can further generalize out-of-distribution to segment partially occluded objects and to predict their amodal object boundaries. Our algorithm outperforms alternative methods that use the same supervision by a large margin, and even outperforms methods where annotated amodal segmentations are used during training, when the amount of occlusion is large. Code is publicly available at https://github.com/YihongSun/Bayesian-Amodal.

Related papers

Tuning-Free Amodal Segmentation via the Occlusion-Free Bias of Inpainting Models [16.800402755022482]
Amodal segmentation aims to predict segmentation masks for both the visible and occluded regions of an object. Most existing works formulate this as a supervised learning problem, requiring manually annotated amodal masks or synthetic training data. This work introduces a tuning-free approach that repurposes pretrained diffusion-based inpainting models for amodal segmentation.
arXiv Detail & Related papers (2025-03-24T17:59:56Z)
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA [49.10341970643037]
Amodal segmentation aims to infer the complete shape of occluded objects, even when the occluded region's appearance is unavailable. Current amodal segmentation methods lack the capability to interact with users through text input. We propose a novel task named amodal reasoning segmentation, aiming to predict the complete amodal shape of occluded objects.
arXiv Detail & Related papers (2025-03-13T10:08:18Z)
LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes. We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net) The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z)
Towards a Generalist and Blind RGB-X Tracker [91.36268768952755]
We develop a single model tracker that can remain blind to any modality X during inference time. Our training process is extremely simple, integrating multi-label classification loss with a routing function. Our generalist and blind tracker can achieve competitive performance compared to well-established modal-specific models.
arXiv Detail & Related papers (2024-05-28T03:00:58Z)
Sequential Amodal Segmentation via Cumulative Occlusion Learning [15.729212571002906]
A visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order. We introduce a diffusion model with cumulative occlusion learning designed for sequential amodal segmentation of objects with uncertain categories. This model iteratively refines the prediction using the cumulative mask strategy during diffusion, effectively capturing the uncertainty of invisible regions. It is akin to the human capability for amodal perception, i.e., to decipher the spatial ordering among objects and accurately predict complete contours for occluded objects in densely layered visual scenes.
arXiv Detail & Related papers (2024-05-09T14:17:26Z)
BLADE: Box-Level Supervised Amodal Segmentation through Directed Expansion [10.57956193654977]
Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision. We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks. Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
arXiv Detail & Related papers (2024-01-03T09:37:03Z)
Amodal Ground Truth and Completion in the Wild [84.54972153436466]
We use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images. This pipeline is used to construct an amodal completion evaluation benchmark, MP3D-Amodal, consisting of a variety of object categories and labels.
arXiv Detail & Related papers (2023-12-28T18:59:41Z)
Coarse-to-Fine Amodal Segmentation with Shape Prior [52.38348188589834]
Amodal object segmentation is a challenging task that involves segmenting both visible and occluded parts of an object. We propose a novel approach called Coarse-to-Fine: C2F-Seg, that addresses this problem by progressively modeling the amodal segmentation.
arXiv Detail & Related papers (2023-08-31T15:56:29Z)
Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models [6.408114351192012]
We present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions. We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis.
arXiv Detail & Related papers (2022-12-29T13:51:54Z)
Self-supervised Amodal Video Object Segmentation [57.929357732733926]
Amodal perception requires inferring the full shape of an object that is partially occluded. This paper develops a new framework of amodal Video object segmentation (SaVos)
arXiv Detail & Related papers (2022-10-23T14:09:35Z)
A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation [35.103437828235826]
This paper addresses weakly supervised amodal instance segmentation. The goal is to segment both visible and occluded (amodal) object parts, while training provides only ground-truth visible (modal) segmentations.
arXiv Detail & Related papers (2021-08-23T02:27:29Z)
Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.