Bayesian Deep Learning for Affordance Segmentation in images
- URL: http://arxiv.org/abs/2303.00871v1
- Date: Thu, 2 Mar 2023 00:01:13 GMT
- Title: Bayesian Deep Learning for Affordance Segmentation in images
- Authors: Lorenzo Mur-Labadia, Ruben Martinez-Cantin and Jose J. Guerrero
- Abstract summary: We present a novel Bayesian deep network to detect affordances in images.
We quantify the distribution of the aleatoric and epistemic variance at the spatial level.
Our results outperform the state-of-the-art of deterministic networks.
- Score: 3.15834651147911
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Affordances are a fundamental concept in robotics since they relate available
actions for an agent depending on its sensory-motor capabilities and the
environment. We present a novel Bayesian deep network to detect affordances in
images, at the same time that we quantify the distribution of the aleatoric and
epistemic variance at the spatial level. We adapt the Mask-RCNN architecture to
learn a probabilistic representation using Monte Carlo dropout. Our results
outperform the state-of-the-art of deterministic networks. We attribute this
improvement to a better probabilistic feature space representation on the
encoder and the Bayesian variability induced at the mask generation, which
adapts better to the object contours. We also introduce the new
Probability-based Mask Quality measure that reveals the semantic and spatial
differences on a probabilistic instance segmentation model. We modify the
existing Probabilistic Detection Quality metric by comparing the binary masks
rather than the predicted bounding boxes, achieving a finer-grained evaluation
of the probabilistic segmentation. We find aleatoric variance in the contours
of the objects due to the camera noise, while epistemic variance appears in
visual challenging pixels.
Related papers
- FlowSDF: Flow Matching for Medical Image Segmentation Using Distance Transforms [60.195642571004804]
We propose FlowSDF, an image-guided conditional flow matching framework to represent the signed distance function (SDF)
By learning a vector field that is directly related to the probability path of a conditional distribution of SDFs, we can accurately sample from the distribution of segmentation masks.
arXiv Detail & Related papers (2024-05-28T11:47:12Z) - Gaussian Mixture Models for Affordance Learning using Bayesian Networks [50.18477618198277]
Affordances are fundamental descriptors of relationships between actions, objects and effects.
This paper approaches the problem of an embodied agent exploring the world and learning these affordances autonomously from its sensory experiences.
arXiv Detail & Related papers (2024-02-08T22:05:45Z) - Pre-training with Random Orthogonal Projection Image Modeling [32.667183132025094]
Masked Image Modeling (MIM) is a powerful self-supervised strategy for visual pre-training without the use of labels.
We propose an Image Modeling framework based on Random Orthogonal Projection Image Modeling (ROPIM)
ROPIM reduces spatially-wise token information under guaranteed bound on the noise variance and can be considered as masking entire spatial image area under locally varying masking degrees.
arXiv Detail & Related papers (2023-10-28T15:42:07Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data.
We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z) - Analysis of convolutional neural network image classifiers in a
rotationally symmetric model [4.56877715768796]
The rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed.
It is shown that least squares plug-in classifiers based on convolutional neural networks are able to circumvent the curse of dimensionality in binary image classification.
arXiv Detail & Related papers (2022-05-11T13:43:13Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - CQ-VAE: Coordinate Quantized VAE for Uncertainty Estimation with
Application to Disk Shape Analysis from Lumbar Spine MRI Images [1.5841288368322592]
We propose a powerful generative model to learn a representation of ambiguity and to generate probabilistic outputs.
Our model, named Coordinate Quantization Variational Autoencoder (CQ-VAE), employs a discrete latent space with an internal discrete probability distribution.
A matching algorithm is used to establish the correspondence between model-generated samples and "ground-truth" samples.
arXiv Detail & Related papers (2020-10-17T04:25:32Z) - Stochastic Segmentation Networks: Modelling Spatially Correlated
Aleatoric Uncertainty [32.33791302617957]
We introduce segmentation networks (SSNs), an efficient probabilistic method for modelling aleatoric uncertainty with any image segmentation network architecture.
SSNs can generate multiple spatially coherent hypotheses for a single image.
We tested our method on the segmentation of real-world medical data, including lung nodules in 2D CT and brain tumours in 3D multimodal MRI scans.
arXiv Detail & Related papers (2020-06-10T18:06:41Z) - Multi-Granularity Canonical Appearance Pooling for Remote Sensing Scene
Classification [0.34376560669160383]
We propose a novel Multi-Granularity Canonical Appearance Pooling (MG-CAP) to automatically capture the latent ontological structure of remote sensing datasets.
For each specific granularity, we discover the canonical appearance from a set of pre-defined transformations and learn the corresponding CNN features through a maxout-based Siamese style architecture.
We provide a stable solution for training the eigenvalue-decomposition function (EIG) in a GPU and demonstrate the corresponding back-propagation using matrix calculus.
arXiv Detail & Related papers (2020-04-09T11:24:00Z) - Spatially Adaptive Inference with Stochastic Feature Sampling and
Interpolation [72.40827239394565]
We propose to compute features only at sparsely sampled locations.
We then densely reconstruct the feature map with an efficient procedure.
The presented network is experimentally shown to save substantial computation while maintaining accuracy over a variety of computer vision tasks.
arXiv Detail & Related papers (2020-03-19T15:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.