Compositional Convolutional Neural Networks: A Robust and Interpretable
Model for Object Recognition under Occlusion
- URL: http://arxiv.org/abs/2006.15538v1
- Date: Sun, 28 Jun 2020 08:18:19 GMT
- Title: Compositional Convolutional Neural Networks: A Robust and Interpretable
Model for Object Recognition under Occlusion
- Authors: Adam Kortylewski and Qing Liu and Angtian Wang and Yihong Sun and Alan
Yuille
- Abstract summary: We show that black-box deep convolutional neural networks (DCNNs) have only limited robustness to partial occlusion.
We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets)
Our experiments show that CompositionalNets improve by a large margin over their non-compositional counterparts at classifying and detecting partially occluded objects.
- Score: 21.737411464598797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer vision systems in real-world applications need to be robust to
partial occlusion while also being explainable. In this work, we show that
black-box deep convolutional neural networks (DCNNs) have only limited
robustness to partial occlusion. We overcome these limitations by unifying
DCNNs with part-based models into Compositional Convolutional Neural Networks
(CompositionalNets) - an interpretable deep architecture with innate robustness
to partial occlusion. Specifically, we propose to replace the fully connected
classification head of DCNNs with a differentiable compositional model that can
be trained end-to-end. The structure of the compositional model enables
CompositionalNets to decompose images into objects and context, as well as to
further decompose object representations in terms of individual parts and the
objects' pose. The generative nature of our compositional model enables it to
localize occluders and to recognize objects based on their non-occluded parts.
We conduct extensive experiments in terms of image classification and object
detection on images of artificially occluded objects from the PASCAL3D+ and
ImageNet dataset, and real images of partially occluded vehicles from the
MS-COCO dataset. Our experiments show that CompositionalNets made from several
popular DCNN backbones (VGG-16, ResNet50, ResNext) improve by a large margin
over their non-compositional counterparts at classifying and detecting
partially occluded objects. Furthermore, they can localize occluders accurately
despite being trained with class-level supervision only. Finally, we
demonstrate that CompositionalNets provide human interpretable predictions as
their individual components can be understood as detecting parts and estimating
an objects' viewpoint.
Related papers
- Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks? [4.9260675787714]
Image classification models, including convolutional neural networks (CNNs), perform well on a variety of classification tasks but struggle under partial occlusion.
We contribute the Image Recognition Under Occlusion (IRUO) dataset, based on the recently developed Occluded Video Instance (IRUO) dataset (arXiv:2102.01558)
We find that modern CNN-based models show improved recognition accuracy on occluded images compared to earlier CNN-based models, and ViT-based models are more accurate than CNN-based models on occluded images.
arXiv Detail & Related papers (2024-09-16T23:21:22Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Unsupervised Image Decomposition with Phase-Correlation Networks [28.502280038100167]
Phase-Correlation Decomposition Network (PCDNet) is a novel model that decomposes a scene into its object components.
In our experiments, we show how PCDNet outperforms state-of-the-art methods for unsupervised object discovery and segmentation.
arXiv Detail & Related papers (2021-10-07T13:57:33Z) - Compositional Sketch Search [91.84489055347585]
We present an algorithm for searching image collections using free-hand sketches.
We exploit drawings as a concise and intuitive representation for specifying entire scene compositions.
arXiv Detail & Related papers (2021-06-15T09:38:09Z) - Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible
Neural Networks [118.20778308823779]
We present a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN)
Our model learns to parse 3D objects into semantically consistent part arrangements without any part-level supervision.
arXiv Detail & Related papers (2021-03-18T17:59:31Z) - Robust Instance Segmentation through Reasoning about Multi-Object
Occlusion [9.536947328412198]
We propose a deep network for multi-object instance segmentation that is robust to occlusion.
Our work builds on Compositional Networks, which learn a generative model of neural feature activations to locate occluders.
In particular, we obtain feed-forward predictions of the object classes and their instance and occluder segmentations.
arXiv Detail & Related papers (2020-12-03T17:41:55Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - Robust Object Detection under Occlusion with Context-Aware
CompositionalNets [21.303976151518125]
Compositional convolutional neural networks (CompositionalNets) have been shown to be robust at classifying occluded objects.
We propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects.
arXiv Detail & Related papers (2020-05-24T02:57:34Z) - Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from
a Single RGB Image [102.44347847154867]
We propose a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives.
Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives.
Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
arXiv Detail & Related papers (2020-04-02T17:58:05Z) - Compositional Convolutional Neural Networks: A Deep Architecture with
Innate Robustness to Partial Occlusion [18.276428975330813]
Recent findings show that deep convolutional neural networks (DCNNs) do not generalize well under partial occlusion.
Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model.
We conduct classification experiments on artificially occluded images as well as real images of partially occluded objects from the MS-COCO dataset.
Our proposed model outperforms standard DCNNs by a large margin at classifying partially occluded objects, even when it has not been exposed to occluded objects during training.
arXiv Detail & Related papers (2020-03-10T01:45:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.