Related papers: Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

URL: http://arxiv.org/abs/2003.04490v3
Date: Fri, 17 Apr 2020 07:23:05 GMT
Title: Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion
Authors: Adam Kortylewski, Ju He, Qing Liu, Alan Yuille
Abstract summary: Recent findings show that deep convolutional neural networks (DCNNs) do not generalize well under partial occlusion. Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model. We conduct classification experiments on artificially occluded images as well as real images of partially occluded objects from the MS-COCO dataset. Our proposed model outperforms standard DCNNs by a large margin at classifying partially occluded objects, even when it has not been exposed to occluded objects during training.
Score: 18.276428975330813
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent findings show that deep convolutional neural networks (DCNNs) do not generalize well under partial occlusion. Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion. We term this architecture Compositional Convolutional Neural Network. In particular, we propose to replace the fully connected classification head of a DCNN with a differentiable compositional model. The generative nature of the compositional model enables it to localize occluders and subsequently focus on the non-occluded parts of the object. We conduct classification experiments on artificially occluded images as well as real images of partially occluded objects from the MS-COCO dataset. The results show that DCNNs do not classify occluded objects robustly, even when trained with data that is strongly augmented with partial occlusions. Our proposed model outperforms standard DCNNs by a large margin at classifying partially occluded objects, even when it has not been exposed to occluded objects during training. Additional experiments demonstrate that CompositionalNets can also localize the occluders accurately, despite being trained with class labels only. The code used in this work is publicly available.

Related papers

Are Deep Learning Models Robust to Partial Object Occlusion in Visual Recognition Tasks? [4.9260675787714]
Image classification models, including convolutional neural networks (CNNs), perform well on a variety of classification tasks but struggle under partial occlusion. We contribute the Image Recognition Under Occlusion (IRUO) dataset, based on the recently developed Occluded Video Instance (IRUO) dataset (arXiv:2102.01558) We find that modern CNN-based models show improved recognition accuracy on occluded images compared to earlier CNN-based models, and ViT-based models are more accurate than CNN-based models on occluded images.
arXiv Detail & Related papers (2024-09-16T23:21:22Z)
Unsupervised 3D Point Cloud Completion via Multi-view Adversarial Learning [61.14132533712537]
We propose MAL-UPC, a framework that effectively leverages both region-level and category-specific geometric similarities to complete missing structures. Our MAL-UPC does not require any 3D complete supervision and only necessitates single-view partial observations in the training set.
arXiv Detail & Related papers (2024-07-13T06:53:39Z)
DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification. The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z)
Occlusion-Aware Instance Segmentation via BiLayer Network Architectures [73.45922226843435]
We propose Bilayer Convolutional Network (BCNet), where the top layer detects occluding objects (occluders) and the bottom layer infers partially occluded instances (occludees) We investigate the efficacy of bilayer structure using two popular convolutional network designs, namely, Fully Convolutional Network (FCN) and Graph Convolutional Network (GCN)
arXiv Detail & Related papers (2022-08-08T21:39:26Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [72.38919601150175]
We propose Bilayer Convolutional Network (BCNet) to segment highly-overlapping objects. BCNet detects the occluding objects (occluder) and the bottom GCN layer infers partially occluded instance (occludee)
arXiv Detail & Related papers (2021-03-23T06:25:42Z)
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [9.536947328412198]
We propose a deep network for multi-object instance segmentation that is robust to occlusion. Our work builds on Compositional Networks, which learn a generative model of neural feature activations to locate occluders. In particular, we obtain feed-forward predictions of the object classes and their instance and occluder segmentations.
arXiv Detail & Related papers (2020-12-03T17:41:55Z)
Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition under Occlusion [21.737411464598797]
We show that black-box deep convolutional neural networks (DCNNs) have only limited robustness to partial occlusion. We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets) Our experiments show that CompositionalNets improve by a large margin over their non-compositional counterparts at classifying and detecting partially occluded objects.
arXiv Detail & Related papers (2020-06-28T08:18:19Z)
Robust Object Detection under Occlusion with Context-Aware CompositionalNets [21.303976151518125]
Compositional convolutional neural networks (CompositionalNets) have been shown to be robust at classifying occluded objects. We propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects.
arXiv Detail & Related papers (2020-05-24T02:57:34Z)
Linguistically Driven Graph Capsule Network for Visual Question Reasoning [153.76012414126643]
We propose a hierarchical compositional reasoning model called the "Linguistically driven Graph Capsule Network" The compositional process is guided by the linguistic parse tree. Specifically, we bind each capsule in the lowest layer to bridge the linguistic embedding of a single word in the original question with visual evidence. Experiments on the CLEVR dataset, CLEVR compositional generation test, and FigureQA dataset demonstrate the effectiveness and composition generalization ability of our end-to-end model.
arXiv Detail & Related papers (2020-03-23T03:34:25Z)
Ellipse R-CNN: Learning to Infer Elliptical Object from Clustering and Occlusion [31.237782332036552]
We introduce the first CNN-based ellipse detector, called Ellipse R-CNN, to represent and infer occluded objects as ellipses. We first propose a robust and compact ellipse regression based on the Mask R-CNN architecture for elliptical object detection.
arXiv Detail & Related papers (2020-01-30T22:04:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.