Ensemble learning in CNN augmented with fully connected subnetworks
- URL: http://arxiv.org/abs/2003.08562v3
- Date: Tue, 24 Mar 2020 07:03:45 GMT
- Title: Ensemble learning in CNN augmented with fully connected subnetworks
- Authors: Daiki Hirata, Norikazu Takahashi
- Abstract summary: We propose a new model called EnsNet which is composed of one base CNN and multiple Fully Connected SubNetworks (FCSNs)
An EnsNet achieves a state-of-the-art error rate of 0.16% on MNIST.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks (CNNs) have shown remarkable performance in
general object recognition tasks. In this paper, we propose a new model called
EnsNet which is composed of one base CNN and multiple Fully Connected
SubNetworks (FCSNs). In this model, the set of feature-maps generated by the
last convolutional layer in the base CNN is divided along channels into
disjoint subsets, and these subsets are assigned to the FCSNs. Each of the
FCSNs is trained independent of others so that it can predict the class label
from the subset of the feature-maps assigned to it. The output of the overall
model is determined by majority vote of the base CNN and the FCSNs.
Experimental results using the MNIST, Fashion-MNIST and CIFAR-10 datasets show
that the proposed approach further improves the performance of CNNs. In
particular, an EnsNet achieves a state-of-the-art error rate of 0.16% on MNIST.
Related papers
- Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications.
Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z) - CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation.
The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z) - PICNN: A Pathway towards Interpretable Convolutional Neural Networks [12.31424771480963]
We introduce a novel pathway to alleviate the entanglement between filters and image classes.
We use the Bernoulli sampling to generate the filter-cluster assignment matrix from a learnable filter-class correspondence matrix.
We evaluate the effectiveness of our method on ten widely used network architectures.
arXiv Detail & Related papers (2023-12-19T11:36:03Z) - Exploiting Hybrid Models of Tensor-Train Networks for Spoken Command
Recognition [9.262289183808035]
This work aims to design a low complexity spoken command recognition (SCR) system.
We exploit a deep hybrid architecture of a tensor-train (TT) network to build an end-to-end SRC pipeline.
Our proposed CNN+(TT-DNN) model attains a competitive accuracy of 96.31% with 4 times fewer model parameters than the CNN model.
arXiv Detail & Related papers (2022-01-11T05:57:38Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - A Structurally Regularized Convolutional Neural Network for Image
Classification using Wavelet-based SubBand Decomposition [2.127049691404299]
We propose a convolutional neural network (CNN) architecture for image classification based on subband decomposition of the image using wavelets.
The proposed architecture decomposes the input image spectra into multiple critically sampled subbands, extracts features using a single CNN per subband, and finally, performs classification by combining the extracted features using a fully connected layer.
We show the proposed architecture is more robust than the regular full-band CNN to noise caused by weight-and-bias quantization and input quantization.
arXiv Detail & Related papers (2021-03-02T16:01:22Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Graph Neural Networks: Architectures, Stability and Transferability [176.3960927323358]
Graph Neural Networks (GNNs) are information processing architectures for signals supported on graphs.
They are generalizations of convolutional neural networks (CNNs) in which individual layers contain banks of graph convolutional filters.
arXiv Detail & Related papers (2020-08-04T18:57:36Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.