An Ensemble of Simple Convolutional Neural Network Models for MNIST
Digit Recognition
- URL: http://arxiv.org/abs/2008.10400v2
- Date: Mon, 5 Oct 2020 03:49:48 GMT
- Title: An Ensemble of Simple Convolutional Neural Network Models for MNIST
Digit Recognition
- Authors: Sanghyeon An, Minjun Lee, Sanglee Park, Heerin Yang, Jungmin So
- Abstract summary: A very high accuracy on the MNIST test set can be achieved by using simple convolutional neural network (CNN) models.
A two-layer ensemble, a heterogeneous ensemble of three homogeneous ensemble networks, can achieve up to 99.91% test accuracy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We report that a very high accuracy on the MNIST test set can be achieved by
using simple convolutional neural network (CNN) models. We use three different
models with 3x3, 5x5, and 7x7 kernel size in the convolution layers. Each model
consists of a set of convolution layers followed by a single fully connected
layer. Every convolution layer uses batch normalization and ReLU activation,
and pooling is not used. Rotation and translation is used to augment training
data, which is frequently used in most image classification tasks. A majority
voting using the three models independently trained on the training data set
can achieve up to 99.87% accuracy on the test set, which is one of the
state-of-the-art results. A two-layer ensemble, a heterogeneous ensemble of
three homogeneous ensemble networks, can achieve up to 99.91% test accuracy.
The results can be reproduced by using the code at:
https://github.com/ansh941/MnistSimpleCNN
Related papers
- HyperKAN: Kolmogorov-Arnold Networks make Hyperspectral Image Classificators Smarter [0.0699049312989311]
We propose the replacement of linear and convolutional layers of traditional networks with KAN-based counterparts.
These modifications allowed us to significantly increase the per-pixel classification accuracy for hyperspectral remote-sensing images.
The greatest effect was achieved for convolutional networks working exclusively on spectral data.
arXiv Detail & Related papers (2024-07-07T06:36:09Z) - GMConv: Modulating Effective Receptive Fields for Convolutional Kernels [52.50351140755224]
In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF)
Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.
Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
arXiv Detail & Related papers (2023-02-09T10:17:17Z) - Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs.
They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion.
For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z) - Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs)
In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers.
The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z) - CBIR using Pre-Trained Neural Networks [1.2130044156459308]
We use a pretrained Inception V3 model, and extract activation of its last fully connected layer, which forms a low dimensional representation of the image.
This feature matrix, is then divided into branches and separate feature extraction is done for each branch, to obtain multiple features flattened into a vector.
We achieved a training accuracy of 99.46% and validation accuracy of 84.56% for the same.
arXiv Detail & Related papers (2021-10-27T14:19:48Z) - Adaptive Convolution Kernel for Artificial Neural Networks [0.0]
This paper describes a method for training the size of convolutional kernels to provide varying size kernels in a single layer.
Experiments compared the proposed adaptive layers to ordinary convolution layers in a simple two-layer network.
A segmentation experiment in the Oxford-Pets dataset demonstrated that replacing a single ordinary convolution layer in a U-shaped network with a single 7$times$7 adaptive layer can improve its learning performance and ability to generalize.
arXiv Detail & Related papers (2020-09-14T12:36:50Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - MixNet: Multi-modality Mix Network for Brain Segmentation [8.44876865136712]
MixNet is a 2D semantic-wise deep convolutional neural network to segment brain structure in MRI images.
MixNetv2 was submitted to the MRBrainS challenge at MICCAI 2018 and won the 3rd place in the 3-label task.
arXiv Detail & Related papers (2020-04-21T08:55:55Z) - Question Type Classification Methods Comparison [0.0]
The paper presents a comparative study of state-of-the-art approaches for question classification task: Logistic Regression, Convolutional Neural Networks (CNN), Long Short-Term Memory Network (LSTM) and Quasi-Recurrent Neural Networks (QRNN)
All models use pre-trained GLoVe word embeddings and trained on human-labeled data.
The best accuracy is achieved using CNN model with five convolutional layers and various kernel sizes stacked in parallel, followed by one fully connected layer.
arXiv Detail & Related papers (2020-01-03T00:16:46Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.