A contextual analysis of multi-layer perceptron models in classifying
hand-written digits and letters: limited resources
- URL: http://arxiv.org/abs/2107.01782v1
- Date: Mon, 5 Jul 2021 04:30:37 GMT
- Title: A contextual analysis of multi-layer perceptron models in classifying
hand-written digits and letters: limited resources
- Authors: Tidor-Vlad Pricope
- Abstract summary: We extensively test an end-to-end vanilla neural network (MLP) approach in pure numpy without any pre-processing or feature extraction done beforehand.
We show that basic data mining operations can significantly improve the performance of the models in terms of computational time.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classifying hand-written digits and letters has taken a big leap with the
introduction of ConvNets. However, on very constrained hardware the time
necessary to train such models would be high. Our main contribution is twofold.
First, we extensively test an end-to-end vanilla neural network (MLP) approach
in pure numpy without any pre-processing or feature extraction done beforehand.
Second, we show that basic data mining operations can significantly improve the
performance of the models in terms of computational time, without sacrificing
much accuracy. We illustrate our claims on a simpler variant of the Extended
MNIST dataset, called Balanced EMNIST dataset. Our experiments show that,
without any data mining, we get increased generalization performance when using
more hidden layers and regularization techniques, the best model achieving
84.83% accuracy on a test dataset. Using dimensionality reduction done by PCA
we were able to increase that figure to 85.08% with only 10% of the original
feature space, reducing the memory size needed by 64%. Finally, adding methods
to remove possibly harmful training samples like deviation from the mean helped
us to still achieve over 84% test accuracy but with only 32.8% of the original
memory size for the training set. This compares favorably to the majority of
literature results obtained through similar architectures. Although this
approach gets outshined by state-of-the-art models, it does scale to some
(AlexNet, VGGNet) trained on 50% of the same dataset.
Related papers
- More precise edge detections [0.0]
Edge detection (ED) is a base task in computer vision.
Current models still suffer from unsatisfactory precision rates.
Model architecture for more precise predictions still needs an investigation.
arXiv Detail & Related papers (2024-07-29T13:24:55Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Gradient-Free Structured Pruning with Unlabeled Data [57.999191898036706]
We propose a gradient-free structured pruning framework that uses only unlabeled data.
Up to 40% of the original FLOP count can be reduced with less than a 4% accuracy loss across all tasks considered.
arXiv Detail & Related papers (2023-03-07T19:12:31Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - CoV-TI-Net: Transferred Initialization with Modified End Layer for
COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations.
In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset.
The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z) - Complementary Ensemble Learning [1.90365714903665]
We derive a technique to improve performance of state-of-the-art deep learning models.
Specifically, we train auxiliary models which are able to complement state-of-the-art model uncertainty.
arXiv Detail & Related papers (2021-11-09T03:23:05Z) - SSSE: Efficiently Erasing Samples from Trained Machine Learning Models [103.43466657962242]
We propose an efficient and effective algorithm, SSSE, for samples erasure.
In certain cases SSSE can erase samples almost as well as the optimal, yet impractical, gold standard of training a new model from scratch with only the permitted data.
arXiv Detail & Related papers (2021-07-08T14:17:24Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Machine learning for complete intersection Calabi-Yau manifolds: a
methodological study [0.0]
We revisit the question of predicting Hodge numbers $h1,1$ and $h2,1$ of complete Calabi-Yau intersections using machine learning (ML)
We obtain 97% (resp. 99%) accuracy for $h1,1$ using a neural network inspired by the Inception model for the old dataset, using only 30% (resp. 70%) of the data for training.
For the new one, a simple linear regression leads to almost 100% accuracy with 30% of the data for training.
arXiv Detail & Related papers (2020-07-30T19:43:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.