Related papers: Convolution Neural Network Hyperparameter Optimization Using Simplified Swarm Optimization

Convolution Neural Network Hyperparameter Optimization Using Simplified Swarm Optimization

URL: http://arxiv.org/abs/2103.03995v1
Date: Sat, 6 Mar 2021 00:23:27 GMT
Title: Convolution Neural Network Hyperparameter Optimization Using Simplified Swarm Optimization
Authors: Wei-Chang Yeh, Yi-Ping Lin, Yun-Chia Liang, Chyh-Ming Lai
Abstract summary: Convolutional Neural Network (CNN) is widely used in computer vision. It is not easy to find a network architecture with better performance.
Score: 2.322689362836168
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Among the machine learning approaches applied in computer vision, Convolutional Neural Network (CNN) is widely used in the field of image recognition. However, although existing CNN models have been proven to be efficient, it is not easy to find a network architecture with better performance. Some studies choose to optimize the network architecture, while others chose to optimize the hyperparameters, such as the number and size of convolutional kernels, convolutional strides, pooling size, etc. Most of them are designed manually, which requires relevant expertise and takes a lot of time. Therefore, this study proposes the idea of applying Simplified Swarm Optimization (SSO) on the hyperparameter optimization of LeNet models while using MNIST, Fashion MNIST, and Cifar10 as validation. The experimental results show that the proposed algorithm has higher accuracy than the original LeNet model, and it only takes a very short time to find a better hyperparameter configuration after training. In addition, we also analyze the output shape of the feature map after each layer, and surprisingly, the results were mostly rectangular. The contribution of the study is to provide users with a simpler way to get better results with the existing model., and this study can also be applied to other CNN architectures.

Related papers

Finding Optimal Kernel Size and Dimension in Convolutional Neural Networks An Architecture Optimization Approach [0.0]
Kernel size selection in Convolutional Neural Networks (CNNs) is a critical but often overlooked design decision.<n>This paper proposes the Best Kernel Size Estimation (BKSEF) for optimal, layer-wise kernel size determination.<n> BKSEF balances information gain, computational efficiency, and accuracy improvements by integrating principles from information theory, signal processing, and learning theory.
arXiv Detail & Related papers (2025-06-16T15:15:30Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Neural Architecture Search using Particle Swarm and Ant Colony Optimization [0.0]
This paper focuses on training and optimizing CNNs using the Swarm Intelligence (SI) components of OpenNAS. A system integrating open source tools for Neural Architecture Search (OpenNAS), in the classification of images, has been developed.
arXiv Detail & Related papers (2024-03-06T15:23:26Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units [4.807347156077897]
Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive when used at scale. This paper presents an efficient alternative for optimizing neural networks that reduces the costs of scaling neural networks and provides high-efficiency optimizations for low-resource applications.
arXiv Detail & Related papers (2023-11-13T17:55:07Z)
Receptive Field Refinement for Convolutional Neural Networks Reliably Improves Predictive Performance [1.52292571922932]
We present a new approach to receptive field analysis that can yield these types of theoretical and empirical performance gains. Our approach is able to improve ImageNet1K performance across a wide range of well-known, state-of-the-art (SOTA) model classes.
arXiv Detail & Related papers (2022-11-26T05:27:44Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network. We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint. Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z)
Greedy Network Enlarging [53.319011626986004]
We propose a greedy network enlarging method based on the reallocation of computations. With step-by-step modifying the computations on different stages, the enlarged network will be equipped with optimal allocation and utilization of MACs. With application of our method on GhostNet, we achieve state-of-the-art 80.9% and 84.3% ImageNet top-1 accuracies.
arXiv Detail & Related papers (2021-07-31T08:36:30Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
Inferring Convolutional Neural Networks' accuracies from their architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance. We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems. We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.