Optimal Hyperparameters and Structure Setting of Multi-Objective Robust
CNN Systems via Generalized Taguchi Method and Objective Vector Norm
- URL: http://arxiv.org/abs/2202.04567v2
- Date: Thu, 10 Feb 2022 18:24:53 GMT
- Title: Optimal Hyperparameters and Structure Setting of Multi-Objective Robust
CNN Systems via Generalized Taguchi Method and Objective Vector Norm
- Authors: Sheng-Guo Wang and Shanshan Jiang (The University of North Carolina at
Charlotte)
- Abstract summary: Machine Learning, Artificial Intelligence, and Convolutional Neural Network (CNN) have made huge progress with broad applications.
These systems may have multi-objective ML and AI performance needs.
There is a key requirement to find the optimal hyperparameters and structures for multi-objective robust optimal CNN systems.
- Score: 0.587414205988452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Machine Learning (ML), Artificial Intelligence (AI), and
Convolutional Neural Network (CNN) have made huge progress with broad
applications, where their systems have deep learning structures and a large
number of hyperparameters that determine the quality and performance of the
CNNs and AI systems. These systems may have multi-objective ML and AI
performance needs. There is a key requirement to find the optimal
hyperparameters and structures for multi-objective robust optimal CNN systems.
This paper proposes a generalized Taguchi approach to effectively determine the
optimal hyperparameters and structure for the multi-objective robust optimal
CNN systems via their objective performance vector norm. The proposed approach
and methods are applied to a CNN classification system with the original ResNet
for CIFAR-10 dataset as a demonstration and validation, which shows the
proposed methods are highly effective to achieve an optimal accuracy rate of
the original ResNet on CIFAR-10.
Related papers
- QADM-Net: Multi-Level Quality-Adaptive Dynamic Network for Reliable Multimodal Classification [57.08108545219043]
Current multimodal classification methods lack dynamic networks for sample-specific depth and parameters to achieve reliable inference.
We propose Multi-Level Quality-Adaptive Dynamic Multimodal Network (QADM-Net)
Experiments conducted on four datasets demonstrate that QADM-Net significantly outperforms state-of-the-art methods in classification performance and reliability.
arXiv Detail & Related papers (2024-12-19T03:26:51Z) - Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs.
We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections.
Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Towards Hyperparameter-Agnostic DNN Training via Dynamical System
Insights [4.513581513983453]
We present a first-order optimization method specialized for deep neural networks (DNNs), ECCO-DNN.
This method models the optimization variable trajectory as a dynamical system and develops a discretization algorithm that adaptively selects step sizes based on the trajectory's shape.
arXiv Detail & Related papers (2023-10-21T03:45:13Z) - Bayesian Hyperparameter Optimization for Deep Neural Network-Based
Network Intrusion Detection [2.304713283039168]
Deep neural networks (DNN) have been successfully applied for intrusion detection problems.
This paper proposes a novel Bayesian optimization-based framework for the automatic optimization of hyper parameters.
We show that the proposed framework demonstrates significantly higher intrusion detection performance than the random search optimization-based approach.
arXiv Detail & Related papers (2022-07-07T20:08:38Z) - Towards Enabling Dynamic Convolution Neural Network Inference for Edge
Intelligence [0.0]
Recent advances in edge intelligence require CNN inference on edge network to increase throughput and reduce latency.
To provide flexibility, dynamic parameter allocation to different mobile devices is required to implement either a predefined or defined on-the-fly CNN architecture.
We propose a library-based approach to design scalable and dynamic distributed CNN inference on the fly.
arXiv Detail & Related papers (2022-02-18T22:33:42Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Convolution Neural Network Hyperparameter Optimization Using Simplified
Swarm Optimization [2.322689362836168]
Convolutional Neural Network (CNN) is widely used in computer vision.
It is not easy to find a network architecture with better performance.
arXiv Detail & Related papers (2021-03-06T00:23:27Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance.
We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems.
We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.