Related papers: Dual-Individual Genetic Algorithm: A Dual-Individual Approach for Efficient Training of Multi-Layer Neural Networks

Dual-Individual Genetic Algorithm: A Dual-Individual Approach for Efficient Training of Multi-Layer Neural Networks

URL: http://arxiv.org/abs/2504.17346v1
Date: Thu, 24 Apr 2025 08:04:08 GMT
Title: Dual-Individual Genetic Algorithm: A Dual-Individual Approach for Efficient Training of Multi-Layer Neural Networks
Authors: Tran Thuy Nga Truong, Jooyong Kim,
Abstract summary: This paper introduces an enhanced Genetic Algorithm technique to optimize neural networks for binary image classification tasks.<n>The Dual-Individual Genetic Algorithm employs only two individuals for crossover, represented by two parameter sets: Leader and Follower.<n> Experimental results show that the Dual-Individual GA achieves 99.04% training accuracy and 80% testing accuracy (cost = 0.034) on a three-layer network with architecture.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces an enhanced Genetic Algorithm technique called Dual-Individual Genetic Algorithm (Dual-Individual GA), which optimizes neural networks for binary image classification tasks, such as cat vs. non-cat classification. The proposed method employs only two individuals for crossover, represented by two parameter sets: Leader and Follower. The Leader focuses on exploitation, representing the primary optimal solution at even-indexed positions (0, 2, 4, ...), while the Follower promotes exploration by preserving diversity and avoiding premature convergence, operating at odd-indexed positions (1, 3, 5, ...). Leader and Follower are modeled as two phases or roles. The key contributions of this work are threefold: (1) a self-adaptive layer dimension mechanism that eliminates the need for manual tuning of layer architectures; (2) generates two parameter sets, leader and follower parameter sets, with 10 layer architecture configurations (5 for each set), ranked by Pareto dominance and cost. post-optimization; and (3) demonstrated superior performance compared to traditional gradient-based methods. Experimental results show that the Dual-Individual GA achieves 99.04% training accuracy and 80% testing accuracy (cost = 0.034) on a three-layer network with architecture [12288, 17, 4, 1], outperforming a gradient-based approach that achieves 98% training accuracy and 80% testing accuracy (cost = 0.092) on a four-layer network with architecture [12288, 20, 7, 5, 1]. These findings highlight the efficiency and effectiveness of the proposed method in optimizing neural networks.

Related papers

AHDMIL: Asymmetric Hierarchical Distillation Multi-Instance Learning for Fast and Accurate Whole-Slide Image Classification [51.525891360380285]
AHDMIL is an Asymmetric Hierarchical Distillation Multi-Instance Learning framework.<n>It eliminates irrelevant patches through a two-step training process.<n>It consistently outperforms previous state-of-the-art methods in both classification performance and inference speed.
arXiv Detail & Related papers (2025-08-07T07:47:16Z)
DNAD: Differentiable Neural Architecture Distillation [6.026956571669411]
Differentiable neural architecture distillation (DNAD) algorithm is developed based on two cores, namely search by deleting and search by imitating. DNAD achieves the top-1 error rate of 23.7% on ImageNet classification with a model of 6.0M parameters and 598M FLOPs. Super-network progressive shrinking (SNPS) algorithm is developed based on the framework of differentiable architecture search (DARTS)
arXiv Detail & Related papers (2025-04-25T08:49:31Z)
Boosting the Efficiency of Parametric Detection with Hierarchical Neural Networks [4.1410005218338695]
We propose Hierarchical Detection Network (HDN), a novel approach to efficient detection. The network is trained using a novel loss function, which encodes simultaneously the goals of statistical accuracy and efficiency. We show how training a three-layer HDN using two-layer model can further boost both accuracy and efficiency.
arXiv Detail & Related papers (2022-07-23T19:23:00Z)
On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models. Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data. We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z)
Binarizing by Classification: Is soft function really necessary? [4.329951775163721]
We propose to tackle network binarization as a binary classification problem. We also take binarization as a lightweighting approach for pose estimation models. The proposed method enables binary networks to achieve a mAP of up to $60.6$ for the first time.
arXiv Detail & Related papers (2022-05-16T02:47:41Z)
Optimization of Residual Convolutional Neural Network for Electrocardiogram Classification [0.9281671380673306]
We propose to optimize the Recurrent one Dimensional Convolutional Neural Network model (R-1D-CNN) with two levels. At the first level, a residual convolutional layer and one-dimensional convolutional neural layers are trained to learn patient-specific ECG features. The second level is automatic and based on proposed algorithm based BO.
arXiv Detail & Related papers (2021-12-11T16:52:23Z)
Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations. This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z)
Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community. We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
A block coordinate descent optimizer for classification problems exploiting convexity [0.0]
We introduce a coordinate descent method to deep linear networks for classification tasks that exploits convexity of the cross-entropy loss in the weights of the hidden layer. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training.
arXiv Detail & Related papers (2020-06-17T19:49:06Z)
DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures. We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart. We show how to build a strong baseline, which already achieves state-of-the-art accuracy. We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.