Related papers: GeneCAI: Genetic Evolution for Acquiring Compact AI

GeneCAI: Genetic Evolution for Acquiring Compact AI

URL: http://arxiv.org/abs/2004.04249v2
Date: Tue, 14 Apr 2020 04:35:42 GMT
Title: GeneCAI: Genetic Evolution for Acquiring Compact AI
Authors: Mojan Javaheripi, Mohammad Samragh, Tara Javidi, Farinaz Koushanfar
Abstract summary: Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy. Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices. This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper- parameters.
Score: 36.04715576228068
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the contemporary big data realm, Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy. Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices. Such methods comprise various hyper-parameters that require per-layer customization to ensure high accuracy. Choosing such hyper-parameters is cumbersome as the pertinent search space grows exponentially with model layers. This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper-parameters. We devise a bijective translation scheme that encodes compressed DNNs to the genotype space. The optimality of each genotype is measured using a multi-objective score based on accuracy and number of floating point operations. We develop customized genetic operations to iteratively evolve the non-dominated solutions towards the optimal Pareto front, thus, capturing the optimal trade-off between model accuracy and complexity. GeneCAI optimization method is highly scalable and can achieve a near-linear performance boost on distributed multi-GPU platforms. Our extensive evaluations demonstrate that GeneCAI outperforms existing rule-based and reinforcement learning methods in DNN compression by finding models that lie on a better accuracy-complexity Pareto curve.

Related papers

Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures. By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets. Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z)
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [56.58170370127227]
We show that optimal learning rate follows a power-law relationship with both model parameters and data sizes, while optimal batch size scales primarily with data sizes. This work is the first work that unifies different model shapes and structures, such as Mixture-of-Experts models and dense transformers.
arXiv Detail & Related papers (2025-03-06T18:58:29Z)
GDSG: Graph Diffusion-based Solution Generator for Optimization Problems in MEC Networks [109.17835015018532]
We present a Graph Diffusion-based Solution Generation (GDSG) method. This approach is designed to work with suboptimal datasets while converging to the optimal solution large probably. We build GDSG as a multi-task diffusion model utilizing a Graph Neural Network (GNN) to acquire the distribution of high-quality solutions.
arXiv Detail & Related papers (2024-12-11T11:13:43Z)
Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems. DTs often struggle to generalize to unseen conditions in data-scarce settings. In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z)
Discovering Physics-Informed Neural Networks Model for Solving Partial Differential Equations through Evolutionary Computation [5.8407437499182935]
This article proposes an evolutionary computation method aimed at discovering the PINNs model with higher approximation accuracy and faster convergence rate. In experiments, the performance of different models that are searched through Bayesian optimization, random search and evolution is compared in solving Klein-Gordon, Burgers, and Lam'e equations.
arXiv Detail & Related papers (2024-05-18T07:32:02Z)
Fast Genetic Algorithm for feature selection -- A qualitative approximation approach [5.279268784803583]
We propose a two-stage surrogate-assisted evolutionary approach to address the computational issues arising from using Genetic Algorithm (GA) for feature selection. We show that CHCQX converges faster to feature subset solutions of significantly higher accuracy, particularly for large datasets with over 100K instances.
arXiv Detail & Related papers (2024-04-05T10:15:24Z)
Hybrid Genetic Algorithm and Hill Climbing Optimization for the Neural Network [0.0]
We propose a hybrid model combining genetic algorithm and hill climbing algorithm for optimizing Convolutional Neural Networks (CNNs) on the CIFAR-100 dataset. The proposed hybrid model achieves better accuracy with fewer generations compared to the standard algorithms.
arXiv Detail & Related papers (2023-08-24T22:03:18Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems. A theory has shown the importance of the gradient descent (GD) to globally optimal solutions. We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
Genetic-algorithm-optimized neural networks for gravitational wave classification [0.0]
We propose a new method for hyperparameter optimization based on genetic algorithms (GAs) We show that the GA can discover high-quality architectures when the initial hyper parameter seed values are far from a good solution. Using genetic algorithm optimization to refine an existing network should be especially useful if the problem context changes.
arXiv Detail & Related papers (2020-10-09T03:14:20Z)
A Study of Genetic Algorithms for Hyperparameter Optimization of Neural Networks in Machine Translation [0.0]
We propose an automatic tuning method modeled after Darwin's Survival of the Fittest Theory via a Genetic Algorithm. Research results show that the proposed method, a GA, outperforms a random selection of hyper parameters.
arXiv Detail & Related papers (2020-09-15T02:24:16Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.