Related papers: Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights

Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights

URL: http://arxiv.org/abs/2310.13901v1
Date: Sat, 21 Oct 2023 03:45:13 GMT
Title: Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights
Authors: Carmel Fiscko, Aayushya Agarwal, Yihan Ruan, Soummya Kar, Larry Pileggi, and Bruno Sinopoli
Abstract summary: We present a first-order optimization method specialized for deep neural networks (DNNs), ECCO-DNN. This method models the optimization variable trajectory as a dynamical system and develops a discretization algorithm that adaptively selects step sizes based on the trajectory's shape.
Score: 4.513581513983453
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a stochastic first-order optimization method specialized for deep neural networks (DNNs), ECCO-DNN. This method models the optimization variable trajectory as a dynamical system and develops a discretization algorithm that adaptively selects step sizes based on the trajectory's shape. This provides two key insights: designing the dynamical system for fast continuous-time convergence and developing a time-stepping algorithm to adaptively select step sizes based on principles of numerical integration and neural network structure. The result is an optimizer with performance that is insensitive to hyperparameter variations and that achieves comparable performance to state-of-the-art optimizers including ADAM, SGD, RMSProp, and AdaGrad. We demonstrate this in training DNN models and datasets, including CIFAR-10 and CIFAR-100 using ECCO-DNN and find that ECCO-DNN's single hyperparameter can be changed by three orders of magnitude without affecting the trained models' accuracies. ECCO-DNN's insensitivity reduces the data and computation needed for hyperparameter tuning, making it advantageous for rapid prototyping and for applications with new datasets. To validate the efficacy of our proposed optimizer, we train an LSTM architecture on a household power consumption dataset with ECCO-DNN and achieve an optimal mean-square-error without tuning hyperparameters.

Related papers

Efficient Fault Detection in WSN Based on PCA-Optimized Deep Neural Network Slicing Trained with GOA [0.6827423171182154]
Traditional fault detection methods often struggle with optimizing deep neural networks (DNNs) for efficient performance.<n>This study proposes a novel hybrid method combining Principal Component Analysis (PCA) with a DNN optimized by the Grasshopper Optimization Algorithm (GOA) to address these limitations.<n>Our approach achieves a remarkable 99.72% classification accuracy, with exceptional precision and recall, outperforming conventional methods.
arXiv Detail & Related papers (2025-05-11T15:51:56Z)
An Attempt to Devise a Pairwise Ising-Type Maximum Entropy Model Integrated Cost Function for Optimizing SNN Deployment [0.0]
Spiking Neural Networks (SNNs) emulate the spiking behavior of biological neurons and are typically deployed on distributed-memory neuromorphic hardware. We model SNN dynamics using an Ising-type pairwise interaction framework, bridging microscopic neuron interactions with macroscopic network behavior. We evaluate our approach on two SNNs deployed on the sPyNNaker neuromorphic platform.
arXiv Detail & Related papers (2024-07-09T16:33:43Z)
RLEEGNet: Integrating Brain-Computer Interfaces with Adaptive AI for Intuitive Responsiveness and High-Accuracy Motor Imagery Classification [0.0]
We introduce a framework that leverages Reinforcement Learning with Deep Q-Networks (DQN) for classification tasks. We present a preprocessing technique for multiclass motor imagery (MI) classification in a One-Versus-The-Rest (OVR) manner. The integration of DQN with a 1D-CNN-LSTM architecture optimize the decision-making process in real-time.
arXiv Detail & Related papers (2024-02-09T02:03:13Z)
Dynamically configured physics-informed neural network in topology optimization applications [4.403140515138818]
The physics-informed neural network (PINN) can avoid generating enormous amounts of data when solving forward problems. A dynamically configured PINN-based topology optimization (DCPINN-TO) method is proposed. The accuracy of the displacement prediction and optimization results indicate that the DCPINN-TO method is effective and efficient.
arXiv Detail & Related papers (2023-12-12T05:35:30Z)
Towards A Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms [0.49157446832511503]
This paper presents a framework for a deep learning module inference latency prediction framework. It hosts a set of customizable input parameters to train multiple different RMs per DNN module. It automatically selects a set of trained RMs leading to the highest possible overall prediction accuracy.
arXiv Detail & Related papers (2023-12-11T15:15:48Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
A novel Deep Neural Network architecture for non-linear system identification [78.69776924618505]
We present a novel Deep Neural Network (DNN) architecture for non-linear system identification. Inspired by fading memory systems, we introduce inductive bias (on the architecture) and regularization (on the loss function) This architecture allows for automatic complexity selection based solely on available data.
arXiv Detail & Related papers (2021-06-06T10:06:07Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge Devices [9.313154178072049]
We present a novel fusion-parametric pruning approach, called FuPruner, for accelerating neural networks. We introduce an aggressive fusion method to equivalently transform a model, which extends the optimization space of pruning. FuPruner provides optimization options for controlling fusion and pruning, allowing much more flexible performance-accuracy trade-offs to be made.
arXiv Detail & Related papers (2020-10-30T10:10:08Z)
Automatic Remaining Useful Life Estimation Framework with Embedded Convolutional LSTM as the Backbone [5.927250637620123]
We propose a new LSTM variant called embedded convolutional LSTM (E NeuralTM) In ETM a group of different 1D convolutions is embedded into the LSTM structure. Through this, the temporal information is preserved between and within windows. We show the superiority of our proposed ETM approach over the state-of-the-art approaches on several widely used benchmark data sets for RUL Estimation.
arXiv Detail & Related papers (2020-08-10T08:34:20Z)
Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations. Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization. It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.