Training Artificial Neural Networks by Coordinate Search Algorithm
- URL: http://arxiv.org/abs/2402.12646v1
- Date: Tue, 20 Feb 2024 01:47:25 GMT
- Title: Training Artificial Neural Networks by Coordinate Search Algorithm
- Authors: Ehsan Rokhsatyazdi, Shahryar Rahnamayan, Sevil Zanjani Miyandoab, Azam
Asilian Bidgoli, H.R. Tizhoosh
- Abstract summary: We propose an efficient version of the gradient-free Coordinate Search (CS) algorithm for training neural networks.
The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems.
Finding the optimal values for weights of ANNs is a large-scale optimization problem.
- Score: 0.20971479389679332
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training Artificial Neural Networks poses a challenging and critical problem
in machine learning. Despite the effectiveness of gradient-based learning
methods, such as Stochastic Gradient Descent (SGD), in training neural
networks, they do have several limitations. For instance, they require
differentiable activation functions, and cannot optimize a model based on
several independent non-differentiable loss functions simultaneously; for
example, the F1-score, which is used during testing, can be used during
training when a gradient-free optimization algorithm is utilized. Furthermore,
the training in any DNN can be possible with a small size of the training
dataset. To address these concerns, we propose an efficient version of the
gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern
Search methods, for training neural networks. The proposed algorithm can be
used with non-differentiable activation functions and tailored to
multi-objective/multi-loss problems. Finding the optimal values for weights of
ANNs is a large-scale optimization problem. Therefore instead of finding the
optimal value for each variable, which is the common technique in classical CS,
we accelerate optimization and convergence by bundling the weights. In fact,
this strategy is a form of dimension reduction for optimization problems. Based
on the experimental results, the proposed method, in some cases, outperforms
the gradient-based approach, particularly, in situations with insufficient
labeled training data. The performance plots demonstrate a high convergence
rate, highlighting the capability of our suggested method to find a reasonable
solution with fewer function calls. As of now, the only practical and efficient
way of training ANNs with hundreds of thousands of weights is gradient-based
algorithms such as SGD or Adam. In this paper we introduce an alternative
method for training ANN.
Related papers
- Multi-Objective Optimization for Sparse Deep Multi-Task Learning [0.0]
We present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs)
Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with particular focus on Deep Multi-Task models.
arXiv Detail & Related papers (2023-08-23T16:42:27Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - AskewSGD : An Annealed interval-constrained Optimisation method to train
Quantized Neural Networks [12.229154524476405]
We develop a new algorithm, Annealed Skewed SGD - AskewSGD - for training deep neural networks (DNNs) with quantized weights.
Unlike algorithms with active sets and feasible directions, AskewSGD avoids projections or optimization under the entire feasible set.
Experimental results show that the AskewSGD algorithm performs better than or on par with state of the art methods in classical benchmarks.
arXiv Detail & Related papers (2022-11-07T18:13:44Z) - Training Neural Networks using SAT solvers [1.0152838128195465]
We propose an algorithm to explore the global optimisation method, using SAT solvers, for training a neural net.
In the experiments, we demonstrate the effectiveness of our algorithm against the ADAM optimiser in certain tasks like parity learning.
arXiv Detail & Related papers (2022-06-10T01:31:12Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - A Dynamical View on Optimization Algorithms of Overparameterized Neural
Networks [23.038631072178735]
We consider a broad class of optimization algorithms that are commonly used in practice.
As a consequence, we can leverage the convergence behavior of neural networks.
We believe our approach can also be extended to other optimization algorithms and network theory.
arXiv Detail & Related papers (2020-10-25T17:10:22Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Physarum Powered Differentiable Linear Programming Layers and
Applications [48.77235931652611]
We propose an efficient and differentiable solver for general linear programming problems.
We show the use of our solver in a video segmentation task and meta-learning for few-shot learning.
arXiv Detail & Related papers (2020-04-30T01:50:37Z) - Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations.
Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization.
It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z) - Optimizing Wireless Systems Using Unsupervised and
Reinforced-Unsupervised Deep Learning [96.01176486957226]
Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems.
In this article, we introduce unsupervised and reinforced-unsupervised learning frameworks for solving both variable and functional optimization problems.
arXiv Detail & Related papers (2020-01-03T11:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.