Related papers: Improving Gradient Methods via Coordinate Transformations: Applications to Quantum Machine Learning

Improving Gradient Methods via Coordinate Transformations: Applications to Quantum Machine Learning

URL: http://arxiv.org/abs/2304.06768v2
Date: Thu, 25 Apr 2024 16:42:59 GMT
Title: Improving Gradient Methods via Coordinate Transformations: Applications to Quantum Machine Learning
Authors: Pablo Bermejo, Borja Aizpurua, Roman Orus,
Abstract summary: Machine learning algorithms heavily rely on optimization algorithms based on gradients, such as gradient descent and alike. The overall performance is dependent on the appearance of local minima and barren plateaus, which slow-down calculations and lead to non-optimal solutions. In this paper we introduce a generic strategy to accelerate and improve the overall performance of such methods, allowing to alleviate the effect of barren plateaus and local minima.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning algorithms, both in their classical and quantum versions, heavily rely on optimization algorithms based on gradients, such as gradient descent and alike. The overall performance is dependent on the appearance of local minima and barren plateaus, which slow-down calculations and lead to non-optimal solutions. In practice, this results in dramatic computational and energy costs for AI applications. In this paper we introduce a generic strategy to accelerate and improve the overall performance of such methods, allowing to alleviate the effect of barren plateaus and local minima. Our method is based on coordinate transformations, somehow similar to variational rotations, adding extra directions in parameter space that depend on the cost function itself, and which allow to explore the configuration landscape more efficiently. The validity of our method is benchmarked by boosting a number of quantum machine learning algorithms, getting a very significant improvement in their performance.

Related papers

Information Geometry and Beta Link for Optimizing Sparse Variational Student-t Processes [6.37512592611305]
Student-t Processes has been proposed to enhance computational efficiency and flexibility for real-world datasets using gradient descent. Traditional gradient descent methods like Adam may not fully exploit the parameter space geometry, potentially leading to slower convergence and suboptimal performance. We adopt natural gradient methods from information geometry for variational parameter optimization of Student-t Processes.
arXiv Detail & Related papers (2024-08-13T07:53:39Z)
Qudit inspired optimization for graph coloring [0.0]
We introduce a quantum-inspired algorithm for Graph Coloring Problems (GCPs) We use qudits in a product state, with each qudit representing a node in the graph and parameterized by d-dimensional spherical coordinates. We benchmark two optimization strategies: qudit gradient descent (QdGD), initiating qudits in random states and employing gradient descent to minimize a cost function.
arXiv Detail & Related papers (2024-06-02T16:19:55Z)
Gradient-free neural topology optimization [0.0]
gradient-free algorithms require many more iterations to converge when compared to gradient-based algorithms. This has made them unviable for topology optimization due to the high computational cost per iteration and high dimensionality of these problems. We propose a pre-trained neural reparameterization strategy that leads to at least one order of magnitude decrease in iteration count when optimizing the designs in latent space.
arXiv Detail & Related papers (2024-03-07T23:00:49Z)
ELRA: Exponential learning rate adaption gradient descent optimization method [83.88591755871734]
We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption. The main idea of the method is to adapt the $alpha by situational awareness. It can be applied to problems of any dimensions n and scales only linearly.
arXiv Detail & Related papers (2023-09-12T14:36:13Z)
Mechanic: A Learning Rate Tuner [52.4242550204696]
We introduce a technique for tuning the learning rate scale factor of any base optimization algorithm and schedule automatically, which we call textscmechanic. We rigorously evaluate textscmechanic on a range of large scale deep learning tasks with varying batch sizes, schedules, and base optimization algorithms.
arXiv Detail & Related papers (2023-05-31T19:32:43Z)
Recommender System Expedited Quantum Control Optimization [0.0]
Quantum control optimization algorithms are routinely used to generate optimal quantum gates or efficient quantum state transfers. There are two main challenges in designing efficient optimization algorithms, namely overcoming the sensitivity to local optima and improving the computational speed. Here, we propose and demonstrate the use of a machine learning method, specifically the recommender system (RS), to deal with the latter challenge.
arXiv Detail & Related papers (2022-01-29T10:25:41Z)
Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies. This enables integration of articulated body dynamics into deep learning frameworks. We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z)
Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent [20.47598828422897]
We propose textit-Meta-Regularization, a novel approach for the adaptive choice of the learning rate in first-order descent methods. Our approach modifies the objective function by adding a regularization term, and casts the joint process parameters.
arXiv Detail & Related papers (2021-04-12T13:13:34Z)
Channel-Directed Gradients for Optimization of Convolutional Neural Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error. We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z)
Efficient Learning of Generative Models via Finite-Difference Score Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference. Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z)
A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update. We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z)
Variance Reduction with Sparse Gradients [82.41780420431205]
Variance reduction methods such as SVRG and SpiderBoost use a mixture of large and small batch gradients. We introduce a new sparsity operator: The random-top-k operator. Our algorithm consistently outperforms SpiderBoost on various tasks including image classification, natural language processing, and sparse matrix factorization.
arXiv Detail & Related papers (2020-01-27T08:23:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.