Related papers: A Linearly Convergent Algorithm for Computing the Petz-Augustin Mean

A Linearly Convergent Algorithm for Computing the Petz-Augustin Mean

URL: http://arxiv.org/abs/2502.06399v2
Date: Wed, 16 Jul 2025 08:03:27 GMT
Title: A Linearly Convergent Algorithm for Computing the Petz-Augustin Mean
Authors: Chun-Neng Chu, Wei-Fu Tseng, Yen-Huan Li,
Abstract summary: We study the computation of the Petz-Augustin mean of order $alpha in (0,1) cup (1,infty)$.<n>We propose the first algorithm with a non-asymptotic convergence guarantee for solving this optimization problem.<n>We establish that the Petz-Augustin mean of order $alpha$, when all quantum states commute, is equivalent to the equilibrium prices in Fisher markets.
Score: 2.059931105362387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the computation of the Petz-Augustin mean of order $\alpha \in (0,1) \cup (1,\infty)$, defined as the minimizer of a weighted sum of $n$ Petz-R\'enyi divergences of order $\alpha$ over the set of $d$-by-$d$ quantum states, where the Petz-R\'enyi divergence is a quantum generalization of the classical R\'enyi divergence. We propose the first algorithm with a non-asymptotic convergence guarantee for solving this optimization problem. The iterates are guaranteed to converge to the Petz-Augustin mean at a linear rate of $ O\left( \lvert 1 - 1/\alpha \rvert^T \right) $ with respect to the Thompson metric for $\alpha\in(1/2,1)\cup(1,\infty)$, where $ T $ denotes the number of iterations. The algorithm has an initialization time complexity of $O\left(nd^3\right)$ and a per-iteration time complexity of $O\left(nd^2 + d^3\right)$. Two applications follow. First, we propose the first iterative method with a non-asymptotic convergence guarantee for computing the Petz capacity of order $\alpha\in(1/2,1)$, which generalizes the quantum channel capacity and characterizes the optimal error exponent in classical-quantum channel coding. Second, we establish that the Petz-Augustin mean of order $\alpha$, when all quantum states commute, is equivalent to the equilibrium prices in Fisher markets with constant elasticity of substitution (CES) utilities of common elasticity $\rho=1-1/\alpha$, and our proposed algorithm can be interpreted as a t\^{a}tonnement dynamic. We then extend the proposed algorithm to inhomogeneous Fisher markets, where buyers have different elasticities, and prove that it achieves a faster convergence rate compared to existing t\^{a}tonnement-type algorithms.

Related papers

Finite-Time Bounds for Two-Time-Scale Stochastic Approximation with Arbitrary Norm Contractions and Markovian Noise [7.770605097524015]
Two-time-scale Approximation (SA) is an iterative algorithm with applications in reinforcement learning and optimization.<n>Motivated by applications in reinforcement learning, we give the first mean square bound on non linear two-time-scale SA.
arXiv Detail & Related papers (2025-03-24T07:03:23Z)
Convergence Rate Analysis of LION [54.28350823319057]
LION converges iterations of $cal(sqrtdK-)$ measured by gradient Karush-Kuhn-T (sqrtdK-)$. We show that LION can achieve lower loss and higher performance compared to standard SGD.
arXiv Detail & Related papers (2024-11-12T11:30:53Z)
Quantum spectral method for gradient and Hessian estimation [4.193480001271463]
Gradient descent is one of the most basic algorithms for solving continuous optimization problems. We propose a quantum algorithm that returns an $varepsilon$-approximation of its gradient with query complexity $widetildeO (1/varepsilon)$. We also propose two quantum algorithms for Hessian estimation, aiming to improve quantum analogs of Newton's method.
arXiv Detail & Related papers (2024-07-04T11:03:48Z)
Towards Optimal Circuit Size for Sparse Quantum State Preparation [10.386753939552872]
We consider the preparation for $n$-qubit sparse quantum states with $s$ non-zero amplitudes and propose two algorithms. The first algorithm uses $O(ns/log n + n)$ gates, improving upon previous methods by $O(log n)$. The second algorithm is tailored for binary strings that exhibit a short Hamiltonian path.
arXiv Detail & Related papers (2024-04-08T02:13:40Z)
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging [8.990961435218544]
We propose a first-order algorithm named $B$-sample dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $varepsilon$ solution in $smashtildeO(d3/varepsilon2)$ time. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $varepsilon$-optimal solution in $smashtildeO(d3/varepsilon2)$ time.
arXiv Detail & Related papers (2023-11-05T03:33:44Z)
Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming [48.18845814885398]
We develop new techniques to extend the applicability of sketching-based approaches to sparse dictionary learning and the Euclidean $k$-means clustering problems. On the fast algorithms front, we obtain a new approach for designing PTAS's for the $k$-means clustering problem. On the streaming algorithms front, we obtain new upper bounds and lower bounds for dictionary learning and $k$-means clustering.
arXiv Detail & Related papers (2023-10-29T16:46:26Z)
Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets. We present an improved version of the "$q$-means" algorithm for clustering. We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z)
Near-Optimal Quantum Coreset Construction Algorithms for Clustering [15.513270929560088]
We give quantum algorithms that find coresets for $k$-clustering in $mathbbRd$ with $tildeO(sqrtnkd3/2)$ query complexity. Our coreset reduces the input size from $n$ to $mathrmpoly(kepsilon-1d)$, so that existing $alpha$-approximation algorithms for clustering can run on top of it.
arXiv Detail & Related papers (2023-06-05T12:22:46Z)
Private estimation algorithms for stochastic block models and mixture models [63.07482515700984]
General tools for designing efficient private estimation algorithms. First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z)
A Quantum Algorithm Framework for Discrete Probability Distributions with Applications to Rényi Entropy Estimation [13.810917492304565]
We propose a unified quantum algorithm framework for estimating properties of discrete probability distributions. Our framework estimates $alpha$-R'enyi entropy $H_alpha(p)$ to within additive error $epsilon$ with probability at least $2/3$.
arXiv Detail & Related papers (2022-12-03T08:01:55Z)
Mind the gap: Achieving a super-Grover quantum speedup by jumping to the end [114.3957763744719]
We present a quantum algorithm that has rigorous runtime guarantees for several families of binary optimization problems. We show that the algorithm finds the optimal solution in time $O*(2(0.5-c)n)$ for an $n$-independent constant $c$. We also show that for a large fraction of random instances from the $k$-spin model and for any fully satisfiable or slightly frustrated $k$-CSP formula, statement (a) is the case.
arXiv Detail & Related papers (2022-12-03T02:45:23Z)
A (simple) classical algorithm for estimating Betti numbers [1.8749305679160366]
We describe a simple algorithm for estimating the $k$-th normalized Betti number of a simplicial complex over $n$ elements using the path integral Monte Carlo method. For a general simplicial complex, the running time of our algorithm is $nOleft(frac1sqrtgammalogfrac1varepsilonright)$ with $gamma$ measuring the spectral gap of the Laplacian and $varepsilon in (0,$1) the additive precision.
arXiv Detail & Related papers (2022-11-17T16:10:47Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency. In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! . We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z)
The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization [88.91190483500932]
We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems. The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical. We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
arXiv Detail & Related papers (2022-05-19T16:04:40Z)
Random quantum circuits transform local noise into global white noise [118.18170052022323]
We study the distribution over measurement outcomes of noisy random quantum circuits in the low-fidelity regime. For local noise that is sufficiently weak and unital, correlations (measured by the linear cross-entropy benchmark) between the output distribution $p_textnoisy$ of a generic noisy circuit instance shrink exponentially. If the noise is incoherent, the output distribution approaches the uniform distribution $p_textunif$ at precisely the same rate.
arXiv Detail & Related papers (2021-11-29T19:26:28Z)
Online Sub-Sampling for Reinforcement Learning with General Function Approximation [111.01990889581243]
In this paper, we establish an efficient online sub-sampling framework that measures the information gain of data points collected by an RL algorithm. For a value-based method with complexity-bounded function class, we show that the policy only needs to be updated for $proptooperatornamepolylog(K)$ times. In contrast to existing approaches that update the policy for at least $Omega(K)$ times, our approach drastically reduces the number of optimization calls in solving for a policy.
arXiv Detail & Related papers (2021-06-14T07:36:25Z)
Higher-order Derivatives of Weighted Finite-state Machines [68.43084108204741]
This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines. We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature. Our algorithm is significantly faster than prior algorithms.
arXiv Detail & Related papers (2021-06-01T19:51:55Z)
Accelerating Optimization and Reinforcement Learning with Quasi-Stochastic Approximation [2.294014185517203]
This paper sets out to extend convergence theory to quasi-stochastic approximations. It is illustrated with applications to gradient-free optimization and policy gradient algorithms for reinforcement learning.
arXiv Detail & Related papers (2020-09-30T04:44:45Z)
Finite-Time Analysis for Double Q-learning [50.50058000948908]
We provide the first non-asymptotic, finite-time analysis for double Q-learning. We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $epsilon$-accurate neighborhood of the global optimum.
arXiv Detail & Related papers (2020-09-29T18:48:21Z)
Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics. In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Continuous Submodular Maximization: Beyond DR-Submodularity [48.04323002262095]
We first prove a simple variant of the vanilla coordinate ascent, called Coordinate-Ascent+. We then propose Coordinate-Ascent++, that achieves tight $(1-1/e-varepsilon)$-approximation guarantee while performing the same number of iterations. The computation of each round of Coordinate-Ascent++ can be easily parallelized so that the computational cost per machine scales as $O(n/sqrtvarepsilon+nlog n)$.
arXiv Detail & Related papers (2020-06-21T06:57:59Z)
Improved quantum algorithm for A-optimal projection [4.248054546466641]
This paper corrects the time complexity of Duan emphet al.'s algorithm to $(frackappa4ssqrtks epsilonsmathrmpolylog)$. Our algorithm achieves at least a speedup compared to Duan emphet al.'s algorithm.
arXiv Detail & Related papers (2020-06-10T09:31:53Z)
A Simple Convergence Proof of Adam and Adagrad [74.24716715922759]
We show a proof of convergence between the Adam Adagrad and $O(d(N)/st)$ algorithms. Adam converges with the same convergence $O(d(N)/st)$ when used with the default parameters.
arXiv Detail & Related papers (2020-03-05T01:56:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.