Fugu-MT 論文翻訳(概要): Distributed Variational Quantum Linear Solver

論文の概要: Distributed Variational Quantum Linear Solver

arxiv url: http://arxiv.org/abs/2604.14435v1
Date: Wed, 15 Apr 2026 21:27:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-17 21:29:31.620645
Title: Distributed Variational Quantum Linear Solver
Title（参考訳）: 分散変分量子線形解法
Authors: Chao Lu, Pooja Rao, Muralikrishnan Gopalakrishnan Meena, Kalyana Chakaravarthi Gottiparthi,
Abstract要約: NVIDIA-Q上に構築された分散VQLSフレームワークにより,O(L2)コスト関数評価のスケーラブルな分散を実現する。高速ウォルシュ・アダマール変換(英語版) (FWHT) によるパウリ分解はLCU項の指数関数的な成長を抑制し、L を n > 6 ビットで O(2n) から O(1) に還元する。 10ビットの3重対角トープリッツ系の場合、これは256倍減少し、99.99%以上の解忠実度を保っている。
参考スコア（独自算出の注目度）: 2.7835589988032887
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The Variational Quantum Linear Solver (VQLS), a hybrid quantum-classical algorithm for solving linear systems, faces a practical scalability bottleneck: the Linear Combination of Unitaries (LCU) decomposition requires O(L^2) circuit evaluations per optimizer iteration, where $L$ can grow as 4^n for n-qubit systems for the worst case scenario. We address this computational bottleneck through two complementary strategies. First, we present a distributed VQLS (D-VQLS) framework, built on NVIDIA CUDA-Q, that enables asynchronous, scalable distribution of the O(L^2) cost-function evaluations. Second, a fast Walsh--Hadamard transform (FWHT)-based Pauli decomposition with 1% coefficient thresholding curbs the exponential growth of LCU terms, reducing L from O}(2^n) to O(1) for n > 6 qubits and compressing the per-iteration circuit complexity from O(n * 4^n) to O(n) for sparse, structured matrices. For a 10-qubit tridiagonal Toeplitz system, this yields a 256x reduction, from 23 million to 90,112 circuits per iteration, while preserving over $99.99\%$ solution fidelity. Additionally, to inform feasibility on early fault-tolerant QPUs, the paper provides resource estimates -- gate counts, qubit requirements, and circuit evaluations per iteration -- for VQLS applied to arbitrary matrices. The D-VQLS framework is validated on the NERSC Perlmutter supercomputer using multi-node, multi-GPU ideal state-vector simulations, achieving over 99.99% fidelity against classical solutions on tridiagonal Toeplitz and Hele--Shaw flow benchmarks, with near-ideal strong scaling up to 24 GPUs and 95.3% weak scaling efficiency at 96 GPUs processing 360,448 circuits per iteration for a 10-qubit system. Systematic profiling identifies the optimal resource allocation for distributed quantum circuit workloads, yielding a 2.52x speedup for the configurations studied.
Abstract（参考訳）: 線形系を解くためのハイブリッド量子古典的アルゴリズムである変分量子線形ソルバー(VQLS)は、現実的なスケーラビリティのボトルネックに直面している。線形ユニタリ(LCU)分解はオプティマイザ反復毎にO(L^2)回路評価を必要とする。この計算ボトルネックを2つの相補的戦略によって解決する。まず,NVIDIA CUDA-Qをベースとした分散VQLS(D-VQLS)フレームワークを提案する。第二に、1%の閾値閾値を持つ高速ウォルシュ・アダマール変換(FWHT)に基づくパウリ分解は、LCU項の指数的な成長を抑制し、LをO(2^n)からO(1)に減らし、n > 6 qubitsとし、スパースで構造化された行列に対してO(n * 4^n)からO(n)に圧縮する。 10ビットの3重対角トープリッツ系の場合、これは256倍の減少となり、1イテレーションあたり2300万から90,112回路となり、99.99\%以上の解の忠実さが保たれる。さらに,早期フォールトトレラントQPUの実現可能性を明らかにするため,任意の行列に適用したVQLSに対して,ゲート数,キュービット要求,回路評価などのリソース推定を行う。 D-VQLSフレームワークは、NERSC Perlmutterスーパーコンピュータ上で、マルチノード、マルチGPUの理想的な状態ベクトルシミュレーションを用いて検証され、99.99%以上の忠実さをトリジタゴナルなToeplitzとHele-Shawフローベンチマークで達成した。システマティックプロファイリングは、分散量子回路のワークロードに対して最適なリソース割り当てを特定し、研究された構成に対して2.52倍のスピードアップをもたらす。

論文の概要: Distributed Variational Quantum Linear Solver

関連論文リスト