Variational quantum algorithms are the leading candidate for near-term
advantage on noisy quantum hardware. When training a parametrized quantum
circuit to solve a specific task, the choice of ansatz is one of the most
important factors that determines the trainability and performance of the
algorithm. Problem-tailored ansatzes have become the standard for tasks in
optimization or quantum chemistry, and yield more efficient algorithms with
better performance than unstructured approaches. In quantum machine learning
(QML), however, the literature on ansatzes that are motivated by the training
data structure is scarce. Considering that it is widely known that unstructured
ansatzes can become untrainable with increasing system size and circuit depth,
it is of key importance to also study problem-tailored circuit architectures in
a QML context. In this work, we introduce an ansatz for learning tasks on
weighted graphs that respects an important graph symmetry, namely equivariance
under node permutations. We evaluate the performance of this ansatz on a
complex learning task on weighted graphs, where a ML model is used to implement
a heuristic for a combinatorial optimization problem. We analytically study the
expressivity of our ansatz at depth one, and numerically compare the
performance of our model on instances with up to 20 qubits to ansatzes where
the equivariance property is gradually broken. We show that our ansatz
outperforms all others even in the small-instance regime. Our results
strengthen the notion that symmetry-preserving ansatzes are a key to success in
QML and should be an active area of research in order to enable near-term
advantages in this field.
Variational quantum algorithms are the leading candidate for near-term advantage on noisy quantum hardware.
変分量子アルゴリズムは、雑音量子ハードウェアにおける短期的優位性の主要な候補である。
0.70
When training a parametrized quantum circuit to solve a specific task, the choice of ansatz is one of the most important factors that determines the trainability and performance of the algorithm.
Problem-tailored ansatzes have become the standard for tasks in optimization or quantum chemistry, and yield more efficient algorithms with better performance than unstructured approaches.
Considering that it is widely known that unstructured ansatzes can become untrainable with increasing system size and circuit depth, it is of key importance to also study problem-tailored circuit architectures in a QML context.
In this work, we introduce an ansatz for learning tasks on weighted graphs that respects an important graph symmetry, namely equivariance under node permutations.
We evaluate the performance of this ansatz on a complex learning task on weighted graphs, where a ML model is used to implement a heuristic for a combinatorial optimization problem.
We analytically study the expressivity of our ansatz at depth one, and numerically compare the performance of our model on instances with up to 20 qubits to ansatzes where the equivariance property is gradually broken.
We show that our ansatz outperforms all others even in the small-instance regime.
当社のansatzは,小規模体制においても他国よりも優れています。
0.43
Our results strengthen the notion that symmetry-preserving ansatzes are a key to success in QML and should be an active area of research in order to enable near-term advantages in this field.
These types of problems have also been studied in computer science and mathematics for decades.
この種の問題はコンピュータ科学や数学でも何十年も研究されてきた。
0.70
Many interesting combinatorial optimization problems that are relevant in industry today are NP-hard, so that no general efficient solution is expected to exist.
For this reason, heuristics have gained much popularity, as they often provide high-quality solutions to real-world instances of many NP-hard problems.
One line of research in this area investigates using neural networks (NNs) to learn algorithms for solving combinatorial optimization problems [1, 2], which is known as neural combinatorial optimization (NCO).
Here, NNs learn to solve combinatorial optimization problems based on data, and can then be used to find approximate solutions to arbitrary instances of the same problem.
A downside of the supervised approach is that it requires access to a large amount of training data in form of solved instances of the given problem, which requires solving many NP-hard instances of the problem to completion.
At large problem sizes, this is a serious impediment for the practicability of this method.
大きな問題の規模では、この方法の実用性に深刻な障害があります。
0.74
For this reason, reinforcement learning (RL) was introduced as a technique to train these heuristics.
そのため、これらのヒューリスティックスを訓練する手法として強化学習(RL)を導入した。
0.72
In RL, an agent does not learn based on a given data set, but by
rlでは、エージェントは与えられたデータセットに基づいて学習するのではなく、
0.76
interacting with an environment and gathering information in a trial-and-error fashion.
環境と対話し、試行錯誤の方法で情報を集める。
0.69
These RL-based approaches have been shown to successfully solve even instances of significant size in problems with a geometric structure like the convex hull problem [4], chip placement [5] or the vehicle routing problem [6].
In most combinatorial optimization problems the instances are presented in a structured form, for instance as a graph, so the structure of the ML models underlying the above techniques can be chosen in an informed manner.
In recent years the field that focuses on the study of exploiting this known structure, called geometric deep learning, has garnered a lot of interest [7].
This field studies the properties of common NN architectures, like convolutional NNs or graph NNs, through the lens of group theory and geometry and provides an understanding of why these structured types of models are the main drivers of recent advances in deep learning.
The success of these models can largely be attributed to the fact that they preserve certain symmetries that are present in the training data, e g , translation invariance in images in the case of convolutional NNs, or permutation invariance or equivariance in the case of graph NNs, as depicted in fig.
1. When it comes to quantum computing, much work has been dedicated to understanding the structure for variational quantum algorithms [8] that address problems in optimization [9, 10] or chemistry [11, 12].
It is known that unstructured ansatzes like the HWEA scale badly as the width and depth of the circuit grows, most prominently because of the barren plateau phenomenon [15–
These types of NNs suffer from similar trainability issues as unstructured quantum circuits [18].
これらのタイプのnnは、非構造化量子回路[18]と同様のトレーサビリティの問題に苦しむ。
0.62
The true break-throughs in deep learning were in part possible because more efficient architectures have been developed, like the convolutional NN for image recognition or the recurrent NN for time series.
In this work, we introduce a permutation equivariant quantum circuit architecture for weighted graphs and demonstrate that it significantly outperforms unstructured ansatzes.
To evaluate these ansatzes on a complex learning task that is relevant for solving industry-related problems once scaled up in instance size, we study their performance in the context of neural combinatorial optimization to find approximate solutions to the Traveling Salesperson Problem (TSP).
on weighted graphs. • We evaluate the performance of this ansatz on the task of solving TSP instances and show that it performs well on instances with up to 20 cities (20 qubits).
• We numerically compare our ansatz to three non-equivariant ansatzes, and show that the more the equivariance property of the ansatz is broken, the worse performance becomes and that a simple hardware-efficient ansatz completely fails on this learning task.
• We analytically study the expressivity of our model at depth one, and show under which conditions there exists a parameter setting for any given TSP instance of arbitrary size for our ansatz that produces the optimal tour with the RL scheme that is applied in this work.
Our work illustrates the merit of using symmetrypreserving ansatzes for QML on the example of graphbased learning, and underlines the notion that in order to successfully apply variational quantum algorithms for ML tasks in the future, the usage of unstructured ansatzes that are popular in current QML research is limited as problem sizes grow.
This work motivates further study of “geometric quantum learning” in the vein of the classical field of geometric deep learning, to establish more effective ansatzes for QML, as these are a prerequisite to efficiently apply quantum models on any practically relevant learning task in the near-term.
2 FIG. 1: Depiction of two functions that respect important symmetries of graphs:
2 FIG. 1: グラフの重要な対称性を尊重する2つの関数の解釈
0.63
a) The permutation equivariant function will yield the same output values for each graph permutation, but reordered according to the reordering of nodes.
a) 置換同変関数は各グラフの置換に対して同じ出力値を与えるが、ノードの再順序付けに従って再順序付けされる。
0.72
The above example shows a simple function that outputs node features in the same order as they are fed into the function.
上記の例は、関数に入力されたノードの特徴を同じ順序で出力する単純な関数を示している。
0.87
b) An invariant function will yield the same output, regardless of the permutation.
b) 不変関数は、置換にかかわらず、同じ出力を生成する。
0.67
The above example shows a simple function that outputs node features in ascending order.
上記の例では、ノードの特徴を昇順に出力する単純な関数を示している。
0.72
Which of the two is preferable depends on the task at hand.
2つのうちどちらが望ましいかは、目の前の作業次第です。
0.58
II. RELATED WORK A. Geometric learning - quantum and classical
II。 関連作業 A.幾何学学習-量子と古典
0.73
Learning approaches that utilize geometric properties of a given problem have lead to major successes in the field of ML, such as AlphaFold for the complex task of protein folding [20, 21] and have become an increasingly popular research field over the past few years.
Arguably the prime example of a successful geometric model is the convolutional NN (CNN), which has been developed at the end of the 20th century in an effort to enable efficient training of image recognition models [22].
20世紀末に画像認識モデルの効率的なトレーニングを可能にするために開発されたconvolutional nn (cnn) が成功した。 訳抜け防止モード: おそらく、成功した幾何学モデルの主要な例は畳み込みNN(CNN)である。 20世紀末に開発されました 画像認識モデルの効率的な訓練を可能にする[22]。
0.63
Since then, it has been shown that one of the main reasons that CNNs are so effective is that they are translation invariant: if an object in a given input image is shifted by some amount, the model will still “recognize” it as the same object and thus effectively requires fewer training data [7].
While CNNs are the standard architecture used for images, symmetry-preserving architectures have also been developed for time-series data in the form of recurrent NNs [23], and for graph data with GNNs [24].
GNNs have seen a surge of interest in the classical machine learning community in the past decade [24, 25].
GNNは、過去10年間に古典的な機械学習コミュニティへの関心が高まってきた[24, 25]。
0.71
They are designed to process data that is presented in graph form, like social networks [24], molecules [26], images [27] or instances of combinatorial optimization problems [1].
The first attempt to implement a geometric learning model in the quantum realm was made with the quantum convolutional NN in [28], where the authors introduce a translation invariant architecture motivated by classical convolutional NNs.
Approaches to translate the GNN formalism to QNNs were taken in [29], where input graphs are represented in terms of a parametrized Hamiltonian, which is then used
The authors of [30] introduce the so-called quantum evolution kernel, where they devise a graphbased kernel for a quantum kernel method for graph classification.
Again, their ansatz is based on alternating layers of Hamiltonians, where one Hamiltonian in each layer encodes the problem graph, while a second parametrized Hamiltonian is trained to solve a given problem.
A proposal for a quantum graph convolutional NN was made in [31], and the authors of [32] propose directly encoding the adjacency matrix of a graph into a unitary to build a quantum circuit for graph convolutions.
While all of the above works introduce forms of structured QML models, none of them study their properties explicitly from a geometric learning perspective or relate their performance to unstructured ansatzes.
The authors of [33] take the step to introduce an equivariant model family for graph data and generalize the QGNN picture to so-called equivariant quantum graph circuits (EQGCs).
EQGCs are a very broad class of ansatzes that respect the connectivity of a given input graph.
EQGCは、与えられた入力グラフの接続性を尊重する非常に広いクラスである。
0.77
The authors of [33] also introduce a subclass of EQGCs called equivariant quantum Hamiltonian graph circuits (EH-QGCs), that includes the QGNNs by [29] as a special case.
EH-QGCs are implemented in terms of a Hamiltonian that is constructed based on the input graph structure, and they are explicitly equivariant under permutation of vertices in the input graph.
The framework that the authors of [33] propose can be seen as a generalization of the above proposals.
33] の著者が提案した枠組みは、上記の提案の一般化と見なすことができる。
0.76
Different from the above proposals, EQGCs use a post-measurement classical layer that performs the functionality of an aggregation function as those found in classical GNNs.
In classical GNNs, the aggregation function in each layer is responsible for aggregating node and edge information in an equivariant or invariant manner.
古典的なGNNでは、各層の集約関数は、同変または不変の方法でノードとエッジ情報を集約する。
0.71
Popular aggregation functions are sums or products, as they trivially fulfill the equivariance property.
一般的な集約関数は和や積であり、それらは同値性を自明に満たす。
0.55
In the case of EQGCs, there is no aggregation in the quantum circuit, and this step is offloaded to a classical layer that takes as input the measurements of the PQC.
Additionally, the EQGC family is defined over unweighted graphs and only considers the adjacency matrix of the underlying input graph to determine the connectivity of the qubits.
The authors of [33] also show that their EQGC outperforms a standard message passing neural network on a graph classification task, and thereby demonstrate a first separation of quantum and classical models on a graph-based learning task.
Second, the initial state of our model is always the uniform superposition, which allows each layer in the ansatz to perform graph feature aggregation via sums and products of node and edge features, as we will discuss in section III.
Additionally, in its simplest form as used in this work, the number of qubits in our model scales linearly with the number of nodes in the input graph, while the depth of each layer depends on the graph’s connectivity, and therefore it provides one answer to the question of a NISQ-friendly equivariant quantum model posed by [34].
When combined with RL, this data manifests in form of states of an environment, while the objective is defined in terms of a reward function, as we will now explain in more detail.
In the RL paradigm, the model, referred to as an agent, interacts with a socalled environment.
RLパラダイムでは、エージェントと呼ばれるモデルがいわゆる環境と相互作用する。
0.56
The environment is defined in terms of its state space S and action space A, that can both either be discrete or continuous.
環境は状態空間 s と作用空間 a によって定義され、離散的でも連続的でも構わない。
0.76
The agent alters the state of the environment by performing an action a ∈ A, whereafter it receives feedback from the environment in form of the following state s(cid:48) ∈ S, and a reward r that depends on the quality of the chosen action, given the initial state s.
エージェントは、動作 a ∈ A を実行して環境の状態を変更し、その後、以下の状態 s(cid:48) ∈ S の形で環境からフィードバックを受け、初期状態 s が与えられた場合、選択された動作の品質に依存する報酬 r を与えられる。
0.78
Actions are chosen based on a policy π(a|s), which is a probability distribution of actions a given states s.
アクションは、与えられた状態 s に対するアクションの確率分布であるポリシー π(a|s) に基づいて選択される。
0.81
The definition of the state and action spaces and the reward function depends on the given environment.
状態とアクション空間の定義と報酬関数は、与えられた環境に依存する。
0.72
In general, the goal of the agent is to learn a policy that maximizes the expected return G,
一般に、エージェントの目標は、期待される戻り値Gを最大化するポリシーを学ぶことである。
0.78
∞(cid:88) Gt =
∞(cid:88) Gt =
0.42
γkrt+k+1,
γkrt+k+1 である。
0.32
(1) k=0 where γ ∈ [0, 1] is the discount factor that determines the importance of future rewards in the agent’s decision.
The above definition of the expected return is for the so-called infinite horizon, where the interaction with the environment can theoretically go on to infinity.
以上の定義は、環境との相互作用が理論上無限大へと続くようないわゆる無限地平線に対するものである。
0.68
In practice. we usually work in environments with a finite horizon, where the above sum runs only over a predefined number of indices.
In this work, we focus on so-called Q-learning, where the expected return is maximized in terms of Q-values.
本研究では,q値の観点で期待値が最大化されるいわゆるq-learningに注目した。
0.72
The values Q(s, a) for each (s, a) pair also
各(s, a)ペアの値 Q(s, a) も
0.71
英語(論文から抽出)
日本語訳
スコア
4 FIG. 2: An illustration of one episode in the TSP environment.
4 FIG.2:TSP環境における一つのエピソードのイラスト。
0.63
The agent receives a graph instance as input, where the first node is already added to the proposed tour (marked red), which can always be done without loss of generality.
Based on the agent’s output, the label for a given (s, a) pair from the memory is computed as follows,
エージェントの出力に基づいて、メモリから与えられた(s, a)ペアのラベルを次のように計算する。
0.70
q = rt+1 + γ · max
q = rt+1 + γ · max
0.47
a ˆQθ(st+1, a),
あ qθ(st+1, a) である。
0.55
(3) and this label is then used to compute parameter updates.
(3) このラベルはパラメータの更新を計算するのに使われます
0.59
The update is not computed with the output of the function approximator Q, but by a copy ˆQ called the target network, which is updated with the current parameters of the function approximator at fixed intervals.
In our case, the function approximator and target network are implemented as PQCs, while the parameter optimization is perfomed via the classical DQN algorithm.
For more detail on implementing the DQN algorithm with a PQC as the function approximator, we refer the reader to [37–39].
関数近似器としてPQCを用いたDQNアルゴリズムの実装の詳細については、[37-39]を参照する。
0.82
1. Solving the Traveling Salesperson Problem with
1.旅行セールスパーソンの問題を解決する
0.76
reinforcement learning To evaluate the performance gains of an ansatz that respects certain symmetries relevant to the problem at hand, we apply our model to a practically motivated learning task on graphs.
The TSP is a low-level abstraction of a problem commonly encountered in mobility and logistics: given a list of locations, find the shortest route that connects all of these locations without visiting any of them twice.
Formally, given a graph G(V,E) with vertices V and weighted edges E, the goal is to find a permutation of the vertices such that the resulting tour length is minimal, where a tour is a cycle that visits each vertex exactly once.
形式的には、頂点 V と重み付き辺 E を持つグラフ G(V,E) が与えられたとき、その目標は、ツアーの長さが最小となるような頂点の置換を見つけることである。 訳抜け防止モード: 形式的には、頂点 V と重み付き辺 E を持つグラフ G(V, E ) が与えられる。 ゴールは 頂点の順応を 見つけ出すことです ツアーの長さは最小限で ツアーはそれぞれの頂点を 正確に1度訪問するサイクルです
0.76
A special case of the TSP is the 2D Euclidean TSP, where each node is defined in terms of its x and y coordinates in Euclidean space, and the edge weights are given by the Euclidean distance between these points.
TSP の特別なケースは 2D ユークリッド TSP であり、各ノードはユークリッド空間の x と y の座標で定義され、エッジウェイトはこれらの点間のユークリッド距離によって与えられる。
0.74
In this work, we deal with the symmetric Euclidean TSP on a complete graph, where the edges in the graph are undirected.
この研究では、グラフの辺が無向である完全グラフ上の対称ユークリッド TSP を扱う。
0.56
This reduces the number of possible tours from n! to (n−1)!
これにより、可能なツアーの数が n から (n−1)!
0.84
. However, even in this reduced case the number of possible tours is already larger than 100k for instances with a modest number of ten cities,
2 EQCstateactionagento ne episoden-2 timesinput: graphoutput: tour
2 EQCstateactionagento ne episoden-2 timesinput: graphoutput: Tour
0.45
英語(論文から抽出)
日本語訳
スコア
and the TSP is a well-known NP-hard problem.
TSPはよく知られたNPハード問題である。
0.73
To solve this problem with a RL approach, we follow the strategy introduced in [4].
この問題をRLアプローチで解決するには,[4]で導入された戦略に従う。
0.84
In this work, a classical GNN is used to solve a number of combinatorial optimization problems on graphs.
この研究では、グラフ上の多くの組合せ最適化問題を解くために古典的なGNNが用いられる。
0.72
The authors show that this approach can outperform dedicated approximation algorithms defined for the TSP, like the Christofides algorithm, on instances of up to 300 cities.
One episode of this learning algorithm for the TSP can be seen in fig.
TSPのためのこの学習アルゴリズムの1つのエピソードは、フィグで見ることができる。
0.59
2, and a detailed description of the learning task as implemented in our work is given in section IV A.
第2節では,本研究で実施した学習タスクの詳細な説明が第4節aで述べられている。
0.61
III. EQUIVARIANT QUANTUM CIRCUIT
III。 等価量子回路
0.52
In this section, we formally introduce the structure of our equivariant quantum circuit (EQC) for learning tasks on weighted graphs that we use in this work.
Other types of data that are represented in graph form are, for example, social networks or molecules.
グラフ形式で表現される他の種類のデータは、例えばソーシャルネットワークや分子である。
0.78
In general, when learning based on graph data, there are two sets of features: node features and edge features.
一般的に、グラフデータに基づく学習には、ノード機能とエッジ機能という2つの機能セットがあります。
0.75
Depending on the specific learning task, it might be enough to use only one set of these features as input data, and the specific implementation of the circuit will change accordingly.
As mentioned above, an example of an ansatz for cases where encoding node features suffices is the EQGC introduced in [33].
前述のように、ノードの特徴が十分である場合のアンサッツの例は[33]で導入されたEQGCである。
0.76
In our case, we use both node and edge features to solve TSP instances.
我々の場合、私たちはTSPインスタンスを解決するためにノード機能とエッジ機能の両方を使用します。
0.59
In case of the nodes, we encode whether a node is already present in the partial tour at time step t to inform the node selection process described later in definition 2.
We discuss the details of this in appendix A. We now proceed to define the ansatz in terms of encoding node information in form of α (see definition 1) and edge information in terms of the weighted graph edges εij ∈ E.
我々は、ノード情報をαの形でエンコーディングする(定義1を参照)とともに、重み付きグラフ辺 εij ∈ e の観点でエッジ情報を定義する。
0.56
A. Ansatz structure and equivariance
a.アンサッツ構造と等分散
0.64
Given a graph G(V,E) with node features α and weighted edges E, and trainable parameters β, γ ∈ Rp, our ansatz at depth p is of the following form
ノードが α で重み付き辺 e と訓練可能なパラメータ β, γ ∈ rp を持つグラフ g(v,e) が与えられたとき、深さ p における我々のアンサッツは次の形式である。 訳抜け防止モード: ノードを持つグラフ G(V, E) が与えられたとき、α と重み付きエッジ E, トレーニング可能なパラメータ β , γ ∈ Rp , 深さ p のアンザッツは以下の形式です
0.89
|E, α, β, γ(cid:105)p = UN (α, βp)UG(E, γp)
|E, α, β, γ(cid:105)p = UN (α, βp)UG(E, γp)
0.49
(4) where |s(cid:105) is the uniform superposition of bitstrings of length n,
(4) |s(cid:105) は長さ n のビットストリングの一様重ね合わせである。
0.68
. . . UN (α, β1)UG(E, γ1)|s(cid:105) ,
. . . UN(α, β1)UG(E, γ1)|s(cid:105)
0.43
|s(cid:105) =
s(cid:105) =
0.40
1√ 2n x∈{0,1}n
1~2n x ∈{0,1}n
0.40
|x(cid:105) ,
|x(cid:105)。
0.60
(5) (cid:88)
(5) (cid:88)
0.41
5 FIG. 3: EQC used in this work.
5 FIG.3: この作業で使用されるEQC。
0.64
Each layer consists of two parts: the first part UG encodes edge features, while the second part UN encodes node features.
各層は2つの部分で構成され、第1部UGはエッジ特徴を符号化し、第2部UNはノード特徴を符号化する。
0.65
Each of the two parts is parametrized by one parameter βl, γl, respectively.
2つの部品はそれぞれ1つのパラメータβl,γlでパラメータ化される。
0.82
n(cid:79) UN (α, βj) with Rx(θ) = e−i θ
n(キッド:79) rx(θ) = e−i θ を持つ un (α, βj)
0.75
2 X , is defined as
2 X と定義されている。
0.74
UN (α, βj) =
UN(α, βj) =
0.47
Rx(αl · βj),
Rx(αl · βj)
0.45
(6) and UG(E, γj) is
(6) UG(E, γj) は
0.61
l=1 UG(E, γj) = exp(−iγjHG)
l=1 ug(e, γj) = exp(−iγjhg)
0.50
with HG =(cid:80)
hg =(cid:80)の場合
0.61
(7) and E are the edges of graph G weighted by εij.
(7) と E は εij で重み付けられたグラフ G の辺である。
0.81
A 5-qubit example of this ansatz can be seen in fig.
このアンザッツの5ビットの例はフィグで見ることができる。
0.50
3. (i,j)∈E εijσ(i)
3. (i,j)htmle εijσ(i)
0.42
z σ(j) z For p = 1, we have |E, α, β, γ(cid:105)1 = UN (α, β)UG(E, γ)|s(cid:105)
z σ(j) z p = 1 に対して |E, α, β, γ(cid:105)1 = UN (α, β)UG(E, γ)|s(cid:105) を持つ。
0.87
· (cid:88) x∈{0,1}n
·(第88回) x ∈{0,1}n
0.53
· exp (cid:124)
·exp (cid:124)
0.39
α1β 2 cos (cid:18) (cid:124) (cid:88)
α1β 2 コス (出典:18)(出典:124)→(出典:88)
0.42
(i,j)∈E = 1√ 2n + ··· − isin
(i,j)大E = 1 = 2n + ··· − isin
0.36
− ··· − isin
--····-イシン
0.36
αlβ 2 (cid:123)(cid:122)
αlβ2 (cid:123)(cid:122)
0.35
weighted bitflip terms
重み付きビットフリップ項
0.47
(cid:19) (cid:125)
(第19回)(第125回)
0.51
αnβ 2 diag(ZiZj)|x(cid:105) · −i
αnβ 2 diag(ZiZj)|x(cid:105) · −i
0.38
π 2 γεij |x(cid:105) ,
π 2 γεij |x(cid:105)。
0.58
(8) (cid:123)(cid:122)
(8) (cid:123)(cid:122)
0.40
edge weights (cid:125)
エッジウェイト エイ(cid:125)
0.56
where diag(ZiZj)|x(cid:105) = ±1 is the entry in the matrix corresponding to each ZiZj term, e g , I1 ⊗···⊗ Zi ⊗ Ik⊗···⊗Zj⊗···⊗In, corresponding to the basis state |x(cid:105).
(E.g., the first term on the diagonal corresponds to the all-zero state, and so on.)
(例えば、対角線上の最初の項は、全ゼロ状態などに対応する。)
0.62
We see that the first group of terms, denoted weighted bitflip terms, is a sum over products of terms that encode the node features.
重み付きビットフリップ項と呼ばれる最初の項群は、ノードの特徴を符号化する項の積の和である。
0.53
In other words, in the one-qubit case we get a sum over sine and cosine terms, in the two-qubit case we get a sum over products of pairs of sine and cosine terms, and so on.
The terms in the second part of the equation denoted edge weights is the exponential of a sum over edge weight terms.
方程式の第2部における辺重みを表す項は、辺重み項上の和の指数関数である。
0.55
As we start in the uniform superposition, each basis state’s amplitude depends on all node and edge features, but with different signs and therefore different terms interfering constructively and destructively for every basis state.
This can be regarded as a quantum version of the classical aggregation functions in GNNs as discussed
これはGNNにおける古典的集約関数の量子版と見なすことができる。
0.75
. . . 𝑈𝐺(ℰ, 𝛾1)one layer𝑈𝑁(𝛼, 𝛽1)
. . . UG(E, γ1)オン層UN(α, β1)
0.68
英語(論文から抽出)
日本語訳
スコア
in section II. Similarly to the classical case where the k-th layer of a NN aggregates information over the klocal neighborhood of the graph, the terms in eq.
第2節で 古典的な場合と同様に、NN の k 番目の層がグラフの局所的近傍(eq の項)に情報を集約する。
0.59
(8) become more complex with each additional layer in the PQC.
(8)はPQCの各付加層とより複雑になる。
0.77
The reader may already have observed that this ansatz is closely related to an ansatz that is wellknown in quantum optimization: that of the quantum approximate optimization algorithm (QAOA) [9].
Indeed, our ansatz can be seen as a special case of the QAOA, where instead of using a cost Hamiltonian to encode the problem, we directly encode instances of graphs and apply the “mixer terms” in eq.
This correspondence will later let us use known results for QAOA-type ansatzes at depth one [40] to derive exact analytical forms of the expectation values of our ansatz, and use these to study its expressivity.
As our focus is on implementing an ansatz that respects a symmetry that is useful in graph learning tasks, namely an equivariance under permutation of vertices of the input graph, we now show that each part of our ansatz respects this symmetry.
Theorem 1 (Permutation equivariance of the ansatz).
定理1(アンサッツの置換同値)。
0.48
Let the ansatz be of the type as defined in eq.
ansatz を eq で定義されている型とする。
0.68
(4) with parameters β, γ ∈ Rp that represents an instance of a graph G with weighted edges εij ∈ E. Let P ∈ Bn×n be a permutation on the weighted adjacency matrix A of G, and ˜P ∈ B2n×2n a matrix that maps the tensor product |v1(cid:105)⊗|v2(cid:105)⊗···⊗|vn(cid:105) with |vi(cid:105) ∈ C2 to |vp(1)(cid:105) ⊗ |vp(2)(cid:105) ⊗ ··· ⊗ |vp(n)(cid:105).
(4) パラメータ β, γ ∈ Rp は、重み付き辺 εij ∈ E を持つグラフ G のインスタンスを表す。 P ∈ Bn×n を G の重み付き随伴行列 A 上の置換とし、かつ、そのテンソル積 |v1(cid:105) を |vi(cid:105) ∈ C2 と |vp(1)(cid:105) ∈ C2 と |vp(2)(cid:105) ···|vp(cid:105) を写像する行列とする。
0.85
We call an ansatz that satisfies the following property equivariant under permutations of vertices in G,
我々は、G の頂点の置換の下で以下の性質同変を満たすアンザッツを呼ぶ。
0.60
|E, α, β, γ(cid:105)p = ˜P |E(P T AP ), α, β, γ(cid:105)
|E, α, β, γ(cid:105)p = >P |E(P T AP ), α, β, γ(cid:105)
0.48
(9) with E(P T AP ) the graph edges after action of the permutation matrix on the adjacency matrix A of the given graph.
(9) 与えられたグラフの隣接行列A上の置換行列の作用の後にグラフエッジをE(P T AP ) で表す。
0.81
, p As mentioned before, our ansatz is closely related to the EH-QGCs in [33], and the authors of this work prove the equivariance of this type of circuit for unweighted adjacency matrices.
, p 前述したように、我々の ansatz は [33] における eh-qgcs と密接に関連しており、本論文の著者は、非重み付き隣接行列に対するこの種の回路の同値性を証明する。
0.49
In order to prove equivariance of our circuit, we have to generalize their result to the case where a weighted graph is encoded in form of a Hamiltonian and parametrized by a set of free parameters on these weights as described in eq.
To guarantee this, we choose to assign one node and edge parameter per layer, respectively, and this makes us arrive at the QAOA-type parametrization shown in eq.
(4). For didactic reasons we provide a formal proof of our circuit’s equivariance in appendix A. Note that we formulate the theorem
(4). 道理上の理由から、付録Aにおける回路の等価性の形式的証明を提供する。
0.51
6 in terms of the state vector |E, α, β, γ(cid:105)p instead of the unitary
6) 単位元の代わりに状態ベクトル |E, α, β, γ(cid:105)p で表す。
0.78
UN (α, βp)UG(E, γp) . . . UN (α, β1)UG(E, γ1),
UN (α, βp)UG(E, γp) . . UN (α, β1)UG(E, γ1)
0.45
as our initial state is always the uniform superposition.
初期状態は常に一様重ね合わせである。
0.59
The equivariance of this unitary holds regardless of the initial state, however, when choosing an initial state that is different from the uniform superposition, the equivariance of the state vector only holds if that initial state is permutation invariant.
OPTIMIZATION WITH THE EQC In this section, we formally define the learning task that we address in this work, and the specific setup of the EQC and its observables.
We show that each component of the QNCO scheme – Q-values and resulting tour – is equivariant under permutation of the vertices, and then analytically study the expressivity of our ansatz at depth one.
The heuristic takes as input an instance of the TSP problem in form of a weighted 2D Euclidean graph G(V,E) with n = |V| nodes representing the cities and edge weights εij = d(vi, vj), where d(vi, vj) is the Euclidean distance between nodes vi and vj.
ヒューリスティックは、重み付き2次元ユークリッドグラフ G(V,E) と n = |V| ノードが都市とエッジの重みを表す εij = d(vi, vj) の形で TSP 問題の例を入力として取り、d(vi, vj) はノード vi と vj の間のユークリッド距離である。
0.81
Specifically, we are dealing with the symmetric TSP, where the edges in the graph are undirected.
具体的には、グラフのエッジが無方向である対称的 TSP を扱う。
0.69
Given G, the algorithm constructs a tour in n − 2 steps.
g が与えられると、アルゴリズムは n − 2 ステップでツアーを構成する。
0.74
Starting from a given (fixed) node in the proposed tour Tt=1, in each step t of the tour selection process the algorithm proposes the next node (city) in the tour.
In order to refer to versions of the input graph at different time steps where the nodes that are already present in the tour are marked, we now define the annotated graph.
In each time step of an episode in the algorithm, the model is given an annotated graph as input.
アルゴリズムにおけるエピソードの各タイムステップでは、入力として注釈付きグラフが与えられる。
0.72
Based on the annotated graph, the model should select the
注釈付きグラフに基づいて、モデルが選択すべき
0.82
英語(論文から抽出)
日本語訳
スコア
next node to add to the partial tour Tt at step t.
次に、ステップtで部分ツアーttに追加するノード。
0.59
The annotation can be used to partition the nodes V into i = π} and the the set of available nodes Va = {vi|α(t) set of unavailable nodes Vu = {vi|α(t) i = 0}.
このアノテーションは、ノード V を i = π {\displaystyle i} に分割し、利用可能なノードの集合 Vu = {vi|α(t) i = 0 {\displaystyle Vu= {vi|α(t)i=0} に分割するのに使うことができる。 訳抜け防止モード: このアノテーションはノード V を i = π } に分割するのに使うことができる。 そして、利用可能なノードの集合 Vu = { vi|α(t ) の集合 Vu = { vi|α(t ) i = 0 } である。
0.77
The node selection process can now be defined as follows.
ノード選択プロセスは次のように定義できるようになった。
0.71
Definition 2 (Node selection).
定義2(ノードの選択)。
0.75
Given an annotated graph G(V,E, α(t)), the node selection process consist of selecting nodes in a tour in a step-wise fashion.
is the sum of edge weights (distances) for all edges between the nodes in the tour, with ET ⊂ E. We measure the quality of the generated tour in form of the approximation ratio
ツアー中のノード間のすべてのエッジのエッジ重み(距離)の和である ET は E である。 生成したツアーの質を近似比の形で測定する。
0.65
c(Tn) c(T ∗)
c(Tn) c(T ∗)
0.38
. (11) The rewards in this environment are defined by the difference in overall length of the partial tour Tt at time step t, and upon addition of a given node vl at time step t + 1:
(12) Note that we use the negative of the cost as a reward, as a Q-learning agent will always select the action that leads to the maximum expected reward.
The learning process is defined in terms of a DQN algorithm, where the Q-function approximator is implemented in form of a PQC (which is described in detail in section III).
Here, we define the TSP in terms of an RL environment, where the set of states S = {Gi(V,E, α(t)) for i = 1, . . . ,|X| and t = 1, . . . , n − 1} consists of all possible annotated graphs (i.e., all possible configurations of values of α(t)) for each instance i in the training set X .
ここでは、TSP を RL 環境で定義する: i = 1, . . ,|X| および t = 1, . . . , n − 1} の状態 S = {Gi(V,E, α(t)) の集合は、トレーニング集合 X の各インスタンス i に対して可能なすべてのアノテートグラフ(すなわち、任意の α(t) の値の構成)からなる。
0.78
This means that the number of states in this environment is |S| = 2n−1|X|.
これは、この環境における状態の数は |S| = 2n−1|X| であることを意味する。
0.51
The action that the agent is required to perform is selecting the next node in each step of the node selection process described in definition 2, so the action space A consists of a set of indices for all but the first node in each instance (as we always start from the first node in terms of the list of nodes we are presented with for each graph, so α(t)
エージェントが実行するアクションは、定義2で記述されたノード選択プロセスの各ステップで次のノードを選択することであるので、アクション空間aは、各インスタンスの第1ノードを除くすべてのインデックスからなる(各グラフで提示されるノードのリストで、常に第1ノードからスタートするので、α(t))。 訳抜け防止モード: エージェントが実行するために必要なアクションは、定義2に記述されたノード選択プロセスの各ステップで次のノードを選択することである。 したがって、アクション空間 a は、各インスタンスの最初のノード以外のすべてのインデックスからなる(各グラフに対して提示されるノードのリストの観点で、常に最初のノードから開始するので)。 なので α(t )
0.87
1 = 0, ∀ t), and |A| = n − 1.
1 = 0 であり、|a| = n − 1 である。
0.80
The Q-function approximator gets as input an annotated graph, and returns as output the index of the node that should next be added to the tour.
All unavailable nodes vl ∈ Vu are not included in the node selection process, so we manually set their Q-values to a large negative number to exclude them, e g ,
すべての不利用可能なノード vl ∈ Vu はノード選択プロセスには含まれないので、Q-値を大きな負の数に手動で設定し、例えば、それらを排除する。
0.69
Q(Gi(V,E, α(t)), vl) = −10000 ∀ vl ∈ Vu.
Q(Gi(V,E, α(t)), vl) = −10000 > vl ∈ Vu。
0.76
We also define a stopping criterion for our algorithm, which corresponds to the agent solving the TSP environment for a given instance size.
As we aim at comparing the results of our algorithm to optimal solutions in this work, we have access to a labeled set of instances and define our stopping criterion based on these.
However, note that the optimal solutions are not required for training, as a stopping criterion can also be defined in terms of number of episodes or other figures of merit that are not related to the optimal solution.
In this work, the environment is considered as solved and training is stopped when the average approximation ratio of the past 100 iterations is < 1.05, where an approximation ratio of 1 means that the agent returns the optimal solution for the instances it was presented with in the past 100 episodes.
In our numerical results shown in section V, however, most agents do not reach the stopping criterion of having an average approximation ratio below 1.05, and run for the predefined number of episodes instead.
We showed in section III A that our ansatz of arbitrary depth is permutation equivariant.
セクションIIIAでは、任意の深さのアンサッツが置換同変であることを示した。
0.56
Now we proceed to show that the Q-values that are generated from measurements of this PQC, and the tour generation process as described in section IV A are equivariant as well.
While the equivariance of all components of an algorithm is not a pre-requisite to harness the advantage gained by an equivariant model, knowing which parts of our learning strategy fulfill
this property provides additional insight for studying the performance of our model later.
この特性は、後ほどモデルの性能を研究する上で、さらなる洞察を与えます。
0.65
As we show that the whole node selection process is equivariant, we know that the algorithm will always generate the same tour for every possible permutation of the input graph for a fixed setting of parameters, given that the model underlying the tour generation process is equivariant.
This is not necessarily true for a nonequivariant model, and simply by virtue of giving a permuted graph as input, the algorithm can potentially return a different tour.
Let pn be a permutation of n = |V| elements, where the l-th element corresponds to the l-th vertex vl and pQ n be a permutation that reorders the Q-values Q(G(t) i ) = {Q(G(t) , vn)} in correspondence to the reordering of the vertices by pn.
pn を n = |V| の元の置換とし、l 番目の元は l 番目の頂点 vl に対応し、pQ n は pn による頂点の再順序に対応する Q-値 Q(G(t) i ) = {Q(G(t) , vn)} を並べ替える置換とする。
0.83
Then Q(G(t) , vl) is equivariant under permutation of the vertices vl,
このとき、Q(G(t) , vl) は頂点 vl の置換の下で同変である。
0.69
, v1), . . . , Q(G(t)
, v1), . . . , q(g(t))
0.41
i i i Q(G(t)
私は 私は 私は Q(G(t))
0.52
i , vl) = Q(G(t,pn)
私は , vl) = Q(G(t,pn)
0.48
i , vpQ n (l)),
私は 、vpQ n (l)) である。
0.69
(15) where G(t,pn) vpQ mutation pn.
(15) ここで G(t,pn) vpQ 変異 pn。
0.59
is the permuted graph at step t, and n (l) is the vertex corresponding to vl after the per-
ステップ t の置換グラフであり、n (l) はper の後に vl に対応する頂点である。
0.76
i i Proof. We know from Theorem 1 that the ansatz that we use and therefore the expectation values (cid:104)Ovl(cid:105 ) are permutation equivariant.
(14)) only depend on the edge weights of the graph G. The edge weights are computed according to the graph’s adjacency matrix, and are therefore re-ordered under a permutation of the vertices pn and assigned to their corresponding permuted expectation values.
i As a second step to show that all components of our algorithm are permutation equivariant, it remains to show that the tours that our model produces as described in section IV A are also equivariant under permutations.
Let Tt(Gi, β, γ, v0) be a tour generated by a permutation equivariant agent implemented with a PQC as defined in Theorem 1 and Q-values as defined in Theorem 2, for a fixed set of parameters β, γ and a given start node v0, where a tour is a cycle over all vertices vl ∈ V that contains each vertex exactly once.
Let pn be a permutation of the vertices V. The output tour is equivariant under permutation of the vertices,
pn を頂点 v の置換とし、出力 tour は頂点の置換の下で同値である。
0.54
Tt(Gi, β, γ, v0) = Tt(Gi,pn , β, γ, vpn(0)).
tt(gi, β, γ, v0) = tt(gi,pn , β, γ, vpn(0))。
0.76
(16) Proof. We have shown in Theorem 2 that the Q-values of our model are permutation equivariant, meaning that a permutation of vertices results in a reordering
As we have shown in Theorem 1, equivariance of the model holds for arbitrary input graphs, so in particular it holds for each G(t) in the action selection process, and the output tour under the permuted graph is equal to the output tour under the original graph up to a renaming of the vertices.
C. Analysis of expressivity In this section, we analyze under which conditions there exists a setting of β, γ for a given graph instance Gi for our ansatz at depth one that can produce the optimal tour for this instance.
C. 表現性の分析 この節では、与えられたグラフインスタンス Gi に対する β, γ の設定が存在する条件を、この場合の最適ツアーを生成することができる深さ 1 で解析する。
0.73
Note that this does not show anything about constructing the optimal tour for a number of instances simultaneously with this set of parameters, or how easy it is to find any of these sets of parameters.
Those questions are beyond the scope of this work.
これらの質問は、この仕事の範囲を超えている。
0.62
The capability to produce optimal tours at any depth for individual instances is of interest because first, we do not expect that the model can find a set of parameters that is close-tooptimal for a large number of instances if it is not expressive enough to contain a parameter setting that is optimal for individual instances.
Second, the goal of a ML model is always to find similarities within the training data that can be used to generalize well on the given learning task, so the ability to find optimal solutions on individual instances is beneficial for the goal of generalizing on a larger set of instances.
Additionally, how well the model generalizes also depends on the specific instances and the parameter optimization routine, and therefore it is hard to make formal statements about the general case where we find one universal set of parameters that produces the optimal solution for arbitrary sets of instances.
(17) where vt−1 is the last node in the partial tour and vl is the candidate node.
(17) vt−1 は部分巡回の最後のノードであり、vl は候補ノードである。
0.61
In order to generate an arbitrary tour of our choice, in particular also the optimal tour, it suffices to guarantee that for a suitable choice of (fixed) γ, at each step in the node selection process the edge we want to add next to the partial tour has
(17) such that only the expectation values corresponding to edges that we want to select are positive, and all others are negative.
17) 選択したい辺に対応する期待値のみが正で、他のすべての値が負であるように。
0.74
To understand whether this is possible, we can leverage known results about the expressivity of the sine function.
これが可能かどうかを理解するために、正弦関数の表現性に関する既知の結果を活用することができる。
0.58
First, it is known that the function class f
まず、関数類 f が知られている。
0.76
(x) = sign(sin(xω)) parametrized by a single realvalued parameter ω has infinite Vapnik-Chervonenkisd imension (VC-dimension), also called shattering dimension.
Let X = {x1, . . . , xn} be a set of n data points labeled with binary labels in Y = {y1, . . . , yn}.
X = {x1, . . . , xn} を Y = {y1, . . . . , yn} のバイナリラベルでラベル付けした n 個のデータ点の集合とする。
0.81
The set of points X is said to be shattered by a function class F if for all y ∈ Y n there exists a function f ∈ F s.t. f
点 X の集合は、すべての y ∈ Y n に対して函数 f ∈ F s.t. f が存在するとき、函数類 F によって破られると言われる。
0.83
(xi) = yi, i = 1, . . . , n.
(xi) = yi, i = 1, . . , n。
0.39
The VC-dimension of a function class F is the size of the largest set of points that can be shattered by F.
関数類 F の VC-次元は、F によって破れる最大の点の集合の大きさである。
0.80
This result states that there exists at least one data set of infinite size that can be shattered by the function class f (x) = sign(sin(xω)), and a typical example of such a set is {2−m : m ∈ N} [41].
この結果は、関数類 f (x) = sign(sin(xω)) によって破砕できる無限サイズの少なくとも1つのデータセットが存在し、そのような集合の典型的な例は {2−m : m ∈ N} [41] である。
0.89
It does not tell us, however, if an arbitrary set of distinct points with labels in ±1 can be shattered by this function class.
In fact, it is known that this is not the case, and one can easily show that the simple set of evenly spaced points {x, 2x, 3x, 4x} with x ∈ R can not be shattered by f (x) [41].
実際、これはそうではないことが知られており、x ∈ R の等間隔点 {x, 2x, 3x, 4x} の単純集合が f (x) [41] で破れないことを容易に示せる。
0.74
To understand whether we can use the sine function to produce the labeling of edge weights of our choice, we can turn to another result, namely that for any rationally independent set of {x1, ..., xn} with labels yi ∈ (−1, 1), the sine function can approximate these points to arbitrary precision as shown in [42], i.e.,
(18) In general, the edge weights of graphs that represent TSP instances are not rationally independent.1
(18) 一般に、TSPインスタンスを表すグラフの辺重みは合理的に独立ではない。
0.57
However, in principle they can easily be made rationally independent by adding a finite perturbation (cid:48) i to each edge weight.
しかし、原理的には、各辺の重みに有限摂動 (cid:48) i を加えることで、合理的に独立にすることができる。
0.69
The results in [42] imply that almost any set of points x1, . . . , xn with 0 < xi < 1 is rationally independent, so we can choose (cid:48) i to be drawn uniformly at random from (0, max].
As long as these perturbations are applied to the edge weights in a way that does not change the optimal tour, as could be done by ensuring that max is small enough so that the proportions between edge weights are preserved, we can use this perturbed version of the graph to infer the optimal tour.
1 The real numbers x1, . . . , xn are said to be rationally independent if no integers k1, . . . , kn exist such that x1k1 +· · · + xnkn = 0, besides the trivial solution ki = 0 ∀ k.
Rational independence also implies the points are not rational numbers, so they are also not numbers normally represented by a computer.
論理的独立性はまた、点が有理数ではないことを意味するため、通常はコンピュータで表される数ではない。
0.68
9 at depth one can produce arbitrary labelings of our edges, which in turn let us produce expectation values such that only the ones that correspond to edges in the tour of our choice will have positive values.
For completeness, we provide a proof for this case in section IV C. We, however do not go deeper into this discussion since in fact we do not want to rely on this proof of optimality as a guiding explanation of how the algorithm works.
This is similar in vein to universality results for QAOA-type circuits, where it can be shown that for very specific types of Hamiltonians, alternating applications of the cost and mixer Hamiltonian leads to quantum computationally universal dynamics, i.e. it can reach all unitaries to arbitrary precision [43, 44], but these Hamiltonians are not related to any of the combinatorial optimization problems that were studied in the context of the QAOA.
While these results provide valuable insight into the expressivity of the models, in our case they do not inform us about the possibility of a quantum advantage on the learning problem that we study in this work.
In particular, we do not know from these results whether the EQC utilizes the information provided by the graph features in a way in which the algorithm benefits from the quantumness of the model, at depth one or otherwise.
As it is known that the QAOA applied to ground state finding benefits from interference effects, investigating whether similar results hold for our algorithm is an interesting question that we leave for future work.
Additionally, we note that high expressivity alone does not necessarily lead to a good model, and may even lead to issues in training as the well-studied phenomenon of barren plateaus [15], or a susceptibility to overfitting on the training data.
In practice, the best models are those that strike a balance between being expressive enough, and also restricting the search space of the model in a way that suits the given training data.
Studying and designing models that have this balance is exactly the goal of geometric learning, and the equivariance we have proven for our model is a helpful geometric prior for learning tasks on graph inputs.
V. NUMERICAL RESULTS A. Comparison of the EQC and non-equivariant
V.数値結果 A. EQC と非同変の比較
0.75
ansatzes on the TSP After proving that our model is equivariant under node permutations and analytically studying the expressivity of our ansatz, we now numerically study the training and validation performance of this model
10 FIG. 4: One layer of each of the circuits studied in this work.
10 第4図: この研究で研究された各回路の1つの層。
0.62
a) The EQC with two trainable parameters β, γ per layer.
a) 2つの訓練可能なパラメータβ,γを層ごとに持つeqc。
0.67
b) The same set of gates as in the EQC, but we break equivariance by introducing one individual free parameter per gate (denoted NEQC).
b) EQCと同じゲートの集合ですが、ゲートごとに1つの個別自由パラメータ(NEQCと表記)を導入することで、等価性を破ります。
0.72
c) Similar to the NEQC, but we start from the all-zero state and add a final layer of trainable one-qubit gates and a ladder of CZ-gates (denoted hardware-efficient with trainable encoding, HWETE).
c) NEQCと同様、全ゼロ状態から開始し、トレーニング可能な1量子ビットゲートとCZゲートのラグ(ハードウェア効率とトレーニング可能なエンコーディングを備えたHWETE)の最終レイヤを追加します。
0.71
d) Same as the HWETE, but only the single-qubit Y-rotation parameters are trained (denoted HWE).
d) HWETEと同じだが、単一のキュービットY回転パラメータのみを訓練する(HWEを参照)。
0.75
on TSP instances of varying size in a NCO context.
NCOコンテキストにおける様々なサイズのTSPインスタンス。
0.77
The training data set that we use is taken from [3], where the authors propose a novel classical attention approach and evaluate it on a number of geometric learning tasks.2
As described in section IV A, the environment is considered as solved by an agent when the running average of the approximation ratio over the past 100 episodes is less than 1.05.
In a realistic scenario where one does not have access to optimal solutions, the algorithm would simply run for a fixed number of episodes or until another convergence criterion is met.
When evaluating the final average approximation ratios, we always use the parameter setting that was stored in the final episode, regardless of the final training error.
When 2 We note that we have re-computed the optimal tours for all instances that we use, as the data set uploaded by the authors of [3] erroneously contains sub-optimal solutions.
As we are interested in the performance benefits that we gain by using an ansatz that respects an important graph symmetry, we compare our model to versions of the same ansatz where we gradually break the equivariance property.
We start with the simplest case, were the circuit structure is still the same as for the EQC, but instead of having one βl, γl in each layer, every X- and ZZ-gate is individually parametrized.
As these parameters are now tied directly to certain one- and two-qubits gates, e g an edge between qubits one and two, they will not change location upon a graph permutation and therefore break equivariance.
We call this the non-equivariant quantum circuit (NEQC).
これを非同変量子回路(NEQC)と呼ぶ。
0.83
To go one step further, we take the NEQC and add a variational part to each layer that is completely unrelated to the graph structure: namely a hardware-efficient layer that consists of parametrized Y-rotations and a ladder of CZ-gates.
In this ansatz, we have a division between a data encoding part and a variational part, as is often done in QML.
このアンサッツでは、QMLでよく行われるように、データ符号化部と変分部を分割する。
0.56
To be closer to standard types of ansatzes often used in QML, we also omit the initial layer of H-gates here and start from the all-zero state (which requires us
FIG. 5: Comparison between the EQC and its non-equivariant version (NEQC) where each gate is parametrized separately as described in section V. Results are shown on TSP instances with 5, 10 and 20 cities (TSP5, TSP10 and TSP20, respectively), averaged over ten agents each.
To provide a classical baseline, we also show results for the nearest-neighbor heuristic (NN).
古典的ベースラインを提供するため,最寄りヒューリスティック (NN) の結果も示す。
0.56
a) and b) show the training and validation performance for both ansatzes with one layer, while
a) と b)一方の層を有する双方のアンサーゼの訓練及び検証性能を示す。
0.78
c) and d) show the same for four layers.
c) と d) 4層でも同様を示す。
0.81
The dashed, grey line on the left-hand side figures denotes optimal performance.
左側の破れた灰色の線は最適な性能を示している。
0.70
The dotted, black line on the right-hand side figures denotes the upper bound of the Christofides algorithm, a popular classical approximation algorithm that is guaranteed to find a solution that is at most 1.5 times as long as the optimal tour.
Figures a) and c) show the running average over the last ten episodes.
図 a) と c)過去10回のランニング平均を示す。
0.67
to switch the order of X- and ZZ-gates)3.
xゲート及びzzゲートの順番を切り替える)3。
0.57
We denote this the hardware-efficient with trainable embedding (HWETE) ansatz.
ハードウェア効率をトレーニング可能な埋め込み(hwete) ansatzで表します。
0.55
Finally, we study a third ansatz, where we take the HWETE and now only train the Yrotation gates, and the graph-embedding part of the circuit only serves as a data encoding step.
The nearest-neighbor algorithm finds a solution quickly also for instances with increasing size, but there is no guarantee that this tour is close to the optimal one.
3 However, in practice it did not make a difference whether we started from the all-zero or uniform superposition state in the learning task that we study.
4 For reference, the authors of [3], who generated the training instances that we use, stop comparing to optimal solutions at n = 20 as it becomes extremely costly to find optimal tours from thereon out.
We show results for TSP instances with five and ten cities (TSP5, TSP10 respectively) for models trained on 10 and 100 instances, and with one and four layers.
The hardware-efficient ansatz with trainable embedding (HWETE) consists of trainable graph encoding layers as those in the EQC, with an additional variational part in each layer that consists of parametrized single-qubit Y-gates and a ladder of CZ-gates.
Additionally, we add the upper bound given by one of the most widely used approximation algorithms for the TSP (as implemented e g in Google OR-Tools): the Christofides algorithm.
This algorithm is guaranteed to find a tour that is at most 1.5 times as long as the optimal tour [46].
このアルゴリズムは、最適なツアー[46]の1.5倍の長さのツアーを見つけることが保証されている。
0.75
In the case where any of our models produces validation results that are on average above this upper bound of the Christofides algorithm, we consider it failed, as it is more efficient to use a polynomial approximation algorithm for these instances.
5 and fig. 6. Geometric learning models are expected to be more data-efficient than their unstructured counterparts, as
5および5。 6. 幾何学的学習モデルは、非構造化モデルよりもデータ効率が高いことが期待される。
0.59
they respect certain symmetries in the training data.
訓練データの特定の対称性を尊重します
0.72
This means that when a number of symmetric instances are present in the training or validation data, the effective size of these data sets is decreased.
In our comparison of the EQC and the NEQC, we fix the number of training samples and compare the different models in terms of circuit depth and number of parameters to achieve a certain validation error and expect that the EQC will need fewer layers to achieve the same validation performance as the NEQC.
5 a) and b), we show the training and validation performance of both ansatzes at depth one.
5 a) と b) 両アンサーゼの深度1でのトレーニングおよび検証性能を示す。
0.60
For instances with five cities, 10100Number of training instances1.01.11.21. 31.41.51.61.71.8Appr oximation ratioEQCHWETEHWErand om10100Number of training instances1.01.11.21. 31.41.51.61.71.8Appr oximation ratioEQCHWETEHWErand om10100Number of training instances1.001.251.5 01.752.002.252.502.7 53.00Approximation ratioEQCHWETEHWErand om10100Number of training instances1.001.251.5 01.752.002.252.502.7 53.00Approximation ratioEQCHWETEHWErand om
As the instance size increases, the gap between EQC and NEQC becomes bigger.
インスタンスサイズが大きくなると、eqcとneqcのギャップが大きくなる。
0.64
We see that even though the two ansatzes are structurally identical, the specific type of parametrizations we choose and the properties of both ansatzes that result from this make a noticeable difference in performance.
While the EQC at depth one has only two parameters per layer regardless of instance size, the NEQC’s number of parameters per layer depends on the number of nodes and edges in the graph.
Increasing the depth of the circuits also does not change this.
回路の深さを増加しても、これは変わらない。
0.76
In fig. 5 c) and
フィギュアで 5 c) と
0.51
d) we see that at a depth of four, the EQC still beats the NEQC.
d) 深さ4で、EQCがNEQCに勝っていることが分かる。
0.59
The latter’s validation performance even slightly decreases with more layers, which is likely due to the increased complexity of the optimization task, as the number of trainable parameters per layer is (n−1)n 2 + n, which for the 20-city instances means 840 trainable parameters at depth four (compared to 8 parameters in case of the EQC).
レイヤ毎のトレーニング可能なパラメータの数は (n−1)n 2 + n であり、20のインスタンスでは、深さ4のトレーニング可能なパラメータが 840 である(EQCの場合は 8 のパラメータに比較)。 訳抜け防止モード: 後者のバリデーション性能は、より多くのレイヤでわずかに低下する。 最適化タスクの複雑さが増し 層ごとのトレーニング可能なパラメータの数が (n−1)n 2 + n であるように これは20のインスタンスに対して、深さ4のトレーニング可能なパラメータ840(EQCの場合の8パラメータと比較して)を意味する。
0.84
This shows that at a fraction of the number of trainable parameters, the EQC is competitive with its non-equivariant counterpart even though the underlying structure of both circuits is identical.
Compared to the classical nearest-neighbor heuristic, both ansatzes perform well and beat it at all instance sizes, and both ansatzes are also below the approximation ratio upper bound given by the Christofides algorithm on all validation instances.
Next, we compare the EQC to ansatzes in which we introduce additional variational components that are completely unrelated to the training data structure, as described above.
We show results for the HWETE and the HWE ansatz in fig.
フィグにおけるHWETEとHWEアンサッツの結果を示す。
0.63
6. To our own surprise, we did not manage to get satisfactory results with either of those two ansatzes, especially at larger instances, despite an intensive hyperparameter search.
Even the HWETE, which is basically identical to the NEQC with additional trainable parameters in each layer, failed to show any significant performance.
Additionally, we show how the validation performance changes when the models are trained with either a training data set consisting of 10 or 100 instances, in the hopes of seeing improved performance as the size of the training set increases.
This example shows that in a complex learning scenario, where the number of permutations of each input instance grows combinatorially with instance size and the number of states in the RL environment grows exponentially with the number of
While increasing the size of the training set and/or the number of layers in the circuit seems to provide small advantages in some cases, it also leads to a decrease in performance in others.
On the other hand, the EQC is mostly agnostic to changes in the number of layers or the training data size.
一方、EQCは主にレイヤ数やトレーニングデータサイズの変化に依存しません。
0.47
Overall, we see that the closer the ansatz is to an equivariant configuration, the better it performs, and picking ansatzes that respect symmetries inherent to the problem at hand is the key to success in this graph-based learning task.
In section V A, we compared the performance of the EQC to a number of non-equivariant ansatzes and the classical nearest-neighbor heuristic.
第V節では,EQCの性能を,非同変アンサーゼおよび古典的近接ヒューリスティックと比較した。
0.65
We have also seen in section III that our ansatz is closely related to that of the QAOA.
第III節では、我々のアンザッツがQAOAと密接に関連していることも見てきた。
0.58
As the QAOA is arguably the most explored variational quantum optimization algorithm at the time of writing, and due to the structural similarity between the EQC and the QAOA’s ansatz, we also compare these two approaches on TSP instances with five cities.
In general, for AQC, we consider a starting Hamiltonian H0, for which both the formulation and the ground state are well known, and a final Hamiltonian HP , that encodes the combinatorial optimization problem to be solved.
一般に、AQC に対して、定式化と基底状態の両方がよく知られた開始ハミルトニアン H0 と、組合せ最適化問題を符号化する最終ハミルトニアン HP を考える。
0.60
The system is prepared in the ground state of the Hamiltonian H0 and then it is evolved according to the time-dependent Hamiltonian:
where s(t) is a real function called annealing schedule that satisfies the boundary conditions: s(0) = 0 and s(T ) = 1, with T the duration of the evolution.
We define a parameter p (integer known as the depth, or level) of QAOA which has the same role as r in eq.
eq において r と同じ役割を持つ qaoa のパラメータ p (深さまたはレベルとして知られる整数) を定義する。
0.78
(19). Increasing the depth p adds additional layers to the QAOA circuit, and thus more closely approximates the H(t) [9].
(19). 深さ p を増加させることで qaoa 回路に層を追加し、h(t) [9] をより密接に近似する。
0.59
In QAOA, all qubits are initialized to |+(cid:105)⊗n , which x .
QAOA では、すべての量子ビットは |+(cid:105) =n に初期化される。
0.62
Alternating layers of Hp and H0 are added to the circuit (p times),
回路(p時間)にHpとH0の交互層を付加する。
0.74
is the ground state of H0 =(cid:80)
H0 =(cid:80)の基底状態である
0.78
i σ(i)
i σ(i)
0.43
英語(論文から抽出)
日本語訳
スコア
14 on the results of the previous step.
14 前回のステップの結果についてです
0.51
In fig. 7 we show our results for five-city instances of the TSP.
フィギュアで 7) tspの5都市における実測結果を示す。
0.41
The approximation ratio shown is derived by dividing the tour length of the best feasible solution, measured as the output of the trained QAOA circuit, by the optimal tour length of the respective instance.
In addition, we compute results for two different p = 3 QAOA circuits: the first is trained in the procedure described above (where the parameters are trained for each instance).
さらに, 2 つの異なる p = 3 個の QAOA 回路に対する結果を計算する。 訳抜け防止モード: さらに、2つの異なる p = 3 QAOA 回路に対する結果を計算する。 1つ目は、上記の手順(各インスタンスでパラメータがトレーニングされている)でトレーニングされる。
0.76
The second uses the parameters of the best QAOA circuit out of those for all instances evaluated at p = 3, following a concentration of parameters argument as presented in [49].
The second method is closer to what is done in a ML context, where one set of parameters is used to evaluate the performance on all validation samples.
We note that the performance of QAOA improves with higher p, however, QAOA performance is still far from matching the approximation ratios obtained by EQC even for p = 3, which can be seen in fig.
Furthermore, we note that significant computational effort is required to obtain these results: methods like COBYLA are based on gradient descent, which requires us to evaluate the circuit many times until either convergence or the maximum number of iterations is reached.
We also note that due to the heuristic optimization of the QAOA parameters themselves, we are not guaranteed that the configuration of parameters is optimal, which may result in either insufficient iterations to converge or premature convergence to sub-optimal parameter values.
In an attempt to mitigate this, we tested several optimizers (Adam, SPSA, BFGS and COBYLA) and used the best results, which were those found by COBYLA.
We see that already on these small instances, the QAOA requires significantly deep circuits to achieve good results, that may be out of reach in a NISQ setting.
Last box shows the results for the best parameters found for one instance at p = 3 applied to all training instances, following a parameter concentration argument.
The dotted, black line denotes the upper bound of the Christofides algorithm.
点在する黒い線は、クリストフィデスアルゴリズムの上限を表す。
0.55
parameterized by γ and β as defined in eq.
eq で定義された γ と β でパラメータ化される。
0.71
(20). The values of γ and β are found by minimizing the expectation value of Hp, and thus approximate the optimal solution to the original combinatorial optimization problem.
1233, conc.p1.01.11.21.31. 41.51.6Approximation ratio
1233, conc.p1.01.11.21.31. 41.51.6近似比
0.12
英語(論文から抽出)
日本語訳
スコア
VI. DISCUSSION After providing analytic insight on the expressivity of our ansatz, we have numerically investigated the performance of our EQC model on TSP instances with 5, 10, and 20 cities (corresponding to 5, 10, 20 qubits respectively), and compared them to other types of ansatzes that do not respect any graph symmetries.
To get a fair comparison, we designed PQCs that gradually break the equivariance property of the EQC and assessed their performance.
公平な比較を得るために、我々は、EQCの等値性を徐々に破壊し、その性能を評価するPQCを設計した。
0.61
We find that ansatzes that contain structures that are completely unrelated to the input data structure are extremely hard to train for this learning task where the size of the state space scales exponentially in the number of input nodes of the graph.
The EQC on the other hand works almost out-of-the box, and achieves good generalization performance with minimal hyperparameter tuning and relatively few trainable parameters.
We have also compared using the EQC in a neural combinatorial optimization scheme with the QAOA, and find that even on TSP instances with only five cities the NCO approach significantly outperforms the QAOA.
In addition to training the QAOA parameters for every instance individually, we have also investigated the performance in light of known parameter concentration results that state that in some cases, parameters found on one instance perform well on average for other instances of the same problem.
Comparing our algorithm to the QAOA is also interesting from a different perspective.
アルゴリズムとQAOAを比較することも、異なる観点から興味深い。
0.65
In section III we have seen that our ansatz can be regarded as a special case of a QAOA-type ansatz, where instead of encoding a problem Hamiltonian we encode a graph instance directly, and include mixing terms only for a problem-dependent subset of qubits.
For the QAOA, it is known that in the limit of infinite depth, it can find the ground state of the problem Hamiltonian and therefore the optimal solution to a given combinatorial optimization problem [9].
Additionally, it has been shown that even at low depth, the probability distributions generated by QAOA-type circuits are hard to sample from classically [51].
i) a term corresponding to the edge between the last added node and the candidate node, and
一 最後の追加ノードと候補ノードの間の端に相当する用語
0.57
ii) all outgoing edges from the candidate node.
i) 候補ノードから全てのエッジを出力する。
0.81
So our model considers the one-step neighborhood of each candidate node at depth one.
そこで本モデルは,各候補ノードの1ステップ近傍を深さ1で検討する。
0.61
In the case of the 15 TSP it is not clear whether this can provide a quantum advantage for the learning task as specified in section IV A. In terms of QAOA, it was shown that in order to find optimal solutions, the algorithm has to “see the whole graph” [52], meaning that all edges in the graph contribute to the expectation values used to minimize the energy.
To alleviate this strong requirement on depth, a recursive version of the QAOA (RQAOA) was introduced in [53].
この深度に対する強い要求を軽減するため、[53]に再帰的なQAOA(RQAOA)が導入されました。
0.80
It works by iteratively merging edges in the problem graph based on their correlation, and thereby gradually reducing the problem to a smaller instance that can be solved efficiently by a classical algorithm, e g by brute-force search.
相関関係に基づいて問題グラフのエッジを反復的にマージすることで機能し、ブルートフォースサーチなど古典的アルゴリズムで効率的に解ける小さなインスタンスに徐々に問題を還元する。 訳抜け防止モード: それは、それらの相関に基づいて問題グラフのエッジを反復的にマージすることによって機能する。 徐々に問題を小さくし 従来のアルゴリズム、例えば brute - force search で効率的に解ける。
0.79
The authors of [53] show that the depth-one RQAOA outperforms QAOA with constant depth p, and that RQAOA achieves an approximation ratio of one for a family of Ising Hamiltonians.
While the RQAOA at depth one can be considered a classical algorithm due to its efficient simulability, a subsequent work compared higher depth versions of RQAOA to the best known generic classical algorithm for graph coloring and showed that these deeper versions of RQAOA outperform the classical approach [54].
This suggests that there is a potential for advantage in this setting as well.
これは、この設定にも利点があることを示唆している。
0.64
The node selection process performed by our algorithm with the EQC used as the ansatz is similar to RQAOA, where instead of merging edges, the mixer terms for nodes that have already been selected are turned off, therefore effectively turning expectation values of edges corresponding to unavailable nodes to zero.
It is and interesting question whether this type of model can lead to a quantum advantage on the TSP problem that we study here, and we leave this for future work.
Inspired by classical geometric deep learning and in an effort to illustrate the importance of problemtailored ansatzes for QML, we have proposed an ansatz for learning on weighted graphs in this work.
A model that does not inherently respect this symmetry will treat each permutation of a training instance as a separate data point and thus has to learn good parameter settings for each individual permutation.
Our equivariant model on the other hand, “recognizes” a permutation of a graph instance and will therefore not need separate training data points to learn parameters for each possible permutation, of which there are combinatorially many.
Our results illustrate the need for QML ansatzes that are motivated by the training data structure.
本研究は,トレーニングデータ構造に動機づけられたqml ansatzesの必要性を示す。
0.68
英語(論文から抽出)
日本語訳
スコア
We introduced such an ansatz for learning tasks on weighted graphs, however, more research is required to find suitable ansatzes for other types of data as well in order to make a near-term advantage in QML possible in the future.
VD also acknowledges the support by the project NEASQC funded from the European Union’s Horizon 2020 research and innovation programme (grant agreement No 951821).
vdはまた、euのhorizon 2020 research and innovation program(grant agreement no 951821)から資金提供を受けたプロジェクトneasqcの支援も認めている。
0.64
16 [1] Quentin Cappart, Didier Ch´etelat, Elias Khalil, Andrea Lodi, Christopher Morris, and Petar Veliˇckovi´c.
Combinatorial optimization and reasoning with graph neural networks.
グラフニューラルネットワークを用いた組合せ最適化と推論
0.82
arXiv preprint arXiv:2102.09544, 2021.
arXiv preprint arXiv:2102.09544, 2021
0.40
[2] Yoshua Bengio, Andrea Lodi, and Antoine Prouvost.
[2]ヨシュア・ベンジオ、アンドレア・ロディ、アントワーヌ・プロボスト。
0.59
Machine learning for combinatorial optimization: a methodological tour d’horizon.
組合せ最適化のための機械学習: 方法論ツアーd’horizon。
0.83
European Journal of Operational Research, 290(2):405–421, 2021.
European Journal of Operational Research, 290(2):405-421, 2021
0.44
[3] Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly.
[3] oriol vinyals,meire fortunato,navdeep jaitly。
0.30
Pointer networks.
ポインタネットワーク。
0.68
arXiv preprint arXiv:1506.03134, 2015.
arXiv preprint arXiv:1506.03134, 2015
0.40
[4] Hanjun Dai, Elias B Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song.
[4]ハンジュン・ダイ、エリアス・B・ハリル、ユユ・チャン、ビストラ・ディカルキナ、ル・ソング。
0.50
Learning combinatorial optimization algorithms over graphs.
グラフによる組合せ最適化アルゴリズムの学習。
0.77
arXiv preprint arXiv:1704.01665, 2017.
arxiv プレプリント arxiv:1704.01665, 2017
0.42
[5] Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Sungmin Bae, et al Chip placement with deep reinforcement learning.
5] Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Sungmin Bae, et al Chip Placement with Deep reinforcement learning。 訳抜け防止モード: 5 ] アザリア・ミルホセイニ アンナ・ゴルディ ムスタファ・ヤズガン joe jiang, ebrahim songhori, shen wang, young - ジュン・リー eric johnson氏、omkar pathak氏、songmin bae氏、そして深層強化学習によるalチップ配置。
0.64
arXiv preprint arXiv:2004.10746, 2020.
arxiv プレプリント arxiv:2004.10746, 2020
0.44
[6] Mohammadreza
6] モハンマドレザ
0.40
Oroojlooy, Lawrence V Snyder, and Martin Tak´aˇc.
オルージュロイ、ローレンス5世スナイダー、マーティン・タック。
0.45
Reinforcement learning for solving the vehicle routing problem.
車両経路問題を解決するための強化学習
0.81
arXiv preprint arXiv:1802.04240, 2018.
arXiv preprint arXiv:1802.04240, 2018
0.39
Nazari, Afshin [7] Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c.
Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.
幾何学的ディープラーニング:グリッド、グループ、グラフ、測地線、ゲージ。
0.69
arXiv preprint arXiv:2104.13478, 2021.
arXiv preprint arXiv:2104.13478, 2021
0.40
[8] Marco Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, et al Variational quantum algorithms.
8]marco cerezo、andrew arrasmith、ryan babbush、simon c benjamin、suguru endo、keisuke fujii、jarrod r mcclean、kosuke mitarai、xiao yuan、lukasz cincio、al variational quantum algorithms。 訳抜け防止モード: マルコ・セレゾ アンドリュー・アラスミス ライアン・バブブッシュ simon c benjamin, suguru endo, keisuke fujii, jarrod r mcclean, kosuke mitarai, xiao yuan, lukasz cincio, et al variational quantum algorithms など。
0.57
Nature Reviews Physics, 3(9):625–644, 2021.
Nature Reviews Physics, 3(9):625–644, 2021。
0.46
[9] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann.
9]エドワード・ファリー、ジェフリー・ゴールドストーン、サム・グートマン
0.53
A quantum approximate optimization algorithm.
量子近似最適化アルゴリズム。
0.73
arXiv preprint arXiv:1411.4028, 2014.
arxiv プレプリント arxiv:1411.4028, 2014
0.43
[10] Tavis Bennett, Edric Matwiejew, Sam Marsh, and Jingbo B Wang.
10]Tavis Bennett、Edric Matwiejew、Sam Marsh、Jingbo B Wang。
0.67
Quantum walk-based vehicle routing optimisation.
量子ウォークに基づく車両ルーティング最適化
0.69
Frontiers in Physics, page 692, 2021.
物理学のフロンティア』692ページ、2021年。
0.76
[11] Harper R Grimsley, Sophia E Economou, Edwin Barnes, and Nicholas J Mayhall.
Harper R Grimsley氏、Sophia E Economou氏、Edwin Barnes氏、Nicholas J Mayhall氏。
0.69
Adapt-vqe: An exact variational algorithm for fermionic simulations on a quantum computer.
[12] Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J Love, Al´an Aspuru-Guzik, and Jeremy L O’brien.
Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J Love, Al ́an Aspuru-Guzik, Jeremy L O’brien 訳抜け防止モード: 12] アルベルト・ペルッツォ ジャロッド・マクリーン ピーター・シャドボルト 男-ホン・ユン、シャオ-チー・周、ピーター・j・ラブ グジク(guzik)とジェレミー・l・オブライエン(jeremy l o'brien)。
0.46
A variational eigenvalue solver on a photonic quantum processor.
フォトニック量子プロセッサにおける変動固有値解法
0.64
Nature communications, 5(1):1–7, 2014.
自然通信 5(1):1-7 2014年。
0.85
[13] Abhinav Kandala, Antonio Mezzacapo, Kristan
[13]Abhinav Kandala,Antonio Mezzacapo,Kristan
0.38
Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta.
Temme、Maika Takita、Markus Brink、Jerry M Chow、Jay M Gambetta。
0.31
Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets.
小型分子と量子磁石のハードウェア効率可変量子固有解法
0.82
Nature, 549(7671):242–246, 2017.
549(7671):242-246、2017年。
0.61
[14] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini.
14]マルチェロ・ベネデッティ、エリカ・ロイド、ステファン・サック、マティア・フィオレンティーニ
0.36
Parameterized quantum circuits as machine learning models.
機械学習モデルとしての量子回路のパラメータ化。
0.68
Quantum Science and Technology, 4(4):043001, 2019.
量子科学と技術, 4(4):043001, 2019。
0.83
[15] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven.
Jarrod R McClean氏、Sergio Boixo氏、Vadim N Smelyanskiy氏、Ryan Babbush氏、Hartmut Neven氏。
0.35
Barren plateaus in quantum neural network training landscapes.
量子ニューラルネットワークトレーニングランドスケープにおけるバレンプラトー
0.70
Nature communications, 9(1):1–6, 2018.
ナチュラル・コミュニケーションズ、2018年1月9日:1-6日。
0.48
[16] Andrew Arrasmith, M Cerezo, Piotr Czarnik, Lukasz Cincio, and Patrick J Coles.
16] andrew arrasmith、m cerezo、piotr czarnik、lukasz cincio、patrick j coles。
0.47
Effect of barren plateaus on gradient-free optimization.
勾配なし最適化におけるバレン高原の影響
0.72
Quantum, 5:558, 2021.
量子 5:558, 2021
0.37
[17] Carlos Ortiz Marrero, M´aria Kieferov´a, and Nathan Wiebe.
In International Conference on Artificial Neural Networks, pages 159– 164.
ニューラルネットワークに関する国際会議 (international conference on artificial neural networks) 159-164頁。
0.58
Springer, 1998.
1998年、スプリンガー。
0.60
[19] Code this https://github.com/a skolik/eqc for nco.
【19】コード https://github.com/a skolik/eqc for nco
0.54
used in work [20] John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin ˇZ´ıdek, Anna Potapenko, et al Highly accurate protein structure prediction with alphafold.
[21] Kathryn Tunyasuvunakool, Jonas Adler, Zachary Wu, Tim Green, Michal Zielinski, Augustin ˇZ´ıdek, Alex Bridgland, Andrew Cowie, Clemens Meyer, Agata Laydon, et al Highly accurate protein structure prediction for the human proteome.
[34] Mart´ın Larocca, Fr´ed´eric Sauvage, Faris M. Sbahi, Guillaume Verdon, Patrick J. Coles, and M. Cerezo.
マルト・ラロッカ、Fr ed ́eric Sauvage、Faris M. Sbahi、Guillaume Verdon、Patrick J. Coles、M. Cerezo。 訳抜け防止モード: マルティン・ラロッカ, Fr ́ed ́eric Sauvage, Faris M. Sbahi, Guillaume Verdon、Patrick J. Coles、M. Cerezo。
0.75
Group-invariant quantum machine learning.
グループ不変量子機械学習。
0.82
arXiv preprint arXiv:2205.02261, 2022.
arXiv preprint arXiv:2205.02261, 2022
0.40
[35] Richard S Sutton and Andrew G Barto.
35] リチャード・s・サットンと アンドリュー・g・バート
0.74
Reinforce- ment learning: An introduction.
補強 メンタラーニング: 入門。
0.31
MIT press, 2018.
MIT出版、2018年。
0.71
[36] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al Human-level control through deep reinforcement learning.
36] volodymyr mnih, koray kavukcuoglu, david silver, andrei a rusu, joel veness, marc g bellemare, alex graves, martin riedmiller, andreas k fidjeland, georg ostrovski, et al human-level control through deep reinforcement learning 訳抜け防止モード: 36]volodymyr mnih, koray kavukcuoglu, david silver。 アンドレイ・ア・ルース ジョエル・ヴェニス マルク・g・ベネマレ アレックス・グレイヴス martin riedmiller, andreas k fidjeland, georg ostrovski, et al human - 深層強化学習によるレベル制御。
0.67
nature, 518(7540):529– 533, 2015.
自然, 518(7540): 529– 533, 2015
0.84
[37] Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Hsi-Sheng Goan.
For fixed control parameters the quantum approximate optimization algorithm’s objective function value concentrates for typical arXiv preprint arXiv:1812.04170, 2018.
Learning with invariances in random features and kernel models.
ランダム特徴とカーネルモデルにおける不変性による学習。
0.73
In Conference on Learning Theory, pages 3351–3418.
学習理論に関する会議』3351-3418頁。
0.79
PMLR, 2021.
PMLR、2021年。
0.80
Appendix A: Proof of equivariance of ansatz
Appendix A: Ansatzの等価性の証明
0.71
Proof of theorem 1. First, we show that one layer of the ansatz is equivariant under permutation of graph vertices.
定理 1 の証明。 まず、アンサッツの1つの層がグラフ頂点の置換の下で同値であることを示す。
0.60
Our ansatz is initialized in the uniform superposition, which is permutation invariant by definition, so we only have to show that the unitaries each layer in the ansatz is equivariant.
Each layer consists of two terms: UN (α, βj) and UG(E, γj).
各層は、UN (α, βj) と UG(E, γj) の2つの項からなる。
0.82
We need to show that both of these fulfill the equivariance property.
これら両方が等分散性を満たすことを示す必要がある。
0.63
The first term contains only single-qubit operations, where each operation is defined w.r.t. vertices in G. For an arbitrary quantum state |ψ(cid:105) defined on a register of n qubits, a permutation of qubits
第一項は1量子ビット演算のみを含み、各演算は g において w.r.t. 頂点を定義する。 任意の量子状態 |ψ(cid:105) に対して n 量子ビットのレジスタ、すなわち qubit の置換で定義される。 訳抜け防止モード: 最初の項は、n 量子ビットのレジスタ上で定義された任意の量子状態 |ψ(cid:105 ) に対して、各演算は g において w.r.t で定義される。 qubits (複数形 qubits)
0.72
英語(論文から抽出)
日本語訳
スコア
corresponds to applying a series of SWAP-gates on those qubits.
これらのキュービットに一連のSWAPゲートを適用することに対応する。
0.53
The SWAP operation leaves the individual quantum states unchanged, and merely reassigns them to a new position in the register, therefore fulfilling the above equivariance property.
ZZ-operators are diagonal and therefore commute, so the order in which they are executed does not matter.
zzオペレータは対角であり、したがって通勤するので、実行される順序は重要ではない。
0.66
In addition, the assignment of edge weights to the given ZiZj is determined by the adjacency matrix A of the graph, and therefore a permutation on the adjacency matrix directly translates to a re-assignment of edge weights to their corresponding ZiZj according to the permuted adjacency matrix, which makes the second term equivariant as well.
さらに、与えられた zizj への辺重みの割り当ては、グラフの隣接行列 a によって決定され、したがって、隣接行列上の置換は、置換された隣接行列に従って対応する zizj への辺重みの再割り当てに直接変換され、2番目の項が同変となる。
0.67
Additionally, the permutation invariance of the trainable parameters is guaranteed by using one shared parameter for each ZZ-term or X-rotation in each layer.
Note that parameters not being tied to specific edge weights is necessary, because otherwise trainable parameters would be assigned to different edge weights depending on the specific graph permutation, which would break equivariance.
Now that we have shown that a single layer in our ansatz is equivariant under permutation of vertices, it remains to show that the composition of multiple layers is equivariant as well.
This easily follows from the above, as we have already shown that the permutation of the single-qubit operators corresponding to vertices is akin to a SWAP operation on a quantum state |ψ(cid:105), that itself is permutation equivariant.
The above proof holds for the one qubit case, and whether an ansatz with multiple qubits per node will still be equivariant depends on the chosen mapping of node and edge features.
A trivial example where the above holds, is when multiple node features are encoded via single-qubit rotations on each additional qubit, and edge features are still encoded in form of commuting two-qubit gates as above.
edge weights In this section, we show how the fact that the sine function can approximate an arbitrary set of rationally independent values {x1, . . . , xn} with labels
18 yi ∈ [−1, 1] can be used prove that our ansatz at depth one can construct the optimal tour for a graph with edge weights that are rationally independent.
18 yi ∈ [−1, 1] は、深さでの ansatz が有理独立な辺重みを持つグラフの最適巡回を構築することができることを証明できる。
0.71
Theorem 4 (Ansatz can generate optimal tours for rationally independent edge weights).
There exists a setting (β, γ)∗ for each graph instance of the symmetric TSP such that the ansatz at depth one described in section III will produce the optimal tour T ∗ with the node selection process described in definition 2, given that the edge weights εij of the graph are rationally independent and εijγ (cid:54)= π
Proof. As known from [42], we can find a parameter ω such that we can approximate an arbitrary labeling in [−1, 1] for our rationally independent edge weights with the sine function.
This means that this term can merely flip the sign of all (cid:104)Ovl(cid:105 ), and from now on w.l.o.g. we assume that β is such that the term is positive.
Now we can again formulate the tour generation task in terms of a binary classification problem, where we want to find a configuration of labels for our remaining sin terms in eq.
(B2) s.t. the product will have the highest expectation value in each node selection step for the edge that produces the ordering we have chosen for T .
This means that we have to find an assignment of the edges εij to the classes f± that at each step of the node selection process will lead to the node being picked that we specify in T .
As all edges can occur in the above products multiple times during the node selection process, this is a non-trivial task.
すべてのエッジが上記の製品でノード選択プロセス中に複数回発生するため、これは非自明なタスクである。
0.76
However, if we can guarantee that each (cid:104)Ovl(cid:105 )t at node selection step t contains at least one unique term that is only present in this specific expectation value, we can use this term to control the
Each εij occurs either in the leading term sin(εijγ) (corresponding to the candidate edge to be potentially added in the next step) 2 − εijγ) (correspondor in the product term as sin( π ing to an outgoing edge from the current candidate).
We can easily see that the leading term only appears in the case when we ask for this specific εij to be the next edge in the tour, and from definition 2 we know that this only happens once in the node selection process.
In all other expectations, εij appears only with the “offset” of π 2 .
その他の予想では、εij は π 2 の "オフセット" にのみ現れる。
0.70
This means that this leading term is the unique term that we are looking for, as long as sin(εijγ) (cid:54)= sin( π 2 − εijγ), so as long as sin(εijγ) (cid:54)= cos(εijγ).
(B3). In particular, this means that we can construct the optimal tour in this way.
(B3)。 特にこれは、この方法で最適なツアーを構築することができることを意味する。
0.54
19 Additionally, due to equivariance of our model under node permutations, it is sufficient to find just one of these settings for any tour T and use this to construct the optimal tour.
We can do this because once we have found a labeling of the edges for the classes f± that produces a given tour sequence, we can simply reshuffle our nodes s.t. this sequence of the “physical nodes” (indices corresponding to qubits) will produce the optimal tour for our “logical nodes” ((x, y) coordinates of graph nodes).