Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210112となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# ネットワーク構造を用いたノードの属性予測 Predicting Attributes of Nodes Using Network Structure ( http://arxiv.org/abs/1912.12264v3 ) ライセンス: Link先を確認	Sarwan Ali, Muhammad Haroon Shakeel, Imdadullah Khan, Safiullah Faizullah, Muhammad Asad Khan	(参考訳) ソーシャルネットワークのような多くのグラフでは、ノードはその振る舞いを表す属性を持つ。このようなグラフでノード属性を予測することは、レコメンデーションシステム、プライバシー保護、ターゲット広告といった多くのドメインのアプリケーションにとって重要な問題である。属性は属性間のパターンや相関を分析し、分類/回帰アルゴリズムを用いて予測することができる。しかし,これらの手法は容易に利用可能なネットワークトポロジ情報を利用できない。この観点から、ノードの異なる属性間の相互接続を利用して予測精度を向上させることができる。本稿では,隣人の属性をすべて利用し,属性値が$a_i$の値となる属性$a_i$(機械学習アルゴリズムの入力として使用される)に関する特徴マップを用いてノードを表現する手法を提案する。 10個の実世界のデータセットに対して広範な実験を行い、提案した特徴マップがこれらのデータセットのベースラインアプローチと比較して予測精度を大幅に向上することを示す。 In many graphs such as social networks, nodes have associated attributes representing their behavior. Predicting node attributes in such graphs is an important problem with applications in many domains like recommendation systems, privacy preservation, and targeted advertisement. Attributes values can be predicted by analyzing patterns and correlations among attributes and employing classification/regression algorithms. However, these approaches do not utilize readily available network topology information. In this regard, interconnections between different attributes of nodes can be exploited to improve the prediction accuracy. In this paper, we propose an approach to represent a node by a feature map with respect to an attribute $a_i$ (which is used as input for machine learning algorithms) using all attributes of neighbors to predict attributes values for $a_i$. We perform extensive experimentation on ten real-world datasets and show that the proposed feature map significantly improves the prediction accuracy as compared to baseline approaches on these datasets.	翻訳日:2023-06-09 23:17:03 公開日:2021-01-12
# 多光子一致率におけるサム則 Sum rules in multiphoton coincidence rates ( http://arxiv.org/abs/2004.11504v2 ) ライセンス: Link先を確認	David Amaro Alcal\'a and Dylan Spivak and Hubert de Guise	(参考訳) 多重光子干渉法実験では、元のユニタリ散乱行列を0$sのコセット行列に置き換えることで、慎重に選択された一致率の和を単純化できることを示す。これらの$0$sの個数と配置は、元の率の合計に影響を与えることなく、和における各項の複雑さを減少させる。特に、永久数のモジュラス二乗の和の評価は、ある場合において行列式のモジュラス二乗の和となることが示されている。レートの和は、干渉計における光学素子の除去と等価であることが示されている。 We show that sums of carefully chosen coincidence rates in a multiphoton interferometry experiment can be simplified by replacing the original unitary scattering matrix with a coset matrix containing $0$s. The number and placement of these $0$s reduces the complexity of each term in the sum without affecting the original sum of rates. In particular, the evaluation of sums of modulus squared of permanents is shown to turn in some cases into a sum of modulus squared of determinants. The sums of rates are shown to be equivalent to the removal of some optical elements in the interferometer.	翻訳日:2023-05-22 06:25:48 公開日:2021-01-12
# d波量子アニールにおけるハミルトン雑音のベンチマーク Benchmarking Hamiltonian Noise in the D-Wave Quantum Annealer ( http://arxiv.org/abs/2006.16421v3 ) ライセンス: Link先を確認	Tristan Zaborniak, Rog\'erio de Sousa	(参考訳) 様々なノイズ源は量子コンピュータの性能を制限し、量子ビット状態を制御不能に変化させ、コヒーレンス時間を短縮する。量子異方体では、このノイズはハミルトン問題を定義するパラメータにさらなるゆらぎをもたらし、もともとプログラムされた問題から乱される問題の基底状態を見つける。本稿では,量子アニーラのプログラムされたハミルトニアンのノイズ量を評価する手法について述べる。プログラムされたハミルトニアン集合の係数をゼロにする縮退の列は、量子アニーリングプロトコルにおけるハミルトニアンパラメータの"in situ"に影響する雑音スペクトル密度の推定に繋がることを示す。この方法は、D-Waveの低雑音2000キュービットデバイス(DW_2000Q_6)および最近リリースされた5000キュービットデバイス(Advantage_system1.1)で実証される。 dw_2000q_6のベンチマークでは、フラックス量子ビットを形成する材料固有のフラックスノイズの周波数依存性特性が1/f^{0.7}$で支配されるハミルトンノイズを示す。対照的にAdvantage_system1.1は、全てのアニール時間に対してDW_2000Q_6よりも2〜3倍高い内在的なフラックスノイズ振幅を持つ、低アニール時間における追加ノイズ源の影響を受けている。 Various sources of noise limit the performance of quantum computers by altering qubit states in an uncontrolled manner throughout computations and reducing their coherence time. In quantum annealers, this noise introduces additional fluctuations to the parameters defining the original problem Hamiltonian, such that they find the ground states of problems perturbed from those originally programmed. Here we describe a method to benchmark the amount of noise affecting the programmed Hamiltonian of a quantum annealer. We show that a sequence of degenerate runs with the coefficients of the programmed Hamiltonian set to zero leads to an estimate of the noise spectral density affecting Hamiltonian parameters "in situ" during the quantum annealing protocol. The method is demonstrated in D-Wave's lower noise 2000 qubit device (DW_2000Q_6) and in its recently released 5000 qubit device (Advantage_system1.1). Our benchmarking of DW_2000Q_6 shows Hamiltonian noise dominated by the $1/f^{0.7}$ frequency dependence characteristic of flux noise intrinsic to the materials forming flux qubits. In contrast, Advantage_system1.1 is found to be affected by additional noise sources for low annealing times, with underlying intrinsic flux noise amplitudes $2-3$ times higher than in DW_2000Q_6 for all annealing times.	翻訳日:2023-05-12 03:19:49 公開日:2021-01-12
# 2\otimes d$量子システムのための分離可能かつ絶対分離可能な状態球の構築 Constructing a ball of separable and absolutely separable states for $2\otimes d$ quantum system ( http://arxiv.org/abs/2007.00891v2 ) ライセンス: Link先を確認	Satyabrata Adhikari	(参考訳) 絶対分離状態は、任意の大域的ユニタリ変換の作用の下で分離可能な状態の一種である。これらの状態は量子相関を持ちうるか、そうでないかもしれないし、これらの相関は量子不協和によって測定することができる。絶対分離状態は、たとえ無限小の量子相関を含むとしても、量子計算において有用である。したがって、ゼロディスコードを持つ2量子ビットの絶対分離可能な状態のクラスを探索するために、すべてのゼロディスコード状態を示す$\varrho$ である$tr(\varrho^{2})$ の上限を導出した。一般に、上界は検討中の状態に依存するが、その状態がゼロディスコード状態の特定のクラスに属するならば、上界は状態独立であることが分かる。後に、これらの非調和状態の特定のクラスのうち、絶対分離可能な部分クラスが存在することが示される。さらに、与えられたキュービット量子状態の分離性に必要な条件を導出した。次に、導出条件を用いて、$tr(\rho^{2})\leq tr(x^{2})+2tr(xz)+tr(z^{2})$ で記述された2\otimes d$ の量子系を構築し、そこで2\otimes d$ 量子系は密度演算子 $\rho$ によって記述され、ブロック行列 $x,y$ と $z$ が $x,z\geq 0$ で表現できる。特に、qubit-qubit系では、新しく構築された球は、$Tr(\rho^{2})\leq \frac{1}{3}$ で表される球と比較して、より大きい絶対分離状態を含むことを示す。最後に,Qubit-qudit系の絶対分離性について,調査中の純度の観点から,その必要条件を導出した。 Absolute separable states is a kind of separable state that remain separable under the action of any global unitary transformation. These states may or may not have quantum correlation and these correlations can be measured by quantum discord. We find that the absolute separable states are useful in quantum computation even if it contains infinitesimal quantum correlation in it. Thus to search for the class of two-qubit absolute separable states with zero discord, we have derived an upper bound for $Tr(\varrho^{2})$, where $\varrho$ denoting all zero discord states. In general, the upper bound depends on the state under consideration but if the state belong to some particular class of zero discord states then we found that the upper bound is state independent. Later, it is shown that among these particular classes of zero discord states, there exist sub-classes which are absolutely separable. Furthermore, we have derived necessary conditions for the separability of a given qubit-qudit states. Then we used the derived conditions to construct a ball for $2\otimes d$ quantum system described by $Tr(\rho^{2})\leq Tr(X^{2})+2Tr(XZ)+Tr(Z^{2})$, where the $2\otimes d$ quantum system is described by the density operator $\rho$ which can be expressed by block matrices $X,Y$ and $Z$ with $X,Z\geq 0$. In particular, for qubit-qubit system, we show that the newly constructed ball contain larger class of absolute separable states compared to the ball described by $Tr(\rho^{2})\leq \frac{1}{3}$. Lastly, we have derived the necessary condition in terms of purity for the absolute separability of a qubit-qudit system under investigation.	翻訳日:2023-05-11 20:56:30 公開日:2021-01-12
# c-Functionによる位相量子相転移の検出 Detecting Topological Quantum Phase Transitions via the c-Function ( http://arxiv.org/abs/2007.07273v2 ) ライセンス: Link先を確認	Matteo Baggioli, Dimitrios Giataganas	(参考訳) 位相的量子臨界点の位置を検出するための新しい高精度プローブとしてc関数を提案する。直接応用として、位相的に自明な絶縁相と隙間のないワイル半金属の間の位相量子相転移を示すホログラフィックモデルを考える。量子臨界点は空間方向において強いリフシッツ様の異方性を示し、量子相転移は標準ランダウパラダイムに従わない。 c-函数は量子臨界点における大域的な特徴をロバストに示し、2つの異なるゼロ温度位相を非常に精度良く区別する。 c-関数と絡み合いエントロピーの関係を考慮すると、我々の提案は量子相転移の一般的な特徴であり、ホログラフィーフレームワークを超えて適用可能であると推測する。 We propose the c-function as a new and accurate probe to detect the location of topological quantum critical points. As a direct application, we consider a holographic model which exhibits a topological quantum phase transition between a topologically trivial insulating phase and a gapless Weyl semimetal. The quantum critical point displays a strong Lifshitz-like anisotropy in the spatial directions and the quantum phase transition does not follow the standard Landau paradigm. The c-function robustly shows a global feature at the quantum criticality and distinguishes with great accuracy the two separate zero temperature phases. Taking into account the relation of the c-function with the entanglement entropy, we conjecture that our proposal is a general feature of quantum phase transitions and that is applicable beyond the holographic framework.	翻訳日:2023-05-10 01:58:26 公開日:2021-01-12
# He型およびLi型等電子系列の超コンパクト正確な波動関数と変分電卓 I.地上状態 Ultra-Compact accurate wave functions for He-like and Li-like iso-electronic sequences and variational calculus. I. Ground state ( http://arxiv.org/abs/2007.11745v4 ) ライセンス: Link先を確認	A.V. Turbiner, J.C. Lopez Vieyra, J.C. del Valle, D.J. Nader	(参考訳) クーロン電荷(qmcc)の量子力学の適用可能性の領域を記述する一般化ハイレラス・キノ下関数(英語版)とゲバラ・ハリス・タービナー関数(英語版)の形でのいくつかの超コンパクトな正確な波動関数、または同値に、点状、無限重核を持つ静的近似におけるhe様およびli様の等電子配列の基底状態エネルギー(4-5重要な桁(s.d.))に対する非相対論的qed(nrqed)が構成されている。どちらの列に対しても、得られたパラメータは単純な滑らかな関数によって$Z$に収まることが示され、一般にこれらのパラメータは変分計算で現れるパラメータとは異なる。 He型2電子系列では、基底状態関数の近似式がエネルギー$\sim 10^{-3}$\,a.u.に対して絶対精度を与え、カスプパラメータと6つの期待値の両方に対して同じ相対精度$\sim 10^{-2}-10^{-3}$を求める。 Li型3電子系列では、変分試行関数として取る最も正確な超コンパクト関数はエネルギー$\sim 10^{-3}$\,a.u., 2-3 s.d.の電子核カスプパラメータ$Z \leq 20$と3 s.d.の2つの期待値$Z=3$に対して絶対的精度を与える。 Several ultra-compact accurate wave functions in the form of generalized Hylleraas-Kinoshita functions and Guevara-Harris-Turbiner functions, which describe the domain of applicability of the Quantum Mechanics of Coulomb Charges (QMCC), or, equivalently, the Non-Relativistic QED (NRQED), for the ground state energies (4-5 significant digits (s.d.)) of He-like and Li-like iso-electronic sequences in the static approximation with point-like, infinitely heavy nuclei are constructed. It is shown that for both sequences the obtained parameters can be fitted in $Z$ by simple smooth functions: in general, these parameters differ from the ones emerging in variational calculations. For the He-like two-electron sequence the approximate expression for the ground state function, which provides absolute accuracy for the energy $\sim 10^{-3}$\,a.u. and the same relative accuracies $\sim 10^{-2}-10^{-3}$ for both the cusp parameters and the six expectation values, is found. For the Li-like three-electron sequence the most accurate ultra-compact function taken as the variational trial function provides absolute accuracy for energy $\sim 10^{-3}$\,a.u., 2-3 s.d. for the electron-nuclear cusp parameter for $Z \leq 20$ and 3 s.d. for the two expectation values for $Z=3$.	翻訳日:2023-05-08 11:07:54 公開日:2021-01-12
# ナノスケールダイヤモンド磁気計による脂質二層膜のラベルフリー位相変化検出 Label-free phase change detection of lipid bilayers using nanoscale diamond magnetometry ( http://arxiv.org/abs/2007.13085v5 ) ライセンス: Link先を確認	Hitoshi Ishiwata, Hiroshi C. Watanabe, Shinya Hanashima, Takayuki Iwasaki and Mutsuko Hatano	(参考訳) ダイヤモンド中のNV中心は、NMRスペクトルと温度測定の高感度ナノスケール分析のための特別な品質の量子センサーである。本研究では,NV中心の深さによって決定される小体積~(6nm)$^{3}$のアンサンブル平均核スピン検出による脂質二層膜のナノスケール相変化検出について検討した。ナノスケールNMR信号の解析により、脂質二分子膜の厚さは6.2 nm$\pm$3.4 nmで、プロトン密度は65プロトン/nm$^{3}$で、ダイヤモンド試料の上に脂質二分子膜の形成を検証する。ナノスケール体積の相関スペクトルは, 71.8mTの印加磁場における陽子のラーモア周波数に対応する3.06MHzの量子振動を呈し, モンテカルロシミュレーションで構築した2次元分子拡散モデルと分子動力学シミュレーションの結果を比較した。 1.5$\pm$ 0.25 nm$^{2}$/${\mu}$sから3.0$\pm$ 0.5 nm$^{2}$/${\mu}$sへの拡散定数の変化があり、温度が26.5$^\circ$Cから36.0$^\circ$Cへ変化する。その結果, ナノスケールダイヤモンド磁力計を用いたラベルフリー測定では, 並進拡散と温度変化の同時観測が可能となった。本手法は, 細胞膜を無ラベルでイメージングし, その相組成と動的特性を理解する方法である。 The NV center in a diamond is a quantum sensor with exceptional quality for highly sensitive nanoscale analysis of NMR spectra and thermometry. In this study, we investigate nanoscale phase change detection of lipid bilayers utilizing ensemble-averaged nuclear spin detection from small volume ~ (6 nm)$^{3}$, which was determined by the depth of the NV center. Analysis of nanoscale NMR signal confirm thickness of lipid bilayer to be 6.2 nm $\pm$ 3.4 nm with proton density of 65 proton/nm$^{3}$ verifying formation of lipid bilayer on top of diamond sample. Correlation spectroscopy from nanoscale volume reveals quantum oscillation at 3.06 MHz corresponding to the Larmor frequency of proton at an applied magnetic field of 71.8 mT. The result of the correlation spectroscopy was compared with the 2D molecular diffusion model constructed by Monte Carlo simulation combined with results from molecular dynamics simulation. There is a change in diffusion constant from 1.5 $\pm$ 0.25 nm$^{2}$/${\mu}$s to 3.0 $\pm$ 0.5 nm$^{2}$/${\mu}$s when the temperature changes from 26.5 $^\circ$C to 36.0 $^\circ$C. Our results demonstrate that simultaneous observation of changes in translational diffusion and temperature is possible in label-free measurements using nanoscale diamond magnetometry. Our method paves the way for label-free imaging of cell membranes for understanding its phase composition and dynamics.	翻訳日:2023-05-08 04:47:15 公開日:2021-01-12
# 偏光エンタングル光子のサニャック源に結合したtsirelsonへの接近 Approaching the Tsirelson bound with a Sagnac source of polarization-entangled photons ( http://arxiv.org/abs/2008.01575v2 ) ライセンス: Link先を確認	Sandra Meraner, Robert J. Chapman, Stefan Frick, Robert Keil, Maximilian Prilm\"uller and Gregor Weihs	(参考訳) 高忠実度偏光束縛光子は量子通信の強力な資源であり、絡み合いと量子テレポーテーションを分散する。ベル-CHSHの不等式 $S\leq2$ は二部交絡によって破られ、最大の絡み合った状態だけが$S=2\sqrt{2}$ を達成できる。自発的パラメトリックダウン変換源は、トシレルソン境界に近い相関を持つ絡み合った光子を生成することができる。サニャック構成は、固有の安定性、コンパクトなフットプリント、高いコレクション効率を提供するが、ソース輝度と絡み合いの可視性の間にはトレードオフがあることが多い。ここでは、2\sqrt{2}-s=(5.65\pm0.57)\times10^{-3}$のsagnac偏光エンタングルソースを最高値とほぼ同等に生成し、(4660\pm70)$ pairs/s/mwを生成・検出する。我々の情報源は、0.9953\pm0.0003$ concurrenceと0.99743\pm0.00014$ を理想的なベル状態に忠実に記録している。サニャック光源の系統的誤差を研究することにより、結晶内の集束焦点の精度が、実験におけるS$パラメータの削減に最大の役割を果たすことを明らかにした。 sagnacソースで記録された最高$s$パラメータを、非常に高い輝度を維持しながら最新のものにすることができる経路を提供する。 High-fidelity polarization-entangled photons are a powerful resource for quantum communication, distributing entanglement and quantum teleportation. The Bell-CHSH inequality $S\leq2$ is violated by bipartite entanglement and only maximally entangled states can achieve $S=2\sqrt{2}$, the Tsirelson bound. Spontaneous parametric down-conversion sources can produce entangled photons with correlations close to the Tsirelson bound. Sagnac configurations offer intrinsic stability, compact footprint and high collection efficiency, however, there is often a trade off between source brightness and entanglement visibility. Here, we present a Sagnac polarization-entangled source with $2\sqrt{2}-S=(5.65\pm0.57)\times10^{-3}$, on-par with the highest values recorded, while generating and detecting $(4660\pm70)$ pairs/s/mW, which is a substantially higher brightness than previously reported for Sagnac sources and around two orders of magnitude brighter than for traditional cone sources with the highest $S$ parameter. Our source records $0.9953\pm0.0003$ concurrence and $0.99743\pm0.00014$ fidelity to an ideal Bell state. By studying systematic errors in Sagnac sources, we identify that the precision of the collection focal point inside the crystal plays the largest role in reducing the $S$ parameter in our experiment. We provide a pathway that could enable the highest $S$ parameter recorded with a Sagnac source to-date while maintaining very high brightness.	翻訳日:2023-05-07 04:33:44 公開日:2021-01-12
# sachdev-ye-kitaev鎖の非ユニタリダイナミクス Non-unitary dynamics of Sachdev-Ye-Kitaev chain ( http://arxiv.org/abs/2008.11955v2 ) ライセンス: Link先を確認	Chunxiao Liu, Pengfei Zhang, Xiao Chen	(参考訳) Sachdev-Ye-Kitaevモデルに基づく一元的および虚的進化からなる一元的非一元的ダイナミクスのシリーズを構築する。短距離の絡み合い状態から始めて、大きな$N$制限の経路積分形式を用いて絡み合いのダイナミクスを解析する。その結果,(1)想像上の進化の強さを変化させることにより,相互作用モデルが高度に絡み合った体積法相から領域法相への1次相転移を示すこと,(2)1次元自由フェルミオンモデルは創発的な二次元共形対称性を持つ広範な臨界状態を示すこと,の2つが特に興味深い。 We construct a series of one-dimensional non-unitary dynamics consisting of both unitary and imaginary evolutions based on the Sachdev-Ye-Kitaev model. Starting from a short-range entangled state, we analyze the entanglement dynamics using the path integral formalism in the large $N$ limit. Among all the results that we obtain, two of them are particularly interesting: (1) By varying the strength of the imaginary evolution, the interacting model exhibits a first order phase transition from the highly entangled volume law phase to an area law phase; (2) The one-dimensional free fermion model displays an extensive critical regime with emergent two-dimensional conformal symmetry.	翻訳日:2023-05-04 19:46:15 公開日:2021-01-12
# 非可換オブザーバブルの同時連続測定による量子力学 Quantum dynamics under simultaneous and continuous measurement of noncommutative observables ( http://arxiv.org/abs/2008.12908v2 ) ライセンス: Link先を確認	Chao Jiang, Gentaro Watanabe	(参考訳) 我々は、コンピュテータが必ずしも$c$-numberでない2つの非可換可観測器の同時かつ連続的な測定を考える。アーサース・ケリーモデルを再検討し、2つの観測可能な系の同時測定を記述するために一般化する。この一般化モデルを用いて,scott and milburn (scott and milburn, phys. rev. a 63, 042101 (2001)) によって提案されたスキームに従って,連続的にシステムを測定する。非条件のマスター方程式は連続極限におけるリンドブラッド形式に還元される。さらに、マスター方程式はこれらの2つの測定値のクロス項を含まないことが分かる。最後に,2つの観測機器の同時連続計測に基づくフィードバック制御により,外部領域における2段階のシステム状態を作成する手法を提案する。 We consider simultaneous and continuous measurement of two noncommutative observables of the system whose commutator is not necessarily a $c$-number. We revisit the Arthurs-Kelly model and generalize it to describe the simultaneous measurement of two observables of the system. Using this generalized model, we continuously measure the system by following the scheme proposed by Scott and Milburn [Scott and Milburn, Phys. Rev. A 63, 042101 (2001)]. We find that the unconditioned master equation reduces to the Lindblad form in the continuous limit. In addition, we find that the master equation does not contain a cross term of these two measurements. Finally, we propose a scheme to prepare the state of a two-level system in an external field by feedback control based on the simultaneous, continuous measurement of the two observables.	翻訳日:2023-05-04 09:13:50 公開日:2021-01-12
# ガラスを用いた円形磁性電子レンズの量子力学とB(z)$の電力則モデル Quantum mechanics of round magnetic electron lenses with Glaser and power law models of $B(z)$ ( http://arxiv.org/abs/2009.13943v2 ) ライセンス: Link先を確認	Sameen Ahmed Khan, Ramaswamy Jagannathan	(参考訳) 単一粒子レベルでの量子電子ビーム光学のスカラー理論は、折りたたみワウトフイセン様変換法を用いてディラック方程式から導かれる。軸磁場$B(z)$に対するGlaserと電力法則モデルを用いた円形磁性電子レンズの研究を行った。グラッサーモデルレンズの同軸量子プロパゲータは、その同軸運動方程式のよく知られた基本解によって得られる。 B(z)$のパワーローモデルを持つレンズの場合、微分方程式を解くことで得られる偏軸方程式のよく知られた基本解もまた、ペアノ・ベーカー級数を用いて構成される。収差の量子力学を簡潔に論じる。収差における量子不確実性の役割と非パラ軸ビームの運動方程式の非線形部分について指摘する。本稿の目的は電子ビーム光学の量子力学を理解することであるが、現在の電子ビームデバイスの光学に対する量子効果の影響は無視できるかもしれない。 Scalar theory of quantum electron beam optics, at the single-particle level, derived from the Dirac equation using a Foldy-Wouthuysen-like transformation technique is considered. Round magnetic electron lenses with Glaser and power law models for the axial magnetic field $B(z)$ are studied. Paraxial quantum propagator for the Glaser model lens is obtained in terms of the well known fundamental solutions of its paraxial equation of motion. In the case of lenses with the power law model for $B(z)$ the well known fundamental solutions of the paraxial equations, obtained by solving the differential equation, are constructed using the Peano-Baker series also. Quantum mechanics of aberrations is discussed briefly. Role of quantum uncertainties in aberrations, and in the nonlinear part of the equations of motion for a nonparaxial beam, is pointed out. The main purpose of this article is to understand the quantum mechanics of electron beam optics though the influence of quantum effects on the optics of present-day electron beam devices might be negligible.	翻訳日:2023-04-30 16:31:55 公開日:2021-01-12
# ボソニックラダーにおける物質のヘリカル相の探索 Exploring helical phases of matter in bosonic ladders ( http://arxiv.org/abs/2010.02740v2 ) ライセンス: Link先を確認	Andreas Haller, Apollonas S. Matsoukas-Roubeas, Yueting Pan, Matteo Rizzi and Michele Burrello	(参考訳) 超低温原子のラダーモデルは、人工ゲージ場と相互作用の間の相互作用と関連する物質の異なる現象と相に関する実験および理論的研究のための多目的プラットフォームを提供する。強い相関を持つヘリカル状態は、粒子と磁束密度の比で現れることが知られており、しばしば分数量子ホール状態の1次元極限として解釈され、プレトポロジーと呼ばれる。しかしながら、それらのシグネチャは通常、これらの状態を特徴づける小さなギャップのために観察するのが困難である。本稿では,充填係数1のボソニックラダーモデルについて検討する。ボゾン化, 再正規化群, 行列積状態シミュレーションに基づいて, この共鳴に現れる2つの強相関ヘリカル相をピンポイントとした。 2種のハードコアボソンとオンサイト反発のみのシステムで1つがアクセス可能であることを示し,光学格子実験に応用可能であることを示した。そのシグネチャは、リアルなシステムサイズのために、幅広いパラメータで拡張可能で安定である。 Ladder models of ultracold atoms offer a versatile platform for the experimental and theoretical study of different phenomena and phases of matter linked to the interplay between artificial gauge fields and interactions. Strongly correlated helical states are known to appear for specific ratios of the particle and magnetic flux densities and they can often be interpreted as a one-dimensional limit of fractional quantum Hall states, thus being called pretopological. Their signatures, however, are typically hard to observe due to the small gaps characterizing these states. Here we investigate bosonic ladder models at filling factor 1. Based on bosonization, renormalization group and matrix product state simulations we pinpoint two strongly correlated helical phases appearing at this resonance. We show that one of them can be accessed in systems with two-species hardcore bosons and on-site repulsions only, thus amenable for optical lattice experiments. Its signatures are sizable and stable over a broad range of parameters for realistic system sizes.	翻訳日:2023-04-29 20:25:07 公開日:2021-01-12
# 分子振動分極における非断熱現象 Nonadiabatic phenomena in molecular vibrational polaritons ( http://arxiv.org/abs/2011.07480v2 ) ライセンス: Link先を確認	Tam\'as Szidarovszky, P\'eter Badank\'o, G\'abor J. Hal\'asz, \'Agnes Vib\'ok	(参考訳) 非断熱現象は、赤外線キャビティに閉じ込められた分子の可逆運動において研究される。振動分極子間の円錐交差(CI)は、電子分極子表面間のCIと類似している。振動分極子のスペクトル,トポロジカル,動的性質は,分子振動,回転,空洞フォトニックモードの非断熱的結合の明確な指紋を示す。さらに, HCl分子の1つで$^{35}$Clから$^{37}$Clに変化させることにより, HCl分子の2つの可視化HCl分子とキャビティモードからなり, 分子置換対称性を破り, 偏光性表面, ナノジアバティックカップリング, 関連するスペクトル, トポロジカル, 動的性質を著しく損なうことができた。これは、現実的な分極系をモデル化する際に異なる分子アイソトポローグの自然発生を考慮する必要があることを意味する。 Nonadiabatic phenomena are investigated in the rovibrational motion of molecules confined in an infrared cavity. Conical intersections (CIs) between vibrational polaritons, similar to CIs between electronic polaritonic surfaces, are found. The spectral, topological, and dynamic properties of the vibrational polaritons show clear fingerprints of nonadiabatic couplings between molecular vibration, rotation and the cavity photonic mode. Furthermore, it is found that for the investigated system, composed of two rovibrating HCl molecules and the cavity mode, breaking the molecular permutational symmetry, by changing $^{35}$Cl to $^{37}$Cl in one of the HCl molecules, the polaritonic surfaces, nonadiabatic couplings, and related spectral, topological, and dynamic properties can deviate substantially. This implies that the natural occurrence of different molecular isotopologues needs to be considered when modeling realistic polaritonic systems.	翻訳日:2023-04-24 01:46:56 公開日:2021-01-12
# パルス光メカニカル計測による2モード機械エンタングルメントの作製と検証 Preparation and Verification of Two-Mode Mechanical Entanglement Through Pulsed Optomechanical Measurements ( http://arxiv.org/abs/2011.10289v2 ) ライセンス: Link先を確認	Pascal Neveu, Jack Clarke, Michael R. Vanner, Ewold Verhagen	(参考訳) 短光パルスと測定を用いて, 単一光共振器に結合した2つのメカニカルモード間の両部ガウス交絡を生成・検証するプロトコルを提案する。我々のプロトコルは、解決されたサイドバンド機構や低い熱フォノン占有を必要とせず、機械的な動きの期間未満で量子絡みの発生と検証を可能にする。コンディショニング位置測定により、効果的な2モード機械スクイーズを介して絡み合いが生じる。機械周波数と光学結合速度の実験的偏差に対する絡み合いの堅牢性について検討した。 We propose a protocol how to generate and verify bipartite Gaussian entanglement between two mechanical modes coupled to a single optical cavity, by means of short optical pulses and measurement. Our protocol requires neither the resolved sideband regime, nor low thermal phonon occupancy, and allows the generation and verification of quantum entanglement in less than a mechanical period of motion. Entanglement is generated via effective two-mode mechanical squeezing through conditioning position measurements. We study the robustness of entanglement to experimental deviations in mechanical frequencies and optomechanical coupling rates.	翻訳日:2023-04-23 15:03:10 公開日:2021-01-12
# 単一粒子ステアリングと非局所性:Stern-Gerlach実験 Single-particle steering and nonlocality: The consecutive Stern-Gerlach Experiments ( http://arxiv.org/abs/2011.11797v2 ) ライセンス: Link先を確認	E Benitez Rodriguez and E Piceno Martinez, and L M Arevalo Aguilar	(参考訳) 量子非局所性と量子ステアリング(quantum steering)は、古典的な資源だけでは生成できない量子系の基本的な相関である。非局所性(nonlocality)とは、量子ステアリングアリスがボブの状態を遠隔操作するときに、遠方系で実施される測定結果に影響を及ぼす能力を記述する。非局所性やステアリングの研究は、量子情報の開発や、量子鍵分布のような非局所的な資源を必要とする多くのアプリケーションに基本的な関心を持っている。一方、Stern-Gerlach実験は量子力学と量子情報の歴史、発展、教育において重要な位置を占めている。特に、連続Stern-Gerlach実験の思考実験は、量子作用素間の非可換性の概念を実証するために一般的に用いられる。しかしながら、我々の知る限りでは、連続するスターン・ゲラハ実験は完全な量子的な方法では行われておらず、連続するスターン・ゲラハ実験を横断する原子が古典的な経路を辿るという考えは広く受け入れられている。ここでは2つの連続Stern-Gerach実験が非局所性とステアリングを生成することを示す。また,この結果の意義と絡み合いとの関係について考察する。これは、質量を持つ粒子の量子相関を用いて、この有望な実験を用いて非局所的タックを生成することを示唆している。 Quantum nonlocality and quantum steering are fundamental correlations of quantum systems which can not be created using classical resources only. Nonlocality describes the ability to influence the possible results of measurements carried out in distant systems, in quantum steering Alice remotely steers Bob's state. Research in nonlocality and steering possess a fundamental interest for the development of quantum information and in many applications requiring nonlocal resources like quantum key distribution. On the other hand, the Stern-Gerlach experiment holds an important place in the history, development and teaching of quantum mechanics and quantum information. In particular, the thought experiment of consecutive Stern-Gerlach Experiments is commonly used to exemplify the concept of non-commutativity between quantum operators. However, to the best of our knowledge, the consecutive Stern-Gerlach Experiments have not been treated in a fully quantum manner yet, and it is a widely accepted idea that atoms crossing consecutive Stern-Gerlach Experiments follow classical paths. Here we demonstrate that two consecutive Stern-Gerach Experiment generate nonlocality and steering, these nonlocal effects strongly modify our usual understanding of this experiment. Also, we discuss the implications of this result and its relation with the entanglement. This suggests the use of quantum correlations, of particles possessing mass, to generate nonlocal taks using this venerable experiment.	翻訳日:2023-04-23 08:41:55 公開日:2021-01-12
# グラフ上の連続時間量子ウォークの輸送効率 Transport efficiency of continuous-time quantum walks on graphs ( http://arxiv.org/abs/2011.13794v2 ) ライセンス: Link先を確認	Luca Razzoli, Matteo G. A. Paris, Paolo Bordone	(参考訳) 連続時間量子ウォークは、グラフ上で連続的に進化する量子粒子(または励起)の伝播を記述する。そのため、光ハーベストティングシステムなどの輸送プロセスをモデリングするための自然なフレームワークを提供する。特に、輸送特性は、調査中のグラフの初期状態と特定の特徴に強く依存する。本稿では,グラフトポロジの役割を論じ,正則性,対称性,接続性が異なるグラフの輸送特性について考察する。我々は障害や非一貫性を無視し、損失過程を説明できる単一のトラップ頂点を仮定する。特に、各グラフに対して、最大輸送効率を持つ状態の部分空間を解析的に決定する。本研究は,環境支援量子輸送のベンチマークを提供し,接続性が輸送効率の指標に乏しいことを示唆する。実際、あるグラフの転送効率と接続性の間には特定の相関関係があるが、一般には相関しない。 Continuous-time quantum walk describes the propagation of a quantum particle (or an excitation) evolving continuously in time on a graph. As such, it provides a natural framework for modeling transport processes, e.g., in light-harvesting systems. In particular, the transport properties strongly depend on the initial state and on the specific features of the graph under investigation. In this paper, we address the role of graph topology, and investigate the transport properties of graphs with different regularity, symmetry, and connectivity. We neglect disorder and decoherence, and assume a single trap vertex accountable for the loss processes. In particular, for each graph, we analytically determine the subspace of states having maximum transport efficiency. Our results provide a set of benchmarks for environment-assisted quantum transport, and suggest that connectivity is a poor indicator for transport efficiency. Indeed, we observe some specific correlations between transport efficiency and connectivity for certain graphs, but in general they are uncorrelated.	翻訳日:2023-04-22 20:28:19 公開日:2021-01-12
# Google Apple Exposure Notification (GAEN)フレームワークの批判 A Critique of the Google Apple Exposure Notification (GAEN) Framework ( http://arxiv.org/abs/2012.05097v2 ) ライセンス: Link先を確認	Jaap-Henk Hoepman	(参考訳) 新型コロナウイルス(COVID-19)の感染拡大を受け、医療当局が感染した人との接触が密接で持続しているかどうかを判断するためのツールとして、デジタルコンタクトの追跡が提案されている。 2020年4月、GoogleとAppleは、コンタクトトレースのための分散型でよりプライバシーに優しいプラットフォームとして、Google Apple Exposure Notification (GAEN)フレームワークをリリースした。 GAENフレームワークは、アプリケーション(ライセンス)層を完全に置き換えるのではなく、主にオペレーティングシステム層で露出通知を実装している。本稿では,このアプローチの結果について考察する。これはOS層における大量監視のための休眠機能を生み出すと我々は主張する。我々は、医療当局が純粋に中央集権化された接触追跡を行うのを技術的に防ぐことができないことを示す。 GAENによってGoogleとAppleは、接触追跡が健康当局によって実際にどのように実装されているか(あるいは実装されていないか)、そしてそれがいかに機能的不気味のリスクをもたらすかを決定することができる。 As a response to the COVID-19 pandemic digital contact tracing has been proposed as a tool to support the health authorities in their quest to determine who has been in close and sustained contact with a person infected by the coronavirus. In April 2020 Google and Apple released the Google Apple Exposure Notification (GAEN) framework, as a decentralised and more privacy friendly platform for contact tracing. The GAEN framework implements exposure notification mostly at the operating system layer, instead of fully at the app(lication) layer. In this paper we study the consequences of this approach. We argue that this creates a dormant functionality for mass surveillance at the operating system layer. We show how it does not technically prevent the health authorities from implementing a purely centralised form of contact tracing (even though that is the stated aim). We highlight that GAEN allows Google and Apple to dictate how contact tracing is (or rather isn't) implemented in practice by health authorities, and how it introduces the risk of function creep.	翻訳日:2023-04-21 08:07:03 公開日:2021-01-12
# ディープニューラルネットワークを用いた単発電子スピン読み出しのノイズロバスト分類 Noise-robust classification of single-shot electron spin readouts using a deep neural network ( http://arxiv.org/abs/2012.10841v2 ) ライセンス: Link先を確認	Yuta Matsumoto, Takafumi Fujita, Arne Ludwig, Andreas D. Wieck, Kazunori Komatani, Akira Oiwa	(参考訳) 量子点接触や量子ドットなどの電荷センサによる電荷とスピン状態の単発読み出しは、半導体スピン量子ビットの動作に必須の技術である。単発読み出しの忠実度は、信号対雑音比、システム温度、しきい値などの数値パラメータといった実験条件に依存する。ノイズの多い環境下で堅牢な正確な電荷検出スキームは、スケーラブルなフォールトトレラント量子計算アーキテクチャの開発には不可欠である。本研究では,ディープニューラルネットワーク(dnn)を用いて,雑音に対して頑健な単発読み出し分類手法を提案する。重要なことに、DNN分類器は、充電ラインで実験的に得られた電荷遷移信号のデータセットを用いて、トレーニング可能なパラメータを調整することにより、任意のノイズ環境でスピンアップおよびスピンダウン信号を自動的に設定する。さらに, 様々な量子ドット実験における電荷状態とスピン状態の測定に用いる2つの従来の分類法と比較して, 雑音環境下でのdnn分類が頑健であることを検証した。 Single-shot readout of charge and spin states by charge sensors such as quantum point contacts and quantum dots are essential technologies for the operation of semiconductor spin qubits. The fidelity of the single-shot readout depends both on experimental conditions such as signal-to-noise ratio, system temperature and numerical parameters such as threshold values. Accurate charge sensing schemes that are robust under noisy environments are indispensable for developing a scalable fault-tolerant quantum computation architecture. In this study, we present a novel single-shot readout classification method that is robust to noises using a deep neural network (DNN). Importantly, the DNN classifier is automatically configured for spin-up and spin-down signals in any noise environment by tuning the trainable parameters using the datasets of charge transition signals experimentally obtained at a charging line. Moreover, we verify that our DNN classification is robust under noisy environment in comparison to the two conventional classification methods used for charge and spin state measurements in various quantum dot experiments.	翻訳日:2023-04-20 02:31:01 公開日:2021-01-12
# 標準平衡状態のワイルウィグナー表現 Weyl-Wigner Representation of Canonical Equilibrium States ( http://arxiv.org/abs/2012.11674v2 ) ライセンス: Link先を確認	F. Nicacio	(参考訳) 標準熱平衡量子状態のワイルヴィグナー表現は、ハイゼンベルク作用素とメタプレクティック作用素のワイルヴィグナー記号のウィック回転を通じて二次ハミルトニアンのクラス全体に対して得られる。これらのユニタリと本質的に関連づけられた古典的構造の挙動はウィック写像の下で記述され、熱平衡状態は複素シンプレクティック行列によって完全に決定され、熱力学的性質をすべて設定する。ハミルトン力学の4つのカテゴリ(放物型、楕円型、双曲型、ロキソドロミック)を分析した。半古典的および高温の近似は、古典的および/または二次的挙動と比較される。 The Weyl-Wigner representations for canonical thermal equilibrium quantum states are obtained for the whole class of quadratic Hamiltonians through a Wick rotation of the Weyl-Wigner symbols of Heisenberg and metaplectic operators. The behavior of classical structures inherently associated to these unitaries is described under the Wick mapping, unveiling that a thermal equilibrium state is fully determined by a complex symplectic matrix, which sets all of its thermodynamical properties. The four categories of Hamiltonian dynamics (Parabolic, Elliptic, Hyperbolic, and Loxodromic) are analyzed. Semiclassical and high temperature approximations are derived and compared to the classical and/or quadratic behavior.	翻訳日:2023-04-20 00:08:53 公開日:2021-01-12
# 量子ダルブーの定理です The Quantum Darboux Theorem, ( http://arxiv.org/abs/2012.15260v2 ) ライセンス: Link先を確認	Olindo Corradini, Emanuele Latini and Andrew Waldron	(参考訳) 量子力学的プロパゲータの計算問題は、波動関数のベクトル束に作用する平坦な接続による並列輸送のためのウィルソン線演算子の計算として再キャストすることができる。この図では、基底多様体は奇数次元シンプレクティック幾何学(英語版)、あるいは非常に総称的に「位相時空」と見なすことのできる接触多様体であり、ファイバーはヒルベルト空間である。このアプローチは、局所古典力学を直線に変換する接触多様体上のダルブーの定理と平行な「量子ダルブーの定理」を享受する。量子ダルブーの定理が非調和量子ポテンシャルに対してどのように機能するかを詳述する。特に,局所的に複雑な量子力学を自明にするゲージ変換の漸近性を計算するための新しい図式論的手法を開発した。 The problem of computing quantum mechanical propagators can be recast as a computation of a Wilson line operator for parallel transport by a flat connection acting on a vector bundle of wavefunctions. In this picture the base manifold is an odd dimensional symplectic geometry, or quite generically a contact manifold that can be viewed as a "phase-spacetime", while the fibers are Hilbert spaces. This approach enjoys a "quantum Darboux theorem" that parallels the Darboux theorem on contact manifolds which turns local classical dynamics into straight lines. We detail how the quantum Darboux theorem works for anharmonic quantum potentials. In particular, we develop a novel diagrammatic approach for computing the asymptotics of a gauge transformation that locally makes complicated quantum dynamics trivial.	翻訳日:2023-04-18 07:49:36 公開日:2021-01-12
# アイゼンハートリフトの定量化 Quantizing the Eisenhart Lift ( http://arxiv.org/abs/2012.15288v2 ) ライセンス: Link先を確認	Kieran Finn, Sotirios Karamitsos and Apostolos Pilaftsis	(参考訳) 古典的なアイゼンハートリフト(英: classical eisenhart lift)とは、高次元の曲面多様体(リフト多様体)で進化する自由系を用いて、ポテンシャルに従属する古典系の力学を再現する手法である。我々は、アイゼンハートリフトの定式化を量子系に拡張し、リフト多様体はポテンシャルの古典的効果だけでなく量子力学的効果も再現することを示した。特に、昇降系のシュロディンガー方程式の解は、新しい自由度を射出した後に元の系の解に還元されることが分かる。この文脈では、古典系の持ち上げ運動量に対応する保存量子数を特定する。さらに、アイゼンハートリフトを量子場理論(QFT)に適用する。昇降場空間多様体はスカラー場ポテンシャルの古典的効果と量子的効果の両方を再現できることを示す。 QFTの場合、持ち上げられた運動量の類似は、時間だけでなく空間でも保存される量子電荷である。この電荷の異なる可能な値は、すべて互いに交わらないフォック空間のアンサンブルをラベル付けする。これらの拡張フォック空間の宇宙定数とゲージ階層問題との関連性を考慮する。 The classical Eisenhart lift is a method by which the dynamics of a classical system subject to a potential can be recreated by means of a free system evolving in a higher-dimensional curved manifold, known as the lifted manifold. We extend the formulation of the Eisenhart lift to quantum systems, and show that the lifted manifold recreates not only the classical effects of the potential, but also its quantum mechanical effects. In particular, we find that the solutions of the Schrodinger equations of the lifted system reduce to those of the original system after projecting out the new degrees of freedom. In this context, we identify a conserved quantum number, which corresponds to the lifted momentum of the classical system. We further apply the Eisenhart lift to Quantum Field Theory (QFT). We show that a lifted field space manifold is able to recreate both the classical and quantum effects of a scalar field potential. We find that, in the case of QFT, the analogue of the lifted momentum is a quantum charge that is conserved not only in time, but also in space. The different possible values for this charge label an ensemble of Fock spaces that are all disjoint from one another. The relevance of these extended Fock spaces to the cosmological constant and gauge hierarchy problems is considered.	翻訳日:2023-04-18 07:38:16 公開日:2021-01-12
# 光子損失と量子会議のキーアグリーメント Quantum Conference Key Agreement with Photon Loss ( http://arxiv.org/abs/2101.01483v2 ) ライセンス: Link先を確認	Phattharaporn Singkanipa and Pieter Kok	(参考訳) 会議鍵合意 (CKA) は、2つ以上の関係者が共通の秘密鍵を共有したいという情報処理課題である。本稿では、冗長符号化と誤り訂正に基づくCKAの損失耐性プロトコルを提案する。我々のプロトコルは、既存の損失の少ないCKAプロトコルよりも転送速度が向上する。しかし、符号化と誤り訂正には追加費用がかかる。我々は、生成確率p > 0.3の光子源を用いて、プロトコルの秘密鍵レートが既存のプロトコルを克服できることを示す。したがって、損失耐性プロトコルの現実的な実装には高い確率で絡み合う光子源が必要である。 Conference key agreement (CKA) is an information processing task where more than two parties want to share a common secret key. Here, we present a loss-resilient protocol for CKA, based on redundant encoding and error correction. Our protocol provides a speed-up in transmission rate over the existing lossy CKA protocol. However, encoding and error correction come with extra cost. We show that, using photon sources with creation probability p > 0.3, our protocol's secret key rate can overcome the existing protocol's. Hence, high probability entangled photon sources are required for realistic implementation of our loss-resilient protocol.	翻訳日:2023-04-17 19:58:08 公開日:2021-01-12
# Mobius構造における単一光子の制御可能な非相互伝達 Controllable non-reciprocal transmission of single photon in Mobius structure ( http://arxiv.org/abs/2101.04282v1 ) ライセンス: Link先を確認	Hai-Yuan Zhu, Xin-Yuan Hu, Jun-Jie Lin, Jia-Yi Wu, Shuo Li, Yan-Xiang Wang, Fu-Guo Deng, Na-Na Zhang	(参考訳) 制御可能な非相互伝送モデルを提案する。このモデルはモビウス環(mobius ring)から成り、この環は2つの1次元半無限鎖に連結され、モビウス環の空洞の内側に2段階の原子が配置されている。グリーン関数の手法を用いて、モデルによる単一光子の透過率の研究を行う。その結果、このモデルでは非相互伝達が達成でき、2レベル原子は単一光子の非相互輸送の量子スイッチとして振る舞うことができた。この制御可能な非相互伝送モデルは、新しい量子非相互デバイスを刺激することができる。 We propose a controllable non-reciprocal transmission model. The model consists of a Mobius ring, which is connected with two one-dimensional semi-infinite chains, and with a two-level atom located inside one of the cavities of the Mobius ring. We use the method of Green function to study the transmittance of a single photon through the model. The results show that the non-reciprocal transmission can be achieved in this model and the two-level atom can behave as a quantum switch for the non-reciprocal transport of the single photon. This controllable non-reciprocal transmission model may inspire new quantum non-reciprocal devices.	翻訳日:2023-04-17 00:51:03 公開日:2021-01-12
# Wick回転想像時間進化を用いた量子オプション価格設定 Quantum option pricing using Wick rotated imaginary time evolution ( http://arxiv.org/abs/2101.04280v1 ) ライセンス: Link先を確認	Santosh Kumar Radha	(参考訳) 本稿では,量子環境における価格設定の問題を再検討する。提案するアルゴリズムは,初期状態を作成し,オプション価格を表現し,既存の仮想時間シミュレーションアルゴリズムを用いて進化させる。この価格設定の方法は、初期オプション価格を量子状態にマッピングし、ウィックの想像上の時間空間における時間依存をシミュレートするものである。我々は、ヨーロッパオプションのアルゴリズムを概念実証として、特定の想像上の時間発展アルゴリズムを用いて数値的に検証し、アジアオプションのような経路依存オプションにどのように拡張できるかを示す。提案手法はハイブリッド変分アルゴリズムを用いており, 近距離量子コンピュータに関係していると考えられる。 In this paper we reformulate the problem of pricing options in a quantum setting. Our proposed algorithm involves preparing an initial state, representing the option price, and then evolving it using existing imaginary time simulation algorithms. This way of pricing options boils down to mapping an initial option price to a quantum state and then simulating the time dependence in Wick's imaginary time space. We numerically verify our algorithm for European options using a particular imaginary time evolution algorithm as proof of concept and show how it can be extended to path dependent options like Asian options. As the proposed method uses a hybrid variational algorithm, it is bound to be relevant for near-term quantum computers.	翻訳日:2023-04-17 00:50:51 公開日:2021-01-12
# 量子ニューラルネットワークの表現性 Expressivity of Quantum Neural Networks ( http://arxiv.org/abs/2101.04273v1 ) ライセンス: Link先を確認	Yadong Wu, Juan Yao, Pengfei Zhang, and Hui Zhai	(参考訳) 本研究では、十分に深い量子ニューラルネットワークが目標関数をできるだけ正確に近似できるかどうかという問題に対処する。対象関数が物理的観測可能である単純だが典型的な物理的状況から始めて、学習対象が直接物理的観測可能ではなく、loshimidtエコーやrenyiエントロピーのような複数のレプリカを持つ拡大ヒルベルト空間における物理的観測可能として表現できる状況まで議論を拡大する。主な発見は、データセット内の入力波関数が量子回路が作用するヒルベルト空間全体を満たさない場合にのみ正確な近似が可能であり、より正確には、前者のヒルベルト空間次元は後者のヒルベルト空間次元の半分以下でなければならないということである。場合によっては、例えば、入力波関数が異なるレプリカ間で対称でなければならない場合、データセットの固有の特性のために、この要件を自動で満たすことができる。そして、この要求がデータセットで満たされない場合、入力時に常にウェーブ関数が固定された1つの補助キュービットを追加することで、表現能力が回復できることを示す。本研究は,古典的ニューラルネットワークの表現性の基礎となる普遍近似定理の量子ニューラルネットワークアナロジーの確立に向けたものである。 In this work, we address the question whether a sufficiently deep quantum neural network can approximate a target function as accurate as possible. We start with simple but typical physical situations that the target functions are physical observables, and then we extend our discussion to situations that the learning targets are not directly physical observables, but can be expressed as physical observables in an enlarged Hilbert space with multiple replicas, such as the Loshimidt echo and the Renyi entropy. The main finding is that an accurate approximation is possible only when the input wave functions in the dataset do not exhaust the entire Hilbert space that the quantum circuit acts on, and more precisely, the Hilbert space dimension of the former has to be less than half of the Hilbert space dimension of the latter. In some cases, this requirement can be satisfied automatically because of the intrinsic properties of the dataset, for instance, when the input wave function has to be symmetric between different replicas. And if this requirement cannot be satisfied by the dataset, we show that the expressivity capabilities can be restored by adding one ancillary qubit where the wave function is always fixed at input. Our studies point toward establishing a quantum neural network analogy of the universal approximation theorem that lays the foundation for expressivity of classical neural networks.	翻訳日:2023-04-17 00:50:39 公開日:2021-01-12
# オープンおよび周期駆動システムにおける量子制御 Quantum Control in Open and Periodically Driven Systems ( http://arxiv.org/abs/2101.04267v1 ) ライセンス: Link先を確認	Si-Yuan Bai, Chong Chen, Hong Wu, Jun-Hong An	(参考訳) 量子技術は、技術革新を実現するために量子資源の効率的な利用を頼りにしている。これらのシステムは、それぞれの状態が異なる量子プロトコルを実現するために望ましい方法に従うように制御される。しかし、システムと環境の相互作用によって引き起こされるデコヒーレンスは、望ましい方法から逸脱する状態を引き起こす。アクティブコントロールとパッシブデコヒーレンスの共存の下で量子資源を保護する方法は重要である。近年の研究では、デコヒーレンスが系環境エネルギースペクトルの特徴によって決定されることが明らかになっている:エネルギースペクトルにおける境界状態の形成に伴うデコヒーレンスを抑制することができる。デコヒーレンスを制御するためのガイドラインを提供する。このようなアイデアは周期駆動系のシステムに一般化することができる。準エネルギースペクトルにおけるフロッケ境界状態の操作により、フロッケ工学と呼ばれる周期運転によるコヒーレント制御は、デコヒーレンスを制御するだけでなく、人工的にエキゾチックな位相を合成する際にも多用途ツールとなっている。オープンおよび周期駆動システムにおける量子制御の進展について概観する。境界国家が果たす卓越した役割と、デコヒーレンスを抑え、新しいトポロジカルフェーズを創出する周期運転による制御性に特に注目される。 Quantum technology resorts to efficient utilization of quantum resources to realize technique innovation. The systems are controlled such that their states follow the desired manners to realize different quantum protocols. However, the decoherence caused by the system-environment interactions causes the states deviating from the desired manners. How to protect quantum resources under the coexistence of active control and passive decoherence is of significance. Recent studies have revealed that the decoherence is determined by the feature of the system-environment energy spectrum: Accompanying the formation of bound states in the energy spectrum, the decoherence can be suppressed. It supplies a guideline to control decoherence. Such idea can be generalized to systems under periodic driving. By virtue of manipulating Floquet bound states in the quasienergy spectrum, coherent control via periodic driving dubbed as Floquet engineering has become a versatile tool not only in controlling decoherence, but also in artificially synthesizing exotic topological phases. We will review the progress on quantum control in open and periodically driven systems. Special attention will be paid to the distinguished role played by the bound states and their controllability via periodic driving in suppressing decoherence and generating novel topological phases.	翻訳日:2023-04-17 00:50:17 公開日:2021-01-12
# 新しいパラメータ化された絡み合いモノトーン A new parameterized entanglement monotone ( http://arxiv.org/abs/2101.04256v1 ) ライセンス: Link先を確認	Xue Yang, Ming-Xing Luo, Yan-Han Yang, Shao-Ming Fei	(参考訳) 絡み合い共起は量子実験における絡み合いを特徴付けるために広く使われている。絡み合いモノトンとして、特定の量子タリスエントロピーに関係している。本論文の目的は,Tsallisエントロピーにインスパイアされた$q$-concurrenceと命名された,新しいパラメータ化二部絡みモノトンを提案することである。我々は、正の部分的転位基準と再無視基準を用いて、任意の二成分量子絡み合い状態のq$-共起に対する解析的下限を導出し、強い分離可能性基準と興味深い関係を示す。新しい絡み合いモノトンは二部体等方性状態の特徴付けに用いられる。最後に、2つの二成分純粋状態を重ね合わせることにより、任意の絡み合いに対するq$-concurrenceを推定する計算方法を提案する。重ね合わせ演算は、重ね合わせられている2つの状態が両側直交または片側直交である場合において、最大で1つのebitを$q$-concurrenceで増加させることができる。これらの結果は、量子通信や量子情報処理で興味深い、絡み合いに関する一連の新しい現象を明らかにする。 Entanglement concurrence has been widely used for featuring entanglement in quantum experiments. As an entanglement monotone it is related to specific quantum Tsallis entropy. Our goal in this paper is to propose a new parameterized bipartite entanglement monotone which is named as $q$-concurrence inspired by general Tsallis entropy. We derive an analytical lower bound for the $q$-concurrence of any bipartite quantum entanglement state by employing positive partial transposition criterion and realignment criterion, which shows an interesting relationship to the strong separability criteria. The new entanglement monotone is used to characterize bipartite isotropic states. Finally, we provide a computational method to estimate the $q$-concurrence for any entanglement by superposing two bipartite pure states. It shows that the superposition operations can at most increase one ebit for the $q$-concurrence in the case that the two states being superposed are bi-orthogonal or one-sided orthogonal. These results reveal a series of new phenomena about the entanglement, which may be interesting in quantum communication and quantum information processing.	翻訳日:2023-04-17 00:49:57 公開日:2021-01-12
# 大規模量子コンピュータとシミュレータのプラットフォームとしての記憶リングにおける冷イオンビーム--研究・開発への挑戦と方向性 Cold ion beam in a storage ring as a platform for large-scale quantum computers and simulators: challenges and directions for research and development ( http://arxiv.org/abs/2101.04247v1 ) ライセンス: Link先を確認	Timur Shaftan, Boris B. Blinov	(参考訳) 本研究の目的は,スケーラブル量子コンピューティング(qc)と量子シミュレーション(qs)のためのプラットフォームとして,多数のイオンを蓄積・冷却・制御可能な大規模記憶リング型イオントラップシステムを構築する可能性を評価することである。このようなトラップでは、イオンは冷却レーザの周波数と強度によって一定速度で円路に沿って移動する結晶ビームを形成する。本稿では,現在最先端の線形イオントラップ装置で利用可能な100個未満から,ストレージリング装置における105個の結晶化イオンの順まで,量子ビット数の面で大きな飛躍的な進歩を考察する。この新しいトラップ設計は、荷電粒子の貯蔵リングとQCと質量分析に使用される線形イオントラップの2つの異なる概念を統一する。本稿では粒子加速器の言語を用いてイオン状態と動力学について論じる。上記の概念の違いを概説し、回転するイオンビームで大きな環の課題を分析し、現在の1000倍の量子ビットを持つ将来の量子コンピュータを実現するために必要な研究と開発のための目標を提案する。このような大規模量子システムを作成する上で、量子ビットのコヒーレンスと量子論理演算の高忠実さを維持しながら課題となる。アナログ量子シミュレーションを実行することは、そのようなデバイスに対する達成可能な初期目標である。複雑な量子システムの量子シミュレーションは、基礎科学と応用研究の両方を前進させる。原子核と素粒子物理学、多体量子システム、格子ゲージ理論、原子核構造計算は、大規模な量子シミュレーションシステムが自然の理解を進める上で非常に強力なツールになる数少ない例である。 The purpose of this paper is to evaluate the possibility of constructing a large-scale storage-ring-type ion-trap system capable of storing, cooling, and controlling a large number of ions as a platform for scalable quantum computing (QC) and quantum simulations (QS). In such a trap, the ions form a crystalline beam moving along a circular path with a constant velocity determined by the frequency and intensity of the cooling lasers. In this paper we consider a large leap forward in terms of the number of qubits, from fewer than 100 available in state-of-the-art linear ion-trap devices today to an order of 105 crystallized ions in the storage-ring setup. This new trap design unifies two different concepts: the storage rings of charged particles and the linear ion traps used for QC and mass spectrometry. In this paper we use the language of particle accelerators to discuss the ion state and dynamics. We outline the differences between the above concepts, analyze challenges of the large ring with a revolving beam of ions, and propose goals for the research and development required to enable future quantum computers with 1000 times more qubits than available today. The challenge of creating such a large-scale quantum system while maintaining the necessary coherence of the qubits and the high fidelity of quantum logic operations is significant. Performing analog quantum simulations may be an achievable initial goal for such a device. Quantum simulations of complex quantum systems will move forward both the fundamental science and the applied research. Nuclear and particle physics, many-body quantum systems, lattice gauge theories, and nuclear structure calculations are just a few examples in which a large-scale quantum simulation system would be a very powerful tool to move forward our understanding of nature.	翻訳日:2023-04-17 00:49:35 公開日:2021-01-12
# 大規模非制約最適化のための正規化限定メモリBFGS法とその効率的な実装 A Regularized Limited Memory BFGS method for Large-Scale Unconstrained Optimization and its Efficient Implementations ( http://arxiv.org/abs/2101.04413v1 ) ライセンス: Link先を確認	Hardik Tankaria, Shinji Sugimoto and Nobuo Yamashita	(参考訳) リミテッドメモリBFGS(L-BFGS)法は、大規模非制約最適化の一般的な方法の1つである。標準L-BFGS法はラインサーチを用いてグローバル収束を保証するため、時には多数の関数評価を必要とする。難易度を克服するために,一定の正規化技術を備えた新しいl-bfgsを提案する。通常の仮定の下で、そのグローバル収束を示す。本手法をより堅牢かつ効率的にするために,ノンモノトーン法やWolfe線探索の同時利用など,いくつかの手法で拡張する。最後に, CUTEstにおけるテスト問題に対する数値計算結果について述べる。 The limited memory BFGS (L-BFGS) method is one of the popular methods for solving large-scale unconstrained optimization. Since the standard L-BFGS method uses a line search to guarantee its global convergence, it sometimes requires a large number of function evaluations. To overcome the difficulty, we propose a new L-BFGS with a certain regularization technique. We show its global convergence under the usual assumptions. In order to make the method more robust and efficient, we also extend it with several techniques such as nonmonotone technique and simultaneous use of the Wolfe line search. Finally, we present some numerical results for test problems in CUTEst, which show that the proposed method is robust in terms of solving number of problems.	翻訳日:2023-04-17 00:42:57 公開日:2021-01-12
# 金属電極によるシリコン中のドナースピンの空間分解脱コヒーレンス Spatially-resolved decoherence of donor spins in silicon strained by a metallic electrode ( http://arxiv.org/abs/2101.04391v1 ) ライセンス: Link先を確認	V. Ranjan, B. Albanese, E. Albertinale, E. Billaud, D. Flanigan, J. J. Pla, T. Schenkel, D. Vion, D. Esteve, E. Flurin, J. J. L. Morton, Y. M. Niquet, P. Bertet	(参考訳) 電子スピンは既知の最もコヒーレントな固体系の一つであるが、量子センシングや情報処理のデバイスで使用されるためには、通常は界面の近くに置かれなければならない。電子スピンのコヒーレンスとスペクトル特性に対するそのような界面の影響を理解し緩和することは、そのような応用を実現する上で非常に重要であるが、単一スピンの研究からそのようなデータを推測するには、意味のある結果を得るために多くの測定が必要である。ここでは, ミリケルビン温度における28シリコン中の表面近傍ビスマスドナースピンのコヒーレンスを包括的に研究する。特に、金属電極によるひずみ誘起周波数シフトを用いて、スピンコヒーレンスの空間地図を電極に対する深さと位置の関数として用いる。磁場非感受性クロック遷移の測定により、表面スピンによる磁気ノイズと電荷ノイズを分離する。本研究は, ひずみ分離スピン共鳴スペクトルの定量的モデルとシリコン表面における常磁性不純物濃度の抽出を含む。このような表面近傍の電子スピンに対するこれらのデコヒーレンス機構の相互作用は、量子技術におけるそれらの応用において重要であるが、歪分割とクロック遷移の組み合わせはコヒーレンス寿命を最大2桁まで延長し、平均深さ100nmで最大300msに達する。ここで紹介する近接面アンサンブルにおけるコヒーレンスを空間的にマッピングする手法は、ダイヤモンド、炭化ケイ素、および光学結晶中の希土類イオンの欠陥など、他の活発な興味を持つスピン系に直接適用できる。 Electron spins are amongst the most coherent solid-state systems known, however, to be used in devices for quantum sensing and information processing applications, they must be typically placed near interfaces. Understanding and mitigating the impacts of such interfaces on the coherence and spectral properties of electron spins is critical to realize such applications, but is also challenging: inferring such data from single-spin studies requires many measurements to obtain meaningful results, while ensemble measurements typically give averaged results that hide critical information. Here, we report a comprehensive study of the coherence of near-surface bismuth donor spins in 28-silicon at millikelvin temperatures. In particular, we use strain-induced frequency shifts caused by a metallic electrode to make spatial maps of spin coherence as a function of depth and position relative to the electrode. By measuring magnetic-field-insensitive clock transitions we separate magnetic noise caused by surface spins from charge noise. Our results include quantitative models of the strain-split spin resonance spectra and extraction of paramagnetic impurity concentrations at the silicon surface. The interplay of these decoherence mechanisms for such near-surface electron spins is critical for their application in quantum technologies, while the combination of the strain splitting and clock transition extends the coherence lifetimes by up to two orders of magnitude, reaching up to 300 ms at a mean depth of only 100nm. The technique we introduce here to spatially map coherence in near-surface ensembles is directly applicable to other spin systems of active interest, such as defects in diamond, silicon carbide, and rare earth ions in optical crystals.	翻訳日:2023-04-17 00:42:33 公開日:2021-01-12
# 不完全なデバイスを用いたソース独立量子乱数発生器のセキュリティ解析と改善 Security Analysis and Improvement of Source Independent Quantum Random Number Generators with Imperfect Devices ( http://arxiv.org/abs/2101.04327v1 ) ライセンス: Link先を確認	Xing Lin, Shuang Wang, Zhen-Qiang Yin, Guan-Jie Fan-Yuan, Rong Wang, Wei Chen, De-Yong He, Zheng Zhou, Guang-Can Guo, and Zheng-Fu Han	(参考訳) 乱数生成器(QRNG)は、数値シミュレーションや暗号など多くのアプリケーションにおいて、真の乱数発生源として不可欠である。近年,信頼できない情報源でセキュアな乱数を生成できるsi-qrng(source-independent quantum random number generator)が実現されている。しかし、SI-QRNGで使用される信頼されているが不完全なデバイスの測定の抜け穴はまだ十分に調べられていないため、特に高速システムではセキュリティ上の問題が発生する。本稿では,SI-QRNGにおける実用的不完全な測定装置のセキュリティ欠陥を指摘する。また,条件付き最小エントロピーの再計算とモニタの追加により,これらの情報漏洩を防止するための対応策を提案する。さらに, 有限サイズ効果を考慮に入れることで, 残差パルスの影響が, 多数のサンプルラウンドを持つ有限サイズ効果よりも大きいことを示す。プロトコルは単純かつ効果的であり,si-qrngのセキュリティや高速測定装置との互換性が向上し,超高速でセキュリティ認定された商用si-qrngシステムの構築方法が確立されている。 A quantum random number generator (QRNG) as a genuine source of randomness is essential in many applications, such as number simulation and cryptography. Recently, a source-independent quantum random number generator (SI-QRNG), which can generate secure random numbers with untrusted sources, has been realized. However, the measurement loopholes of the trusted but imperfect devices used in SI-QRNGs have not yet been fully explored, which will cause security problems, especially in high-speed systems. Here, we point out and evaluate the security loopholes of practical imperfect measurement devices in SI-QRNGs. We also provide corresponding countermeasures to prevent these information leakages by recalculating the conditional minimum entropy and adding a monitor. Furthermore, by taking into account the finite-size effect,we show that the influence of the afterpulse can exceed that of the finite-size effect with the large number of sampled rounds. Our protocol is simple and effective, and it promotes the security of SI-QRNG in practice as well as the compatibility with high-speed measurement devices, thus paving the way for constructing ultrafast and security-certified commercial SI-QRNG systems.	翻訳日:2023-04-17 00:42:06 公開日:2021-01-12
# ブロックチェーンによるパンデミック予防接種証明書の共有:ケーススタディとパフォーマンス評価 Sharing pandemic vaccination certificates through blockchain: Case study and performance evaluation ( http://arxiv.org/abs/2101.04575v1 ) ライセンス: Link先を確認	Jos\'e Luis Hern\'andez-Ramos, Georgios Karopoulos, Dimitris Geneiatakis, Tania Martin, Georgios Kambourakis, and Igor Nai Fovino	(参考訳) この研究は、covid-19やその他の疾病予防証明書のセキュアな共有のための、スケーラブルなブロックチェーンベースのプラットフォームを提案している。例示的なユースケースとして,欧州連合の国を考慮し,大規模展開をシミュレートする。提案するプラットフォームは,計算資源使用量,ネットワーク応答時間,帯域幅といった幅広いシミュレーションによって評価される。その結果,提案手法はすべての評価基準において満足できる性能を示し,実実装のペースを設定できることが示唆された。 vis-\`a- 関連する研究によると、提案されたプラットフォームは、特に大規模で本格的な実装とその評価のプリズムによって、斬新である。 This work proposes a scalable, blockchain-based platform for the secure sharing of COVID-19 or other disease vaccination certificates. As an indicative use case, we simulate a large-scale deployment by considering the countries of the European Union. The proposed platform is evaluated through extensive simulations in terms of computing resource usage, network response time and bandwidth. Based on the results, the proposed scheme shows satisfactory performance across all major evaluation criteria, suggesting that it can set the pace for real implementations. Vis-\`a-vis the related work, the proposed platform is novel, especially through the prism of a large-scale, full-fledged implementation and its assessment.	翻訳日:2023-04-17 00:33:30 公開日:2021-01-12
# ダイヤモンド中の窒素空洞を用いたマイスナースクリーニングの全光学・マイクロ波検出 All-Optical and Microwave-Free Detection of Meissner Screening using Nitrogen-Vacancy Centers in Diamond ( http://arxiv.org/abs/2101.04571v1 ) ライセンス: Link先を確認	D. Paone, D. Pinto, G. Kim, L. Feng, M-J. Kim, R. St\"ohr, A. Singha, S. Kaiser, G. Logvenov, B. Keimer, J. Wrachtrup and K. Kern	(参考訳) 薄膜超伝導体の微視的研究は、非平衡相転移の検出とナノスケールでのダイナミクスの解明に重要な役割を果たしている。しかし、ナノスケールの空間分解能とピコ秒時間分解能を持つ磁気センサはこれらの探索に不可欠である。本稿では,非侵襲量子センサとしてダイヤモンド中の負帯電窒素空孔(nv)中心を利用し,超伝導薄膜中のマイスナー状態の空間的検出を可能にする全光マイクロ波フリー方式を提案する。超伝導LSCO薄膜上にNV注入ダイヤモンド膜を配置する。 NVフォトルミネッセンス (PL) の強いB場依存性により, 外部磁場下でのLSCOのマイスナースクリーニングを非共鳴的に検討することができる。 Microscopic studies on thin film superconductors play an important role for probing non-equilibrium phase transitions and revealing dynamics at the nanoscale. However, magnetic sensors with nanometer scale spatial and picosecond temporal resolution are essential for exploring these. Here, we present an all-optical, microwave-free method, that utilizes the negatively charged nitrogen-vacancy (NV) center in diamond as a non-invasive quantum sensor and enables the spatial detection of the Meissner state in a superconducting thin film. We place an NV implanted diamond membrane on a superconducting LSCO thin film. The strong B-field dependence of the NV photoluminescence (PL) allows us to investigate the Meissner screening in LSCO under an externally applied magnetic field in a non-resonant manner.	翻訳日:2023-04-17 00:33:20 公開日:2021-01-12
# 光子と圧縮コヒーレント光子の特性関数と準確率分布 Characteristic function and quasi-probability distribution of photons and of squeezed coherent photons ( http://arxiv.org/abs/2101.04549v1 ) ライセンス: Link先を確認	Moorad Alexanian	(参考訳) 我々は、光子の熱状態における圧縮コヒーレント光子のp次特性関数とその準分布関数であるフーリエ変換を考察し、圧縮コヒーレント光子の平均数と数分散を計算する。前の特性関数から計算された全ての特性は、パラメトリック変換によって圧縮されたコヒーレント光子の温度状態における光子の当該性質を計算するために使うことができる。特に、パラメトリック変換によって平均数と値のばらつきを得ることができる。 We consider the p-ordered characteristic function and its Fourier transform, the quasidistribution function, of squeezed coherent photons in a thermal state of photons and calculate the mean number and number variance of squeezed coherent photons. All the properties calculated from the previous characteristic function can be used to calculate said properties for photons in a thermal state of squeezed coherent photons via a parametric transformation. In particular, one can obtain the mean number and number variance via the parametric transformation.	翻訳日:2023-04-17 00:33:07 公開日:2021-01-12
# GSM-GPRSを用いたスマート街灯 GSM-GPRS Based Smart Street Light ( http://arxiv.org/abs/2101.06102v1 ) ライセンス: Link先を確認	Imran Kabir, Shihab Uddin Ahamad, Mohammad Naim Uddin, Shah Mohazzem Hossain, Faija Farjana, Partha Protim Datta, Md. Raduanul Alam Riad, Mohammed Hossam-E-Haider	(参考訳) 街路灯はバングラデシュの街路を照らす伝統的なマニュアルシステムであり、ゾーンの街路灯を制御するためだけに専用人が掲示され、街路灯を照らして1日に2回点灯し、日中や日中は街路灯の展示が行われる。これにより予算が削減される。これに加えて、欠陥のあるライトは、技術的な欠点に繋がる長い間、関係する当局の邪魔にならないかもしれない。本稿では,手動制御,半自動制御,完全自動制御を備えたSIM900 GSM-GPRS Shieldを用いたバングラデシュのような国の街灯制御のプロセスを示す。 Street lighting system has always been the traditional manual system of illuminating the streets in Bangladesh, where a dedicated person is posted only to control the street lights of a zone, who roams around the zonal area to switch on and switch off the lights two times a day, which brings about the exhibition of bright lights in street even after sunrise and in some cases maybe the whole day. This results in insertion to the budget. In addition to this, faulty lights may not come to the heed of the concerned authority for a long time which leads to the technical downside. This paper demonstrates a process of controlling the street lights in country like Bangladesh employing SIM900 GSM-GPRS Shield which comes up with the provision of manual control, semi-automated control as well as full-automated control.	翻訳日:2023-04-17 00:25:42 公開日:2021-01-12
# 重力波検出のためのボース・アインシュタイン凝縮体の2モードフォノンスクイーズ Two-mode Phonon Squeezing in Bose-Einstein Condensates for Gravitational Wave Detection ( http://arxiv.org/abs/2101.05051v1 ) ライセンス: Link先を確認	Paul Juschitz	(参考訳) 圧縮された非古典的状態は、測定装置の感度を古典的状態の限界を超えて押し上げる能力があるため、量子メトロロジーの不可欠な道具である。光におけるそれらの生成は標準的な技術となっているが、超低温原子の気体中の集合励起の圧縮状態の生成は、ボース=アインシュタイン凝縮体(BEC)のフォノンは、比較的最近の問題である。このタスクは、becベースの量子メロジカルデバイスに対する多くの提案や、重力波の検出にそれらを適用する可能性と、継続的に関連づけられている。本論文の目的は, 均一なBECに対して振動する外部電位が最近説明された効果を利用して, 現代の技術によって2モード圧縮されたフォノン状態を生成することである。この問題は、一般相対性理論やエフィモフ物理学のような、冷たい原子以外の様々な分野の要素をまとめる。これに答えるために、初期熱フォノニック状態における振動電位による完全な変換を考慮し、摂動の大きさの上限を見つけるとともに、メトロロジーにおける使用に関して最終状態の品質を定量化することができる。これらの知見を既存の実験に応用して, スクイーズ方式の有効性を判断し, 有効性に適していないことを示す一方で, 効率的な実装が可能で, 実験範囲内と思われる設定を提案する。最適化のための広大なパラメータ空間を出発する余地を考えると、検討されたメカニズムは、もともとこの研究を動機付けた重力波検出器だけでなく、より一般的には超低温原子に基づく量子力学分野にも応用できる。 Squeezed, nonclassical states are an integral tool of quantum metrology due to their ability to push the sensitivity of a measurement apparatus beyond the limits of classical states. While their creation in light has become a standard technique, the production of squeezed states of the collective excitations in gases of ultracold atoms, the phonons of a Bose-Einstein condensate (BEC), is a comparably recent problem. This task is continuously gaining relevance with a growing number of proposals for BEC-based quantum metrological devices and the possibility to apply them in the detection of gravitational waves. The objective of this thesis is to find whether the recently described effect of an oscillating external potential on a uniform BEC can be exploited to generate two-mode squeezed phonon states, given present day technology. This question brings together elements of a range of fields beyond cold atoms, such as general relativity and Efimov physics. To answer it, the full transformation caused by the oscillating potential on an initially thermal phononic state is considered, allowing to find an upper bound for the magnitude of this perturbation as well as to quantify the quality of the final state with respect to its use in metrology. These findings are then applied to existing experiments to judge the feasibility of the squeezing scheme and while the results indicate that they are not well suited for it, a setup is proposed that allows for its efficient implementation and seems within experimental reach. In view of the vast parameter space leaving room for optimization, the considered mechanism could find applications not only in the gravitational wave detector that originally motivated this work, but more generally in the field of quantum metrology based on ultracold atoms.	翻訳日:2023-04-17 00:25:26 公開日:2021-01-12
# 非エルミート粒子アンサンブルの情報と熱力学特性 Information and thermodynamic properties of a non-Hermitian particle ensemble ( http://arxiv.org/abs/2101.04803v1 ) ライセンス: Link先を確認	F. C. E. Lima, A. R. P. Moreira, and C. A. S. Almeida	(参考訳) 非相対論的量子力学の文脈では、非エルミート系のシャノンのエントロピーを調べ、この量がサイクロトロン周波数でどのように変化するかを理解する。その後、均一磁場の存在下でこれらのスピンレス粒子のアンサンブルを構築することに注意を向ける。次に,モデルの熱力学特性について検討する。最後に、シャノンのエントロピーと熱力学的性質が磁場の作用によってどのように変化するかを示す。 In the context of non-relativistic quantum mechanics, we investigated Shannon's entropy of a non-Hermitian system to understand how this quantity is modified with the cyclotron frequency. Subsequently, we turn our attention to the construction of an ensemble of these spinless particles in the presence of a uniform magnetic field. Then, we study the thermodynamic properties of the model. Finally, we show how Shannon's entropy and thermodynamic properties are modified with the action of the magnetic field.	翻訳日:2023-04-17 00:24:55 公開日:2021-01-12
# 個々の窒素空洞核スピンの室温制御と電気的読み出し Room-temperature control and electrical readout of individual nitrogen-vacancy nuclear spins ( http://arxiv.org/abs/2101.04769v1 ) ライセンス: Link先を確認	Michal Gulka, Daniel Wirtitsch, Viktor Iv\'ady, Jelle Vodnik, Jaroslav Hruby, Goele Magchiels, Emilie Bourgeois, Adam Gali, Michael Trupke, Milos Nesladek	(参考訳) 半導体の核スピンは量子計算、通信、センシングなど量子技術の主要な候補である。ダイヤモンドの核スピンは、非常に長いコヒーレンス寿命のため特に魅力的である。窒素空孔(NV)中心では、これらの核量子ビットは、フォトニックリンクを介する絡み合いを可能にする補助電子量子ビットの恩恵を受ける。電子自体による量子情報の転送は、隣り合う中心への制御された転送や双極子相互作用によって、より高速でより小さなプロセッサが実現されるが、そのようなノードの配列の光学的読み出しは、必要な準回折サイト間距離のために困難を伴う。ここでは、nv電子に結合した1つの14n核スピンである、そのような系の基本単位の電気的読み出しを示す。本研究は, 量子ゲート動作と核量子ビットレジスタの電気的読み出しを, ナノスケール電極構造に適合して行うことを目的とする。このデモンストレーションは、半導体スケーラビリティを備えた大規模ダイヤモンド量子デバイスへのマイルストーンである。 Nuclear spins in semiconductors are leading candidates for quantum technologies, including quantum computation, communication, and sensing. Nuclear spins in diamond are particularly attractive due to their extremely long coherence lifetime. With the nitrogen-vacancy (NV) centre, such nuclear qubits benefit from an auxiliary electronic qubit, which has enabled entanglement mediated by photonic links. The transport of quantum information by the electron itself, via controlled transfer to an adjacent centre or via the dipolar interaction, would enable even faster and smaller processors, but optical readout of arrays of such nodes presents daunting challenges due to the required sub-diffraction inter-site distances. Here, we demonstrate the electrical readout of a basic unit of such systems - a single 14N nuclear spin coupled to the NV electron. Our results provide the key ingredients for quantum gate operations and electrical readout of nuclear qubit registers, in a manner compatible with nanoscale electrode structures. This demonstration is therefore a milestone towards large-scale diamond quantum devices with semiconductor scalability.	翻訳日:2023-04-17 00:24:47 公開日:2021-01-12
# 量子演算回路の現実的最悪のケース解析について On the realistic worst case analysis of quantum arithmetic circuits ( http://arxiv.org/abs/2101.04764v1 ) ライセンス: Link先を確認	Alexandru Paler, Oumarou Oumarou, Robert Basmadjian	(参考訳) 量子回路の設計における直観は誤解を招く可能性があることを示す。特に私たちはこう示しています a) t数の減少は,総深度を増加させることができる。 b) NISQ回路における測定のためにCNOTを交換することは有益かもしれない。 c) 相対位相トフォリアンシラの計測に基づく非計算は,回路の深さの最大30%を占めることができる。 d) 面積及び容積コストの指標は、資源分析を誤報することができる。私たちの発見は、qubitsは極めて希少なリソースであり続けると仮定している。結果は、NISQとQECC保護回路の両方に適用できる。提案手法はトフォリゲートをクリフォード+Tゲートに分解する複数の方法を用いる。リップルキャリーを用いた加算回路と乗算回路について述べる。副産物として,実用的に重要な回路幅に対して,リプルキャリー付加回路はキャリーキャリー付加回路よりも資源効率が高いことを示す。これらの手法と回路はオープンソース QUANTIFY ソフトウェアで実装された。 We provide evidence that commonly held intuitions when designing quantum circuits can be misleading. In particular we show that: a) reducing the T-count can increase the total depth; b) it may be beneficial to trade CNOTs for measurements in NISQ circuits; c) measurement-based uncomputation of relative phase Toffoli ancillae can make up to 30\% of a circuit's depth; d) area and volume cost metrics can misreport the resource analysis. Our findings assume that qubits are and will remain a very scarce resource. The results are applicable for both NISQ and QECC protected circuits. Our method uses multiple ways of decomposing Toffoli gates into Clifford+T gates. We illustrate our method on addition and multiplication circuits using ripple-carry. As a byproduct result we show systematically that for a practically significant range of circuit widths, ripple-carry addition circuits are more resource efficient than the carry-lookahead addition ones. The methods and circuits were implemented in the open-source QUANTIFY software.	翻訳日:2023-04-17 00:24:28 公開日:2021-01-12
# 差分制約付きknapsack問題に対するしきい値探索に基づくメメティックアルゴリズム A threshold search based memetic algorithm for the disjunctively constrained knapsack problem ( http://arxiv.org/abs/2101.04753v1 ) ライセンス: Link先を確認	Zequn Wei and Jin-Kao Hao	(参考訳) 連結制約クナップサック問題は、クナップサック容量を満足しながら選択されたアイテムの総利益を最大化するように、容量制約クナップサックに対向するアイテムのサブセットを充填することを特徴とする。 DCKPは多くの応用があり、計算は困難である(NP-hard)。本研究では, DCKP を解くためのしきい値探索に基づくメメティックアルゴリズムを提案し, メメティックフレームワークとしきい値探索を組み合わせた高品質な解を求める。 6340のベンチマークインスタンスの2セットに関する広範な計算的評価結果から,提案手法は最先端手法と比較して高い競合性を示す。特に,セットi (100 インスタンス) とセットii (6240 インスタンス) について,24 と 354 において,最もよく知られた結果(新しい下限)がそれぞれ改善されていることを報告した。我々は,アルゴリズムの重要成分を分析し,アルゴリズムの性能向上のためにその役割を明かす。我々のアルゴリズムのコードは公開される予定だ。 The disjunctively constrained knapsack problem consists in packing a subset of pairwisely compatible items in a capacity-constrained knapsack such that the total profit of the selected items is maximized while satisfying the knapsack capacity. DCKP has numerous applications and is however computationally challenging (NP-hard). In this work, we present a threshold search based memetic algorithm for solving the DCKP that combines the memetic framework with threshold search to find high quality solutions. Extensive computational assessments on two sets of 6340 benchmark instances in the literature demonstrate that the proposed algorithm is highly competitive compared to the state-of-the-art methods. In particular, we report 24 and 354 improved best-known results (new lower bounds) for Set I (100 instances) and for Set II (6240 instances), respectively. We analyze the key algorithmic components and shed lights on their roles for the performance of the algorithm. The code of our algorithm will be made publicly available.	翻訳日:2023-04-17 00:24:17 公開日:2021-01-12
# 移動振動子の励起 Excitation of a moving oscillator ( http://arxiv.org/abs/2101.04721v1 ) ライセンス: Link先を確認	Viktor V. Dodonov	(参考訳) 運動法則の移動中心を持つ量子調和振動子のコヒーレント状態とフォック状態の間の遷移振幅と確率を計算する。これらの量は移動中心加速度のフーリエ変換によって決定される。定加速度の場合、その確率は振動子周波数と振動し、各周期の後に励起は起こらない。調和トラップ中心の振動や回転運動の例も考慮される。得られた原子トラップでは,高調波トラップ中心の動きによる振動状態の励起効果が観測できることが推定された。 We calculate transition amplitudes and probabilities between the coherent and Fock states of a quantum harmonic oscillator with a moving center for an arbitrary law of motion. These quantities are determined by the Fourier transform of the moving center acceleration. In the case of a constant acceleration, the probabilities oscillate with the oscillator frequency, so that no excitation occurs after every period. Examples of oscillating and rotating motion of the harmonic trap center are considered too. Estimations show that the effect of excitation of vibration states due to the motion of the harmonic trap center can be observed in available atomic traps.	翻訳日:2023-04-17 00:23:58 公開日:2021-01-12
# pontryagin differentiable programming: エンドツーエンドの学習と制御フレームワーク Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework ( http://arxiv.org/abs/1912.12970v5 ) ライセンス: Link先を確認	Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou	(参考訳) 本稿では,pdp(pontryagin differentiable programming)手法を開発し,学習と制御タスクの幅広いクラスを解決するための統一フレームワークを構築した。 The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. 逆強化学習,システム識別,制御・計画の3つの学習モードについて検討した。マルチリンクロボットアーム,6-DoF操縦四極子,6-DoFロケット搭載着陸など,多次元システムにおける学習モード毎のPDPの能力を示す。 This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.	翻訳日:2023-01-17 03:12:21 公開日:2021-01-12
# 競争型生成分類のための情報ボトルネックを用いた正規化流れの訓練 Training Normalizing Flows with the Information Bottleneck for Competitive Generative Classification ( http://arxiv.org/abs/2001.06448v5 ) ライセンス: Link先を確認	Lynton Ardizzone, Radek Mackowiak, Carsten Rother, Ullrich K\"othe	(参考訳) Information Bottleneck (IB) の目的は、情報理論を用いてタスクパフォーマンス対堅牢性トレードオフを定式化することである。標準判別分類設定においてうまく適用されている。 IBが正規化フローなどの生成可能性モデルのトレーニングにも利用できるかどうかという疑問が提起される。正規化フローは非可逆ネットワークアーキテクチャ(INN)を使用するため、構築によって情報保存される。これはボトルネックの概念と矛盾しているようだ。本稿では,まず,innをトレーニングする条件付き正規化フローのクラスであるib-innsの理論と方法論を開発する。少数の"em control"情報損失を導入することで,innの生成能力を無傷に保ちながら,ibの漸近的に正確な定式化が可能になる。次に,これらのモデルの性質を実験的に検討し,特に生成的分類器として用いた。このモデルクラスは、不確実性定量化の改善や分布外検出などの利点を提供するが、従来の生成型分類法は分類精度がかなり低い。 IBのトレードオフパラメータは、標準分類器に近い生成能力と精度の混合を制御している。経験的に、この混合状態における我々の不確実性推定は、従来の生成的および識別的分類器と好意的に比較できる。 The Information Bottleneck (IB) objective uses information theory to formulate a task-performance versus robustness trade-off. It has been successfully applied in the standard discriminative classification setting. We pose the question whether the IB can also be used to train generative likelihood models such as normalizing flows. Since normalizing flows use invertible network architectures (INNs), they are information-preserving by construction. This seems contradictory to the idea of a bottleneck. In this work, firstly, we develop the theory and methodology of IB-INNs, a class of conditional normalizing flows where INNs are trained using the IB objective: Introducing a small amount of {\em controlled} information loss allows for an asymptotically exact formulation of the IB, while keeping the INN's generative capabilities intact. Secondly, we investigate the properties of these models experimentally, specifically used as generative classifiers. This model class offers advantages such as improved uncertainty quantification and out-of-distribution detection, but traditional generative classifier solutions suffer considerably in classification accuracy. We find the trade-off parameter in the IB controls a mix of generative capabilities and accuracy close to standard classifiers. Empirically, our uncertainty estimates in this mixed regime compare favourably to conventional generative and discriminative classifiers.	翻訳日:2023-01-10 09:59:32 公開日:2021-01-12
# AI-GAN: 攻撃に触発された敵の例の生成 AI-GAN: Attack-Inspired Generation of Adversarial Examples ( http://arxiv.org/abs/2002.02196v2 ) ライセンス: Link先を確認	Tao Bai, Jun Zhao, Jinlin Zhu, Shoudong Han, Jiefeng Chen, Bo Li, Alex Kot	(参考訳) ディープニューラルネットワーク(DNN)は、入力に知覚できない摂動を加えることで構築される敵の例に対して脆弱である。近年、様々な攻撃や戦略が提案されているが、現実的かつ効率的に敵の例を生成する方法は未解決のままである。本稿では、ジェネレータ、識別器、攻撃者が共同で訓練されるAI-GAN(Attack-Inspired GAN)と呼ばれる新しいフレームワークを提案する。トレーニングが完了すると、入力画像とターゲットクラスを効率よく生成できる。一般的なデータセット \eg MNIST と CIFAR-10 に関する広範な実験を通じて、AI-GAN は高い攻撃成功率を達成し、さまざまな設定で生成時間を大幅に短縮する。さらに、AI-GANは初めて、すべてのクラスで約90\%の成功率で、複雑なデータセット \eg CIFAR-100にスケールすることに成功した。 Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to inputs. Recently different attacks and strategies have been proposed, but how to generate adversarial examples perceptually realistic and more efficiently remains unsolved. This paper proposes a novel framework called Attack-Inspired GAN (AI-GAN), where a generator, a discriminator, and an attacker are trained jointly. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. Through extensive experiments on several popular datasets \eg MNIST and CIFAR-10, AI-GAN achieves high attack success rates and reduces generation time significantly in various settings. Moreover, for the first time, AI-GAN successfully scales to complicated datasets \eg CIFAR-100 with around $90\%$ success rates among all classes.	翻訳日:2023-01-03 10:12:23 公開日:2021-01-12
# 摂動に基づく正則化によるアーキテクチャ探索の安定化 Stabilizing Differentiable Architecture Search via Perturbation-based Regularization ( http://arxiv.org/abs/2002.05283v3 ) ライセンス: Link先を確認	Xiangning Chen, Cho-Jui Hsieh	(参考訳) 微分可能なアーキテクチャ探索(DARTS)は、アーキテクチャを識別するための一般的なNASソリューションである。アーキテクチャ空間の継続的な緩和に基づいて、DARTSは異なるアーキテクチャウェイトを学び、探索コストを大幅に削減する。しかし、その安定性は、探索が進むにつれて劣化するアーキテクチャをもたらすことで問題視されている。最終アーキテクチャを蒸留する際の劇的な性能低下につながる急激なバリデーション損失の状況は、不安定を引き起こす重要な要因であることがわかった。そこで本研究では,DARTS法における損失景観の平滑化と一般化性の向上を目的として,摂動型正規化SmoothDARTS(SDARTS)を提案する。特に,我々の新しい定式化はDARTS法をランダムな平滑化または逆攻撃によって安定化させる。 nas-bench-1shot1の探索軌跡は,提案手法の有効性を示し,安定性の向上により,4つのデータセットの様々な検索空間における性能向上を実現する。さらに,SDARTSは,スムーズな損失景観と性能向上を考慮に入れた検証損失のヘッセン的規範を暗黙的に正則化していることを示す。 Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the search cost. However, its stability has been challenged for yielding deteriorating architectures as the search proceeds. We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. Based on this observation, we propose a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods. In particular, our new formulations stabilize DARTS-based methods by either random smoothing or adversarial attack. The search trajectory on NAS-Bench-1Shot1 demonstrates the effectiveness of our approach and due to the improved stability, we achieve performance gain across various search spaces on 4 datasets. Furthermore, we mathematically show that SDARTS implicitly regularizes the Hessian norm of the validation loss, which accounts for a smoother loss landscape and improved performance.	翻訳日:2023-01-01 18:52:58 公開日:2021-01-12
# hemlets posh: 3次元ポーズと形状推定のための部分中心ヒートマップ三重項学習 HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose and Shape Estimation ( http://arxiv.org/abs/2003.04894v3 ) ライセンス: Link先を確認	Kun Zhou, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu	(参考訳) 一つの画像から3Dのポーズを推定するのは難しい作業だ。本研究は,2次元観察と3次元解釈のギャップを短くする中間状態部分熱マップトリプレット(HEMlets)を導入することで,検出された2次元関節を3次元空間に持ち上げる不確実性に対処しようとするものである。 HEMletsは3つのジョイントヒートマップを使用して、各骨格体部に対するエンドジョイントの相対的な深さ情報を表す。提案手法では,まず入力画像からHEMletを予測するために畳み込みネットワーク(ConvNet)を訓練し,その後にボリューム結合熱マップの回帰を行う。積分演算を利用して体積熱マップから関節の位置を抽出し、エンドツーエンドの学習を保証する。ネットワーク設計の単純さにも拘わらず、定量的比較は最高の方法(例えばHuman3.6Mの$20\%)よりも大幅に性能が向上したことを示している。提案手法は,骨格関節の弱いアノテートした相対深度情報しか得られない "in-the-wild" 画像による訓練を自然に支援する。これにより,屋外画像の質的比較により,モデルの一般化能力が向上する。ヘムレットの姿勢推定の強みを活かし,さらに浅く効果的なネットワークモジュールを設計・付加し,身体の姿勢や形状のsmplパラメータを緩和する。 HEMletsベースのヒューマンポーズと形状回復パイプラインHEMlets PoShを総称する。 HEMlets PoShアプローチで得られた最先端の成果を,既存の人体回収ベンチマークの大規模定量および定性的実験により正当化した。 Estimating 3D human pose from a single image is a challenging task. This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets), which shortens the gap between the 2D observation and the 3D interpretation. The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part. In our approach, a Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression. We leverage on the integral operation to extract the joint locations from the volumetric heatmaps, guaranteeing end-to-end learning. Despite the simplicity of the network design, the quantitative comparisons show a significant performance improvement over the best-of-grade methods (e.g. $20\%$ on Human3.6M). The proposed method naturally supports training with "in-the-wild" images, where only weakly-annotated relative depth information of skeletal joints is available. This further improves the generalization ability of our model, as validated by qualitative comparisons on outdoor images. Leveraging the strength of the HEMlets pose estimation, we further design and append a shallow yet effective network module to regress the SMPL parameters of the body pose and shape. We term the entire HEMlets-based human pose and shape recovery pipeline HEMlets PoSh. Extensive quantitative and qualitative experiments on the existing human body recovery benchmarks justify the state-of-the-art results obtained with our HEMlets PoSh approach.	翻訳日:2022-12-24 21:19:48 公開日:2021-01-12
# 自動ベイズ最適化に向けて : 獲得関数に関わる第一歩 Towards Automatic Bayesian Optimization: A first step involving acquisition functions ( http://arxiv.org/abs/2003.09643v2 ) ライセンス: Link先を確認	Eduardo C. Garrido Merch\'an, Luis C. Jariego P\'erez	(参考訳) ベイズ最適化 (Bayesian Optimization) は、ブラックボックスの最適化(つまり、解析的表現や勾配にアクセスできない関数)の最先端技術であり、評価は高価であり、評価はノイズが多い。ベイズ最適化の最も一般的な応用は機械学習アルゴリズムの自動ハイパーパラメータチューニングであり、これらのアルゴリズムの一般化誤差を最適化して機械学習アルゴリズムの最適構成を得る。ベイズ最適化手法は成功して適用されているものの、確率的サロゲートモデルや使用する獲得関数などの設定が必要なハイパーパラメータも備えている。これらのハイパーパラメータの構成に関する悪い決定は、悪い品質結果を得ることを意味する。通常、これらのハイパーパラメータは、我々が評価したい目的関数を仮定することで調整されますが、目的関数に関する事前情報を持っていないシナリオがあります。本稿では,ベイズ最適化の獲得関数を自動的にチューニングする複数のヒューリスティックを探索し,ベイズ最適化に対する最初の試みを提案する。機械学習アルゴリズムのベンチマーク問題とハイパーパラメータチューニング問題において,これらのヒューリシックの有効性について述べる。 Bayesian Optimization is the state of the art technique for the optimization of black boxes, i.e., functions where we do not have access to their analytical expression nor its gradients, they are expensive to evaluate and its evaluation is noisy. The most popular application of bayesian optimization is the automatic hyperparameter tuning of machine learning algorithms, where we obtain the best configuration of machine learning algorithms by optimizing the estimation of the generalization error of these algorithms. Despite being applied with success, bayesian optimization methodologies also have hyperparameters that need to be configured such as the probabilistic surrogate model or the acquisition function used. A bad decision over the configuration of these hyperparameters implies obtaining bad quality results. Typically, these hyperparameters are tuned by making assumptions of the objective function that we want to evaluate but there are scenarios where we do not have any prior information about the objective function. In this paper, we propose a first attempt over automatic bayesian optimization by exploring several heuristics that automatically tune the acquisition function of bayesian optimization. We illustrate the effectiveness of these heurisitcs in a set of benchmark problems and a hyperparameter tuning problem of a machine learning algorithm.	翻訳日:2022-12-21 10:16:24 公開日:2021-01-12
# Bayesian ODE Solvers: The Maximum A Posteriori Estimate Bayesian ODE Solvers: The Maximum A Posteriori Estimate ( http://arxiv.org/abs/2004.00623v2 ) ライセンス: Link先を確認	Filip Tronarp, Simo Sarkka, Philipp Hennig	(参考訳) 近年,正規微分方程式の数値解を非線形ベイズ推定問題として定式化できることが確立されており,ガウス・マルコフ前駆体を用いると,ガウスフィルタリングや平滑化によって近似的に解くことができる。ガウス推定値の分類が確立され、階層の最上部に最大後方推定値が設定され、反復された拡張カルマン平滑化によって計算できる。残りの3つのクラスは明示的、半単純、暗黙的と呼ばれ、これはベクトル場の条件に対応する古典的概念と類似しており、その下にフィルタ更新が局所的な最大a後方推定を生成する。最大後続推定は、前項に付随する再生ヒルベルト空間の最適補間に対応し、この場合、滑らか性のソボレフ空間 $\nu+1$ と等価である。その結果, ソボレフ空間における散乱データ近似と非線形解析の手法を用いて, ベクトル場上の緩やかな条件下での補間距離(最大ステップサイズ)の多項式速度で, 最大アフター推定値が真の解に収束することを示した。開発された手法は、古典的収束解析法よりも、これらの推定器の収束を研究するための新しい、より自然なアプローチを提供する。これらの方法と理論的結果は数値的な例で示される。 It has recently been established that the numerical solution of ordinary differential equations can be posed as a nonlinear Bayesian inference problem, which can be approximately solved via Gaussian filtering and smoothing, whenever a Gauss--Markov prior is used. In this paper the class of $\nu$ times differentiable linear time invariant Gauss--Markov priors is considered. A taxonomy of Gaussian estimators is established, with the maximum a posteriori estimate at the top of the hierarchy, which can be computed with the iterated extended Kalman smoother. The remaining three classes are termed explicit, semi-implicit, and implicit, which are in similarity with the classical notions corresponding to conditions on the vector field, under which the filter update produces a local maximum a posteriori estimate. The maximum a posteriori estimate corresponds to an optimal interpolant in the reproducing Hilbert space associated with the prior, which in the present case is equivalent to a Sobolev space of smoothness $\nu+1$. Consequently, using methods from scattered data approximation and nonlinear analysis in Sobolev spaces, it is shown that the maximum a posteriori estimate converges to the true solution at a polynomial rate in the fill-distance (maximum step size) subject to mild conditions on the vector field. The methodology developed provides a novel and more natural approach to study the convergence of these estimators than classical methods of convergence analysis. The methods and theoretical results are demonstrated in numerical examples.	翻訳日:2022-12-17 19:33:24 公開日:2021-01-12
# YOLOv3によるドローン検出におけるアノテーションエラーの影響 Effect of Annotation Errors on Drone Detection with YOLOv3 ( http://arxiv.org/abs/2004.01059v4 ) ライセンス: Link先を確認	Aybora Koksal, Kutalmis Gokalp Ince, A. Aydin Alatan	(参考訳) 近年の深層ネットワークの進歩により、ディープラーニングバックボーンを用いた物体検出と追跡アルゴリズムが大幅に改善されているが、この急速な発展により大量の注釈付きラベルが必要となった。たとえこのような半自動的なアノテーションプロセスの詳細が、特にビデオアノテーションに関して、正確には分かっていないとしても、自動化されたラベリングプロセスが使われることが多い。残念ながら、このようなアプローチは誤ったアノテーションをもたらす可能性がある。本研究では,物体検出問題に対する異なる種類のアノテーションエラーをシミュレートし,トレーニングおよび試験段階における誤アノテーションを用いた最先端オブジェクト検出器YOLOv3の性能について検討する。さらに、cvpr-2020アンチuavチャレンジデータセットにおける避けられないアノテーションエラーについても、この有用なデータセットのアノテーションエラーを修正するソリューションを提案しながら、この方法で検討している。 Following the recent advances in deep networks, object detection and tracking algorithms with deep learning backbones have been improved significantly; however, this rapid development resulted in the necessity of large amounts of annotated labels. Even if the details of such semi-automatic annotation processes for most of these datasets are not known precisely, especially for the video annotations, some automated labeling processes are usually employed. Unfortunately, such approaches might result with erroneous annotations. In this work, different types of annotation errors for object detection problem are simulated and the performance of a popular state-of-the-art object detector, YOLOv3, with erroneous annotations during training and testing stages is examined. Moreover, some inevitable annotation errors in CVPR-2020 Anti-UAV Challenge dataset is also examined in this manner, while proposing a solution to correct such annotation errors of this valuable data set.	翻訳日:2022-12-17 12:45:46 公開日:2021-01-12
# 低次元関数のロバストテスト Robust testing of low-dimensional functions ( http://arxiv.org/abs/2004.11642v2 ) ライセンス: Link先を確認	Anindya De, Elchanan Mossel and Joe Neeman	(参考訳) 高次元推論における自然問題は、分類器 $f:\mathbb{R}^n \rightarrow \{-1,1\}$ がその入力データの少数の線形方向に依存するかどうかを決定することである。関数 $g: \mathbb{r}^n \rightarrow \{-1,1\}$、入力空間の$k$-次元部分空間によって完全に決定される場合、線型 $k$-junta と呼ぶ。著者たちの最近の研究は、線型$k$-juntasがテスト可能であることを示した。したがって、次のように区別するアルゴリズムが存在する。 1.$f: \mathbb{R}^n \rightarrow \{-1,1\}$ これは表面積が$s$の線型$k$-juntaである。 2.$f$ is $\epsilon$-far from any linear $k$-junta with surface area $(1+\epsilon)s$ ここでアルゴリズムのクエリの複雑さは周囲の次元$n$とは独立である。耐雑音性試験への関心が高まった後, 本論文では, 耐雑音性(あるいは頑健性)の検証を行った。すなわち、任意の$c>0$,$\epsilon>0$を与えられたアルゴリズムを区別する。 1.$f: \mathbb{R}^n \rightarrow \{-1,1\}$は、少なくとも$c$と、表面積が$s$の線型$k$-juntaとの相関を持つ。 2.$f$ は、最大 $c-\epsilon$ と、表面積が最大 $s$ の線形 $k$-junta との相関を持つ。テスト担当者のクエリの複雑さは$k^{\mathsf{poly}(s/\epsilon)}$である。この手法を用いることで、任意のクラス$\mathcal{C}$ of linear $k$-juntas with surface area with $s$に対して同じクエリ複雑性を持つ完全耐雑音試験器も得られる。その結果、クエリ複雑性が$k^{O(\mathsf{poly}(\log k/\epsilon))} である完全雑音耐性試験器を、ガウス空間上の$k$-半空間(定数$k$)の交叉のクラスに対して得られる。私たちのクエリの複雑さは、周囲の次元$n$とは独立です。以前は、1つのハーフスペースでさえ、非自明なノイズ耐性テスターは知られていない。 A natural problem in high-dimensional inference is to decide if a classifier $f:\mathbb{R}^n \rightarrow \{-1,1\}$ depends on a small number of linear directions of its input data. Call a function $g: \mathbb{R}^n \rightarrow \{-1,1\}$, a linear $k$-junta if it is completely determined by some $k$-dimensional subspace of the input space. A recent work of the authors showed that linear $k$-juntas are testable. Thus there exists an algorithm to distinguish between: 1. $f: \mathbb{R}^n \rightarrow \{-1,1\}$ which is a linear $k$-junta with surface area $s$, 2. $f$ is $\epsilon$-far from any linear $k$-junta with surface area $(1+\epsilon)s$, where the query complexity of the algorithm is independent of the ambient dimension $n$. Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any $c>0$, $\epsilon>0$, distinguishes between 1. $f: \mathbb{R}^n \rightarrow \{-1,1\}$ has correlation at least $c$ with some linear $k$-junta with surface area $s$. 2. $f$ has correlation at most $c-\epsilon$ with any linear $k$-junta with surface area at most $s$. The query complexity of our tester is $k^{\mathsf{poly}(s/\epsilon)}$. Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class $\mathcal{C}$ of linear $k$-juntas with surface area bounded by $s$. As a consequence, we obtain a fully noise tolerant tester with query complexity $k^{O(\mathsf{poly}(\log k/\epsilon))}$ for the class of intersection of $k$-halfspaces (for constant $k$) over the Gaussian space. Our query complexity is independent of the ambient dimension $n$. Previously, no non-trivial noise tolerant testers were known even for a single halfspace.	翻訳日:2022-12-10 04:26:39 公開日:2021-01-12
# DeepRx: 完全な畳み込み型ディープラーニングレシーバー DeepRx: Fully Convolutional Deep Learning Receiver ( http://arxiv.org/abs/2005.01494v2 ) ライセンス: Link先を確認	Mikko Honkala, Dani Korpi, Janne M.J. Huttunen	(参考訳) ディープラーニングは、ヒューリスティックなアルゴリズムに届かない多くの問題を解決した。現在の無線システムはよく理解されており、多くのタスクに最適なアルゴリズムが存在するにもかかわらず、無線通信にもうまく適用されている。受信機の個々の部分を学習することで得られる利得もあるが、受信機全体を共同で学習する方がよい。しかし、これはしばしば、最適解の実装が不可能な、難解な非線形問題をもたらす。そこで本研究では,周波数領域信号ストリームから符号化されていないビットへのレシーバパイプライン全体を実行する,完全畳み込みニューラルネットワークDeepRxを提案する。本研究では,畳み込みニューラルネットワークの入力を,データとパイロットシンボルの両方を用いて非常に特異的に構成することにより,正確なチャネル推定を容易にする。また、DeepRxは5Gシステムで使用されるチャネル符号化と互換性のあるソフトビットを出力する。 3GPP定義チャネルモデルを用いて、DeepRxが従来の手法より優れていることを示す。また,検出精度を向上させるために,未知のデータシンボルの既知の星座点と局所的なシンボル分布を利用するために,DeepRx学習による高い性能が期待できることを示す。 Deep learning has solved many problems that are out of reach of heuristic algorithms. It has also been successfully applied in wireless communications, even though the current radio systems are well-understood and optimal algorithms exist for many tasks. While some gains have been obtained by learning individual parts of a receiver, a better approach is to jointly learn the whole receiver. This, however, often results in a challenging nonlinear problem, for which the optimal solution is infeasible to implement. To this end, we propose a deep fully convolutional neural network, DeepRx, which executes the whole receiver pipeline from frequency domain signal stream to uncoded bits in a 5G-compliant fashion. We facilitate accurate channel estimation by constructing the input of the convolutional neural network in a very specific manner using both the data and pilot symbols. Also, DeepRx outputs soft bits that are compatible with the channel coding used in 5G systems. Using 3GPP-defined channel models, we demonstrate that DeepRx outperforms traditional methods. We also show that the high performance can likely be attributed to DeepRx learning to utilize the known constellation points of the unknown data symbols, together with the local symbol distribution, for improved detection accuracy.	翻訳日:2022-12-07 01:40:06 公開日:2021-01-12
# BERTと従来の機械学習テキスト分類の比較 Comparing BERT against traditional machine learning text classification ( http://arxiv.org/abs/2005.13012v2 ) ライセンス: Link先を確認	Santiago Gonz\'alez-Carvajal and Eduardo C. Garrido-Merch\'an	(参考訳) 近年、BERTモデルは、人間の監督なしに教師付きテキスト分類などの複数のNLPタスクに対処できる、最先端の機械学習モデルとして人気を博している。優れた成果をもたらすあらゆるタイプのコーパスに対処する柔軟性は、このアプローチをアカデミックだけでなく、業界でも非常に人気があります。しかし、成功して何年にもわたって多くの異なるアプローチが使われてきた。本研究では,BERTを初めて紹介し,古典的NLPアプローチについて概説する。そして,従来のTF-IDFボキャブラリに対するBERTの振る舞いを機械学習アルゴリズムに入力する,さまざまなシナリオを扱う一連の実験を経験的にテストした。本研究の目的は,NLPタスクのデフォルトとしてBERTの使用を支持するか拒否する経験的証拠を追加することである。実験では、BERTの優位性と、NLP問題で使用されるデフォルト技術としてBERTを使用する経験的証拠を付加するテキスト言語のような、NLP問題の特徴の独立性を示す。 The BERT model has arisen as a popular state-of-the-art machine learning model in the recent years that is able to cope with multiple NLP tasks such as supervised text classification without human supervision. Its flexibility to cope with any type of corpus delivering great results has make this approach very popular not only in academia but also in the industry. Although, there are lots of different approaches that have been used throughout the years with success. In this work, we first present BERT and include a little review on classical NLP approaches. Then, we empirically test with a suite of experiments dealing different scenarios the behaviour of BERT against the traditional TF-IDF vocabulary fed to machine learning algorithms. Our purpose of this work is to add empirical evidence to support or refuse the use of BERT as a default on NLP tasks. Experiments show the superiority of BERT and its independence of features of the NLP problem such as the language of the text adding empirical evidence to use BERT as a default technique to be used in NLP problems.	翻訳日:2022-11-28 23:22:17 公開日:2021-01-12
# gan(generative adversarial network)を用いた胸部x線画像におけるganによる肺炎およびcovid-19の検出 Data Augmentation using Generative Adversarial Networks (GANs) for GAN-based Detection of Pneumonia and COVID-19 in Chest X-ray Images ( http://arxiv.org/abs/2006.03622v2 ) ライセンス: Link先を確認	Saman Motamed and Patrik Rogalla and Farzad Khalvati	(参考訳) 畳み込みニューラルネットワーク(CNN)のトレーニングの成功には、かなりの量のデータが必要である。小さなデータセットでは、ネットワークは一般的でない。データ拡張技術は、既存のトレーニングデータをより効果的に利用することにより、ニューラルネットワークの一般化性を向上させる。しかし、標準的なデータ拡張手法は、限定可能な代替データを生成する。 GAN(Generative Adversarial Networks)は,新たなデータ生成とCNNの性能向上に利用されている。それでも、ganのトレーニングのためのデータ拡張技術は、cnnと比較して未検討である。そこで本研究では, 造血モデルを用いて, 肺炎およびcovid-19の半監督検出のための胸部x線増倍用ganアーキテクチャを提案する。提案したGANは, 肺炎, COVID-19の胸部X線検査において, データを効果的に増強し, 疾患の分類精度を向上させるのに有用である。我々は、GANモデルとDeep Convolutional GANと、2つの異なるX線データセット上の従来の拡張方法(回転、ズーム等)を比較し、GANに基づく拡張手法が、X線画像の異常を検出するための他の拡張方法を上回ることを示す。 Successful training of convolutional neural networks (CNNs) requires a substantial amount of data. With small datasets networks generalize poorly. Data Augmentation techniques improve the generalizability of neural networks by using existing training data more effectively. Standard data augmentation methods, however, produce limited plausible alternative data. Generative Adversarial Networks (GANs) have been utilized to generate new data and improve the performance of CNNs. Nevertheless, data augmentation techniques for training GANs are under-explored compared to CNNs. In this work, we propose a new GAN architecture for augmentation of chest X-rays for semi-supervised detection of pneumonia and COVID-19 using generative models. We show that the proposed GAN can be used to effectively augment data and improve classification accuracy of disease in chest X-rays for pneumonia and COVID-19. We compare our augmentation GAN model with Deep Convolutional GAN and traditional augmentation methods (rotate, zoom, etc) on two different X-ray datasets and show our GAN-based augmentation method surpasses other augmentation methods for training a GAN in detecting anomalies in X-ray images.	翻訳日:2022-11-25 03:43:46 公開日:2021-01-12
# デバイス社会における分類課題のための連帯学習と連続学習 Federated and continual learning for classification tasks in a society of devices ( http://arxiv.org/abs/2006.07129v2 ) ライセンス: Link先を確認	Fernando E. Casado, Dylan Lema, Roberto Iglesias, Carlos V. Regueiro, Sen\'en Barro	(参考訳) 現在、デバイスはますます相互接続され、センサ化され、ほぼユビキタスな状況にあります。近年、ディープラーニングは、これらのデバイスが収集できる膨大なデータから知識を抽出する一般的な方法となっている。それにもかかわらず、集中型最先端学習手法は、利用可能な情報が通常プライベートで部分的、偏りがあり、時間とともに進化する、真の分散問題に直面したときに多くの欠点がある。フェデレートラーニング(Federated Learning)は、複数の分散デバイスがリモートで、協調的にモデルをトレーニングし、データのプライバシを保存するための人気のフレームワークである。しかし、現在のフェデレートラーニングにおける提案は、スマートフォンなどの非専用デバイスで実装できないようなディープアーキテクチャに重点を置いている。また、時間とともにデータ分布が予期せぬ方法で変化し、概念ドリフト(concept drift)と呼ばれる現象を引き起こすシナリオについてはほとんど研究されていない。したがって、本研究では、軽量で伝統的な学習者を用いた新しいフェデレーションおよび継続型アーキテクチャである、ライトフェデレーションおよび継続型コンセンサス(LFedCon2)を提示したい。当社の手法では,スマートフォンやロボットなどの無力デバイスが,ローカル,継続的,自律的,あるいはユーザからリアルタイムに学習することができると同時に,クラウド上でのモデルの改善も実現している。提案手法をスマートフォン利用者の異種コミュニティに適用し,歩行認識の課題を解決した。この結果は、LFedCon2が他の最先端メソッドに対してもたらす利点を示している。 Today we live in a context in which devices are increasingly interconnected and sensorized and are almost ubiquitous. Deep learning has become in recent years a popular way to extract knowledge from the huge amount of data that these devices are able to collect. Nevertheless, centralized state-of-the-art learning methods have a number of drawbacks when facing real distributed problems, in which the available information is usually private, partial, biased and evolving over time. Federated learning is a popular framework that allows multiple distributed devices to train models remotely, collaboratively, and preserving data privacy. However, the current proposals in federated learning focus on deep architectures that in many cases are not feasible to implement in non-dedicated devices such as smartphones. Also, little research has been done regarding the scenario where data distribution changes over time in unforeseen ways, causing what is known as concept drift. Therefore, in this work we want to present Light Federated and Continual Consensus (LFedCon2), a new federated and continual architecture that uses light, traditional learners. Our method allows powerless devices (such as smartphones or robots) to learn in real time, locally, continuously, autonomously and from users, but also improving models globally, in the cloud, combining what is learned locally, in the devices. In order to test our proposal, we have applied it in a heterogeneous community of smartphone users to solve the problem of walking recognition. The results show the advantages that LFedCon2 provides with respect to other state-of-the-art methods.	翻訳日:2022-11-22 04:26:48 公開日:2021-01-12
# ニューラル非線形トラッキング Neural Non-Rigid Tracking ( http://arxiv.org/abs/2006.13240v2 ) ライセンス: Link先を確認	Alja\v{z} Bo\v{z}i\v{c}, Pablo Palafox, Michael Zollh\"ofer, Angela Dai, Justus Thies, Matthias Nie{\ss}ner	(参考訳) そこで我々は, 最先端の非剛性復元を, 頑健な最適化によって実現できる新しい, エンドツーエンドで学習可能な, 微分可能な非剛性トラッカーを提案する。非剛体移動物体の2つの入力RGB-Dフレームを考慮し、畳み込みニューラルネットワークを用いて、密度の高い対応とその信頼性を予測する。これらの対応は、ARAP(as-rigid-as-possible)最適化問題における制約として用いられる。重み付き非線形最小二乗解法によって勾配バックプロパゲーションを可能にすることにより,非剛性追跡のタスクに最適であるように,エンドツーエンドで対応と信頼度を学習することができる。この定式化の下では、自己超越(self-supervision)を通じて対応信頼度を学習し、学習された堅牢な最適化を通知する。最新の手法と比較して,提案アルゴリズムは再構築性能を向上し,同時に85倍の精度で対応予測を行う。コードを利用可能にします。 We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction by a learned robust optimization. Given two input RGB-D frames of a non-rigidly moving object, we employ a convolutional neural network to predict dense correspondences and their confidences. These correspondences are used as constraints in an as-rigid-as-possible (ARAP) optimization problem. By enabling gradient back-propagation through the weighted non-linear least squares solver, we are able to learn correspondences and confidences in an end-to-end manner such that they are optimal for the task of non-rigid tracking. Under this formulation, correspondence confidences can be learned via self-supervision, informing a learned robust optimization, where outliers and wrong correspondences are automatically down-weighted to enable effective tracking. Compared to state-of-the-art approaches, our algorithm shows improved reconstruction performance, while simultaneously achieving 85 times faster correspondence prediction than comparable deep-learning based methods. We make our code available.	翻訳日:2022-11-17 22:52:49 公開日:2021-01-12
# 機能拡張リワード学習 : 人間の入力を再考する Feature Expansive Reward Learning: Rethinking Human Input ( http://arxiv.org/abs/2006.13208v2 ) ライセンス: Link先を確認	Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan	(参考訳) ロボットがタスクを実行する方法に満足していない場合、それを修正するために介入することができる。逆学習法により、ロボットは人間の入力に基づいて報酬関数をオンラインで適応できるが、手作りの機能に依存している。これらの特徴で補正が説明できない場合、deep inverse reinforcement learning (irl)の最近の研究は、ロボットがタスクのデモンストレーションを要求し、生の状態空間上で定義された報酬を回収できることを示唆している。私たちの洞察では、デモから欠けている機能について暗黙的に学ぶのではなく、ロボットは、何が欠けているかを明示的に教えるデータを求めるべきである。そこで我々は,ロボットが教えている特徴が表現されていない状態からロボットを誘導する新しいタイプの人間入力を紹介した。本稿では,生の状態空間から特徴を学習し,それを報酬関数に統合するアルゴリズムを提案する。人間の入力を欠落した特徴に焦点を合わせることで、サンプルの複雑さを減らし、上記の深いIRLベースラインに対する学習報酬の一般化を改善する。本研究は,7dofロボットマニピュレータを用いた実験や,シミュレーション環境でのユーザ実験で紹介する。 When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states where the feature being taught is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.	翻訳日:2022-11-17 21:41:11 公開日:2021-01-12
# ProVe -- ニューラルネットワークモデルに基づく自動製品置換とコールドスタートのためのセルフ教師付きパイプライン ProVe -- Self-supervised pipeline for automated product replacement and cold-starting based on neural language models ( http://arxiv.org/abs/2006.14994v2 ) ライセンス: Link先を確認	Andrei Ionut Damian, Laurentiu Piciu, Cosmin Mihai Marinescu	(参考訳) 小売業の垂直産業では、企業は新しい購買行動への迅速な理解と適応の人間の限界に対処している。さらに、小売業は、製品・ブランド・カテゴリの大量選択を適切に管理する人的制限を克服する必要がある。これらの制限は、商業的(セールスの損失、顧客満足度の低下など)と運用的視点(外産、過剰生産など)の両方から欠陥をもたらす。本稿では,自然言語理解に基づくパイプラインアプローチを提案し,アウトオブストックの製品に最適な代替品を推奨する。さらに,取引履歴がほとんどない小売業者のポートフォリオに新たに導入された製品を管理するためのソリューションを提案する。このソリューションはビジネスに役立つ – 新製品を適切なカテゴリに自動的に割り当てる; 1日目からクロス販売を補完する製品を推奨する; 取引履歴がほとんどなくても販売予測を行う。最後に,本論文で提示したパイプラインを適用したベクトル空間モデルを,ディープラーニングに基づく需要予測ソリューションのセマンティック情報として直接利用し,より正確な予測を行う。研究と実験のプロセスは、実際のプライベートなトランザクションデータを使用して行われたが、ソースコードはhttps://github.com/lummetry/proveで入手できる。 In retail vertical industries, businesses are dealing with human limitation of quickly understanding and adapting to new purchasing behaviors. Moreover, retail businesses need to overcome the human limitation of properly managing a massive selection of products/brands/categories. These limitations lead to deficiencies from both commercial (e.g. loss of sales, decrease in customer satisfaction) and operational perspective (e.g. out-of-stock, over-stock). In this paper, we propose a pipeline approach based on Natural Language Understanding, for recommending the most suitable replacements for products that are out-of-stock. Moreover, we will propose a solution for managing products that were newly introduced in a retailer's portfolio with almost no transactional history. This solution will help businesses: automatically assign the new products to the right category; recommend complementary products for cross-sell from day 1; perform sales predictions even with almost no transactional history. Finally, the vector space model resulted by applying the pipeline presented in this paper is directly used as semantic information in deep learning-based demand forecasting solutions, leading to more accurate predictions. The whole research and experimentation process have been done using real-life private transactional data, however the source code is available on https://github.com/Lummetry/ProVe	翻訳日:2022-11-16 20:57:17 公開日:2021-01-12
# 全体はパーツを上回っていますか? AI説明が相補的チームパフォーマンスに及ぼす影響 Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance ( http://arxiv.org/abs/2006.14779v3 ) ライセンス: Link先を確認	Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, Daniel S. Weld	(参考訳) 多くの研究者は、AIが推奨を説明すると、意思決定タスクにおける人間とAIチームのパフォーマンスが改善することを示す研究で説明可能なAIを動機付けている。しかし、以前の研究では、AIが人間と最高のチームの両方を上回った場合にのみ、説明による改善が観察された。チームの正確さが人間かAIの作業ソロよりも高い場合、説明は相補的なパフォーマンスにつながるか? 我々は3つのデータセットで混合手法のユーザスタディを行い、人間に匹敵する精度のAIは、参加者がタスク(ある条件下で自身を説明する)を解決するのに役立ちます。 ai拡張による補足的な改善が見られたが、説明では改善されなかった。むしろ説明は、人間がその正確性に関わらず、AIの推奨を受け入れる可能性を高めた。我々の結果は、人間中心のAIに新しい課題をもたらす。AIへの適切な信頼を促す説明的アプローチを開発することは可能か? Many researchers motivate explainable AI with studies showing that human-AI team performance on decision-making tasks improves when the AI explains its recommendations. However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team. Can explanations help lead to complementary performance, where team accuracy is higher than either the human or the AI working solo? We conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions). While we observed complementary improvements from AI augmentation, they were not increased by explanations. Rather, explanations increased the chance that humans will accept the AI's recommendation, regardless of its correctness. Our result poses new challenges for human-centered AI: Can we develop explanatory approaches that encourage appropriate trust in AI, and therefore help generate (or improve) complementary performance?	翻訳日:2022-11-16 20:37:15 公開日:2021-01-12
# 距離測度空間におけるルベーグ点の最も近い近傍特性 A Nearest Neighbor Characterization of Lebesgue Points in Metric Measure Spaces ( http://arxiv.org/abs/2007.03937v4 ) ライセンス: Link先を確認	Tommaso Cesari (TSE), Roberto Colomboni (IIT)	(参考訳) ほぼ全ての点がルベーグ点である性質は、近接近傍に基づくいくつかの分類アルゴリズムの一貫性に不可欠であることが証明されている。我々は,1-Nearest Neighbor回帰アルゴリズムを用いて点推定を行い,対応する収束問題におけるタイブレーキング規則で果たす役割を具体化する。次に、結果を応用して、ほぼすべての点がルベーグ点である一般距離空間において、大きな1-Nearest Neighbor分類アルゴリズムのリスクの収束を証明した。 The property of almost every point being a Lebesgue point has proven to be crucial for the consistency of several classification algorithms based on nearest neighbors. We characterize Lebesgue points in terms of a 1-Nearest Neighbor regression algorithm for pointwise estimation, fleshing out the role played by tie-breaking rules in the corresponding convergence problem. We then give an application of our results, proving the convergence of the risk of a large class of 1-Nearest Neighbor classification algorithms in general metric spaces where almost every point is a Lebesgue point.	翻訳日:2022-11-12 12:47:32 公開日:2021-01-12
# 学習モデルにおける境界厚さと堅牢性 Boundary thickness and robustness in learning models ( http://arxiv.org/abs/2007.05086v2 ) ライセンス: Link先を確認	Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney	(参考訳) さまざまな敵対的および非敵対的腐敗に対する機械学習モデルの堅牢性は、依然として注目されている。本稿では,分類器の境界厚さの概念を導入し,その関係とモデルロバスト性の有用性について述べる。厚い決定境界は性能を向上し、薄い決定境界は過剰適合(例えば、トレーニングとテストの間の堅牢な一般化ギャップによって測定される)と低い堅牢性をもたらす。より厚い境界は、敵の例に対する堅牢性(例えば、敵の訓練の堅牢性テスト精度の向上)といわゆるアウト・オブ・ディストリビューション(OOD)変換の改善に役立つことを示す。理論的には、トレーニング中の境界の厚さを最大化することは、いわゆるミックスアップトレーニングに似ている。これらの結果から,混合訓練における雑音増強は境界厚さをさらに増加させ,様々な形態の敵攻撃やOOD変換に対する脆弱性に対処することを示した。また、最近の作業のいくつかの行のパフォーマンス改善は、より厚い境界と共に起こることも示せる。 Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with and usefulness for model robustness. Thick decision boundaries lead to improved performance, while thin decision boundaries lead to overfitting (e.g., measured by the robust generalization gap between training and testing) and lower robustness. We show that a thicker boundary helps improve robustness against adversarial examples (e.g., improving the robust test accuracy of adversarial training) as well as so-called out-of-distribution (OOD) transforms, and we show that many commonly-used regularization and data augmentation procedures can increase boundary thickness. On the theoretical side, we establish that maximizing boundary thickness during training is akin to the so-called mixup training. Using these observations, we show that noise-augmentation on mixup training further increases boundary thickness, thereby combating vulnerability to various forms of adversarial attacks and OOD transforms. We can also show that the performance improvement in several lines of recent work happens in conjunction with a thicker boundary.	翻訳日:2022-11-12 03:40:10 公開日:2021-01-12
# 深い相互作用で再重み付けを学ぶ Learning to Reweight with Deep Interactions ( http://arxiv.org/abs/2007.04649v2 ) ライセンス: Link先を確認	Yang Fan, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li	(参考訳) 近年,教師モデルを用いて,データ選択や損失関数設計などを通じて,学生モデル(実際のタスクで使用される)のトレーニングを指導する機械学習の概念が導入されている。教師モデルを用いてトレーニングデータを重み付けする特定の種類の授業であるリウェイトへの学習は、その単純さと有効性から多くの注目を集める。教師モデルでは,学習繰り返し数や学習モデルの損失/正確性などの浅面情報のみを学習/評価セットから活用するが,学習モデルの内部状態を無視し,学習結果の再重み付けの可能性を制限する。本研究では,教師モデルが教師モデルに内部状態を提供する改良データ重み付けアルゴリズムを提案し,教師モデルが学習サンプルの適応重み付けを返し,生徒モデルのトレーニングを強化する。教師モデルは、検証セットから伝播するメタ勾配を用いて、生徒モデルと共同で訓練される。クリーン/ノイズラベルとニューラルマシン翻訳を用いた画像分類実験により,従来の手法に比べて大きな改善が得られた。 Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc. Learning to reweight, which is a specific kind of teaching that reweights training data using a teacher model, receives much attention due to its simplicity and effectiveness. In existing learning to reweight works, the teacher model only utilizes shallow/surface information such as training iteration number and loss/accuracy of the student model from training/validation sets, but ignores the internal states of the student model, which limits the potential of learning to reweight. In this work, we propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model, and the teacher model returns adaptive weights of training samples to enhance the training of the student model. The teacher model is jointly trained with the student model using meta gradients propagated from a validation set. Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.	翻訳日:2022-11-12 03:21:29 公開日:2021-01-12
# 非負行列分解のための逆アニーリング Reverse Annealing for Nonnegative/Binary Matrix Factorization ( http://arxiv.org/abs/2007.05565v2 ) ライセンス: Link先を確認	John Golden, Daniel O'Malley	(参考訳) 近年、量子アニールはある種の行列分解アルゴリズムにおいて有効で高速なサブルーチンとして用いられることが示されている。量子アニーリングアルゴリズムは、迅速で近似的な答えに最適だったが、性能は急速に高まった。本稿では,非負行列分解問題に対する量子アニーリングサブルーチンの前方アニーリングの代わりに逆アニーリングを利用する。フォワードアニーリングによる最初のグローバルサーチの後、リバースアニーリングは、既存のソリューションを洗練する一連のローカルサーチを実行する。前と逆の焼鈍の組み合わせは、最も短い実行時間を除いて、前と逆の焼鈍だけでは性能が著しく向上する。 It was recently shown that quantum annealing can be used as an effective, fast subroutine in certain types of matrix factorization algorithms. The quantum annealing algorithm performed best for quick, approximate answers, but performance rapidly plateaued. In this paper, we utilize reverse annealing instead of forward annealing in the quantum annealing subroutine for nonnegative/binary matrix factorization problems. After an initial global search with forward annealing, reverse annealing performs a series of local searches that refine existing solutions. The combination of forward and reverse annealing significantly improves performance compared to forward annealing alone for all but the shortest run times.	翻訳日:2022-11-11 21:41:14 公開日:2021-01-12
# 正しい理由: 画像分類を堅牢にすること Right for the Right Reason: Making Image Classification Robust ( http://arxiv.org/abs/2007.11924v2 ) ライセンス: Link先を確認	Anna Nguyen, Adrian Oberf\"oll, Michael F\"arber	(参考訳) 画像データの分類における畳み込みニューラルネットワーク(CNN)の有効性を徹底的に実証した。人間への分類を説明するため,近年,分類証拠を可視化する手法が開発されている。これらの説明は、時々画像は正しく分類されるが、間違った理由、すなわち偶然の証拠に基づいて、正しく分類される。もちろん、画像が正しい理由、すなわち実際の証拠に基づいて正しく分類されることは望ましい。そこで本稿では,画像分類におけるオブジェクト整列説明量を測定するための新しい説明品質指標を提案する。オブジェクト検出手法、説明手法、ObAlExを用いて、実際の証拠に対するCNNの焦点を定量化する。さらに,CNNのさらなるトレーニングにより,CNNの精度を低下させることなく,CNNの焦点を向上できることを示す。 The effectiveness of Convolutional Neural Networks (CNNs)in classifying image data has been thoroughly demonstrated. In order to explain the classification to humans, methods for visualizing classification evidence have been developed in recent years. These explanations reveal that sometimes images are classified correctly, but for the wrong reasons,i.e., based on incidental evidence. Of course, it is desirable that images are classified correctly for the right reasons, i.e., based on the actual evidence. To this end, we propose a new explanation quality metric to measure object aligned explanation in image classification which we refer to as theObAlExmetric. Using object detection approaches, explanation approaches, and ObAlEx, we quantify the focus of CNNs on the actual evidence. Moreover, we show that additional training of the CNNs can improve the focus of CNNs without decreasing their accuracy.	翻訳日:2022-11-07 12:03:31 公開日:2021-01-12
# 文文脈が単語意味に及ぼす影響を特徴づける:脳を行動にマッピングする Characterizing the Effect of Sentence Context on Word Meanings: Mapping Brain to Behavior ( http://arxiv.org/abs/2007.13840v3 ) ライセンス: Link先を確認	N. Aguirre-Celis and R. Miikkulainen	(参考訳) 意味的特徴モデルはfMRIデータの予測と解釈に人気がある。特に、先行研究により、文読解におけるfMRIパターンの違いは、単語の意味的特徴表現における文脈依存的な変化によって説明できることが示されている。しかし、これらの変化を認識し、それに同意するかどうかは、明らかに疑問視されている。本論文は,人間-対象研究を通じてこの問題に答えることを目的とする。対象者は、特定の文で単語が使われたとき、単語が属する意味からどのように変化するか判断するよう求められた。判断は、偶然よりもはるかに高いモデル予測と一致した。その結果,単語の意味は文の文脈によって体系的に変化するという仮説を支持した。 Semantic feature models have become a popular tool for prediction and interpretation of fMRI data. In particular, prior work has shown that differences in the fMRI patterns in sentence reading can be explained by context-dependent changes in the semantic feature representations of the words. However, whether the subjects are aware of such changes and agree with them has been an open question. This paper aims to answer this question through a human-subject study. Subjects were asked to judge how the word change from their generic meaning when the words were used in specific sentences. The judgements were consistent with the model predictions well above chance. Thus, the results support the hypothesis that word meaning change systematically depending on sentence context.	翻訳日:2022-11-06 07:33:57 公開日:2021-01-12
# マルチスケール特徴抽出による高効率膵分節化 Efficient, high-performance pancreatic segmentation using multi-scale feature extraction ( http://arxiv.org/abs/2009.00872v2 ) ライセンス: Link先を確認	Moritz Knolle (1 and 2), Georgios Kaissis (1 and 2 and 3 and 4), Friederike Jungmann (1), Sebastian Ziegelmayer (1), Daniel Sasse (1), Marcus Makowski (1), Daniel Rueckert (2 and 4), Rickmer Braren (1) ((1) Department of diagnostic and interventional Radiology, Technical University of Munich, Munich, Germany, (2) Institute for Artificial Intelligence and Data Science in Medicine and Healthcare, Technical University of Munich, Munich, Germany, (3) OpenMined Research, (4) Department of Computing, Imperial College London, London, United Kingdom)	(参考訳) 人工知能を用いた画像解析手法を臨床応用するには, 高性能アルゴリズムの開発が不可欠である。例えば、自然画像に基づく既存のセグメンテーションアルゴリズムは、パラメータの使用において効率的でなく、医用画像に最適化されていない。本稿では,高効率なマルチスケール画像特徴量利用による高性能化に焦点をあてた,高度に最適化されたニューラルネットワークに基づく膵分画アルゴリズムであるmonetを提案する。 For artificial intelligence-based image analysis methods to reach clinical applicability, the development of high-performance algorithms is crucial. For example, existent segmentation algorithms based on natural images are neither efficient in their parameter use nor optimized for medical imaging. Here we present MoNet, a highly optimized neural-network-based pancreatic segmentation algorithm focused on achieving high performance by efficient multi-scale image feature utilization.	翻訳日:2022-10-22 19:02:44 公開日:2021-01-12
# 大規模マルチタスク言語理解の測定 Measuring Massive Multitask Language Understanding ( http://arxiv.org/abs/2009.03300v3 ) ライセンス: Link先を確認	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt	(参考訳) テキストモデルのマルチタスク精度を測定するための新しいテストを提案する。このテストは、初等数学、アメリカ史、コンピュータ科学、法律など、57のタスクをカバーする。このテストで高い精度を達成するためには、モデルは広範な世界知識と問題解決能力を持つ必要がある。近年のモデルではほぼランダム率の精度が高いが、最大のgpt-3モデルは平均で20ポイント近い確率でランダム確率を改善できることがわかった。しかし、57タスクのすべてにおいて、最高のモデルには、専門家レベルの精度に到達する前に、かなりの改善が必要である。モデルは性能も劣悪であり、いつ間違っているか分からないことが多い。さらに悪いことに、道徳や法のような社会的に重要な主題について、いまだにほぼランダムな正確さを持っている。モデルの学術的および専門的な理解の幅と深さを包括的に評価することにより、我々のテストは、多くのタスクにわたるモデルを分析し、重要な欠点を特定するのに使用できる。 We propose a new test to measure a text model's multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. To attain high accuracy on this test, models must possess extensive world knowledge and problem solving ability. We find that while most recent models have near random-chance accuracy, the very largest GPT-3 model improves over random chance by almost 20 percentage points on average. However, on every one of the 57 tasks, the best models still need substantial improvements before they can reach expert-level accuracy. Models also have lopsided performance and frequently do not know when they are wrong. Worse, they still have near-random accuracy on some socially important subjects such as morality and law. By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings.	翻訳日:2022-10-21 02:04:21 公開日:2021-01-12
# censored, quantized, and generalized group admmを用いたコミュニケーション効率の高い分散学習 Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMM ( http://arxiv.org/abs/2009.06459v2 ) ライセンス: Link先を確認	Chaouki Ben Issaid, Anis Elgabli, Jihong Park, Mehdi Bennis, M\'erouane Debbah	(参考訳) 本稿では,相互接続作業者のネットワーク上で定義されたコンセンサス最適化問題を解決する,コミュニケーション効率のよい分散機械学習フレームワークを提案する。提案するアルゴリズムであるCensored and Quantized Generalized GADMM(CQ-GGADMM)は,グループ交代方向乗算器(GADMM)の作業者グループ化と分散学習の考え方を活用し,その適用性を一般化ネットワークトポロジに拡張し,量子化後の無視可能な更新のためのリンク検閲を取り入れた通信効率のフロンティアを推し進める。我々はCQ-GGADMMが局所目的関数がいくつかの軽度仮定の下で強く凸であるときに線形収束率を達成することを理論的に証明する。数値シミュレーションにより、cq-ggadmmは、検閲された分散admmとgadmmのワーカーグループ化法と比較して、通信ラウンド数で通信効率が高く、精度と収束速度を損なうことなく、エネルギー消費を伝達する。 In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored and Quantized Generalized GADMM (CQ-GGADMM), leverages the worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the censored decentralized ADMM, and the worker grouping method of GADMM.	翻訳日:2022-10-18 12:06:57 公開日:2021-01-12
# ディープラーニングを用いたソースコードの自動生成と自動補完:現在の言語モデルに基づくアプローチの比較と検討 Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches ( http://arxiv.org/abs/2009.07740v4 ) ライセンス: Link先を確認	Juan Cruz-Benito, Sanjay Vishwakarma, Francisco Martin-Fernandez, Ismael Faro	(参考訳) 近年,言語モデルにおけるディープラーニングの利用が注目されている。いくつかの研究プロジェクトは、人間が書くと解釈できるテキストを生成することができ、多くのアプリケーション領域で新しい可能性を可能にすると主張している。言語処理に関連するさまざまな分野のうち、この種のモデリングを適用する際に最も注目すべきはプログラミング言語である。機械学習コミュニティは長年にわたり、このソフトウェアエンジニアリング領域を調査し、人間がプログラムするコードの自動補完、生成、修正、評価など、さまざまなアプローチの適用といった目標を追求してきた。 Deep-Learning対応の言語モデルアプローチの人気が高まっていることを踏まえ、異なるディープラーニングアーキテクチャを比較して、プログラミングコードに基づいて言語モデルを作成し、使用する経験的な論文が不足していることを発見した。本稿では,awd-lstms,awd-qrnns,transformerなどのニューラルネットワークアーキテクチャを比較し,トランスファーラーニングとトークン化を用いて,pythonデータセットを用いたコード生成とマスクタスクの補完を行う。結果を踏まえて,それぞれのアプローチの強みと弱み,言語モデルの評価や実際のプログラミングコンテキストに適用する上でのギャップについて検討する。 In recent years, the use of deep learning in language models gained much attention. Some research projects claim that they can generate text that can be interpreted as human-writing, enabling new possibilities in many application areas. Among the different areas related to language processing, one of the most notable in applying this type of modeling is programming languages. For years, the Machine Learning community has been researching this software engineering area, pursuing goals like applying different approaches to auto-complete, generate, fix, or evaluate code programmed by humans. Considering the increasing popularity of the Deep-Learning-enabled language models approach, we detected a lack of empirical papers that compare different deep learning architectures to create and use language models based on programming code. This paper compares different neural network architectures like AWD-LSTMs, AWD-QRNNs, and Transformer while using transfer learning and different tokenizations to see how they behave in building language models using a Python dataset for code generation and filling mask tasks. Considering the results, we discuss each approach's different strengths and weaknesses and what gaps we find to evaluate the language models or apply them in a real programming context.	翻訳日:2022-10-17 23:28:29 公開日:2021-01-12
# ニューロシンボリック神経変性疾患の確率論的プログラムによるモデル化 Neuro-symbolic Neurodegenerative Disease Modeling as Probabilistic Programmed Deep Kernels ( http://arxiv.org/abs/2009.07738v3 ) ライセンス: Link先を確認	Alexander Lavin	(参考訳) 神経変性疾患のパーソナライズされた予測モデリングのための,確率的プログラムによる深層学習手法を提案する。分析は、予測性能と、解釈可能性、不確実性推論、データ効率、ドメイン知識の活用といった重要な医療ai特性を評価する、ニューラルネットワークとシンボリック機械学習のアプローチのスペクトルを考察する。我々のベイズ的アプローチは、ガウス的プロセスの柔軟性とニューラルネットワークの構造的パワーを組み合わせてバイオマーカーの進行をモデル化する。アルツハイマー病予測の問題点について評価を行い、神経変性予測の正確性と時系列性の両方においてディープラーニングを上回り、ベイズ型非パラメトリックスと確率的プログラミングの実用的利点を生かした。 We present a probabilistic programmed deep kernel learning approach to personalized, predictive modeling of neurodegenerative diseases. Our analysis considers a spectrum of neural and symbolic machine learning approaches, which we assess for predictive performance and important medical AI properties such as interpretability, uncertainty reasoning, data-efficiency, and leveraging domain knowledge. Our Bayesian approach combines the flexibility of Gaussian processes with the structural power of neural networks to model biomarker progressions, without needing clinical labels for training. We run evaluations on the problem of Alzheimer's disease prediction, yielding results that surpass deep learning in both accuracy and timeliness of predicting neurodegeneration, and with the practical advantages of Bayesian nonparametrics and probabilistic programming.	翻訳日:2022-10-17 23:20:44 公開日:2021-01-12
# 経験的利得最大化による学習の枠組み A Framework of Learning Through Empirical Gain Maximization ( http://arxiv.org/abs/2009.14250v2 ) ライセンス: Link先を確認	Yunlong Feng and Qiang Wu	(参考訳) 本稿では,重み付き雑音や外れ値が応答変数に現れるような頑健な回帰問題に対処するために,経験的ゲイン最大化(EGM)の枠組みを開発する。 EGMの考え方は、通常のように真理関数を直接近似するのではなく、雑音分布の密度関数を近似することである。全ての観測を同等に重要視し、異常観測の存在下で問題となる古典的な最大度推定とは異なり、egmスキームは最小距離推定の観点から解釈でき、これらの観測を無知にすることができる。さらに,いくつかのロバストな非凸回帰パラダイム(例えば,タキー回帰や断続最小二乗回帰)を新しいフレームワークに再構成できることが示されている。そこで我々は,これら十分に確立されているが,完全には理解されていない回帰アプローチに対して,統一的な解析を行うことにより,EMGの学習理論を開発する。新しい枠組みから、既存の有界非凸損失関数の新たな解釈を結論付けることができる。この新しい枠組みでは、ロバスト回帰のための有名なテューキーの双重損失と非パラメトリックスムージングのための三重重項の2つの用語が密接に関連している。より正確には、タキーの双重損失は三重項カーネルから導出できることが示されている。同様に、切り詰められた正方形損失、ゲマン・マククリール損失、指数的正方形損失といった機械学習における有界な非凸損失関数は、統計学においてある滑らかなカーネルから再構成することもできる。さらに,新しいフレームワークにより,ロバスト学習のための境界非凸損失関数の考案が可能となった。 We develop in this paper a framework of empirical gain maximization (EGM) to address the robust regression problem where heavy-tailed noise or outliers may present in the response variable. The idea of EGM is to approximate the density function of the noise distribution instead of approximating the truth function directly as usual. Unlike the classical maximum likelihood estimation that encourages equal importance of all observations and could be problematic in the presence of abnormal observations, EGM schemes can be interpreted from a minimum distance estimation viewpoint and allow the ignorance of those observations. Furthermore, it is shown that several well-known robust nonconvex regression paradigms, such as Tukey regression and truncated least square regression, can be reformulated into this new framework. We then develop a learning theory for EGM, by means of which a unified analysis can be conducted for these well-established but not fully-understood regression approaches. Resulting from the new framework, a novel interpretation of existing bounded nonconvex loss functions can be concluded. Within this new framework, the two seemingly irrelevant terminologies, the well-known Tukey's biweight loss for robust regression and the triweight kernel for nonparametric smoothing, are closely related. More precisely, it is shown that the Tukey's biweight loss can be derived from the triweight kernel. Similarly, other frequently employed bounded nonconvex loss functions in machine learning such as the truncated square loss, the Geman-McClure loss, and the exponential squared loss can also be reformulated from certain smoothing kernels in statistics. In addition, the new framework enables us to devise new bounded nonconvex loss functions for robust learning.	翻訳日:2022-10-13 05:34:17 公開日:2021-01-12
# クロスコネクテッド$\psi$-netによる感度重み付け画像の位相からの定量的感受性マップの再構成 Reconstruction of Quantitative Susceptibility Maps from Phase of Susceptibility Weighted Imaging with Cross-Connected $\Psi$-Net ( http://arxiv.org/abs/2010.05395v3 ) ライセンス: Link先を確認	Zhiyang Lu, Jun Li, Zheng Li, Hongjian He, Jun Shi	(参考訳) 定量的感受性マッピング(QSM)は、磁気感受性を定量化する新しい位相ベースの手法である。既存のQSM再構成法は、通常、高品質の位相データに対する複雑な前処理を必要とする。本研究では,サセプティビティ重み付け画像(SWI)で生成された高域通過フィルタ位相データの新たな値について検討し,QSMをこれらの位相データから直接SWIに再構成するためのエンドツーエンドの$\Psi$-Net(C$\Psi$-Net)を開発する。 C$\Psi$-Net は古典的な U-Net の中間分岐を加えて$\Psi$-like 構造を形成する。特別に設計された拡張された相互作用ブロックは、このブランチの各レベルに埋め込まれ、より広い空間範囲の位相画像からより感受性情報を取得するための受容野を拡大する。さらに、クロスコネクションはブランチ間で利用され、C$\Psi$-Netでリッチなコンテキスト情報をキャプチャして正確な再構成を行うマルチレゾリューション機能融合スキームを実装している。ヒトのデータセットにおける実験結果は、c$\psi$-netが他のqsm再構成アルゴリズムよりも優れた性能を達成していることを示している。 Quantitative Susceptibility Mapping (QSM) is a new phase-based technique for quantifying magnetic susceptibility. The existing QSM reconstruction methods generally require complicated pre-processing on high-quality phase data. In this work, we propose to explore a new value of the high-pass filtered phase data generated in susceptibility weighted imaging (SWI), and develop an end-to-end Cross-connected $\Psi$-Net (C$\Psi$-Net) to reconstruct QSM directly from these phase data in SWI without additional pre-processing. C$\Psi$-Net adds an intermediate branch in the classical U-Net to form a $\Psi$-like structure. The specially designed dilated interaction block is embedded in each level of this branch to enlarge the receptive fields for capturing more susceptibility information from a wider spatial range of phase images. Moreover, the crossed connections are utilized between branches to implement a multi-resolution feature fusion scheme, which helps C$\Psi$-Net capture rich contextual information for accurate reconstruction. The experimental results on a human dataset show that C$\Psi$-Net achieves superior performance in our task over other QSM reconstruction algorithms.	翻訳日:2022-10-08 08:10:50 公開日:2021-01-12
# k-simplex2vec: node2vecの単純拡張 k-simplex2vec: a simplicial extension of node2vec ( http://arxiv.org/abs/2010.05636v2 ) ライセンス: Link先を確認	Celia Hacker	(参考訳) 本稿では, ユークリッド特徴を単純な複合体に関連付ける新しい手法を提案し, 統計的および機械学習ツールの入力として利用する方法を提案する。この方法は、ノード2vecアルゴリズムを高次元の単純化に拡張し、単純複素体の構造やグラフ内の高次相互作用に関する洞察を与える。 We present a novel method of associating Euclidean features to simplicial complexes, providing a way to use them as input to statistical and machine learning tools. This method extends the node2vec algorithm to simplices of higher dimensions, providing insight into the structure of a simplicial complex, or into the higher-order interactions in a graph.	翻訳日:2022-10-08 08:02:01 公開日:2021-01-12
# 機械学習力場 Machine Learning Force Fields ( http://arxiv.org/abs/2010.07067v2 ) ライセンス: Link先を確認	Oliver T. Unke, Stefan Chmiela, Huziel E. Sauceda, Michael Gastegger, Igor Poltavsky, Kristof T. Sch\"utt, Alexandre Tkatchenko, Klaus-Robert M\"uller	(参考訳) 近年,計算化学における機械学習 (ML) の利用は,従来の電子構造手法の計算複雑性により,これまでも多くの進歩を遂げてきた。最も有望な応用の1つは、ab initio法の精度と古典的なFFの効率のギャップを狭めることを目的としたMLベースの力場(FF)の構築である。鍵となるアイデアは、化学結合や関連する相互作用に関する知識の先入観に頼らずに、化学構造とポテンシャルエネルギーの間の統計的関係を学ぶことである。このような普遍的なML近似は、原則として訓練に使用される参照データの品質と量によってのみ制限される。本稿ではML-FFの応用の概要とそれらから得られる化学的知見を紹介する。 ML-FFの基礎となる概念は詳細に説明され、それらをスクラッチから構築およびテストするためのステップバイステップガイドが提供される。このテキストは、次世代のML-FFによって克服されるであろう課題に関する議論で締めくくられている。 In recent years, the use of Machine Learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.	翻訳日:2022-10-07 14:03:01 公開日:2021-01-12
# 衛星周波数計画設計における深部強化学習の適用性と課題 Applicability and Challenges of Deep Reinforcement Learning for Satellite Frequency Plan Design ( http://arxiv.org/abs/2010.08015v2 ) ライセンス: Link先を確認	Juan Jose Garau Luis, Edward Crawley and Bruce Cameron	(参考訳) 深層強化学習(DRL)モデルの研究とベンチマークは、航空宇宙工学や通信を含む多くの産業でトレンドとなっている。これらの分野での最近の研究は、古典的アプローチが時間要件を満たしていない、あるいは最適解を得ることができない、複雑なリアルタイム意思決定問題に対処するこの種のモデルを提案する。 DRLモデルの優れた性能は特定のユースケースやシナリオに対して証明されているが、ほとんどの研究は実際の運用においてそのようなモデルの妥協や一般化可能性について論じていない。本稿では,DRLモデルの異なる要素のトレードオフと,それらが最終性能に与える影響について検討する。そこで我々は、マルチビーム衛星コンステレーションをユースケースとして、周波数計画設計(FPD)問題を選択し、それに対処するためのDRLモデルを提案する。ポリシ,ポリシオプティマイザ,状態,アクション,報酬表現,トレーニング環境という,パフォーマンスに大きな影響を与える6つのコア要素を特定した。これらの要素ごとに異なる選択肢を分析し、その効果を特徴づける。また、異なるシナリオを考慮に入れたり、環境を非定常にしたりするために、複数の環境も利用しています。以上の結果から,DRLは実業務におけるFPD問題,特に意思決定の高速化に対処する潜在的手法である可能性が示唆された。しかし、すべてのシナリオでDRLモデルが他のモデルよりも優れており、6つのコア要素のそれぞれに最適なアプローチは、運用環境の特徴に依存している。航空産業における将来的な複雑な問題を解決するためのDRLの可能性について合意する一方で、適切なモデルや訓練手順を設計することの重要性、それらのモデルの適用性を理解し、主な性能トレードオフを報告することについても考察する。 The study and benchmarking of Deep Reinforcement Learning (DRL) models has become a trend in many industries, including aerospace engineering and communications. Recent studies in these fields propose these kinds of models to address certain complex real-time decision-making problems in which classic approaches do not meet time requirements or fail to obtain optimal solutions. While the good performance of DRL models has been proved for specific use cases or scenarios, most studies do not discuss the compromises and generalizability of such models during real operations. In this paper we explore the tradeoffs of different elements of DRL models and how they might impact the final performance. To that end, we choose the Frequency Plan Design (FPD) problem in the context of multibeam satellite constellations as our use case and propose a DRL model to address it. We identify 6 different core elements that have a major effect in its performance: the policy, the policy optimizer, the state, action, and reward representations, and the training environment. We analyze different alternatives for each of these elements and characterize their effect. We also use multiple environments to account for different scenarios in which we vary the dimensionality or make the environment nonstationary. Our findings show that DRL is a potential method to address the FPD problem in real operations, especially because of its speed in decision-making. However, no single DRL model is able to outperform the rest in all scenarios, and the best approach for each of the 6 core elements depends on the features of the operation environment. While we agree on the potential of DRL to solve future complex problems in the aerospace industry, we also reflect on the importance of designing appropriate models and training procedures, understanding the applicability of such models, and reporting the main performance tradeoffs.	翻訳日:2022-10-07 04:44:07 公開日:2021-01-12
# 球形知識拡散による教師学習ギャップの低減 Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation ( http://arxiv.org/abs/2010.07485v5 ) ライセンス: Link先を確認	Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai	(参考訳) 知識蒸留は、より大きいものからマッピング関数を学習することで、コンパクトで効果的なモデルを得ることを目的としている。生徒の能力が限られているため、生徒は教師に不利になる。そのため、大容量教師からの蒸留では、学生のパフォーマンスが予想外に低下し、キャパシティギャップ問題と呼ばれた。本研究では,教師と学生の信頼のギャップについて検討する。知識蒸留には信頼度は必要とせず,学生が自信を習得せざるを得ない場合には,学生のパフォーマンスを損なう可能性がある。我々は,このギャップを明示的に解消するために球面知識蒸留法を提案する。この新しい知識表現は、はるかに大きな教師でコンパクトモデルを改善することができ、温度に対して堅牢である。 CIFAR100とImageNetの両方で実験を行い,大幅な改良を行った。具体的には、以前のSOTAよりも大幅に改善されたResNet18から73.0の精度をトレーニングし、生徒の約2倍のresnet34と同等である。実装はhttps://github.com/forjiuzhou/Spherical-Knowledge-Distillationで共有されている。 Knowledge distillation aims at obtaining a compact and effective model by learning the mapping function from a much larger one. Due to the limited capacity of the student, the student would underfit the teacher. Therefore, student performance would unexpectedly drop when distilling from an oversized teacher, termed the capacity gap problem. We investigate this problem by study the gap of confidence between teacher and student. We find that the magnitude of confidence is not necessary for knowledge distillation and could harm the student performance if the student are forced to learn confidence. We propose Spherical Knowledge Distillation to eliminate this gap explicitly, which eases the underfitting problem. We find this novel knowledge representation can improve compact models with much larger teachers and is robust to temperature. We conducted experiments on both CIFAR100 and ImageNet, and achieve significant improvement. Specifically, we train ResNet18 to 73.0 accuracy, which is a substantial improvement over previous SOTA and is on par with resnet34 almost twice the student size. The implementation has been shared at https://github.com/forjiuzhou/Spherical-Knowledge-Distillation.	翻訳日:2022-10-07 03:24:59 公開日:2021-01-12
# 自然言語が情報と処理手順の両方を符号化するという考えに基づく自然言語理解の新しいアプローチ New Approaches for Natural Language Understanding based on the Idea that Natural Language encodes both Information and its Processing Procedures ( http://arxiv.org/abs/2010.12789v3 ) ライセンス: Link先を確認	Limin Zhang	(参考訳) 自然言語は情報エンコーディングの方法であり、情報だけでなく、情報がどのように処理されるかの手順もエンコードする必要がある。自然言語を理解するには、コンピュータ言語を想像し設計するのと同じで、最初のステップは情報(またはデータ)と情報(またはデータ)の処理手順を分離することである。自然言語では、データの処理手順を構造チャンクとポインタチャンクとして直接符号化する(この論文では、語彙チャンクをデータチャンク、構造チャンク、ポインタチャンクに再分類する)。データ部分については属性情報の分類符号化システムと情報編成アーキテクチャ(情報集合の構成構造と情報集合の階層構造を含む)について論じた。第2節では、第2節で詳述した理論的な部分が実例で検証され、本論文の研究では、機械が対話で伝達される情報を理解することを目標としている。第4節では、"Understanding"の基本条件を要約し、"Understanding"とは何か、どのように進むべきかを再考する。本研究では,NLUの実用的,理論的基礎および研究手法について述べる。また、人工知能(AI)領域における大規模かつ多種類の情報処理にも適用することができる。 We must recognize that natural language is a way of information encoding, and it encodes not only the information but also the procedures for how information is processed. To understand natural language, the same as we conceive and design computer languages, the first step is to separate information (or data) and the processing procedures of information (or data). In natural language, some processing procedures of data are encoded directly as the structure chunk and the pointer chunk (this paper has reclassified lexical chunks as the data chunk, structure chunk, and the pointer chunk); some processing procedures of data imply in sentences structures; some requests of processing procedures are expressed by information senders and processed by information receivers. For the data parts, the classification encoding system of attribute information and the information organization architecture (including constitutional structures of information sets and the hierarchy between the information sets) were discussed. In section 2, the theoretical part elaborated in section 2 has been verified in examples and proofed that the studies in this paper have achieved the goal of enabling machines to understand the information conveyed in the dialogue. In section 4, the author summarizes the basic conditions of "Understanding", rethinks what "Understanding" is and how to proceed. The study in this paper provides a practical, theoretical basis and research methods for NLU. It also can be applied in large-scale and multi-type information processing in the artificial intelligence (AI) area.	翻訳日:2022-10-03 12:52:20 公開日:2021-01-12
# 深層学習ニューラルネットワークを用いた放射線治療用線量予測におけるモンテカルロ降雨量とブートストラップアグリゲーションの比較 A comparison of Monte Carlo dropout and bootstrap aggregation on the performance and uncertainty estimation in radiation therapy dose prediction with deep learning neural networks ( http://arxiv.org/abs/2011.00388v2 ) ライセンス: Link先を確認	Dan Nguyen, Azar Sadeghnejad Barkousaraie, Gyanendra Bohara, Anjali Balagopal, Rafe McBeth, Mu-Han Lin, Steve Jiang	(参考訳) 近年,人工知能技術とアルゴリズムが放射線治療における治療計画の進展に重点を置いている。これらは臨床ワークフローに取り入れられ始めているため、臨床医からの懸念は、モデルが正確かどうかではなく、その答えが正しいかどうか分からない場合に、モデルが人間のオペレータに表現できるかどうかである。深層学習モデルにおいてモンテカルロドロップアウト(mcdo)とブートストラップ凝集(bagging)技術を用いて放射線治療用線量予測のための不確実性推定を行う。我々は,両モデルとも合理的な不確実性マップを生成できることを示し,提案手法により,予測と関連する指標に対する解釈可能な不確実性と境界を生成する。性能面では,バグングは統計的に有意な損失値の減少と誤差をもたらす。ベージの追加により,ベースラインフレームワークと比較して,Dmeanでは0.34%,Dmaxでは0.19%のエラー削減が可能になった。全体として、ベージフレームワークは、ベースラインフレームワークの2.87のMAEとは対照的に、2.62のMAEをかなり低くした。バッグングの有用性は、単にパフォーマンスの観点からは、問題と許容できる予測誤差に大きく依存しており、トレーニング中の高い事前計算コストは、その使用が有利かどうかを決定するために考慮すべきである。不確かさを見積もったデプロイメントでは、どちらのフレームワークも、約12秒の時間で同じパフォーマンスを提供する。アンサンブルベースのメタヒューリスティックとして、既存の機械学習アーキテクチャを使って安定性とパフォーマンスを向上させることができ、MCDOはアーキテクチャの一部としてドロップアウトしたディープラーニングモデルに適用することができる。 Recently, artificial intelligence technologies and algorithms have become a major focus for advancements in treatment planning for radiation therapy. As these are starting to become incorporated into the clinical workflow, a major concern from clinicians is not whether the model is accurate, but whether the model can express to a human operator when it does not know if its answer is correct. We propose to use Monte Carlo dropout (MCDO) and the bootstrap aggregation (bagging) technique on deep learning models to produce uncertainty estimations for radiation therapy dose prediction. We show that both models are capable of generating a reasonable uncertainty map, and, with our proposed scaling technique, creating interpretable uncertainties and bounds on the prediction and any relevant metrics. Performance-wise, bagging provides statistically significant reduced loss value and errors in most of the metrics investigated in this study. The addition of bagging was able to further reduce errors by another 0.34% for Dmean and 0.19% for Dmax, on average, when compared to the baseline framework. Overall, the bagging framework provided significantly lower MAE of 2.62, as opposed to the baseline framework's MAE of 2.87. The usefulness of bagging, from solely a performance standpoint, does highly depend on the problem and the acceptable predictive error, and its high upfront computational cost during training should be factored in to deciding whether it is advantageous to use it. In terms of deployment with uncertainty estimations turned on, both frameworks offer the same performance time of about 12 seconds. As an ensemble-based metaheuristic, bagging can be used with existing machine learning architectures to improve stability and performance, and MCDO can be applied to any deep learning models that have dropout as part of their architecture.	翻訳日:2022-09-30 23:12:33 公開日:2021-01-12
# 仮説的介入による予測を可能にする因果法の検討 A scoping review of causal methods enabling predictions under hypothetical interventions ( http://arxiv.org/abs/2011.09815v2 ) ライセンス: Link先を確認	Lijing Lin, Matthew Sperrin, David A. Jenkins, Glen P. Martin, Niels Peek	(参考訳) 背景と目的: 予測モデルが通常開発される手法は、パラメータや予測を因果的に解釈するべきではないことを意味する。しかしながら、意思決定を支援するために予測モデルが使用される場合、仮定的な介入の下で結果を予測する必要性がしばしばある。本研究の目的は,仮説的介入による結果のリスク推定を可能にする予測モデルの開発・検証,因果推論の活用,主要な方法論的アプローチ,基礎となる仮定,目標推定,本手法を用いた潜在的な落とし穴や課題,未解決の方法論的課題の検証である。方法:2019年12月までに刊行された文献を体系的にレビューし,仮説的介入による予測モデルの使用を可能にするために因果的考察を用いた健康領域の論文を考察した。結果: データベース検索により4919件の論文を同定し, さらに115件の論文を手作業で検索し, その内13件を統計的および機械学習の文献から抽出した。観測データから因果推定を行う手法の多くは,境界構造モデルとg推定に基づいていた。結論: 臨床予測モデルへの仮説的介入下での予測を可能にする方法は2つある。 1)臨床試験及びメタアナリシスから推定される因果効果による観察研究に由来する予測モデルの改善 2)観測データから直接予測モデルと因果効果を推定する。これらの方法は、ダイナミックな治療体制への拡張と、臨床決定支援システムを運用するための複数の介入を考慮する必要がある。因果予測モデル」を検証する技術はまだ初期段階にある。 Background and Aims: The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. We aimed to identify published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference: their main methodological approaches, underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method, and unresolved methodological challenges. Methods: We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. Results: We identified 4919 papers through database searches and a further 115 papers through manual searches, of which 13 were selected for inclusion, from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation. Conclusions: There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: 1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses; and 2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy.	翻訳日:2022-09-23 21:44:02 公開日:2021-01-12
# ロボットにおける「能動的自己」のための感覚運動表現学習--モデル調査 Sensorimotor representation learning for an "active self" in robots: A model survey ( http://arxiv.org/abs/2011.12860v2 ) ライセンス: Link先を確認	Phuong D.H. Nguyen, Yasmin Kim Georgie, Ezgi Kayhan, Manfred Eppe, Verena Vanessa Hafner, and Stefan Wermter	(参考訳) 安全な人間とロボットの相互作用は、ロボットが「人間」の世界で適切に振る舞う方法を学ばなければならないため、操作のための厳格なルールを提供するのではなく、動的で非構造的な環境によって引き起こされる課題に対処する必要がある。人間では、これらの能力は私たちの身体を宇宙で知覚し、運動中の手足の位置を感知し、他の物体やエージェントを認識し、身体の一部が故意に相互作用するように制御する能力と関係していると考えられている。バイオインスパイアされた能力を持つ次世代ロボットについて,まず,身体スキーマの感覚表現,対人空間,人間の活動的自己など,これらの能力の根底にあるメカニズムの発達過程を概観する。第二に、これらの感覚表現のロボットモデルと自己のロボットモデルについての調査を行い、これらのモデルを人間モデルと比較する。最後に,これらのロボットモデルに欠けているものを解析し,自己爆発による感覚表現を発達させることにより,人工エージェントにおける自己感覚の出現を可能にするための理論的計算枠組みを提案する。 Safe human-robot interactions require robots to be able to learn how to behave appropriately in \sout{humans' world} \rev{spaces populated by people} and thus to cope with the challenges posed by our dynamic and unstructured environment, rather than being provided a rigid set of rules for operations. In humans, these capabilities are thought to be related to our ability to perceive our body in space, sensing the location of our limbs during movement, being aware of other objects and agents, and controlling our body parts to interact with them intentionally. Toward the next generation of robots with bio-inspired capacities, in this paper, we first review the developmental processes of underlying mechanisms of these abilities: The sensory representations of body schema, peripersonal space, and the active self in humans. Second, we provide a survey of robotics models of these sensory representations and robotics models of the self; and we compare these models with the human counterparts. Finally, we analyse what is missing from these robotics models and propose a theoretical computational framework, which aims to allow the emergence of the sense of self in artificial agents by developing sensory representations through self-exploration.	翻訳日:2022-09-21 03:21:12 公開日:2021-01-12
# (参考訳) gansにおけるwasserstein距離の一般化化に向けて Towards Generalized Implementation of Wasserstein Distance in GANs ( http://arxiv.org/abs/2012.03420v2 ) ライセンス: CC BY 4.0	Minkai Xu, Zhiming Zhou, Guansong Lu, Jian Tang, Weinan Zhang, Yong Yu	(参考訳) ワッサーシュタイン距離のカントロヴィチ・ルビンシュタイン(KR)双対性に基づいて構築されたワッサーシュタイン GAN (WGAN) は、理論上最も健全なGANモデルの一つである。しかし実際には、GANの他の変種よりも常に優れているわけではない。これは主にKR双対性によって要求されるリプシッツ条件の不完全な実装のためである。リプシッツ制約の異なる実装でコミュニティで大規模な作業が行われてきたが、実際にはその制約を完全に満たすのは難しい。本稿では,強いリプシッツ制約が最適化に不要である可能性を論じる。その代わり、一歩後退して、リプシッツ制約を緩和しようとする。理論的には、ワッサーシュタイン距離のより一般的な双対形式であるソボレフ双対性は、リプシッツの制約を緩和するが、ワッサーシュタイン距離の好ましい勾配特性を維持している。さらに、KR双対性は実際にはソボレフ双対性の特別な場合であることを示す。さらに, 緩和双対性に基づき, sobolev wasserstein gan (swgan) という一般化した wgan トレーニングスキームを提案し, 既存の手法に対する swgan の改善を広範囲な実験で実証した。 Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models. However, in practice it does not always outperform other variants of GANs. This is mostly due to the imperfect implementation of the Lipschitz condition required by the KR duality. Extensive work has been done in the community with different implementations of the Lipschitz constraint, which, however, is still hard to satisfy the restriction perfectly in practice. In this paper, we argue that the strong Lipschitz constraint might be unnecessary for optimization. Instead, we take a step back and try to relax the Lipschitz constraint. Theoretically, we first demonstrate a more general dual form of the Wasserstein distance called the Sobolev duality, which relaxes the Lipschitz constraint but still maintains the favorable gradient property of the Wasserstein distance. Moreover, we show that the KR duality is actually a special case of the Sobolev duality. Based on the relaxed duality, we further propose a generalized WGAN training scheme named Sobolev Wasserstein GAN (SWGAN), and empirically demonstrate the improvement of SWGAN over existing methods with extensive experiments.	翻訳日:2021-05-21 09:41:39 公開日:2021-01-12
# (参考訳) フラグメントに基づく生成モデルによる分子最適化 Molecule Optimization via Fragment-based Generative Models ( http://arxiv.org/abs/2012.04231v2 ) ライセンス: CC BY 4.0	Ziqi Chen, Martin Renqiang Min, Srinivasan Parthasarathy, Xia Ning	(参考訳) 創薬において、分子最適化は、望ましい薬物特性の観点から薬候補をより良いものにするための重要なステップである。近年の人工知能の進歩により、従来のin vitroプロセスはシリコアプローチによってますます促進されている。本稿では,計算量最適化分子に対する革新的シリコアプローチを提案し,深層生成モデルを用いて最適化された分子グラフを生成する問題を定式化する。我々の生成モデルはフラグメントベースの薬物設計の重要なアイデアに従い、小さなフラグメントを変更することで分子を最適化します。我々のモデルは、最適化されたフラグメントの特定方法と、良い性質と悪い性質を持つ分子の違いから、これらのフラグメントの修正方法を学ぶ。新しい分子を最適化するために、我々のモデルは、予測されたフラグメントの位置で最適化されたフラグメントをデコードするために学習信号を適用します。また、パイプライン内の各モデルが1つのフラグメントを最適化できるように、パイプライン内に複数のモデルを構築します。提案手法は, 分子類似性制約下で80%以上の特性改善, 高分子類似性制約下で10%以上の特性改善により, 他者よりも顕著に優れていることを示す。 In drug discovery, molecule optimization is an important step in order to modify drug candidates into better ones in terms of desired drug properties. With the recent advance of Artificial Intelligence, this traditionally in vitro process has been increasingly facilitated by in silico approaches. We present an innovative in silico approach to computationally optimizing molecules and formulate the problem as to generate optimized molecular graphs via deep generative models. Our generative models follow the key idea of fragment-based drug design, and optimize molecules by modifying their small fragments. Our models learn how to identify the to-be-optimized fragments and how to modify such fragments by learning from the difference of molecules that have good and bad properties. In optimizing a new molecule, our models apply the learned signals to decode optimized fragments at the predicted location of the fragments. We also construct multiple such models into a pipeline such that each of the models in the pipeline is able to optimize one fragment, and thus the entire pipeline is able to modify multiple fragments of molecule if needed. We compare our models with other state-of-the-art methods on benchmark datasets and demonstrate that our methods significantly outperform others with more than 80% property improvement under moderate molecular similarity constraints, and more than 10% property improvement under high molecular similarity constraints.	翻訳日:2021-05-17 09:32:38 公開日:2021-01-12
# 降雨レーダ画像と風況予測の融合による降雨ノキャスティングへの応用 Fusion of rain radar images and wind forecasts in a deep learning model applied to rain nowcasting ( http://arxiv.org/abs/2012.05015v2 ) ライセンス: Link先を確認	Vincent Bouget and Dominique B\'er\'eziat and Julien Brajard and Anastase Charantonis and Arthur Filoche	(参考訳) 短期または中期の降雨予測は、農業管理や洪水リスクモニタリングといったいくつかの環境応用において主要な課題である。既存のデータ駆動アプローチ、特にディープラーニングモデルは、降雨レーダイメージのみを入力として、このタスクにおいて重要なスキルを示してきた。風などの気象パラメータが予測を改善するかどうかを判断するために,降雨レーダ画像と気象予報モデルによる風速の融合に関するディープラーニングモデルを訓練した。ネットワークはレーダーデータのみに基づいてトレーニングされた類似アーキテクチャと、基本的な永続化モデル、光学フローに基づくアプローチと比較された。地平線時間30分で予測する中・高降雨時の光流量をF1スコアで計算し, ネットワークの性能は8%向上した。さらに、降雨レーダイメージのみを使用してトレーニングされた同じアーキテクチャを7%上回っている。降雨量と風速データを組み合わせることでトレーニングプロセスを安定させ,特に降雨予測の難しい降雨量で大幅な改善が達成されている。 Short- or mid-term rainfall forecasting is a major task with several environmental applications such as agricultural management or flood risk monitoring. Existing data-driven approaches, especially deep learning models, have shown significant skill at this task, using only rainfall radar images as inputs. In order to determine whether using other meteorological parameters such as wind would improve forecasts, we trained a deep learning model on a fusion of rainfall radar images and wind velocity produced by a weather forecast model. The network was compared to a similar architecture trained only on radar data, to a basic persistence model and to an approach based on optical flow. Our network outperforms by 8% the F1-score calculated for the optical flow on moderate and higher rain events for forecasts at a horizon time of 30 min. Furthermore, it outperforms by 7% the same architecture trained using only rainfall radar images. Merging rain and wind data has also proven to stabilize the training process and enabled significant improvement especially on the difficult-to-predict high precipitation rainfalls.	翻訳日:2021-05-16 01:47:19 公開日:2021-01-12
# マルチリード心電図信号からの27の異常の同定:サインロス機能を有するSe-ResNetフレームワーク Identification of 27 abnormalities from multi-lead ECG signals: An ensembled Se-ResNet framework with Sign Loss function ( http://arxiv.org/abs/2101.03895v2 ) ライセンス: Link先を確認	Zhaowei Zhu, Xiang Lan, Tingting Zhao, Yangming Guo, Pipin Kojodjojo, Zhuoyang Xu, Zhuo Liu, Siqi Liu, Han Wang, Xingzhi Sun, Mengling Feng	(参考訳) 心臓血管疾患は健康にとって大きな脅威であり、世界中の死因の1つである。 12誘導心電図は、心臓の異常を識別するための安価で一般的なツールである。早期かつ正確な診断は、早期の治療と介入により、心血管疾患の重篤な合併症を予防する。本研究の目的は,12誘導心電図記録から27個の心電図異常を自動的に識別するアルゴリズムを開発することである。 Cardiovascular disease is a major threat to health and one of the primary causes of death globally. The 12-lead ECG is a cheap and commonly accessible tool to identify cardiac abnormalities. Early and accurate diagnosis will allow early treatment and intervention to prevent severe complications of cardiovascular disease. In the PhysioNet/Computing in Cardiology Challenge 2020, our objective is to develop an algorithm that automatically identifies 27 ECG abnormalities from 12-lead ECG recordings.	翻訳日:2021-05-10 05:10:35 公開日:2021-01-12
# データ拡張ポリシとネットワークアーキテクチャの統合検索 Joint Search of Data Augmentation Policies and Network Architectures ( http://arxiv.org/abs/2012.09407v2 ) ライセンス: Link先を確認	Taiga Kashima, Yoshihiro Yamada, Shunta Saito	(参考訳) ディープニューラルネットワークをトレーニングする一般的なパイプラインは、データ拡張やネットワークアーキテクチャの選択など、いくつかのビルディングブロックで構成される。 automlは、これらのパーツを自動的に設計することを目的とした研究分野だが、ほとんどのメソッドは、各パーツを独立して探索する。本稿では,トレーニングパイプラインの設計にさらなる自動化を実現するために,データ拡張ポリシーとネットワークアーキテクチャを統合的に最適化する手法を提案する。私たちのアプローチの核となる考え方は、部分全体を差別化可能にすることです。提案手法は,拡張ポリシー探索法とネットワークアーキテクチャ探索法を組み合わせることで,エンドツーエンドでそれらを協調的に最適化する。実験の結果, 本手法は独立的に検索した結果に対して, 競争性, 優れた性能が得られることがわかった。 The common pipeline of training deep neural networks consists of several building blocks such as data augmentation and network architecture selection. AutoML is a research field that aims at automatically designing those parts, but most methods explore each part independently because it is more challenging to simultaneously search all the parts. In this paper, we propose a joint optimization method for data augmentation policies and network architectures to bring more automation to the design of training pipeline. The core idea of our approach is to make the whole part differentiable. The proposed method combines differentiable methods for augmentation policy search and network architecture search to jointly optimize them in the end-to-end manner. The experimental results show our method achieves competitive or superior performance to the independently searched results.	翻訳日:2021-05-02 07:36:44 公開日:2021-01-12
# ロバスト話者照合のための周波数選択付きマルチストリーム畳み込みニューラルネットワーク Multi-stream Convolutional Neural Network with Frequency Selection for Robust Speaker Verification ( http://arxiv.org/abs/2012.11159v2 ) ライセンス: Link先を確認	Wei Yao, Shen Chen, Jiamin Cui, Yaolin Lou	(参考訳) 話者検証は、入力音声がクレーム話者に対応するかどうかを検証することを目的としており、従来は、特徴抽出器が全周波数範囲で動作する単一ストリームシナリオに基づいて、この種のシステムが展開されている。本稿では,完全周波数範囲ではなく部分周波数範囲を聴きながら分類タスクを行うのに十分な知識を機械が学べる,いわゆる周波数選択手法を仮定し,この手法を話者照合タスクに適用したマルチストリーム畳み込みニューラルネットワーク(cnn)の新たな枠組みを提案する。提案フレームワークは,複数のストリームから発生する多様な時間的埋め込みに対応し,音響モデリングの堅牢性を高める。時間的埋め込みの多様性については,周波数の完全帯域を複数のサブバンドに手作業で分割し,各ストリームの特徴抽出器が対象周波数領域として使用するサブバンドを選択することで,周波数選択による特徴拡張を検討する。従来の単一ストリームソリューションとは異なり、各発話は一度だけ処理されるが、このフレームワークでは複数のストリームが並列に処理される。各ストリームの入力発話は、所定の周波数範囲内の周波数セレクタによって前処理され、平均正規化により後処理される。各ストリームの正規化された時間埋め込みはプール層に流れ込み、融合した埋め込みを生成する。本稿では,voxcelebデータセットの広範な実験を行い,マルチストリームcnnが最小決定コスト関数 (mindcf) の相対的改善率20.53パーセントで,シングルストリームベースラインを有意に上回っていることを示す。 Speaker verification aims to verify whether an input speech corresponds to the claimed speaker, and conventionally, this kind of system is deployed based on single-stream scenario, wherein the feature extractor operates in full frequency range. In this paper, we hypothesize that machine can learn enough knowledge to do classification task when listening to partial frequency range instead of full frequency range, which is so called frequency selection technique, and further propose a novel framework of multi-stream Convolutional Neural Network (CNN) with this technique for speaker verification tasks. The proposed framework accommodates diverse temporal embeddings generated from multiple streams to enhance the robustness of acoustic modeling. For the diversity of temporal embeddings, we consider feature augmentation with frequency selection, which is to manually segment the full-band of frequency into several sub-bands, and the feature extractor of each stream can select which sub-bands to use as target frequency domain. Different from conventional single-stream solution wherein each utterance would only be processed for one time, in this framework, there are multiple streams processing it in parallel. The input utterance for each stream is pre-processed by a frequency selector within specified frequency range, and post-processed by mean normalization. The normalized temporal embeddings of each stream will flow into a pooling layer to generate fused embeddings. We conduct extensive experiments on VoxCeleb dataset, and the experimental results demonstrate that multi-stream CNN significantly outperforms single-stream baseline with 20.53 % of relative improvement in minimum Decision Cost Function (minDCF).	翻訳日:2021-04-27 06:19:04 公開日:2021-01-12
# (参考訳) MOOCにおける学習ニーズ改善のための教育コンテンツリンク Educational Content Linking for Enhancing Learning Need Remediation in MOOCs ( http://arxiv.org/abs/2012.15826v2 ) ライセンス: CC BY 4.0	Shang-Wen Li	(参考訳) 2011年に導入されて以来、web上のさまざまなテーマに4000以上のmoocがあり、3500万人以上の学習者が参加している。 MOOCは、知識の普及を民主化し、世界最高の教育を学習者にもたらす能力を示した。しかし, 参加者間の距離, 学習者の人数, 学習者の背景の不均一性は, 学習経験に悪影響を及ぼすタイムリーな方法で学習者との対話を極めて困難にしている。課題に対処するため,本論文では,教育コンテンツリンクという枠組みを提案する。様々なコース教材に散在する学習コンテンツの断片を、容易にアクセス可能な構造にリンクし、整理することにより、このフレームワークが学習者の指導とコンテンツナビゲーションを改善することができると仮定する。 MOOCにおけるほとんどの指導と知識獲得は、学習者がコース資料を調査する際に行われるので、より良いコンテンツナビゲーションは、学習者が自分の混乱を解消し、学習結果と経験を改善するのに役立つ。予想を裏付けるために,1)手動でリンクを生成すれば学習が改善できるか,という2つの研究の枠組みについて,エンドツーエンドの研究を提示する。 2)機械学習による学習コンテンツの生成は可能か? 最初の質問を学習するために,学習教材を提示し,それらを同時に視覚化するインタフェースを構築した。このインターフェースにより,希望する教材をより効率的に検索し,より多くの概念をより容易に維持できることがわかった。第2の質問に対して,条件付き確率場に基づく自動コンテンツリンクアルゴリズムを提案する。リンクのないインターフェースに対する改善の規模は小さいものの、自動生成リンクは依然として学習の改善につながることを実証する。 Since its introduction in 2011, there have been over 4000 MOOCs on various subjects on the Web, serving over 35 million learners. MOOCs have shown the ability to democratize knowledge dissemination and bring the best education in the world to every learner. However, the disparate distances between participants, the size of the learner population, and the heterogeneity of the learners' backgrounds make it extremely difficult for instructors to interact with the learners in a timely manner, which adversely affects learning experience. To address the challenges, in this thesis, we propose a framework: educational content linking. By linking and organizing pieces of learning content scattered in various course materials into an easily accessible structure, we hypothesize that this framework can provide learners guidance and improve content navigation. Since most instruction and knowledge acquisition in MOOCs takes place when learners are surveying course materials, better content navigation may help learners find supporting information to resolve their confusion and thus improve learning outcome and experience. To support our conjecture, we present end-to-end studies to investigate our framework around two research questions: 1) can manually generated linking improve learning? 2) can learning content be generated with machine learning methods? For studying the first question, we built an interface that present learning materials and visualize the linking among them simultaneously. We found the interface enables users to search for desired course materials more efficiently, and retain more concepts more readily. For the second question, we propose an automatic content linking algorithm based on conditional random fields. We demonstrate that automatically generated linking can still lead to better learning, although the magnitude of the improvement over the unlinked interface is smaller.	翻訳日:2021-04-17 20:34:51 公開日:2021-01-12
# (参考訳) SUMOを用いた意味モデリング Semantic Modeling with SUMO ( http://arxiv.org/abs/2012.15835v3 ) ライセンス: CC BY-SA 4.0	Robert B. Allen	(参考訳) 我々は,Suggested Upper Merged Ontology (SUMO) を用いてセマンティック・シミュレーションを開発する。汎用プログラミング言語を用いて,シミュレーションガソリンエンジンの遷移をモデル化した概念実証実験を行う。計算集約的な手法ではなく、慣れ親しんだソフトウェア工学のテスト手順に関連する計算集約的なアプローチを探求する。さらに,レキシコグラフィーの言語的アプローチに基づく用語の構造化表現を提案する。 We explore using the Suggested Upper Merged Ontology (SUMO) to develop a semantic simulation. We provide two proof-of-concept demonstrations modeling transitions in a simulated gasoline engine using a general-purpose programming language. Rather than focusing on computationally highly intensive techniques, we explore a less computationally intensive approach related to familiar software engineering testing procedures. In addition, we propose structured representations of terms based on linguistic approaches to lexicography.	翻訳日:2021-04-17 20:32:36 公開日:2021-01-12
# 中国農村部における"Brilliant AI Doctor" : AIによるCDSS展開の緊張と課題 "Brilliant AI Doctor" in Rural China: Tensions and Challenges in AI-Powered CDSS Deployment ( http://arxiv.org/abs/2101.01524v2 ) ライセンス: Link先を確認	Dakuo Wang and Liuping Wang and Zhan Zhang and Ding Wang and Haiyi Zhu and Yvonne Gao and Xiangmin Fan and Feng Tian	(参考訳) 人工知能(AI)技術は、先進的な臨床決定支援システム(CDSS)の実装にますます利用されている。臨床意思決定シナリオにおけるAI-CDSS(AI-CDSS)の有用性について検討した。しかし、特に発展途上国では、広告後のユーザー知覚と経験は未熟である。中国の6つの農村クリニックの22人の臨床医の観察とインタビューを通じて、AI-CDSSシステム(Brilliant Doctor)の設計と、現地のコンテキストやワークフローとの相違、技術的制限とユーザビリティ障壁、およびAI-CDSSの透明性と信頼性に関する問題など、農村の臨床的コンテキストとのさまざまな緊張関係を報告する。これらの緊張にもかかわらず、すべての参加者はAI-CDSSの将来に対する肯定的な態度を示し、特に臨床環境でのヒト-AIコラボレーションの未来を実現するために「医師のAIアシスタント」として機能した。最後に、発展途上国の農村臨床状況におけるAI-CDSS介入設計の意義について考察する。 Artificial intelligence (AI) technology has been increasingly used in the implementation of advanced Clinical Decision Support Systems (CDSS). Research demonstrated the potential usefulness of AI-powered CDSS (AI-CDSS) in clinical decision making scenarios. However, post-adoption user perception and experience remain understudied, especially in developing countries. Through observations and interviews with 22 clinicians from 6 rural clinics in China, this paper reports the various tensions between the design of an AI-CDSS system ("Brilliant Doctor") and the rural clinical context, such as the misalignment with local context and workflow, the technical limitations and usability barriers, as well as issues related to transparency and trustworthiness of AI-CDSS. Despite these tensions, all participants expressed positive attitudes toward the future of AI-CDSS, especially acting as "a doctor's AI assistant" to realize a Human-AI Collaboration future in clinical settings. Finally we draw on our findings to discuss implications for designing AI-CDSS interventions for rural clinical contexts in developing countries.	翻訳日:2021-04-11 22:59:54 公開日:2021-01-12
# (参考訳) 分類におけるバイアスと分散分析の統一的アプローチ A unifying approach on bias and variance analysis for classification ( http://arxiv.org/abs/2101.01765v2 ) ライセンス: CC BY 4.0	Cemre Zor and Terry Windeatt	(参考訳) 標準バイアスと分散(B&V)の用語は、もともと回帰設定のために定義され、分類への拡張によって、文献においていくつかの異なるモデル/定義が導かれた。本稿では,Tumer & Ghosh (T&G) の一般的なフレームワークと James との関係について述べる。 2つのアプローチを統一することにより、0/1の損失に対して定義されたB&Vと、二乗誤差損失に対して与えられる境界分布の標準B&Vを関連付ける。クローズドフォームの関係は分類性能をより深く理解し、2つのケーススタディでその使用が実証されている。 Standard bias and variance (B&V) terminologies were originally defined for the regression setting and their extensions to classification have led to several different models / definitions in the literature. In this paper, we aim to provide the link between the commonly used frameworks of Tumer & Ghosh (T&G) and James. By unifying the two approaches, we relate the B&V defined for the 0/1 loss to the standard B&V of the boundary distributions given for the squared error loss. The closed form relationships provide a deeper understanding of classification performance, and their use is demonstrated in two case studies.	翻訳日:2021-04-11 12:57:36 公開日:2021-01-12
# (参考訳) 連携・協調・自動化産業システムにおけるフェデレーション学習の可能性 Opportunities of Federated Learning in Connected, Cooperative and Automated Industrial Systems ( http://arxiv.org/abs/2101.03367v2 ) ライセンス: CC BY 4.0	Stefano Savazzi, Monica Nicoli, Mehdi Bennis, Sanaz Kianoush, Luca Barbieri	(参考訳) 次世代の自律・ネットワーク産業システム(ロボット、車両、ドローン)は、超信頼性、低遅延通信(URLLC)およびコンピューティングの進歩を推進してきた。これらのネットワーク化されたマルチエージェントシステムは、ミッションクリティカルコントロール機能を提供するために、高速で通信効率のよい分散機械学習(ML)を必要とする。フェデレートラーニング(FL)を含む分散ML技術は、センシング、コミュニケーション、学習に精通する多分野の研究領域である。集中型サーバで生データサンプルを使用するのではなく、urllcを介して接続されたネットワークエージェントが、ローカルにトレーニングされたモデルのパラメータを定期的に交換する分散学習者として機能する、協調的な融合アプローチを活用する。本稿では,次世代ネットワーク産業システムにおけるFLの新たな可能性について考察する。スマートマニュファクチャリングにおけるコラボレーティブな自動車両とコラボレーティブなロボティクスにおける協調運転に焦点を当てたオープンな問題について議論する。 Next-generation autonomous and networked industrial systems (i.e., robots, vehicles, drones) have driven advances in ultra-reliable, low latency communications (URLLC) and computing. These networked multi-agent systems require fast, communication-efficient and distributed machine learning (ML) to provide mission critical control functionalities. Distributed ML techniques, including federated learning (FL), represent a mushrooming multidisciplinary research area weaving in sensing, communication and learning. FL enables continual model training in distributed wireless systems: rather than fusing raw data samples at a centralized server, FL leverages a cooperative fusion approach where networked agents, connected via URLLC, act as distributed learners that periodically exchange their locally trained model parameters. This article explores emerging opportunities of FL for the next-generation networked industrial systems. Open problems are discussed, focusing on cooperative driving in connected automated vehicles and collaborative robotics in smart manufacturing.	翻訳日:2021-04-09 09:35:19 公開日:2021-01-12
# hypoSVI: スタイン変動推論と物理インフォームドニューラルネットワークを用いた低中心インバージョン HypoSVI: Hypocenter inversion with Stein variational inference and Physics Informed Neural Networks ( http://arxiv.org/abs/2101.03271v2 ) ライセンス: Link先を確認	Jonathan D. Smith, Zachary E. Ross, Kamyar Azizzadenesheli, Jack B. Muir	(参考訳) ステイン変分推論を用いた確率的中心反転のスキームを提案する。我々のアプローチは、アイコン方程式の解法を訓練する物理インフォームドニューラルネットワークの形で、微分可能フォワードモデルを用いている。これにより、核化されたスタインの差分に対して粒子の集まりを反復的に最適化することで、後部を迅速に近似することができる。本手法は,低中央分散逆問題に共通する非凸後部分布を扱うのに最適であることを示す。様々なハイパーパラメータの影響を調べるために一連の実験が行われた。一度トレーニングすれば、旅行時間表を構築する必要なしに、学習領域内の任意のネットワーク幾何に対して有効である。本研究では,分散音響センシングのような大規模N型センシング技術に最適であることを示す。 We introduce a scheme for probabilistic hypocenter inversion with Stein variational inference. Our approach uses a differentiable forward model in the form of a physics-informed neural network, which we train to solve the Eikonal equation. This allows for rapid approximation of the posterior by iteratively optimizing a collection of particles against a kernelized Stein discrepancy. We show that the method is well-equipped to handle highly non-convex posterior distributions, which are common in hypocentral inverse problems. A suite of experiments is performed to examine the influence of the various hyperparameters. Once trained, the method is valid for any network geometry within the study area without the need to build travel time tables. We show that the computational demands scale efficiently with the number of differential times, making it ideal for large-N sensing technologies like Distributed Acoustic Sensing.	翻訳日:2021-04-09 07:20:55 公開日:2021-01-12
# 平均回帰戦略における関数特性を持つ深層強化学習 Deep Reinforcement Learning with Function Properties in Mean Reversion Strategies ( http://arxiv.org/abs/2101.03418v2 ) ライセンス: Link先を確認	Sophia Gu	(参考訳) ゲーム産業におけるDeep Reinforcement Learningの最近の進歩により、我々は、同じ技術が一般的な量的財政問題にも有効かどうか疑問視している。本稿では,OpenAIによって開発された既製のライブラリが,逆転戦略に容易に適応できるかどうかを考察する。さらに、エージェントが検索する必要のある関数空間を狭めることで、よりよいパフォーマンスが得られるかどうかを確認し、テストします。報酬関数を慎重に選択したペナルティ項によって増強することで、これを実現する。 With the recent advancement in Deep Reinforcement Learning in the gaming industry, we are curious if the same technology would work as well for common quantitative financial problems. In this paper, we will investigate if an off-the-shelf library developed by OpenAI can be easily adapted to mean reversion strategy. Moreover, we will design and test to see if we can get better performance by narrowing the function space that the agent needs to search for. We achieve this through augmenting the reward function by a carefully picked penalty term.	翻訳日:2021-04-09 07:19:56 公開日:2021-01-12
# (参考訳) at-bert:adversarial training bert for acronym identification winning solution for sdu@aaai-21 AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21 ( http://arxiv.org/abs/2101.03700v2 ) ライセンス: CC BY 4.0	Danqing Zhu, Wangli Lin, Yang Zhang, Qiwei Zhong, Guanxiong Zeng, Weilin Wu, Jiayu Tang	(参考訳) 頭字語識別は、省略された頭字語と句を見つけることに焦点を当てており、これは科学文書理解タスクに不可欠である。しかし、手動でアノテートされたデータセットの限られたサイズは、問題のさらなる改善を妨げる。大規模コーパス上で事前学習された言語モデルの最近のブレークスルーは、教師なし事前学習が下流タスクの性能を大幅に改善できることを示している。本稿では,AAAI 2021 の学術文書理解 (SDU) チャレンジにおいて,AT-BERT と名づけられた逆トレーニング BERT 手法を提案する。具体的には、事前訓練されたBERTが、より良いセマンティック表現をキャプチャするために採用されている。次に、FGMの対向訓練戦略をBERTの微調整に取り入れ、モデルをより堅牢で一般化する。さらに、複数のBERT変種から学んだ表現を包含するアンサンブル機構が考案された。これらすべてのコンポーネントを組み立てることにより,sciaiデータセットの実験結果から,提案手法が他手法よりも優れていることが示された。 Acronym identification focuses on finding the acronyms and the phrases that have been abbreviated, which is crucial for scientific document understanding tasks. However, the limited size of manually annotated datasets hinders further improvement for the problem. Recent breakthroughs of language models pre-trained on large corpora clearly show that unsupervised pre-training can vastly improve the performance of downstream tasks. In this paper, we present an Adversarial Training BERT method named AT-BERT, our winning solution to acronym identification task for Scientific Document Understanding (SDU) Challenge of AAAI 2021. Specifically, the pre-trained BERT is adopted to capture better semantic representation. Then we incorporate the FGM adversarial training strategy into the fine-tuning of BERT, which makes the model more robust and generalized. Furthermore, an ensemble mechanism is devised to involve the representations learned from multiple BERT variants. Assembling all these components together, the experimental results on the SciAI dataset show that our proposed approach outperforms all other competitive state-of-the-art methods.	翻訳日:2021-04-04 21:35:35 公開日:2021-01-12
# 階層的微分可能なアーキテクチャ探索による検索空間のアンチェーン Unchain the Search Space with Hierarchical Differentiable Architecture Search ( http://arxiv.org/abs/2101.04028v2 ) ライセンス: Link先を確認	Guanting Liu, Yujie Zhong, Sheng Guo, Matthew R. Scott, Weilin Huang	(参考訳) 微分可能なアーキテクチャサーチ (DAS) は計算コストを削減した高性能アーキテクチャの探索に大きく進歩している。しかし、DASベースの手法は主に繰り返し可能なセル構造を探索することに集中しており、複数のステージに順次積み重ねてネットワークを形成する。この構成は検索空間を大幅に減らし、細胞間の接続の重要性を無視する。本稿では,この制限を克服するために,セルレベルとステージレベルの両方でアーキテクチャ検索を行う階層的微分可能アーキテクチャ探索(h-das)を提案する。具体的には、ネットワークがステージ固有の細胞構造を学習できるように、細胞レベルの検索空間を緩和する。ステージレベルの探索では,各ステージ内の細胞数やセル間の接続など,ステージのアーキテクチャを体系的に研究する。洞察に富んだ観察に基づいて,いくつかの探索ルールと損失をデザインし,より優れたステージレベルのアーキテクチャを探索する。このような階層的検索空間は、高価な検索コストを伴わずにネットワークの性能を大幅に向上させる。 CIFAR10とImageNetの大規模な実験により,提案したH-DASの有効性が示された。さらに、探索されたステージレベルのアーキテクチャは、既存のDAS法で探索されたセル構造と組み合わせることで、パフォーマンスをさらに向上させることができる。コードは、https://github.com/MalongTech/research-HDASで入手できる。 Differentiable architecture search (DAS) has made great progress in searching for high-performance architectures with reduced computational cost. However, DAS-based methods mainly focus on searching for a repeatable cell structure, which is then stacked sequentially in multiple stages to form the networks. This configuration significantly reduces the search space, and ignores the importance of connections between the cells. To overcome this limitation, in this paper, we propose a Hierarchical Differentiable Architecture Search (H-DAS) that performs architecture search both at the cell level and at the stage level. Specifically, the cell-level search space is relaxed so that the networks can learn stage-specific cell structures. For the stage-level search, we systematically study the architectures of stages, including the number of cells in each stage and the connections between the cells. Based on insightful observations, we design several search rules and losses, and mange to search for better stage-level architectures. Such hierarchical search space greatly improves the performance of the networks without introducing expensive search cost. Extensive experiments on CIFAR10 and ImageNet demonstrate the effectiveness of the proposed H-DAS. Moreover, the searched stage-level architectures can be combined with the cell structures searched by existing DAS methods to further boost the performance. Code is available at: https://github.com/MalongTech/research-HDAS	翻訳日:2021-04-04 14:39:33 公開日:2021-01-12
# (参考訳) 意味表現からのマルチコンディション生成の変換 Transforming Multi-Conditioned Generation from Meaning Representation ( http://arxiv.org/abs/2101.04257v1 ) ライセンス: CC BY 4.0	Joosung Lee	(参考訳) タスク指向会話システムでは,会話の流れに関連する特定の情報を生成する自然言語生成システムが有用である。本研究では,発話の意味を表す様々な情報を生成条件として考慮し,言語生成に焦点を当てた。意味表現からのNLG(文の意味の条件)は、通常、文計画と表面実現の2段階を経る。しかし、MR(Meaning Representation)から直接発話を生成するための単純なワンステージフレームワークを提案する。我々のモデルはGPT2に基づいており、文の構造を決定する必要がないスロットと値対の平らな条件の発話を生成する。 E2Eデータセット内の複数のシステムと6つの自動メトリクスを評価した。私たちのシステムは単純な手法ですが、従来のシステムと同等のパフォーマンスを自動測定で示しています。さらに,他の手法を使わずにデータセットの10%しか使用せず,同等の性能を実現し,ゼロショット生成や他のデータセットへの拡張の可能性を示す。 In task-oriented conversation systems, natural language generation systems that generate sentences with specific information related to conversation flow are useful. Our study focuses on language generation by considering various information representing the meaning of utterances as multiple conditions of generation. NLG from meaning representations, the conditions for sentence meaning, generally goes through two steps: sentence planning and surface realization. However, we propose a simple one-stage framework to generate utterances directly from MR (Meaning Representation). Our model is based on GPT2 and generates utterances with flat conditions on slot and value pairs, which does not need to determine the structure of the sentence. We evaluate several systems in the E2E dataset with 6 automatic metrics. Our system is a simple method, but it demonstrates comparable performance to previous systems in automated metrics. In addition, using only 10\% of the data set without any other techniques, our model achieves comparable performance, and shows the possibility of performing zero-shot generation and expanding to other datasets.	翻訳日:2021-04-04 13:11:30 公開日:2021-01-12
# (参考訳) clutter slicesアプローチによる室内空間の同定 Clutter Slices Approach for Identification-on-the-fly of Indoor Spaces ( http://arxiv.org/abs/2101.04262v1 ) ライセンス: CC BY 4.0	Upinder Kaur, Praveen Abbaraju, Harrison McCarty, and Richard M. Voyles	(参考訳) 建設空間は絶えず進化しており、継続的な測量、検査、評価を必要とする動的環境である。このような空間の伝統的な手動検査は、困難で時間を要する活動であることが証明されている。ロボットエージェントによる自動化は効果的なソリューションである。知覚能力を持つロボットは、屋内建設空間を自律的に分類し、調査することができる。本稿では,クラッタの一意なシグネチャを用いた室内空間の粗さ分類のための新しい識別・オン・ザ・フライ手法を提案する。乱雑に付与された文脈を用いて,廊下,階段,共用空間,トイレなどの一般的な屋内空間を認識する。提案したクラッタスライスパイプラインは,提案したクラッタスライスデータセットにおいて最大精度93.6%を達成する。このセンサ独立アプローチは、知的自律エージェントを環境をよりよく知覚するために様々な領域に一般化することができる。 Construction spaces are constantly evolving, dynamic environments in need of continuous surveying, inspection, and assessment. Traditional manual inspection of such spaces proves to be an arduous and time-consuming activity. Automation using robotic agents can be an effective solution. Robots, with perception capabilities can autonomously classify and survey indoor construction spaces. In this paper, we present a novel identification-on-the-fly approach for coarse classification of indoor spaces using the unique signature of clutter. Using the context granted by clutter, we recognize common indoor spaces such as corridors, staircases, shared spaces, and restrooms. The proposed clutter slices pipeline achieves a maximum accuracy of 93.6% on the presented clutter slices dataset. This sensor independent approach can be generalized to various domains to equip intelligent autonomous agents in better perceiving their environment.	翻訳日:2021-04-04 12:58:55 公開日:2021-01-12
# (参考訳) 手術映像における一時ガイド付き手指球追跡 Temporally Guided Articulated Hand Pose Tracking in Surgical Videos ( http://arxiv.org/abs/2101.04281v1 ) ライセンス: CC BY 4.0	Nathan Louis, Luowei Zhou, Steven J. Yule, Roger D. Dias, Milisa Manojlovich, Francis D. Pagani, Donald S. Likosky, Jason J. Corso	(参考訳) 手のポーズ追跡は未熟な問題であり、特に医療領域において、広範囲のアプリケーションで使用される可能性を持っている。生体内手術ビデオのロバストで正確な追跡システムにより、手の動きのダイナミクスや動きのパターンを捉えることができ、スキルアセスメント、手術従事者の訓練、時間的行動認識などのリッチなタスクに役立てることができる。本研究では,ポーズ予測に手ポーズを組み込むことでトラッキング精度を向上させる新しい手ポーズ推定モデルRes152-CondPoseを提案する。我々は,過去の予測を効果的に活用する時間的ガイド付きアプローチに従えば,フレーム単位の独立な予測を提供する最先端手法の改善を示す。さらに,マルチスタンスによる手ポーズアノテーションを提供する最初のデータセットであるオペレーショナルハンドを収集した。我々のデータセットには、28の公開手術ビデオから76の動画クリップと8.1k以上の注釈付き手ポーズインスタンスが含まれています。境界ボックス,手指ポーズアノテーション,トラッキングidを提供し,マルチインスタンス領域ベースおよび関節追跡を可能にした。手術手による評価では,平均平均精度(map),ポーズ推定精度,複数物体追跡精度(mota)を用いて,姿勢追跡性能を評価する手法が最先端手法よりも優れていることを示す。 Articulated hand pose tracking is an underexplored problem that carries the potential for use in an extensive number of applications, especially in the medical domain. With a robust and accurate tracking system on in-vivo surgical videos, the motion dynamics and movement patterns of the hands can be captured and analyzed for rich tasks including skills assessment, training surgical residents, and temporal action recognition. In this work, we propose a novel hand pose estimation model, Res152- CondPose, which improves tracking accuracy by incorporating a hand pose prior into its pose prediction. We show improvements over state-of-the-art methods which provide frame-wise independent predictions, by following a temporally guided approach that effectively leverages past predictions. Additionally, we collect the first dataset, Surgical Hands, that provides multi-instance articulated hand pose annotations for in-vivo videos. Our dataset contains 76 video clips from 28 publicly available surgical videos and over 8.1k annotated hand pose instances. We provide bounding boxes, articulated hand pose annotations, and tracking IDs to enable multi-instance area-based and articulated tracking. When evaluated on Surgical Hands, we show our method outperforms the state-of-the-art method using mean Average Precision (mAP), to measure pose estimation accuracy, and Multiple Object Tracking Accuracy (MOTA), to assess pose tracking performance.	翻訳日:2021-04-04 12:52:20 公開日:2021-01-12
# (参考訳) メタラーニングと一般AIの関連性に関する簡単な調査 A Brief Survey of Associations Between Meta-Learning and General AI ( http://arxiv.org/abs/2101.04283v1 ) ライセンス: CC BY 4.0	Huimin Peng	(参考訳) 本稿では,メタラーニングの歴史を概観し,一般AIへの貢献について述べる。メタラーニングはモデル一般化能力を向上し、分散処理と分散処理の両方に適用可能な汎用アルゴリズムを考案する。汎用AIは、タスク固有のモデルを、AIを使用して多様なタスクを解決するための高度な自動化を導入する一般的なアルゴリズムシステムに置き換える。我々は、メモリモジュール、メタラーナー、共進化、好奇心、忘れること、AI生成アルゴリズムなど、一般的なAI開発へのメタラーニングの主な貢献を要約する。メタラーニングと一般AIの関連性を示し、一般AIアルゴリズムの定式化にメタラーニングをどのように使用できるかについて議論する。 This paper briefly reviews the history of meta-learning and describes its contribution to general AI. Meta-learning improves model generalization capacity and devises general algorithms applicable to both in-distribution and out-of-distribution tasks potentially. General AI replaces task-specific models with general algorithmic systems introducing higher level of automation in solving diverse tasks using AI. We summarize main contributions of meta-learning to the developments in general AI, including memory module, meta-learner, coevolution, curiosity, forgetting and AI-generating algorithm. We present connections between meta-learning and general AI and discuss how meta-learning can be used to formulate general AI algorithms.	翻訳日:2021-04-04 12:31:00 公開日:2021-01-12
# (参考訳) 3D-ANAS:高速ハイパースペクトル画像分類のための3次元非対称ニューラルネットワーク探索 3D-ANAS: 3D Asymmetric Neural Architecture Search for Fast Hyperspectral Image Classification ( http://arxiv.org/abs/2101.04287v1 ) ライセンス: CC BY 4.0	Haokui Zhang, Chengrong Gong, Yunpeng Bai, Zongwen Bai and Ying Li	(参考訳) ハイパースペクトル画像はスペクトルと空間情報を豊富に含み、土地被覆分類において不定の役割を果たす。近年,ディープラーニング技術に基づいて,有望な性能を示すHSI分類手法が提案されている。しかし、これまでの研究では、1)ほとんどのディープラーニングモデルのアーキテクチャは手作業で設計されており、専門知識に依存しており、比較的退屈である。さらに、hsi分類では、異なるセンサーによってキャプチャされたデータセットは、物理的特性が異なる。それに合わせて、異なるモデルをさまざまなデータセット用に設計する必要があるため、アーキテクチャ設計の作業負荷はさらに増加する。隣接する画素のパッチの重複領域を繰り返し計算し、計算コストと時間コストを増大させる。さらに、分類精度は広範な調査実験に基づいて人工的に設定されるパッチサイズに敏感である。上記の問題を克服するため,まず3次元非対称ニューラルネットワーク探索アルゴリズムを提案し,HSI分類のための効率的なアーキテクチャを自動検索する。 hsisの特性を解析することにより、スペクトルと空間の情報を異なる分解畳み込みで処理する3次元非対称分解探索空間を特に構築する。さらに,反復操作を行わず,全体のコストを低減できる新しい高速分類フレームワーク,すなわち画素から画素への分類フレームワークを提案する。異なるセンサーによってキャプチャされた3つの公開HSIデータセットの実験では、我々の3D-ANASが設計したネットワークは、最先端のいくつかの手法と比較して競争力を発揮するが、推論速度ははるかに速い。 Hyperspectral images involve abundant spectral and spatial information, playing an irreplaceable role in land-cover classification. Recently, based on deep learning technologies, an increasing number of HSI classification approaches have been proposed, which demonstrate promising performance. However, previous studies suffer from two major drawbacks: 1) the architecture of most deep learning models is manually designed, relies on specialized knowledge, and is relatively tedious. Moreover, in HSI classifications, datasets captured by different sensors have different physical properties. Correspondingly, different models need to be designed for different datasets, which further increases the workload of designing architectures; 2) the mainstream framework is a patch-to-pixel framework. The overlap regions of patches of adjacent pixels are calculated repeatedly, which increases computational cost and time cost. Besides, the classification accuracy is sensitive to the patch size, which is artificially set based on extensive investigation experiments. To overcome the issues mentioned above, we firstly propose a 3D asymmetric neural network search algorithm and leverage it to automatically search for efficient architectures for HSI classifications. By analysing the characteristics of HSIs, we specifically build a 3D asymmetric decomposition search space, where spectral and spatial information are processed with different decomposition convolutions. Furthermore, we propose a new fast classification framework, i,e., pixel-to-pixel classification framework, which has no repetitive operations and reduces the overall cost. Experiments on three public HSI datasets captured by different sensors demonstrate the networks designed by our 3D-ANAS achieve competitive performance compared to several state-of-the-art methods, while having a much faster inference speed.	翻訳日:2021-04-04 12:16:30 公開日:2021-01-12
# (参考訳) Fits and Starts: AutoMLの企業利用とループにおける人間の役割 Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop ( http://arxiv.org/abs/2101.04296v1 ) ライセンス: CC BY 4.0	Anamaria Crisan, Brittany Fiore-Gartland	(参考訳) AutoMLシステムは、通常のデータサイエンス作業のスピードアップと、統計学やコンピュータサイエンスの専門知識を持たない人たちの機械学習利用を可能にする。これらのシステムは、熟練したデータワーカーのプールが限られている企業環境で勢いを増している。本研究では,異なる規模の組織から29名の個人を対象に,データサイエンスにおけるAutoMLシステムの利用状況や利用意図についてインタビューを行った。また,データ可視化とAutoMLシステムとの併用について検討した。分析の結果,AutoMLの3つの利用シナリオは,さまざまなレベルの専門知識を持つデータワーカーが望む自動化レベルを要約するフレームワークとなった。スピードと人間の監視の緊張関係を表面化し、データの視覚化によって両者のバランスが悪くなることを発見した。本研究は,人間のループ内視覚分析手法の設計と実装に影響を及ぼすものである。 AutoML systems can speed up routine data science work and make machine learning available to those without expertise in statistics and computer science. These systems have gained traction in enterprise settings where pools of skilled data workers are limited. In this study, we conduct interviews with 29 individuals from organizations of different sizes to characterize how they currently use, or intend to use, AutoML systems in their data science work. Our investigation also captures how data visualization is used in conjunction with AutoML systems. Our findings identify three usage scenarios for AutoML that resulted in a framework summarizing the level of automation desired by data workers with different levels of expertise. We surfaced the tension between speed and human oversight and found that data visualization can do a poor job balancing the two. Our findings have implications for the design and implementation of human-in-the-loop visual analytics approaches.	翻訳日:2021-04-04 11:30:28 公開日:2021-01-12
# (参考訳) DeepiSign:CNNの統合性と認証を保護するために、目に見えないフレジブルな透かし DeepiSign: Invisible Fragile Watermark to Protect the Integrityand Authenticity of CNN ( http://arxiv.org/abs/2101.04319v1 ) ライセンス: CC BY 4.0	Alsharif Abuadbba, Hyoungshick Kim, Surya Nepal	(参考訳) 自動運転車のような現実のアプリケーションでデプロイされる畳み込みニューラルネットワーク(cnns)は、毒殺攻撃や微調整といった操作攻撃に弱いことが示されている。したがって、妥協されたモデルは不正な出力を生成し、悪意ある振る舞いをするので、CNNの完全性と信頼性を保証することが不可欠である。本稿では,CNNモデルの整合性と信頼性を確保するために,DeepiSignと呼ばれる自己完結型タンパ保護手法を提案する。 DeepiSignは、秘密とハッシュ値をCNNモデルに安全に埋め込むために、脆弱な目に見えない透かしというアイデアを適用している。モデルの完全性と信頼性を検証するために、モデルからシークレットを取得し、シークレットのハッシュ値を計算し、それを埋め込みハッシュ値と比較する。 CNNモデルに埋め込まれたシークレットの影響を最小限に抑えるため、ウェーブレットベースの手法を用いて重みを周波数領域に変換し、そのシークレットをより少ない有意な係数に埋め込む。理論的解析により,DeepiSignは各層に最大1KBのシークレットを隠蔽し,モデルの精度を最小限に抑えることができた。 deepisignのセキュリティと性能を評価するために,3種類の操作攻撃(ターゲット入力中毒,アウトプット中毒,微調整)に対する3つのデータセット(mnist,cifar-10,imagenet)を用いて,事前学習した4つのモデル(resnet18,vgg16,alexnet,mobilenet)について実験を行った。その結果,DeepiSignは分類精度を低下させることなく検証可能であり,CNNによる攻撃に対して堅牢であることがわかった。 Convolutional Neural Networks (CNNs) deployed in real-life applications such as autonomous vehicles have shown to be vulnerable to manipulation attacks, such as poisoning attacks and fine-tuning. Hence, it is essential to ensure the integrity and authenticity of CNNs because compromised models can produce incorrect outputs and behave maliciously. In this paper, we propose a self-contained tamper-proofing method, called DeepiSign, to ensure the integrity and authenticity of CNN models against such manipulation attacks. DeepiSign applies the idea of fragile invisible watermarking to securely embed a secret and its hash value into a CNN model. To verify the integrity and authenticity of the model, we retrieve the secret from the model, compute the hash value of the secret, and compare it with the embedded hash value. To minimize the effects of the embedded secret on the CNN model, we use a wavelet-based technique to transform weights into the frequency domain and embed the secret into less significant coefficients. Our theoretical analysis shows that DeepiSign can hide up to 1KB secret in each layer with minimal loss of the model's accuracy. To evaluate the security and performance of DeepiSign, we performed experiments on four pre-trained models (ResNet18, VGG16, AlexNet, and MobileNet) using three datasets (MNIST, CIFAR-10, and Imagenet) against three types of manipulation attacks (targeted input poisoning, output poisoning, and fine-tuning). The results demonstrate that DeepiSign is verifiable without degrading the classification accuracy, and robust against representative CNN manipulation attacks.	翻訳日:2021-04-04 11:02:07 公開日:2021-01-12
# (参考訳) 機械学習と信号特徴抽出を組み合わせたブラインド変調分類 Blind Modulation Classification via Combined Machine Learning and Signal Feature Extraction ( http://arxiv.org/abs/2101.04337v1 ) ライセンス: CC BY 4.0	Jafar Norolahi, Paeiz Azmi	(参考訳) 本研究では,視覚・自動変調分類のためのアルゴリズムを提案する。低信号パワーから雑音比(SNR)の様々な変調を識別するために、機械傾きと信号特徴抽出の組み合わせが有効である。提案アルゴリズムは4つを含む。まず、正規および不規則なスペクトル特性に基づく変調信号の分岐に対するスペクトル分析に有利である。次に、受信信号に非線形ソフトマージン支持ベクトル(NS SVM)問題を適用し、そのシンボルを正しい(サポートベクトル)シンボルに分類する。 NS SVMの雇用は変調信号に対する物理層ノイズ効果の低減につながる。その後、k-centerクラスタリングは各クラスの中央を見つけることができる。最後に, 散乱図の相関関数推定は, 変調の既設理想散乱図と相関する。相関結果は分類結果である。さらなる評価のために、多くの公開手法と比較して成功率、性能、複雑さが提供される。シミュレーションにより、提案アルゴリズムは変調された信号をより少ないSNRで分類できることを示す。例えば、SNR=4.2dBで4-QAM、SNR=2.1dBで4-FSK、成功率は%99である。さらに,ns svmと特徴ベース関数の双対問題におけるカーネル関数の利用により,提案手法は複雑性が低く,実装が簡単である。 In this study, an algorithm to blind and automatic modulation classification has been proposed. It well benefits combined machine leaning and signal feature extraction to recognize diverse range of modulation in low signal power to noise ratio (SNR). The presented algorithm contains four. First, it advantages spectrum analyzing to branching modulated signal based on regular and irregular spectrum character. Seconds, a nonlinear soft margin support vector (NS SVM) problem is applied to received signal, and its symbols are classified to correct and incorrect (support vectors) symbols. The NS SVM employment leads to discounting in physical layer noise effect on modulated signal. After that, a k-center clustering can find center of each class. finally, in correlation function estimation of scatter diagram is correlated with pre-saved ideal scatter diagram of modulations. The correlation outcome is classification result. For more evaluation, success rate, performance, and complexity in compare to many published methods are provided. The simulation prove that the proposed algorithm can classified the modulated signal in less SNR. For example, it can recognize 4-QAM in SNR=-4.2 dB, and 4-FSK in SNR=2.1 dB with %99 success rate. Moreover, due to using of kernel function in dual problem of NS SVM and feature base function, the proposed algorithm has low complexity and simple implementation in practical issues.	翻訳日:2021-04-04 10:44:51 公開日:2021-01-12
# (参考訳) ハイパーネットワークに基づく期待整合信号回復アルゴリズムを用いた位相検索 Phase Retrieval using Expectation Consistent Signal Recovery Algorithm based on Hypernetwork ( http://arxiv.org/abs/2101.04348v1 ) ライセンス: CC BY 4.0	Chang-Jen Wang, Chao-Kai Wen, Shang-Ho (Lawrence) Tsai, Shi Jin, Geoffrey Ye Li	(参考訳) 位相検索(PR)は現代の計算イメージングシステムにおいて重要な要素である。過去半世紀にわたって多くのアルゴリズムが開発されてきた。近年のディープラーニングの進歩は、堅牢で高速なPRの新たな可能性を開いた。 deep unfoldingと呼ばれる新たなテクニックは、従来のモデルベースの反復アルゴリズムと、現代的なデータベースのディープラーニングとの系統的な接続を提供する。データ学習を利用した展開アルゴリズムは、元のアルゴリズムよりも顕著な性能と収束速度の向上を示した。その可能性にもかかわらず、既存の展開アルゴリズムのほとんどは、層依存パラメータを使用する場合、一定の数の反復に限られる。本研究では,既存の制約を克服するために,深い展開のための新しい枠組みを開発する。一般の逆問題に対して,我々のフレームワークが広く適用可能であるとしても,PRを例として取り上げる。我々の開発は、データ駆動学習において減衰因子が残される一般化予測整合信号回復アルゴリズム(GEC-SR)に基づいている。特に, GEC-SR の減衰係数を生成するハイパーネットワークを導入する。最適な減衰因子を直接学習する代わりに、ハイパーネットワークは、臨床設定に従って最適な減衰因子を生成する方法を学び、異なるシナリオへの適応性を確保する。ハイパーネットワークの動作を異なるレイヤ番号に適応させるため、私たちはリカレントアーキテクチャを使用して動的ハイパーネットワークを開発し、レイヤ間でオンラインに変化可能な減衰係数を生成します。また,ハイパーネットワークのロバスト性を高めるために自己アテンション機構を利用する。大規模な実験により、提案アルゴリズムは収束速度と精度で既存のアルゴリズムより優れており、多くの古典的PRアルゴリズムが不安定または失敗する非常に厳しい条件下でも機能することが示された。 Phase retrieval (PR) is an important component in modern computational imaging systems. Many algorithms have been developed over the past half century. Recent advances in deep learning have opened up a new possibility for robust and fast PR. An emerging technique, called deep unfolding, provides a systematic connection between conventional model-based iterative algorithms and modern data-based deep learning. Unfolded algorithms, powered by data learning, have shown remarkable performance and convergence speed improvement over the original algorithms. Despite their potential, most existing unfolded algorithms are strictly confined to a fixed number of iterations when employing layer-dependent parameters. In this study, we develop a novel framework for deep unfolding to overcome the existing limitations. Even if our framework can be widely applied to general inverse problems, we take PR as an example in the paper. Our development is based on an unfolded generalized expectation consistent signal recovery (GEC-SR) algorithm, wherein damping factors are left for data-driven learning. In particular, we introduce a hypernetwork to generate the damping factors for GEC-SR. Instead of directly learning a set of optimal damping factors, the hypernetwork learns how to generate the optimal damping factors according to the clinical settings, thus ensuring its adaptivity to different scenarios. To make the hypernetwork work adapt to varying layer numbers, we use a recurrent architecture to develop a dynamic hypernetwork, which generates a damping factor that can vary online across layers. We also exploit a self-attention mechanism to enhance the robustness of the hypernetwork. Extensive experiments show that the proposed algorithm outperforms existing ones in convergence speed and accuracy, and still works well under very harsh settings, that many classical PR algorithms unstable or even fail.	翻訳日:2021-04-04 10:32:38 公開日:2021-01-12
# (参考訳) ランクモデルに対するニューラルラーニングの校正と不確かさについて On the Calibration and Uncertainty of Neural Learning to Rank Models ( http://arxiv.org/abs/2101.04356v1 ) ライセンス: CC BY 4.0	Gustavo Penha and Claudia Hauff	(参考訳) Probability Ranking Principle (PRP) によれば、関連性確率の順に文書をランク付けすると、アドホック検索に最適な文書ランキングが得られる。 PRPは、2つの条件が満たされたときに成り立つ: [C1] モデルが十分に校正され、[C2] 関連性の確率が確実に報告される。しかし、ディープニューラルネットワーク(DNN)はよく校正されておらず、不確実性の原因がいくつかあるため、[C1]と[C2]はニューラルランサーによって満たされない可能性がある。ニューラルラーニング・トゥ・ランク(L2R)のアプローチの成功を考えると、特にBERTベースのアプローチは、まずどの状況を決定論的に分析する。出力ポイント推定神経ローダは校正されるそこで,本研究では,2つの手法を用いて,提案した確率的ランク付けに導かれるニューラルランク付けの不確かさをモデル化し,点推定とは対照的に関連性の予測分布を出力する。会話応答ランク付けのアドホック検索タスクにおける実験結果から, (i) bertベースのランク付けはロバストに調整されないこと, 確率的bertベースのランク付けがより良いキャリブレーションをもたらすこと, (ii) 不確実性推定は, リスク認識型ニューラルネットワークのランキング, すなわち, ランク付け時の不確実性を考慮し, 不可解な会話コンテキストの予測に有効であることが明らかとなった。 According to the Probability Ranking Principle (PRP), ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad-hoc retrieval. The PRP holds when two conditions are met: [C1] the models are well calibrated, and, [C2] the probabilities of relevance are reported with certainty. We know however that deep neural networks (DNNs) are often not well calibrated and have several sources of uncertainty, and thus [C1] and [C2] might not be satisfied by neural rankers. Given the success of neural Learning to Rank (L2R) approaches-and here, especially BERT-based approaches-we first analyze under which circumstances deterministic, i.e. outputs point estimates, neural rankers are calibrated. Then, motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers, which output a predictive distribution of relevance as opposed to point estimates. Our experimental results on the ad-hoc retrieval task of conversation response ranking reveal that (i) BERT-based rankers are not robustly calibrated and that stochastic BERT-based rankers yield better calibration; and (ii) uncertainty estimation is beneficial for both risk-aware neural ranking, i.e.taking into account the uncertainty when ranking documents, and for predicting unanswerable conversational contexts.	翻訳日:2021-04-04 09:55:48 公開日:2021-01-12
# (参考訳) 収束解析を用いたSOMに基づく勾配自由深層学習法 A SOM-based Gradient-Free Deep Learning Method with Convergence Analysis ( http://arxiv.org/abs/2101.05612v1 ) ライセンス: CC BY 4.0	Shaosheng Xu, Jinde Cao, Yichao Cao, Tong Wang	(参考訳) 深層学習における勾配降下法は一連の疑問を引き起こすため,新しい勾配フリー深層学習構造を提案する。従来の自己組織化マップに新たなモジュールを追加し、マップに残余を導入することで、Deep Valued Self-Organizing Mapネットワークを構築する。そして,このような深い価値を持つ自己組織化マップネットワークの収束性能に関する解析を行い,入力の次元と予測の損失を考慮に入れた設計パラメータの不平等性について述べる。 As gradient descent method in deep learning causes a series of questions, this paper proposes a novel gradient-free deep learning structure. By adding a new module into traditional Self-Organizing Map and introducing residual into the map, a Deep Valued Self-Organizing Map network is constructed. And analysis about the convergence performance of such a deep Valued Self-Organizing Map network is proved in this paper, which gives an inequality about the designed parameters with the dimension of inputs and the loss of prediction.	翻訳日:2021-04-04 09:39:12 公開日:2021-01-12
# (参考訳) 深層学習による非線形分散方程式のデータ駆動ピークと周期ピーク移動波解 Data-driven peakon and periodic peakon travelling wave solutions of some nonlinear dispersive equations via deep learning ( http://arxiv.org/abs/2101.04371v1 ) ライセンス: CC BY 4.0	Li Wang and Zhenya Yan	(参考訳) 数学物理学の分野では、波のピークに不連続な一階微分を持つ孤立波であるピークロン解を持つ多くの物理的に興味深い非線形分散方程式が存在する。 In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. これらの結果は、ピークン解とそれに対応する非線形分散方程式の実験設計をさらに研究するのに有用である。 In the field of mathematical physics, there exist many physically interesting nonlinear dispersive equations with peakon solutions, which are solitary waves with discontinuous first-order derivative at the wave peak. In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. These results will be useful to further study the peakon solutions and corresponding experimental design of nonlinear dispersive equations.	翻訳日:2021-04-04 09:24:32 公開日:2021-01-12
# (参考訳) 確率的マルチユーザバンディットを用いた動的スペクトルアクセス Dynamic Spectrum Access using Stochastic Multi-User Bandits ( http://arxiv.org/abs/2101.04388v1 ) ライセンス: CC BY 4.0	Meghana Bande, Akshayaa Magesh, Venugopal V. Veeravalli	(参考訳) 非コーディネートスペクトルアクセスのためのアルゴリズムを開発するために、確率的マルチユーザーマルチアームバンディットフレームワークが使用される。先行研究とは対照的に、衝突しても報酬はゼロではないと仮定され、それによってユーザ数をチャネル数よりも多くすることができる。提案アルゴリズムは推定フェーズと割り当てフェーズから構成される。各ユーザがアルゴリズムを採用すると、システム全体の後悔は、持続時間$t$の時間ホリゾンよりも、オーダー$o(\log t)$のオーダーオプティマイズであることが示される。後悔の保証は、ユーザ数がチャネル数以上である場合とチャネル数未満の場合の両方に適用される。このアルゴリズムは、システムのユーザ数が時間とともに進化する動的ケースに拡張され、サブ線形後悔につながることが示されている。 A stochastic multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. In contrast to prior work, it is assumed that rewards can be non-zero even under collisions, thus allowing for the number of users to be greater than the number of channels. The proposed algorithm consists of an estimation phase and an allocation phase. It is shown that if every user adopts the algorithm, the system wide regret is order-optimal of order $O(\log T)$ over a time-horizon of duration $T$. The regret guarantees hold for both the cases where the number of users is greater than or less than the number of channels. The algorithm is extended to the dynamic case where the number of users in the system evolves over time, and is shown to lead to sub-linear regret.	翻訳日:2021-04-04 08:47:53 公開日:2021-01-12
# (参考訳) シミュレーションユーザによるレコメンダシステム効果の測定 Measuring Recommender System Effects with Simulated Users ( http://arxiv.org/abs/2101.04526v1 ) ライセンス: CC BY 4.0	Sirui Yao and Yoni Halpern and Nithum Thain and Xuezhi Wang and Kang Lee and Flavien Prost and Ed H. Chi and Jilin Chen and Alex Beutel	(参考訳) 食べ物レコメンデーションシステム -- が『emph{causing}』かどうかを確認し、不健康な食事習慣を育むか、単にユーザーの興味を反映させるだけか? レコメンダシステムの選択とバイアスによって、レコメンダシステムでのユーザの経験のどのくらいが時間の経過とともに引き起こされ、ユーザの好みとバイアスに基づいたものなのでしょうか? 人気バイアスとフィルターバブルは、最もよく研究されているシステムバイアスの2つだが、以前の研究のほとんどは、単一のレコメンデーションステップでシステムの振る舞いを理解することに集中している。これらのバイアスはユーザ行動とどのように相互作用し、反復的なインタラクションからどのようなユーザエクスペリエンスが生成されるのか? 本研究では,ユーザ行動の違いによる推薦システムの影響を測定するためのシミュレーションフレームワークを提案する。このシミュレーションフレームワークを用いて、(a)ユーザの好みからレコメンダシステムの効果を分離し、(b)「平均ユーザ」だけでなく、非定型ユーザ行動下での極端な体験についてもシステムがどのように機能するかを検討する。本稿では,シミュレーションフレームワークの一部として,シミュレーション上の評価指標のセットを提案し,レコメンダシステムの振る舞いを理解する。最後に,映画レンズにおける従来の協調フィルタリングと大規模生産レコメンデーションシステムに関する2つの実証的なケーススタディを提示し,人気バイアスが時間とともにどのように現れるかを理解する。 Imagine a food recommender system -- how would we check if it is \emph{causing} and fostering unhealthy eating habits or merely reflecting users' interests? How much of a user's experience over time with a recommender is caused by the recommender system's choices and biases, and how much is based on the user's preferences and biases? Popularity bias and filter bubbles are two of the most well-studied recommender system biases, but most of the prior research has focused on understanding the system behavior in a single recommendation step. How do these biases interplay with user behavior, and what types of user experiences are created from repeated interactions? In this work, we offer a simulation framework for measuring the impact of a recommender system under different types of user behavior. Using this simulation framework, we can (a) isolate the effect of the recommender system from the user preferences, and (b) examine how the system performs not just on average for an "average user" but also the extreme experiences under atypical user behavior. As part of the simulation framework, we propose a set of evaluation metrics over the simulations to understand the recommender system's behavior. Finally, we present two empirical case studies -- one on traditional collaborative filtering in MovieLens and one on a large-scale production recommender system -- to understand how popularity bias manifests over time.	翻訳日:2021-04-04 08:14:03 公開日:2021-01-12
# (参考訳) オブジェクト提案生成のためのスーパーピクセルベースリファインメント Superpixel-based Refinement for Object Proposal Generation ( http://arxiv.org/abs/2101.04574v1 ) ライセンス: CC BY 4.0	Christian Wilms and Simone Frintrop	(参考訳) オブジェクトの正確なセグメンテーションは、クラスに依存しないオブジェクトの提案生成やインスタンスセグメンテーションといったタスクにおいて重要な問題である。ディープラーニングベースのシステムは通常、cnnの固有のダウンサンプリングのため、粗い特徴マップに基づいてオブジェクトのセグメンテーションを生成する。これにより、画像内のオブジェクト境界に順応しないセグメンテーション境界が導かれる。そこで本研究では,最新のオブジェクト提案システムであるAttentionMask上に,新たなスーパーピクセルベースの改良手法を提案する。特徴抽出にスーパーピクセルプーリングと、新しいスーパーピクセル分類器を用いて、高精度スーパーピクセルが対象物に属しているか否かを判定する。実験の結果,AttentionMaskに比べて平均リコール率では最大26.0%の改善が見られた。さらに, セグメンテーションの質的, 定量的分析により, 様々な深層学習に基づくオブジェクト提案生成システムと比較して, 改良のための境界の定着度が著しく向上した。 Precise segmentation of objects is an important problem in tasks like class-agnostic object proposal generation or instance segmentation. Deep learning-based systems usually generate segmentations of objects based on coarse feature maps, due to the inherent downsampling in CNNs. This leads to segmentation boundaries not adhering well to the object boundaries in the image. To tackle this problem, we introduce a new superpixel-based refinement approach on top of the state-of-the-art object proposal system AttentionMask. The refinement utilizes superpixel pooling for feature extraction and a novel superpixel classifier to determine if a high precision superpixel belongs to an object or not. Our experiments show an improvement of up to 26.0% in terms of average recall compared to original AttentionMask. Furthermore, qualitative and quantitative analyses of the segmentations reveal significant improvements in terms of boundary adherence for the proposed refinement compared to various deep learning-based state-of-the-art object proposal generation systems.	翻訳日:2021-04-04 07:54:55 公開日:2021-01-12
# (参考訳) 高密度ハイパーグラフ試験におけるシャープ検出境界 Sharp detection boundaries on testing dense subhypergraph ( http://arxiv.org/abs/2101.04584v1 ) ライセンス: CC BY 4.0	Mingao Yuan and Zuofeng Shang	(参考訳) 本研究では,高密度ハイパーグラフの存在を検査する問題について検討する。ヌル仮説はエルドス=レーニ一様ランダムハイパーグラフであり、代替仮説は高密度な部分ハイパーグラフを含む一様ランダムハイパーグラフである。 1) エッジ確率は既知のもの,(2) エッジ確率は未知のもの,という2つのシナリオにおいて,鋭い検出境界を確立する。どちらのシナリオでも、鋭い検出可能な境界は適切なモデルパラメータによって特徴づけられる。モデルパラメータが検出可能な領域に落ちると漸近的に強力なテストが提供される。以上の結果から,一般的なハイパーグラフモデルの検出可能な領域は,グラフと大きく異なることがわかった。 We study the problem of testing the existence of a dense subhypergraph. The null hypothesis is an Erdos-Renyi uniform random hypergraph and the alternative hypothesis is a uniform random hypergraph that contains a dense subhypergraph. We establish sharp detection boundaries in both scenarios: (1) the edge probabilities are known; (2) the edge probabilities are unknown. In both scenarios, sharp detectable boundaries are characterized by the appropriate model parameters. Asymptotically powerful tests are provided when the model parameters fall in the detectable regions. Our results indicate that the detectable regions for general hypergraph models are dramatically different from their graph counterparts.	翻訳日:2021-04-04 07:41:53 公開日:2021-01-12
# (参考訳) 常識知識の次元 Dimensions of Commonsense Knowledge ( http://arxiv.org/abs/2101.04640v1 ) ライセンス: CC0 1.0	Filip Ilievski, Alessandro Oltramari, Kaixin Ma, Bin Zhang, Deborah L. McGuinness, Pedro Szekely	(参考訳) commonsenseの知識は、自然言語処理、ビジュアル処理、計画など、多くのaiアプリケーションにとって不可欠である。そのため、過去数十年にわたって、常識知識を含む多くの資料が設計され、構築されてきた。近年、大きなテキストベースのソースに焦点が当てられ、ニューラルネットワーク(言語)モデルとの統合が容易になり、典型的にはソースのセマンティクスを犠牲にして、テキストのタスクへの応用が容易になっている。このようなプラクティスは、これらのソースの調和を防ぎ、そのカバレッジとギャップを理解し、ダウンストリームタスクと知識のセマンティックアライメントを妨げる可能性がある。コモンセンス知識の統合は部分的成功をもたらしたが、既存のコモンセンス知識の包括的統合への明確な道筋はない。本稿では,コモンセンス知識の共通次元の周辺にこれらの情報源を整理することを目的とする。この目的のために,我々は,その関係に特に焦点をあてた,幅広い一般的なコモンセンスソースを調査した。我々はこれらの関係を13の知識次元に集約し、それぞれがソースにあるより具体的な関係を抽象化する。この統合により、私たちは別々のソースを統一し、それらのカバレッジ、重複、および知識次元に関するギャップの表示を計算することができます。さらに,コモンセンス知識を必要とする下流推論課題に対する各次元の影響を分析し,時間的・欲求的次元が下流課題の推論に非常に有益であるのに対し,識別性や語彙的知識は影響が少ないことを観察した。これらの結果は、現在の評価におけるいくつかの次元に焦点をあて、他を無視する可能性を明らかにしている。 Commonsense knowledge is essential for many AI applications, including those in natural language processing, visual processing, and planning. Consequently, many sources that include commonsense knowledge have been designed and constructed over the past decades. Recently, the focus has been on large text-based sources, which facilitate easier integration with neural (language) models and application on textual tasks, typically at the expense of the semantics of the sources. Such practice prevents the harmonization of these sources, understanding their coverage and gaps, and may hinder the semantic alignment of their knowledge with downstream tasks. Efforts to consolidate commonsense knowledge have yielded partial success, but provide no clear path towards a comprehensive consolidation of existing commonsense knowledge. The ambition of this paper is to organize these sources around a common set of dimensions of commonsense knowledge. For this purpose, we survey a wide range of popular commonsense sources with a special focus on their relations. We consolidate these relations into 13 knowledge dimensions, each abstracting over more specific relations found in sources. This consolidation allows us to unify the separate sources and to compute indications of their coverage, overlap, and gaps with respect to the knowledge dimensions. Moreover, we analyze the impact of each dimension on downstream reasoning tasks that require commonsense knowledge, observing that the temporal and desire/goal dimensions are very beneficial for reasoning on current downstream tasks, while distinctness and lexical knowledge have little impact. These results reveal focus towards some dimensions in current evaluation, and potential neglect of others.	翻訳日:2021-04-04 07:12:40 公開日:2021-01-12
# (参考訳) リアルか仮想か? 拡張現実シナリオにおける脳活動パターンを用いた参加者ターゲットの識別 Real or Virtual? Using Brain Activity Patterns to differentiate Attended Targets during Augmented Reality Scenarios ( http://arxiv.org/abs/2101.05272v1 ) ライセンス: CC BY 4.0	Lisa-Marie Vortmann, Leonid Schwenke, Felix Putze	(参考訳) 拡張現実(Augmented Reality)は、仮想コンポーネントと実際の環境の融合である。生成されたオブジェクトと自然オブジェクトの同時可視性は、ユーザがリアルまたは仮想の特定のターゲットに選択的に注意を向ける必要がある場合が多い。本研究では,拡張現実のシナリオで収集された脳波(eeg)データを分類する機械学習手法を用いて,この目標が現実か仮想かを検討した。浅い畳み込みニューラルネットワークは、テストデータとトレーニングデータが異なる試行で得られた場合、20人の参加者から平均70%以上の精度で3秒間のデータウィンドウを分類した。 20名中6名に対して, 人別分類が可能であった。このように、脳-コンピュータインタフェースの信頼性は、拡張現実アプリケーションに有用な入力メカニズムとして扱うのに十分である。 Augmented Reality is the fusion of virtual components and our real surroundings. The simultaneous visibility of generated and natural objects often requires users to direct their selective attention to a specific target that is either real or virtual. In this study, we investigated whether this target is real or virtual by using machine learning techniques to classify electroencephalographic (EEG) data collected in Augmented Reality scenarios. A shallow convolutional neural net classified 3 second data windows from 20 participants in a person-dependent manner with an average accuracy above 70\% if the testing data and training data came from different trials. Person-independent classification was possible above chance level for 6 out of 20 participants. Thus, the reliability of such a Brain-Computer Interface is high enough for it to be treated as a useful input mechanism for Augmented Reality applications.	翻訳日:2021-04-04 06:45:02 公開日:2021-01-12
# (参考訳) モバイルおよびwebアプリケーションのための境界対応セグメンテーションネットワーク Boundary-Aware Segmentation Network for Mobile and Web Applications ( http://arxiv.org/abs/2101.04704v1 ) ライセンス: CC BY 4.0	Xuebin Qin and Deng-Ping Fan and Chenyang Huang and Cyril Diagne and Zichen Zhang and Adri\`a Cabeza Sant'Anna and Albert Su\`arez and Martin Jagersand and Ling Shao	(参考訳) 深層モデルは画像分割の精度とロバスト性を大幅に向上させたが、高精度な境界と微細構造を持つセグメンテーション結果を得ることは依然として課題である。本稿では,予測再定義アーキテクチャとハイブリッド損失を含む,シンプルながら強力な境界認識セグメンテーションネットワーク(BASNet)を提案し,高精度な画像セグメンテーションを実現する。予測再定義アーキテクチャは、分割確率マップの予測と精錬にそれぞれ使用される、密集した教師付きエンコーダ-デコーダネットワークと残留精細モジュールで構成される。ハイブリッド損失は、二進的クロスエントロピー、構造的類似性、および交叉対ユニオン損失の組み合わせであり、ネットワークは3レベル(ピクセルレベル、パッチレベル、マップレベル)の階層表現を学習するよう誘導する。我々は,有能なオブジェクトセグメンテーション,カモフラージュされたオブジェクトセグメンテーションを含む2つの逆タスクに対して,BASNetを評価し,鋭いセグメンテーション境界で非常に競争的な性能を実現することを示す。重要な点として、BASNetは単一のGPU上で70fps以上で動作する。 basnetをベースにして、arコピー&ペースト(ar copy & paste)という2つの商用アプリケーションを開発し、basnetは現実世界のオブジェクトの「コピー」と「ペースト」のために拡張現実と統合され、オブジェクトの背景を自動的に除去するwebベースのツールであるobject cut(オブジェクトカット)を開発した。どちらのアプリケーションもすでに多くの注目を集めており、現実世界に大きな影響を与えている。コードと2つのアプリケーションは、https://github.com/NathanUA/BASNetで公開される。 Although deep models have greatly improved the accuracy and robustness of image segmentation, obtaining segmentation results with highly accurate boundaries and fine structures is still a challenging problem. In this paper, we propose a simple yet powerful Boundary-Aware Segmentation Network (BASNet), which comprises a predict-refine architecture and a hybrid loss, for highly accurate image segmentation. The predict-refine architecture consists of a densely supervised encoder-decoder network and a residual refinement module, which are respectively used to predict and refine a segmentation probability map. The hybrid loss is a combination of the binary cross entropy, structural similarity and intersection-over-union losses, which guide the network to learn three-level (ie, pixel-, patch- and map- level) hierarchy representations. We evaluate our BASNet on two reverse tasks including salient object segmentation, camouflaged object segmentation, showing that it achieves very competitive performance with sharp segmentation boundaries. Importantly, BASNet runs at over 70 fps on a single GPU which benefits many potential real applications. Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is integrated with augmented reality for "COPYING" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal. Both applications have already drawn huge amount of attention and have important real-world impacts. The code and two applications will be publicly available at: https://github.com/NathanUA/BASNet.	翻訳日:2021-04-04 06:27:02 公開日:2021-01-12
# (参考訳) 対照的な自己教師付き学習を改善する明示的ホモグラフィ推定 Explicit homography estimation improves contrastive self-supervised learning ( http://arxiv.org/abs/2101.04713v1 ) ライセンス: CC BY 4.0	David Torpey and Richard Klein	(参考訳) 典型的なコントラスト自己監督アルゴリズムは、正と負の画像を直接または間接的に対比して監督信号として潜時空間の類似度尺度を用いる。自己教師付きアルゴリズムの実用性は近年改善されているが,計算処理など,その普及を妨げるボトルネックが依然として残っている。本稿では,自己教師付きコントラスト学習パラダイムにおける追加目標としてのモジュールを提案する。このモジュールをアフィン変換やホモグラフィーのパラメータに組み込むことによって、元のコントラスト目的に加えて、パフォーマンスと学習速度を向上することを示す。重要なことは、この加群がアフィン変換の様々な成分に不変性を強制しないことを保証する。本稿では,最近普及している2つの自己教師型アルゴリズムに対する追加目的の有効性を示す。提案手法の広範な実験的解析を行い,検討した全てのデータセットの性能向上を示す。さらに,一般ホモグラフィとアフィン変換はともに性能と収束性を改善するのに十分であるが,全ての場合においてアフィン変換は良好であることがわかった。 The typical contrastive self-supervised algorithm uses a similarity measure in latent space as the supervision signal by contrasting positive and negative images directly or indirectly. Although the utility of self-supervised algorithms has improved recently, there are still bottlenecks hindering their widespread use, such as the compute needed. In this paper, we propose a module that serves as an additional objective in the self-supervised contrastive learning paradigm. We show how the inclusion of this module to regress the parameters of an affine transformation or homography, in addition to the original contrastive objective, improves both performance and learning speed. Importantly, we ensure that this module does not enforce invariance to the various components of the affine transform, as this is not always ideal. We demonstrate the effectiveness of the additional objective on two recent, popular self-supervised algorithms. We perform an extensive experimental analysis of the proposed method and show an improvement in performance for all considered datasets. Further, we find that although both the general homography and affine transformation are sufficient to improve performance and convergence, the affine transformation performs better in all cases.	翻訳日:2021-04-04 05:51:41 公開日:2021-01-12
# (参考訳) リアルな微小地震事象のベイジアン後方推定を指向した高速機械学習 Towards fast machine-learning-assisted Bayesian posterior inference of realistic microseismic events ( http://arxiv.org/abs/2101.04724v1 ) ライセンス: CC BY 4.0	Davide Piras, Alessio Spurio Mancini, Benjamin Joachimi, Michael P. Hobson	(参考訳) 微小地震活動モニタリングに応用されたベイズ推定は、記録された地震計からの微小地震事象の座標とその関連する不確かさを原理的に推定することができる。しかしながら、これらのマイクロ地震事象の前方モデリングは、ベイズ源の反転を行うのに必要であり、計算資源の面では極めて高価である。実現可能な解決策は、機械学習技術に基づくサロゲートモデルをトレーニングし、前方モデルをエミュレートし、ベイズ推論を加速することだ。本稿では,等方性モーメントテンソルのソースのみを考慮した先行研究について改善する。記録された圧力波のパワースペクトルに基づいて機械学習アルゴリズムをトレーニングし、トレーニングされたエミュレータが$\textit{any}$ソースメカニズムのイベント座標の完全かつ高速な検索を可能にすることを示す。さらに,本手法は商用ノートパソコン上で1時間未満で動作可能であり,トレーニング地震計10^4ドル以下で正確な結果が得られるため,計算コストが低いことを示す。さらに,ベイズ証拠を推定することにより,トレーニングしたエミュレータを用いてソースメカニズムを同定する方法を実証する。この研究は、記録された地震計の効率的な局所化と特徴付けの基礎を築き、地震活動に対する人間の影響を定量化し、地震の危険を軽減するのに役立つ。 Bayesian inference applied to microseismic activity monitoring allows for principled estimation of the coordinates of microseismic events from recorded seismograms, and their associated uncertainties. However, forward modelling of these microseismic events, necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is to train a surrogate model based on machine learning techniques, to emulate the forward model and thus accelerate Bayesian inference. In this paper, we improve on previous work, which considered only sources with isotropic moment tensor. We train a machine learning algorithm on the power spectrum of the recorded pressure wave and show that the trained emulator allows for the complete and fast retrieval of the event coordinates for $\textit{any}$ source mechanism. Moreover, we show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop, while yielding accurate results using less than $10^4$ training seismograms. We additionally demonstrate how the trained emulators can be used to identify the source mechanism through the estimation of the Bayesian evidence. This work lays the foundations for the efficient localisation and characterisation of any recorded seismogram, thus helping to quantify human impact on seismic activity and mitigate seismic hazard.	翻訳日:2021-04-04 05:39:47 公開日:2021-01-12
# (参考訳) SEED:視覚表現のための自己教師型蒸留 SEED: Self-supervised Distillation For Visual Representation ( http://arxiv.org/abs/2101.04731v1 ) ライセンス: CC BY 4.0	Zhiyuan Fang, Jianfeng Wang, Lijuan Wang, Lei Zhang, Yezhou Yang, Zicheng Liu	(参考訳) 本稿では,小型モデルの自己教師型学習について述べる。この問題は,広範に使用されているコントラスト型自己教師付き学習手法が大規模モデルトレーニングにおいて大きな進歩を遂げているが,小モデルではうまく機能しないという経験的研究が動機である。この問題に対処するため,我々はSelf-SupErvised Distillation (SEED)という新たな学習パラダイムを提案し,より大規模なネットワーク(教師として)を利用して,表現的知識をより小さなアーキテクチャ(学生として)に自己管理的に伝達する。ラベルのないデータから直接学習する代わりに、教師が一連のインスタンスに対して推定する類似度スコア分布を模倣するように学生エンコーダを訓練する。シードはダウンストリームタスクにおける小さなネットワークのパフォーマンスを劇的に向上させる。自己監督ベースラインと比較して、SEEDはトップ1の精度を、EfficientNet-B0で42.2%から67.6%、ImageNet-1kデータセットでMobileNet-v3-Largeで36.3%から68.2%に改善している。 This paper is concerned with self-supervised learning for small models. The problem is motivated by our empirical studies that while the widely used contrastive self-supervised learning method has shown great progress on large model training, it does not work well for small models. To address this problem, we propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), where we leverage a larger network (as Teacher) to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion. Instead of directly learning from unlabeled data, we train a student encoder to mimic the similarity score distribution inferred by a teacher over a set of instances. We show that SEED dramatically boosts the performance of small networks on downstream tasks. Compared with self-supervised baselines, SEED improves the top-1 accuracy from 42.2% to 67.6% on EfficientNet-B0 and from 36.3% to 68.2% on MobileNet-v3-Large on the ImageNet-1k dataset.	翻訳日:2021-04-04 05:01:10 公開日:2021-01-12
# (参考訳) 運動計画を用いたブートストラップモータスキル学習 Bootstrapping Motor Skill Learning with Motion Planning ( http://arxiv.org/abs/2101.04736v1 ) ライセンス: CC BY 4.0	Ben Abbatematteo, Eric Rosen, Stefanie Tellex, George Konidaris	(参考訳) ロボットモーターのスキルをスクラッチから学ぶのは非常に遅いので、実際に人間のデモから得られる優れたスキルポリシーを使って学習をブートストラップする必要がある。しかし、人間の実演に頼ると、ロボットの自律性が低下し、運用期間を通じて様々なスキルを身につける必要がある。物体操作のための運動スキル学習をブートストラップする、完全に自律的なサンプルとして運動計画を用いることを提案する。本研究では,運動プランナーを用いて,動的運動プリミティブ表現を用いた引き出しの開閉と,ディープニューラルネットワークポリシによるマイクロ波ドアの開閉という,複雑な2つの操作シナリオにおいて,モータスキルのブートストラップを行う。また,本手法では,静的なシーンを考慮に入れたキネマティック計画では,この課題を解決するには不十分であるが,よりダイナミックなポリシーをブートストラップするには十分であることを示す。これら3例すべてにおいて,本手法は人為的な初期化と競合し,ランダムなポリシーから始めると著しく優れる。このアプローチにより、ロボットは人間の実演なしに動的タスクの運動ポリシーを効率的かつ自律的に学習することができる。 Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must be bootstrapped using a good skill policy obtained from human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must learn a wide variety of skills over their operational lifetimes. We propose using kinematic motion planning as a completely autonomous, sample efficient way to bootstrap motor skill learning for object manipulation. We demonstrate the use of motion planners to bootstrap motor skills in two complex object manipulation scenarios with different policy representations: opening a drawer with a dynamic movement primitive representation, and closing a microwave door with a deep neural network policy. We also show how our method can bootstrap a motor skill for the challenging dynamic task of learning to hit a ball off a tee, where a kinematic plan based on treating the scene as static is insufficient to solve the task, but sufficient to bootstrap a more dynamic policy. In all three cases, our method is competitive with human-demonstrated initialization, and significantly outperforms starting with a random policy. This approach enables robots to to efficiently and autonomously learn motor policies for dynamic tasks without human demonstration.	翻訳日:2021-04-04 04:36:19 公開日:2021-01-12
# (参考訳) 大規模拡張グランガー因果性を用いた機能MRIからの統合失調症の分類 Classification of Schizophrenia from Functional MRI Using Large-scale Extended Granger Causality ( http://arxiv.org/abs/2101.10471v1 ) ライセンス: CC BY 4.0	Axel Wism\"uller and M. Ali Vosoughi	(参考訳) この文献は統合失調症が脳ネットワーク接続の変化と関連していることを示している。本研究では, 大規模拡張グランガー因果性 (lsXGC) が静止状態fMRIデータを用いてこのような変化を捉えることができるか検討する。本手法は,fMRI時系列間の有向因果関係を推定するための予測時系列モデルにおいて,ソース時系列の増大と合わせて次元削減を利用する。 lsXGCは、他のすべての時系列の存在下で、基礎となる動的システムとの関係を特定するため、多変量アプローチである。ここでlsxgcは、cobre(center of biomedical research excellence)データリポジトリから62名の被験者のサブセットを使用して、統合失調症患者を典型的なコントロールから分類するためのバイオマーカーとして機能する。分類の特徴としてlsxgcによって推定される脳結合を用いる。特徴抽出後,kendallのtauランク相関係数による特徴抽出を行い,サポートベクターマシンを用いた分類を行った。参考法として, 機能的接続性の標準尺度として文献で一般的に用いられる相互相関法と比較した。我々は,100種類の異なるトレーニング/テスト (90%/10%) データを分割して平均精度と受信機動作特性曲線 (auc) 下の平均領域を得る。その結果,lsXGCの平均精度範囲は[0.767,0.940],平均AUC範囲は[0.861,0.983]であった。 lsXGCの結果は, [0.721, 0.751] の平均精度と [0.744, 0.860] の平均 AUC との相互相関の結果よりも有意に高い。統合失調症のバイオマーカーとしてのlsXGCの有用性が示唆された。 The literature manifests that schizophrenia is associated with alterations in brain network connectivity. We investigate whether large-scale Extended Granger Causality (lsXGC) can capture such alterations using resting-state fMRI data. Our method utilizes dimension reduction combined with the augmentation of source time-series in a predictive time-series model for estimating directed causal relationships among fMRI time-series. The lsXGC is a multivariate approach since it identifies the relationship of the underlying dynamic system in the presence of all other time-series. Here lsXGC serves as a biomarker for classifying schizophrenia patients from typical controls using a subset of 62 subjects from the Centers of Biomedical Research Excellence (COBRE) data repository. We use brain connections estimated by lsXGC as features for classification. After feature extraction, we perform feature selection by Kendall's tau rank correlation coefficient followed by classification using a support vector machine. As a reference method, we compare our results with cross-correlation, typically used in the literature as a standard measure of functional connectivity. We cross-validate 100 different training/test (90%/10%) data split to obtain mean accuracy and a mean Area Under the receiver operating characteristic Curve (AUC) across all tested numbers of features for lsXGC. Our results demonstrate a mean accuracy range of [0.767, 0.940] and a mean AUC range of [0.861, 0.983] for lsXGC. The result of lsXGC is significantly higher than the results obtained with the cross-correlation, namely mean accuracy of [0.721, 0.751] and mean AUC of [0.744, 0.860]. Our results suggest the applicability of lsXGC as a potential biomarker for schizophrenia.	翻訳日:2021-04-04 04:21:57 公開日:2021-01-12
# (参考訳) 顔のスプーフィング検出のためのコンパクトなディープラーニングモデル A Compact Deep Learning Model for Face Spoofing Detection ( http://arxiv.org/abs/2101.04756v1 ) ライセンス: CC BY 4.0	Seyedkooshan Hashemifard and Mohammad Akbari	(参考訳) 近年,顔バイオメトリック・セキュリティシステムが急速に普及しているため,プレゼンテーションアタック検出(PAD)は研究コミュニティから注目され,主要な研究分野となっている。研究者は、lpp、bsif、lpqなどの従来のテクスチャ特徴抽出の活用から、異なるアーキテクチャのディープニューラルネットワークの利用まで、様々な方法でこの問題に取り組んでいる。これらの技術は特定の攻撃シナリオやデータセットに対してそれぞれ達成されているが、その効率は特定の種類のプレゼンテーションアタックや機器(PAI)に限られているため、そのほとんどが目に見えない条件の問題を一般化できなかった。本稿では,手作りのテクスチャ特徴を完全に抽出したり,深層ニューラルネットワークにのみ依存するのではなく,広部と深部の両方を統合型ニューラルネットワークアーキテクチャで融合することで,この問題に対処する。主なアイデアは、両方の方法の強みを生かして、問題に対するよく一般化された解決策を導出することである。また,提案手法をそれぞれ別々に比較することにより,本手法の有効性を評価した。この手順は、ROSE-Youtu、SiW、NUAA Imposterデータセットなど、さまざまなスプーフィングデータセットで実行される。特に,スプーフィング検出タスク(ディープチャネル)のための畳み込みニューラルネットワーク設計を通じて学習したデータ駆動型特徴を応用した低次元潜在空間を同時学習し,スプーフィング検出機能を利用した周波数・時間次元(ワイドチャネル)のスプーフィング検出機能を活用する。 In recent years, face biometric security systems are rapidly increasing, therefore, the presentation attack detection (PAD) has received significant attention from research communities and has become a major field of research. Researchers have tackled the problem with various methods, from exploiting conventional texture feature extraction such as LBP, BSIF, and LPQ to using deep neural networks with different architectures. Despite the results each of these techniques has achieved for a certain attack scenario or dataset, most of them still failed to generalized the problem for unseen conditions, as the efficiency of each is limited to certain type of presentation attacks and instruments (PAI). In this paper, instead of completely extracting hand-crafted texture features or relying only on deep neural networks, we address the problem via fusing both wide and deep features in a unified neural architecture. The main idea is to take advantage of the strength of both methods to derive well-generalized solution for the problem. We also evaluated the effectiveness of our method by comparing the results with each of the mentioned techniques separately. The procedure is done on different spoofing datasets such as ROSE-Youtu, SiW and NUAA Imposter datasets. In particular, we simultanously learn a low dimensional latent space empowered with data-driven features learnt via Convolutional Neural Network designes for spoofing detection task (i.e., deep channel) as well as leverages spoofing detection feature already popular for spoofing in frequency and temporal dimensions ( i.e., via wide channel).	翻訳日:2021-04-04 04:09:29 公開日:2021-01-12
# (参考訳) 基数評価と順序評価の合同集約と学生用紙コンテストへの応用 Joint aggregation of cardinal and ordinal evaluations with an application to a student paper competition ( http://arxiv.org/abs/2101.04765v1 ) ライセンス: CC BY 4.0	Dorit S. Hochbaum and Erick Moreno-Centeno	(参考訳) 決定論における重要な問題は、個々のランク/レーティングを集団評価に集約することである。 2007 MSOMの学生論文コンペティションにおける新たな集約手法について述べる。この競争における集合問題は2つの課題をもたらす。第一に、各論文は裁判官のごくわずかな部分でのみレビューされ、その結果、総合評価は裁判官が選択した主観的な尺度に非常に敏感である。第二に、裁判官は審査した論文の基数評価と順序評価(格付けとランク付け)の両方を提供した。ここでの貢献は、順序と基数の評価を共同で総合評価に集約する新しい堅牢な方法論である。この方法論は、不完全な評価の場合、すなわち、個人がオブジェクトの厳密なサブセットのみを評価する場合に特に適しています。このアプローチは、大規模なプロジェクトや複数の優先順位を含む資本予算からプロジェクトを選択する委員会による管理的意思決定の問題において、潜在的に有用である。 An important problem in decision theory concerns the aggregation of individual rankings/ratings into a collective evaluation. We illustrate a new aggregation method in the context of the 2007 MSOM's student paper competition. The aggregation problem in this competition poses two challenges. Firstly, each paper was reviewed only by a very small fraction of the judges; thus the aggregate evaluation is highly sensitive to the subjective scales chosen by the judges. Secondly, the judges provided both cardinal and ordinal evaluations (ratings and rankings) of the papers they reviewed. The contribution here is a new robust methodology that jointly aggregates ordinal and cardinal evaluations into a collective evaluation. This methodology is particularly suitable in cases of incomplete evaluations -- i.e., when the individuals evaluate only a strict subset of the objects. This approach is potentially useful in managerial decision making problems by a committee selecting projects from a large set or capital budgeting involving multiple priorities.	翻訳日:2021-04-04 03:14:51 公開日:2021-01-12
# (参考訳) DuctTake:時空間ビデオ合成 DuctTake: Spatiotemporal Video Compositing ( http://arxiv.org/abs/2101.04772v1 ) ライセンス: CC BY 4.0	Jan Rueegg, Oliver Wang, Aljoscha Smolic, Markus Gross	(参考訳) DuctTakeは、シーンの複数のテイクを単一のビデオに実用的な合成を可能にするように設計されたシステムである。現在の業界ソリューションはオブジェクトセグメンテーション(オブジェクトセグメンテーション)に基づいており、手動入力とクリーンアップを必要とする難しい問題であり、フィルム製造プロセスの高価な部分を構成する。そこで本手法では,映像の体積を3次元グラフで補正し,最適な時空間シームを合成する。我々は,hd動画を合成するインタラクティブなツールとして,各セクションの実行時間と性能に特に注意を払いながら,必要なコンポーネント,決定,新しいテクニックを詳細に説明する。我々は,幅広い実例を提示し,現在最先端のツールを用いて,プロのアーティストが作成した複合作品と結果品質と作成時間を比較することにより,このアプローチを検証する。 DuctTake is a system designed to enable practical compositing of multiple takes of a scene into a single video. Current industry solutions are based around object segmentation, a hard problem that requires extensive manual input and cleanup, making compositing an expensive part of the film-making process. Our method instead composites shots together by finding optimal spatiotemporal seams using motion-compensated 3D graph cuts through the video volume. We describe in detail the required components, decisions, and new techniques that together make a usable, interactive tool for compositing HD video, paying special attention to running time and performance of each section. We validate our approach by presenting a wide variety of examples and by comparing result quality and creation time to composites made by professional artists using current state-of-the-art tools.	翻訳日:2021-04-04 02:58:59 公開日:2021-01-12
# (参考訳) 音声駆動サービスにおける実践的音声再使用防止 Practical Speech Re-use Prevention in Voice-driven Services ( http://arxiv.org/abs/2101.04773v1 ) ライセンス: CC BY 4.0	Yangyong Zhang, Maliheh Shirvanian, Sunpreet S. Arora, Jianwei Huang, and Guofei Gu	(参考訳) 音声駆動サービス(VDS)は、スマートホームコントロールからデジタルアシスタントを使った支払いまで、さまざまなアプリケーションで使用されている。このようなサービスへの入力は、オープンな音声チャンネル、例えばマイクを使って、教師なしの設定でキャプチャされることが多い。このような設定における運用上のセキュリティ要件の1つは、入力音声の鮮度である。本稿では,ユーザインタラクション時に動的音響ノイズを積極的に埋め込んだセキュリティオーバーレイであるAEOLUSについて述べる。音響ノイズは, (i) 確実に組込み, 取り出しが可能であり, (ii) 非破壊的 (かつ, 不可避) なvdsユーザであることを示す。実用的観点から、(i)および(ii)に対して最適なパラメータ(音響ナンスの動作周波数、振幅、ビットレート)を決定する。実験の結果,AEOLUSは背景雑音レベルが異なる3つの実環境において,音声の再使用防止のために0% FARで0.5%FRRを得ることがわかった。また,120名の被験者によるユーザ調査を行い,これらの環境では,94.16%の音声サンプルにおいて,全体のユーザエクスペリエンスが低下しないことを示した。そのため、AEOLUSは音声の再使用を防止し、音声入力の鮮度を確保するために実際に使用することができる。 Voice-driven services (VDS) are being used in a variety of applications ranging from smart home control to payments using digital assistants. The input to such services is often captured via an open voice channel, e.g., using a microphone, in an unsupervised setting. One of the key operational security requirements in such setting is the freshness of the input speech. We present AEOLUS, a security overlay that proactively embeds a dynamic acoustic nonce at the time of user interaction, and detects the presence of the embedded nonce in the recorded speech to ensure freshness. We demonstrate that acoustic nonce can (i) be reliably embedded and retrieved, and (ii) be non-disruptive (and even imperceptible) to a VDS user. Optimal parameters (acoustic nonce's operating frequency, amplitude, and bitrate) are determined for (i) and (ii) from a practical perspective. Experimental results show that AEOLUS yields 0.5% FRR at 0% FAR for speech re-use prevention upto a distance of 4 meters in three real-world environments with different background noise levels. We also conduct a user study with 120 participants, which shows that the acoustic nonce does not degrade overall user experience for 94.16% of speech samples, on average, in these environments. AEOLUS can therefore be used in practice to prevent speech re-use and ensure the freshness of speech input.	翻訳日:2021-04-04 02:42:29 公開日:2021-01-12
# (参考訳) マルチエージェントmdpのためのスケーラブルなanytime planning Scalable Anytime Planning for Multi-Agent MDPs ( http://arxiv.org/abs/2101.04788v1 ) ライセンス: CC BY 4.0	Shushman Choudhury, Jayesh K. Gupta, Peter Morales, Mykel J. Kochenderfer	(参考訳) 動的協調を必要とする大規模マルチエージェントシーケンシャル決定問題に対して,スケーラブルな木探索計画アルゴリズムを提案する。エージェントのチームは多くのドメインで決定をコーディネートする必要があるが、単純なアプローチはエージェントの数と共同アクション空間が指数関数的に増加するために失敗する。私たちはこの複雑さを、近似品質と動的に協調する動作のために計算を交換できるanytimeアプローチを通じて回避します。提案アルゴリズムは,モンテカルロ木探索 (MCTS) を用いたオンライン計画,協調グラフを用いた局所エージェント相互作用の因子表現,および協調行動選択のための反復マックスプラス法からなる。我々は,静的コーディネーショングラフを用いたベンチマークSysAdminのアプローチを評価し,MCTSベースラインよりも計算コストがはるかに低い性能を実現する。また,動的,すなわち状態依存のコーディネーショングラフを持つマルチドローン配送ドメインを導入し,我々のアプローチが,他のmctsメソッドでは難解なこの領域の大きな問題にどのようにスケールするかを実証する。我々はこのアルゴリズムのオープンソース実装をhttps://github.com/JuliaPOMDP/FactoredValueMCTS.jlで公開しています。 We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.	翻訳日:2021-04-04 02:25:43 公開日:2021-01-12
# インスタント適応のための線形表現メタ強化学習 Linear Representation Meta-Reinforcement Learning for Instant Adaptation ( http://arxiv.org/abs/2101.04750v1 ) ライセンス: Link先を確認	Matt Peng, Banghua Zhu, Jiantao Jiao	(参考訳) 本稿では,Fast Linearized Adaptive Policy (FLAP)について紹介する。これは,学習中のデータ再利用を必要とせず,かつ,テスト中のサンプル数個だけでほぼ瞬時に適応できる,新しいメタ強化学習(meta-RL)手法である。 FLAPは方針の共有線形表現を学習するアイデアに基づいており、新しいタスクに適応すると、線形重みの集合を予測するのに十分である。適応中は、MAMLのような従来のメタRL法のように勾配勾配を更新する代わりに、アダプティブネットワークを用いてこれらの線形重み付けを予測することで、新しいポリシーを得られるように、個別のアダプタネットワークを同時に訓練する。異なるフィードフォワードネットワークの応用は、適応実行時間を著しく高速化するだけでなく、以前のMeta-RLメソッドでは一般化できなかった非常に異なるタスクに非常によく一般化する。標準の連続制御メタrlベンチマーク実験では、flapは平均リターンを最大2倍にし、以前の方法と比較して最大8倍高速に適応した実行時間速度を示す。 This paper introduces Fast Linearized Adaptive Policy (FLAP), a new meta-reinforcement learning (meta-RL) method that is able to extrapolate well to out-of-distribution tasks without the need to reuse data from training, and adapt almost instantaneously with the need of only a few samples during testing. FLAP builds upon the idea of learning a shared linear representation of the policy so that when adapting to a new task, it suffices to predict a set of linear weights. A separate adapter network is trained simultaneously with the policy such that during adaptation, we can directly use the adapter network to predict these linear weights instead of updating a meta-policy via gradient descent, such as in prior meta-RL methods like MAML, to obtain the new policy. The application of the separate feed-forward network not only speeds up the adaptation run-time significantly, but also generalizes extremely well to very different tasks that prior Meta-RL methods fail to generalize to. Experiments on standard continuous-control meta-RL benchmarks show FLAP presenting significantly stronger performance on out-of-distribution tasks with up to double the average return and up to 8X faster adaptation run-time speeds when compared to prior methods.	翻訳日:2021-04-04 01:55:07 公開日:2021-01-12
# 文脈問題:手話認識のための自己認識 Context Matters: Self-Attention for Sign Language Recognition ( http://arxiv.org/abs/2101.04632v1 ) ライセンス: Link先を確認	Fares Ben Slimane and Mohamed Bouguessa	(参考訳) 本稿では,連続手話認識のための注意ネットワークを提案する。提案手法は,手話のモダリティをモデル化するために,共依存データストリームを利用する。これらの異なる情報チャネルは、互いに複雑な時間構造を共有することができる。そのため、私たちは同期に注意を払い、異なる手話コンポーネント間の絡み合った依存関係を捉えるのに役立ちます。手話はマルチチャネルであるにもかかわらず、手形は手話解釈の中心的な実体を表す。正しい文脈で手形を見ることは、記号の意味を定義する。これを考慮し、注意機構を用いて、手の特徴を適切な時空間で効率的に集約し、より優れた手話認識を実現する。これによってモデルは、支配的な手と顔の領域を中心に回転する重要な手話コンポーネントを識別できることが分かりました。ベンチマークデータセットであるRWTH-PHOENIX-Weather 2014でテストを行い、競争結果を得た。 This paper proposes an attentional network for the task of Continuous Sign Language Recognition. The proposed approach exploits co-independent streams of data to model the sign language modalities. These different channels of information can share a complex temporal structure between each other. For that reason, we apply attention to synchronize and help capture entangled dependencies between the different sign language components. Even though Sign Language is multi-channel, handshapes represent the central entities in sign interpretation. Seeing handshapes in their correct context defines the meaning of a sign. Taking that into account, we utilize the attention mechanism to efficiently aggregate the hand features with their appropriate spatio-temporal context for better sign recognition. We found that by doing so the model is able to identify the essential Sign Language components that revolve around the dominant hand and the face areas. We test our model on the benchmark dataset RWTH-PHOENIX-Weather 2014, yielding competitive results.	翻訳日:2021-04-04 01:54:42 公開日:2021-01-12
# ビデオ感性分析のための量子認知型決定融合 Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis ( http://arxiv.org/abs/2101.04406v1 ) ライセンス: Link先を確認	Dimitris Gkoumas, Qiuchi Li, Shahram Dehdashti, Massimo Melucci, Yijun Yu, Dawei Song	(参考訳) 意思決定プロセスとしての映像感情分析は本質的に複雑であり、複数のモダリティからの意思決定の融合や、いわゆる認知バイアスが伴う。量子認知の最近の進歩に触発されて、あるモダリティからの感情判断が他のモダリティの判断と相容れないこと、すなわち秩序が問題であり、最終的な決定を下すために共同で測定できないことを示す。したがって、認知過程は古典的確率論では捉えられない「量子的」バイアスを示す。そこで本研究では,感情判断予測のための新しい量子認知的融合戦略を提案する。特に、正および負の感性判断の量子重ね合わせ状態として発話を定式化し、一様分類器を相互に相反する可観測量として、正の演算値測度を持つ複素数値ヒルベルト空間上で定式化する。 2つのベンチマークデータセットの実験は、我々のモデルが既存の決定レベルと最先端のコンテンツレベルの融合アプローチを大きく上回っていることを示している。また,不整合性の概念は,すべてのユニモーダル分類器によって誤って予測される極端な事例を含む,すべての組み合わせパターンを効果的に扱えることを示す。 Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matters and they cannot be jointly measured to produce a final decision. Thus the cognitive process exhibits "quantum-like" biases that cannot be captured by classical probability theories. Accordingly, we propose a fundamentally new, quantum cognitively motivated fusion strategy for predicting sentiment judgments. In particular, we formulate utterances as quantum superposition states of positive and negative sentiment judgments, and uni-modal classifiers as mutually incompatible observables, on a complex-valued Hilbert space with positive-operator valued measures. Experiments on two benchmarking datasets illustrate that our model significantly outperforms various existing decision level and a range of state-of-the-art content-level fusion approaches. The results also show that the concept of incompatibility allows effective handling of all combination patterns, including those extreme cases that are wrongly predicted by all uni-modal classifiers.	翻訳日:2021-04-04 01:54:30 公開日:2021-01-12
# マルチモーダルレシピにおける手続き的概念の潜在アライメント Latent Alignment of Procedural Concepts in Multimodal Recipes ( http://arxiv.org/abs/2101.04727v1 ) ライセンス: Link先を確認	Hossein Rajaby Faghihi, Roshanak Mirzaee, Sudarshan Paliwal, and Parisa Kordjamshidi	(参考訳) 本稿では、新たにリリースされたマルチモーダルQAデータセットRecipeQAの手続き的推論を扱うための新しいアライメント機構を提案する。私たちのモデルは,画像と指示を含むレシピの読み解き理解であるテキストクローゼタスクを解決している。我々は,アテンションネットワーク,クロスモーダル表現,命令と候補回答間の潜在アライメント空間のパワーを活用し,この問題を解決した。本稿では,アライメント行列の最大プーリング操作を洗練し,モデルの出力間に不一致な制約を課す制約付きマックスプーリングを提案する。評価の結果,ベースラインに対して19-%改善が見られた。 We propose a novel alignment mechanism to deal with procedural reasoning on a newly released multimodal QA dataset, named RecipeQA. Our model is solving the textual cloze task which is a reading comprehension on a recipe containing images and instructions. We exploit the power of attention networks, cross-modal representations, and a latent alignment space between instructions and candidate answers to solve the problem. We introduce constrained max-pooling which refines the max-pooling operation on the alignment matrix to impose disjoint constraints among the outputs of the model. Our evaluation result indicates a 19\% improvement over the baselines.	翻訳日:2021-04-04 01:53:52 公開日:2021-01-12
# UFA-FUSE:多焦点画像融合のための新しい深層教師付きハイブリッドモデル UFA-FUSE: A novel deep supervised and hybrid model for multi-focus image fusion ( http://arxiv.org/abs/2101.04506v1 ) ライセンス: Link先を確認	Yongsheng Zang, Dongming Zhou, Changcheng Wang, Rencan Nie, and Yanbu Guo	(参考訳) 従来の深層学習に基づく融合法は中間決定マップを生成し、一連の後処理手順を通じて融合画像を得る。しかし、これらの方法で生成された融合結果は、ソースイメージの詳細や成果物を失うことは容易である。ディープラーニングに基づく画像再構成技術に着想を得て,これらの課題をエンドツーエンドかつ教師付き学習方法で解決するために,ポストプロセッシングを伴わないマルチフォーカス画像融合ネットワークフレームワークを提案する。融合モデルを十分に訓練するために,地上融合画像を用いた大規模マルチフォーカス画像データセットを作成した。さらに,より情報的な融合画像を得るために,チャネルアテンションモジュールと空間アテンションモジュールから構成されるユニタリフュージョンアテンションに基づく新しい融合戦略を設計した。具体的には,提案手法は主に特徴抽出,特徴融合,画像再構成の3つの要素からなる。まず,7つの畳み込みブロックを用いて画像の特徴を抽出する。そして, 抽出した畳み込み特性を, 特徴融合層の融合戦略により融合させる。最後に、融合画像の特徴を4つの畳み込みブロックで再構成する。実験の結果, 提案手法は19の最先端融合法と比較して, 優れた核融合性能が得られることがわかった。 Traditional and deep learning-based fusion methods generated the intermediate decision map to obtain the fusion image through a series of post-processing procedures. However, the fusion results generated by these methods are easy to lose some source image details or results in artifacts. Inspired by the image reconstruction techniques based on deep learning, we propose a multi-focus image fusion network framework without any post-processing to solve these problems in the end-to-end and supervised learning way. To sufficiently train the fusion model, we have generated a large-scale multi-focus image dataset with ground-truth fusion images. What's more, to obtain a more informative fusion image, we further designed a novel fusion strategy based on unity fusion attention, which is composed of a channel attention module and a spatial attention module. Specifically, the proposed fusion approach mainly comprises three key components: feature extraction, feature fusion and image reconstruction. We firstly utilize seven convolutional blocks to extract the image features from source images. Then, the extracted convolutional features are fused by the proposed fusion strategy in the feature fusion layer. Finally, the fused image features are reconstructed by four convolutional blocks. Experimental results demonstrate that the proposed approach for multi-focus image fusion achieves remarkable fusion performance compared to 19 state-of-the-art fusion methods.	翻訳日:2021-04-04 01:53:40 公開日:2021-01-12
# 意味的特徴から物体間の相対的深さの予測 Predicting Relative Depth between Objects from Semantic Features ( http://arxiv.org/abs/2101.04626v1 ) ライセンス: Link先を確認	Stefan Cassar, Adrian Muscat, Dylan Seychell	(参考訳) 視覚関係検出や視覚的質問応答といった視覚および言語タスクは、言語を適切に接地できる意味的特徴から恩恵を受ける。 2次元画像で描かれた物体の3次元深度はそのような特徴である。しかし,シーン依存の適切な特徴を学習することなく正確な深度情報を得るのは難しい。この領域における技術の現状は、ステレオ画像データに基づいて訓練された複雑なニューラルネットワークモデルであり、ピクセルごとの深さを予測する。幸いなことに、いくつかのタスクでは、必要なオブジェクト間の相対的な深さのみである。本稿では,意味的特徴がコース相対深さを予測できる程度について検討する。この問題を分類として、オブジェクト境界ボックスに基づく幾何学的特徴として、オブジェクトラベルとシーン属性を計算し、パターン認識モデルの入力として使用して相対深さを予測する。後ろに、正面に、中立に。結果は,最先端技術を表すモノデプスニューラルネットワークモデルの出力を平均化した結果と比較する。モノディープスモデルから計算した相対深度に対する相対深度精度の14%の総合的な増加が達成された。 Vision and language tasks such as Visual Relation Detection and Visual Question Answering benefit from semantic features that afford proper grounding of language. The 3D depth of objects depicted in 2D images is one such feature. However it is very difficult to obtain accurate depth information without learning the appropriate features, which are scene dependent. The state of the art in this area are complex Neural Network models trained on stereo image data to predict depth per pixel. Fortunately, in some tasks, its only the relative depth between objects that is required. In this paper the extent to which semantic features can predict course relative depth is investigated. The problem is casted as a classification one and geometrical features based on object bounding boxes, object labels and scene attributes are computed and used as inputs to pattern recognition models to predict relative depth. i.e behind, in-front and neutral. The results are compared to those obtained from averaging the output of the monodepth neural network model, which represents the state-of-the art. An overall increase of 14% in relative depth accuracy over relative depth computed from the monodepth model derived results is achieved.	翻訳日:2021-04-04 01:53:22 公開日:2021-01-12
# 高精細画像合成のための高速安定化GAN訓練に向けて Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis ( http://arxiv.org/abs/2101.04775v1 ) ライセンス: Link先を確認	Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal	(参考訳) 高忠実度画像に対するGAN(Generative Adversarial Networks)のトレーニングは通常、大規模なGPUクラスタと大量のトレーニングイメージを必要とする。本稿では,最小計算コストでganの少数ショット画像合成タスクについて検討する。 10241024の解像度で優れた品質が得られる軽量gan構造を提案する。特に、モデルは1つのRTX-2080 GPUでわずか数時間のトレーニングでゼロから収束し、100以下のトレーニングサンプルでも一貫したパフォーマンスを持つ。機能エンコーダとして訓練されたスキップ層チャネル方向励振モジュールと自己教師付き判別器である。さまざまなイメージドメインをカバーする13のデータセット(データセットとコードはhttps://github.com/odegeasslbc/fastgan-pytorchで利用可能)では、データとコンピューティング予算が限られている場合、最先端のstylegan2よりも優れたパフォーマンスを示しています。 Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images. In this paper, we study the few-shot image synthesis task for GAN with minimum computing cost. We propose a light-weight GAN structure that gains superior quality on 10241024 resolution. Notably, the model converges from scratch with just a few hours of training on a single RTX-2080 GPU, and has a consistent performance, even with less than 100 training samples. Two technique designs constitute our work, a skip-layer channel-wise excitation module and a self-supervised discriminator trained as a feature-encoder. With thirteen datasets covering a wide variety of image domains (The datasets and code are available at: https://github.com/odegeasslbc/FastGAN-pytorch), we show our model's superior performance compared to the state-of-the-art StyleGAN2, when data and computing budget are limited.	翻訳日:2021-04-04 01:53:05 公開日:2021-01-12
# オンライン旅行目的地予測のための統一フレームワーク A Unified Framework for Online Trip Destination Prediction ( http://arxiv.org/abs/2101.04520v1 ) ライセンス: Link先を確認	Victor Eberstein, Jonas Sj\"oblom, Nikolce Murgovski, Morteza Haghir Chehreghani	(参考訳) 旅行先予測は、旅行計画、自動運転、電気自動車など、多くのアプリケーションで重要性を増している分野である。この問題は、データがシーケンシャルな方法で到着するオンライン学習パラダイムで自然に解決することができるが、研究の大半はむしろオフライン設定だと考えている。本稿では,オンライントレーニングとオンライン予測の両方に適したオンライン環境での旅行先予測の統一フレームワークを提案する。この目的のために,2つのクラスタリングアルゴリズムを開発し,この問題に対する2つのオンライン予測モデルに統合する。実世界のデータセットにおけるクラスタリングアルゴリズムと予測モデルの異なる構成について検討する。従来のクラスタリングのメトリクスと精度を用いて、クラスタリングとフレームワーク全体がオフライン環境と比べて一貫した結果をもたらすことを実証する。最後に、オフラインのフレームワークと比較し、オンラインフレームワーク全体を評価するための新しい後悔の指標を提案する。このメトリックにより、誤った予測のソースをクラスタリングまたは予測モデルのいずれかに関連付けることができる。このメトリックを用いて,提案手法が真の分布に類似した確率分布に収束し,ベースラインのすべてよりも低い後悔を味わうことを示す。 Trip destination prediction is an area of increasing importance in many applications such as trip planning, autonomous driving and electric vehicles. Even though this problem could be naturally addressed in an online learning paradigm where data is arriving in a sequential fashion, the majority of research has rather considered the offline setting. In this paper, we present a unified framework for trip destination prediction in an online setting, which is suitable for both online training and online prediction. For this purpose, we develop two clustering algorithms and integrate them within two online prediction models for this problem. We investigate the different configurations of clustering algorithms and prediction models on a real-world dataset. By using traditional clustering metrics and accuracy, we demonstrate that both the clustering and the entire framework yield consistent results compared to the offline setting. Finally, we propose a novel regret metric for evaluating the entire online framework in comparison to its offline counterpart. This metric makes it possible to relate the source of erroneous predictions to either the clustering or the prediction model. Using this metric, we show that the proposed methods converge to a probability distribution resembling the true underlying distribution and enjoy a lower regret than all of the baselines.	翻訳日:2021-04-04 01:52:27 公開日:2021-01-12
# ベンチマークシミュレーションに基づく推論 Benchmarking Simulation-Based Inference ( http://arxiv.org/abs/2101.04653v1 ) ライセンス: Link先を確認	Jan-Matthis Lueckmann, Jan Boelts, David S. Greenberg, Pedro J. Gon\c{c}alves, Jakob H. Macke	(参考訳) 確率的モデリングの最近の進歩は、確率の数値的評価を必要としない多くのシミュレーションに基づく推論アルゴリズムを生み出した。しかし、このような'likelihood-free'アルゴリズムに適切なパフォーマンス指標を持つ公開ベンチマークは欠落している。これにより、アルゴリズムの比較と、その強みと弱みの特定が難しくなった。私たちは、推論タスクと適切なパフォーマンスメトリクスを備えたベンチマークを提供し、ニューラルネットワークと古典的な近似ベイズ計算手法を用いた最近のアプローチを含むアルゴリズムを初期選択します。性能指標の選択は重要であり、最先端のアルゴリズムでさえ改善の余地があり、逐次推定によりサンプリング効率が向上することがわかった。ニューラルネットワークベースのアプローチは一般的にパフォーマンスが向上するが、一様に最適なアルゴリズムはない。我々は,問題を診断し,アルゴリズムを改善するためのベンチマークの可能性を強調し,実践的なアドバイスを提供する。結果はコンパニオンwebサイトでインタラクティブに探すことができる。すべてのコードはオープンソースであり、さらなるベンチマークタスクと推論アルゴリズムに貢献することができる。 Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for such 'likelihood-free' algorithms has been lacking. This has made it difficult to compare algorithms and identify their strengths and weaknesses. We set out to fill this gap: We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms including recent approaches employing neural networks and classical Approximate Bayesian Computation methods. We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency. Neural network-based approaches generally exhibit better performance, but there is no uniformly best algorithm. We provide practical advice and highlight the potential of the benchmark to diagnose problems and improve algorithms. The results can be explored interactively on a companion website. All code is open source, making it possible to contribute further benchmark tasks and inference algorithms.	翻訳日:2021-04-04 01:52:12 公開日:2021-01-12
# コミュニケーションのためのモデルベース機械学習 Model-Based Machine Learning for Communications ( http://arxiv.org/abs/2101.04726v1 ) ライセンス: Link先を確認	Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, and Andrea J. Goldsmith	(参考訳) 本稿では,コミュニケーションシステムのためのモデルベース機械学習について紹介する。まず、モデルベースアルゴリズムと機械学習を組み合わせる既存の戦略を高レベルの観点から見直し、エンドツーエンドでトレーニングされた確立されたディープニューラルネットワーク(DNN)アーキテクチャを利用した従来のディープラーニングアプローチと比較する。次に,通信受信機の基本的なタスクの一つであるシンボル検出に注目する。本稿では,従来のディープアーキテクチャ,ディープ展開,DNN支援ハイブリッドアルゴリズムの異なる戦略が,この問題にどのように適用できるかを示す。最後の2つのアプローチは、純粋にモデルベースとdnnベースのレシーバーの中間に位置する。この特定のタスクに注目することで,各戦略の利点と欠点を強調し,コミュニケーションのためのモデルベース深層学習システムの設計を容易にするためのガイドラインを提案する。 We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications.	翻訳日:2021-04-04 01:51:46 公開日:2021-01-12
# CleftNet:脳電子顕微鏡によるシナプス下肢検出のための深層学習 CleftNet: Augmented Deep Learning for Synaptic Cleft Detection from Brain Electron Microscopy ( http://arxiv.org/abs/2101.04266v1 ) ライセンス: Link先を確認	Yi Liu, Shuiwang Ji	(参考訳) シナプス裂の検出はシナプスの生物学的機能を調べる上で重要なステップである。体積電子顕微鏡(em)は、em像を高分解能で微細に撮影することでシナプス裂の同定を可能にする。 em画像からシナプス裂を自動的に予測するために、機械学習のアプローチが採用されている。そこで本研究では,脳EM画像からのシナプス・クリフ検出を改善するための,CleftNetと呼ばれる新しい深層学習モデルを提案する。まず,機能拡張器とラベル拡張器という2つの新しいネットワークコンポーネントを提案する。機能拡張子は、入力からグローバル情報を融合し、cleftで共通の形態的パターンを学習し、拡張されたcleft機能に繋がる。さらに、さまざまな次元の出力を生成して、任意のディープネットワークに柔軟に統合することができる。提案するラベル拡張器は,各ボクセルのラベルを値からベクトルに拡張し,セグメンテーションラベルと境界ラベルの両方を含む。これにより、ネットワークは重要な形状情報を学び、より情報的なクリフ表現を生成することができる。提案する機能拡張子とラベル拡張子に基づき、cleftnetをu-netライクなネットワークとして構築する。本手法の有効性は,オンラインタスクとオフラインタスクの両方で評価される。私たちのCleftNetは現在、CREMIオープンチャレンジのオンラインタスクで#1にランク付けしています。さらに,オフラインタスクにおける定量的および定性的な結果から,本手法がベースラインアプローチを大きく上回っていることが示された。 Detecting synaptic clefts is a crucial step to investigate the biological function of synapses. The volume electron microscopy (EM) allows the identification of synaptic clefts by photoing EM images with high resolution and fine details. Machine learning approaches have been employed to automatically predict synaptic clefts from EM images. In this work, we propose a novel and augmented deep learning model, known as CleftNet, for improving synaptic cleft detection from brain EM images. We first propose two novel network components, known as the feature augmentor and the label augmentor, for augmenting features and labels to improve cleft representations. The feature augmentor can fuse global information from inputs and learn common morphological patterns in clefts, leading to augmented cleft features. In addition, it can generate outputs with varying dimensions, making it flexible to be integrated in any deep network. The proposed label augmentor augments the label of each voxel from a value to a vector, which contains both the segmentation label and boundary label. This allows the network to learn important shape information and to produce more informative cleft representations. Based on the proposed feature augmentor and label augmentor, We build the CleftNet as a U-Net like network. The effectiveness of our methods is evaluated on both online and offline tasks. Our CleftNet currently ranks \#1 on the online task of the CREMI open challenge. In addition, both quantitative and qualitative results in the offline tasks show that our method outperforms the baseline approaches significantly.	翻訳日:2021-04-04 01:51:32 公開日:2021-01-12
# PvDeConv:3次元CAD構築のためのポイントボクセルデコンボリューション PvDeConv: Point-Voxel Deconvolution for Autoencoding CAD Construction in 3D ( http://arxiv.org/abs/2101.04493v1 ) ライセンス: Link先を確認	Kseniya Cherenkova, Djamila Aouada, Gleb Gusev	(参考訳) 本稿では,3次元データオートエンコーダのためのPoint-Voxel DeConvolution (PVDeConv) モジュールを提案する。その効率を示すために、コンピュータ支援設計(cad)モデルの基盤となる幾何学を密に記述した10k点の高分解能点雲を合成することを学ぶ。プロトルージョン、欠落した部分、円滑な縁、穴などのスキャンはCADオブジェクトの実際の3Dスキャンに必然的に現れる。元のCADモデル構築を3Dスキャンから学習するには、対応するオブジェクトの3Dスキャンとともに、真理を理解する必要がある。このギャップを解決するために、50k以上のCADモデルとその対応する3Dメッシュを含む、新しい専用データセットCC3Dを導入する。このデータセットは、3Dスキャン(CADモデル)のペアからサンプリングされた点雲の畳み込みオートエンコーダを学ぶために使用される。この新しいデータセットの課題は、ShapeNetでトレーニングされた他の生成点クラウドサンプリングモデルと比較できる。 CC3Dオートエンコーダは、3Dデータ生成の最先端モデルと比較してメモリ消費とトレーニング時間に関して効率的である。 We propose a Point-Voxel DeConvolution (PVDeConv) module for 3D data autoencoder. To demonstrate its efficiency we learn to synthesize high-resolution point clouds of 10k points that densely describe the underlying geometry of Computer Aided Design (CAD) models. Scanning artifacts, such as protrusions, missing parts, smoothed edges and holes, inevitably appear in real 3D scans of fabricated CAD objects. Learning the original CAD model construction from a 3D scan requires a ground truth to be available together with the corresponding 3D scan of an object. To solve the gap, we introduce a new dedicated dataset, the CC3D, containing 50k+ pairs of CAD models and their corresponding 3D meshes. This dataset is used to learn a convolutional autoencoder for point clouds sampled from the pairs of 3D scans - CAD models. The challenges of this new dataset are demonstrated in comparison with other generative point cloud sampling models trained on ShapeNet. The CC3D autoencoder is efficient with respect to memory consumption and training time as compared to stateof-the-art models for 3D data generation.	翻訳日:2021-04-04 01:51:09 公開日:2021-01-12
# プログレッシブリトレーニングによる畳み込みニューラルネットワークの単純化 Convolutional Neural Network Simplification with Progressive Retraining ( http://arxiv.org/abs/2101.04699v1 ) ライセンス: Link先を確認	D. Osaku, J.F. Gomes, A.X. Falc\~ao	(参考訳) カーネルプルーニング法は、畳み込みニューラルネットワーク(CNN)モデルの説明を高速化、単純化、改善するために提案されている。しかし、単純化されたモデルの有効性は、しばしば元のモデルよりも低い。本稿では,カーネル除去の客観的および主観的妥当性基準に基づく新しい手法を提案する。プロセス中、cnnモデルは、次の層から最初の層まで重みを調整し、プロセスに関わらない後の層の重みを保存することによって、現在の層が完全に単純化された場合にのみ再訓練される。私たちはこの戦略を「emph{progressive retraining}」と呼び、各単純化アクションの後にモデル全体を再トレーニングするカーネルプルーニングメソッドとは異なる。我々の主観的関連性基準は、視覚パターン認識における人間の能力を活用し、デザイナーによる単純化プロセスの理解を改善する。適切な適合基準とプログレッシブ・リトレーニングの組み合わせは,モデルの単純化によって有効性を向上できることを示す。また,提案手法は,4つの課題の画像データセットを用いて,最先端技術による2つの手法よりも優れた結果が得られることを示す。 Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. However, the effectiveness of a simplified model is often below the original one. In this letter, we present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion. During the process, a CNN model is retrained only when the current layer is entirely simplified, by adjusting the weights from the next layer to the first one and preserving weights of subsequent layers not involved in the process. We call this strategy \emph{progressive retraining}, differently from kernel pruning methods that usually retrain the entire model after each simplification action -- e.g., the elimination of one or a few kernels. Our subjective relevance criterion exploits the ability of humans in recognizing visual patterns and improves the designer's understanding of the simplification process. The combination of suitable relevance criteria and progressive retraining shows that our methods can increase effectiveness with considerable model simplification. We also demonstrate that our methods can provide better results than two popular ones and another one from the state-of-the-art using four challenging image datasets.	翻訳日:2021-04-04 01:50:38 公開日:2021-01-12
# 顔画像からの痛み推定のための個人化深層学習 Personalized Federated Deep Learning for Pain Estimation From Face Images ( http://arxiv.org/abs/2101.04800v1 ) ライセンス: Link先を確認	Ognjen Rudovic, Nicolas Tobis, Sebastian Kaltwang, Bj\"orn Schuller, Daniel Rueckert, Jeffrey F. Cohn and Rosalind W. Picard	(参考訳) 標準的な機械学習アプローチでは、ユーザのデータをひとつのコンピュータまたは共有データベースに集約する必要がある。したがって、特にデータ規制が厳格な医療環境では、中央アクセスを制限することが重要である。これに取り組む潜在的なアプローチは、生のトレーニングデータをローカルに保持しながら、ローカルにトレーニングされたモデルのパラメータを使用することで、複数の当事者が共有予測モデルを共同的に学習できるフェデレーション学習(fl)である。 AIによる鎮痛モニタリングの文脈では、長期の鎮痛監視のための機密性保存と非閉塞性鎮痛推定を可能とし、定期的なチェックアップを頻繁に行う看護スタッフの負担を軽減したい。この目的のために,顔画像から痛みを推定するためのPFDL(Personalized Federated Deep Learning)アプローチを提案する。 PFDLは、顔画像を共有することなく、異なるクライアント(主題など)にわたって、軽量CNNアーキテクチャを用いて実装されたディープモデルの協調トレーニングを実行する。標準FLのようにモデルのすべてのパラメータを共有する代わりに、PFDLは最後のレイヤをローカルに保持する(痛みの推定をパーソナライズするために使用される)。この(i)は、別のデータの機密性層を追加し、敵が対象者の痛みレベルを推測することを困難にし、(ii)局所的なパラメータチューニングによって各被験者の痛み推定をパーソナライズする。痛みの顔ビデオのデータセット(UNBC-McMaster Shoulder Pain Database)を用いて、PFDLは標準的な集中型およびFLアルゴリズムよりも可視的または優れた性能を示し、データのプライバシーをさらに強化する。これにより、より安全で計算効率が高く、多くの個人(家庭内の痛みモニタリングなど)にスケーラブルで、タイムリーで邪魔にならない痛み測定を提供することで、従来の痛みモニタリングを改善することができる。 Standard machine learning approaches require centralizing the users' data in one computer or a shared database, which raises data privacy and confidentiality concerns. Therefore, limiting central access is important, especially in healthcare settings, where data regulations are strict. A potential approach to tackling this is Federated Learning (FL), which enables multiple parties to collaboratively learn a shared prediction model by using parameters of locally trained models while keeping raw training data locally. In the context of AI-assisted pain-monitoring, we wish to enable confidentiality-preserving and unobtrusive pain estimation for long-term pain-monitoring and reduce the burden on the nursing staff who perform frequent routine check-ups. To this end, we propose a novel Personalized Federated Deep Learning (PFDL) approach for pain estimation from face images. PFDL performs collaborative training of a deep model, implemented using a lightweight CNN architecture, across different clients (i.e., subjects) without sharing their face images. Instead of sharing all parameters of the model, as in standard FL, PFDL retains the last layer locally (used to personalize the pain estimates). This (i) adds another layer of data confidentiality, making it difficult for an adversary to infer pain levels of the target subject, while (ii) personalizing the pain estimation to each subject through local parameter tuning. We show using a publicly available dataset of face videos of pain (UNBC-McMaster Shoulder Pain Database), that PFDL performs comparably or better than the standard centralized and FL algorithms, while further enhancing data privacy. This, has the potential to improve traditional pain monitoring by making it more secure, computationally efficient, and scalable to a large number of individuals (e.g., for in-home pain monitoring), providing timely and unobtrusive pain measurement.	翻訳日:2021-04-04 01:50:18 公開日:2021-01-12
# 深層学習による膝蓋骨遠位端関節症の自動検出:多施設変形性膝関節症研究(MOST)データ Automated Detection of Patellofemoral Osteoarthritis from Knee Lateral View Radiographs Using Deep Learning: Data from the Multicenter Osteoarthritis Study (MOST) ( http://arxiv.org/abs/2101.04350v1 ) ライセンス: Link先を確認	Neslihan Bayramoglu, Miika T. Nieminen, Simo Saarakkala	(参考訳) 目的: 画像を用いた深層学習による膝蓋骨変形性膝関節症(PFOA)の予測能力を評価すること。デザイン:多中心型変形性関節症研究(MOST) (n=18,436膝) から膝側視像を抽出した。 Patellar region-of-interest(ROI)が最初に自動的に検出され、その後、終末から終末にかけての深部畳み込みニューラルネットワーク(CNN)が訓練され、パテロフェモラルOAの状態を検出した。深層学習に基づく物体検出法を用いてパテラーROIを検出した。 MOSTデータセットで提供される手動PFOAステータスアセスメントをCNNの分類結果として用いた。予測モデルの性能は, 受信機動作特性曲線 (ROC AUC) と, 層状5次元断面検証設定における精度再コール曲線 (PR) から得られた平均精度 (AP) に基づいて評価した。結果: 膝18,436例中3,425例(19%)がPFOAであった。 AUCとAPは、年齢、性別、体重指数(BMI)、西オンタリオ大学およびマクマスター大学関節炎指数(WOMAC)スコア、およびPFOAを予測するためのKelgren-Lawrence(KL)グレードが0.806と0.478であった。画像データのみを用いたCNNモデルはPFOA状態の予測を著しく改善した(ROC AUC=0.958, AP=0.862)。結論: 第1回機械学習に基づく自動pfoa検出法を提案する。さらに,膝側方x線写真から膝蓋骨領域を訓練した深層学習モデルでは,患者特性と臨床評価に基づくモデルよりもpfoaの予測が良好である。 Objective: To assess the ability of imaging-based deep learning to predict radiographic patellofemoral osteoarthritis (PFOA) from knee lateral view radiographs. Design: Knee lateral view radiographs were extracted from The Multicenter Osteoarthritis Study (MOST) (n = 18,436 knees). Patellar region-of-interest (ROI) was first automatically detected, and subsequently, end-to-end deep convolutional neural networks (CNNs) were trained and validated to detect the status of patellofemoral OA. Patellar ROI was detected using deep-learning-based object detection method. Manual PFOA status assessment provided in the MOST dataset was used as a classification outcome for the CNNs. Performance of prediction models was assessed using the area under the receiver operating characteristic curve (ROC AUC) and the average precision (AP) obtained from the precision-recall (PR) curve in the stratified 5-fold cross validation setting. Results: Of the 18,436 knees, 3,425 (19%) had PFOA. AUC and AP for the reference model including age, sex, body mass index (BMI), the total Western Ontario and McMaster Universities Arthritis Index (WOMAC) score, and tibiofemoral Kellgren-Lawrence (KL) grade to predict PFOA were 0.806 and 0.478, respectively. The CNN model that used only image data significantly improved the prediction of PFOA status (ROC AUC= 0.958, AP= 0.862). Conclusion: We present the first machine learning based automatic PFOA detection method. Furthermore, our deep learning based model trained on patella region from knee lateral view radiographs performs better at predicting PFOA than models based on patient characteristics and clinical assessments.	翻訳日:2021-04-04 01:49:44 公開日:2021-01-12
# 高精度ピック・アンド・プレイス作業のためのシミュレーションから実世界への移動経験 Transferring Experience from Simulation to the Real World for Precise Pick-And-Place Tasks in Highly Cluttered Scenes ( http://arxiv.org/abs/2101.04781v1 ) ライセンス: Link先を確認	Kilian Kleeberger and Markus V\"olk and Marius Moosmann and Erik Thiessenhusen and Florian Roth and Richard Bormann and Marco F. Huber	(参考訳) 本稿では,高度に散らばったシーンで既知の剛体物体を把握し,深度画像に基づいて正確に配置する,新しい学習手法を提案する。 pq-net (placement quality network) は、ニューラルネットワークの1回のフォワードパスにおいて、複数のオブジェクトに対して、自動的に生成された把持の各々のオブジェクトポーズと品質を92fpsで同時に推定する。全ての把握と配置の試行は物理シミュレーションで実行され、得られた経験はドメインランダム化を用いて実世界に移される。われわれの政策は実世界への移転に成功している。 PQ-Netは成功率の把握の観点から他のモデルフリーアプローチよりも優れており、人間の介入なしに任意の対称性を持つ新しいオブジェクトに自動的にスケールする。 In this paper, we introduce a novel learning-based approach for grasping known rigid objects in highly cluttered scenes and precisely placing them based on depth images. Our Placement Quality Network (PQ-Net) estimates the object pose and the quality for each automatically generated grasp pose for multiple objects simultaneously at 92 fps in a single forward pass of a neural network. All grasping and placement trials are executed in a physics simulation and the gained experience is transferred to the real world using domain randomization. We demonstrate that our policy successfully transfers to the real world. PQ-Net outperforms other model-free approaches in terms of grasping success rate and automatically scales to new objects of arbitrary symmetry without any human intervention.	翻訳日:2021-04-04 01:49:10 公開日:2021-01-12
# クラウドソーシングによる効果的なコンテンツ分析に向けて Toward Effective Automated Content Analysis via Crowdsourcing ( http://arxiv.org/abs/2101.04615v1 ) ライセンス: Link先を確認	Jiele Wu, Chau-Wai Wong, Xinyan Zhao, Xianpeng Liu	(参考訳) 多くのコンピュータ科学者は、オンラインワーカーの集約された回答を使って真実を表現している。先行研究では、多数決のような集計手法が比較的客観的な特徴を測定するのに有効であることが示されている。意味的意味づけのような主観的な機能では、時間ごとの収益を最適化することで知られるオンラインワーカーは、より長く働くと応答の質が低下する傾向がある。本稿では,品質を意識したセマンティックデータアノテーションシステムを提案することで,この問題に対処しようとする。我々は、品質スコアによって定量化された労働者のパフォーマンスに対するタイムリーなフィードバックにより、オンライン労働者が長期にわたってラベル付けの品質を維持することができることを観察した。提案するアノテーションシステムの有効性を検証するために,i) エキスパートラベルデータセットに基づく性能評価,ii) 70%から80%の精度で一貫した学習行動をもたらす機械学習タスクの実証を行った。その結果,本システムでは主観的意味的特徴の質の高い回答を大規模に収集できることが示唆された。 Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective features such as semantic connotation, online workers, known for optimizing their hourly earnings, tend to deteriorate in the quality of their responses as they work longer. In this paper, we aim to address this issue by proposing a quality-aware semantic data annotation system. We observe that with timely feedback on workers' performance quantified by quality scores, better informed online workers can maintain the quality of their labeling throughout an extended period of time. We validate the effectiveness of the proposed annotation system through i) evaluating performance based on an expert-labeled dataset, and ii) demonstrating machine learning tasks that can lead to consistent learning behavior with 70%-80% accuracy. Our results suggest that with our system, researchers can collect high-quality answers of subjective semantic features at a large scale.	翻訳日:2021-04-04 01:48:57 公開日:2021-01-12
# SARS-CoV-2のAIおよびHPC対応リード生成:自然言語テキストに含まれる薬物様分子の抽出モデルとプロセス AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text ( http://arxiv.org/abs/2101.04617v1 ) ライセンス: Link先を確認	Zhi Hong, J. Gregory Pauloski, Logan Ward, Kyle Chard, Ben Blaiszik, and Ian Foster	(参考訳) 世界中の研究者は、重症急性呼吸器症候群ウイルス(SARS-CoV-2)による病気に対抗するために、既存の薬物の再利用や新しい薬物の発見を目指している。このような研究の候補は、新型コロナウイルス研究の文脈で薬物のような分子であると科学文献で報告されている分子である。ここでは、人間と人工知能の両方を利用して、フリーテキストで薬物様分子の参照を検出するプロジェクトについて報告する。我々は、高度でない人間がラベル付きテキストのコーパスを作成し、このラベル付きコーパスを使用して名前付きエンティティ認識モデルを訓練し、訓練されたモデルを用いて198875紙のオープンリサーチデータセットチャレンジ(CORD-19)コーパスから10912の薬物様分子を抽出する。性能分析の結果, 自動抽出モデルは非熟練人間と同等の性能が得られることがわかった。 Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A promising source of candidates for such studies is molecules that have been reported in the scientific literature to be drug-like in the context of coronavirus research. We report here on a project that leverages both human and artificial intelligence to detect references to drug-like molecules in free text. We engage non-expert humans to create a corpus of labeled text, use this labeled corpus to train a named entity recognition model, and employ the trained model to extract 10912 drug-like molecules from the COVID-19 Open Research Dataset Challenge (CORD-19) corpus of 198875 papers. Performance analyses show that our automated extraction model can achieve performance on par with that of non-expert humans.	翻訳日:2021-04-04 01:48:40 公開日:2021-01-12
# Queue-Learning: サービス品質提供のための強化学習アプローチ Queue-Learning: A Reinforcement Learning Approach for Providing Quality of Service ( http://arxiv.org/abs/2101.04627v1 ) ライセンス: Link先を確認	Majid Raeis, Ali Tizghadam, Alberto Leon-Garcia	(参考訳) エンドツーエンドの遅延は、クラウドコンピューティングやコンピュータネットワークなどのアプリケーションドメインにおけるQoS(Quality of Service)の重要な特性である。このメトリクスは、エンドツーエンドサービスがサービスチェーンを介して提供される、タンデムサービスシステムにおいて特に重要です。サービスレート制御は、サービスシステムにおいてqos保証を提供する共通のメカニズムである。本稿では、サービスリソースの過剰使用を防止しつつ、システムのエンドツーエンド遅延に対する確率的上限を提供する強化学習ベース(RLベース)サービスレートコントローラを提案する。一般的なフレームワークを得るために、私たちはキュー理論を使ってサービスシステムをモデル化します。しかし、待ち行列理論の制限を避けるためにrlベースのアプローチを採用する。特に、Deep Deterministic Policy Gradient(DDPG)を使用して、タンデムサービスシステムのキュー長(状態)の関数として、サービスレート(アクション)を学習します。システム全体の報酬によって性能を定量化する既存のrlベースの手法とは対照的に,提案するコントローラはシステムのエンド・ツー・エンドの遅延に対する明示的な確率的保証を提供する。 qosの制約を満たしたコントローラの能力を検証した,非指数的相互接続およびサービス時間を有するタンデム待ち行列システムについて評価を行った。 End-to-end delay is a critical attribute of quality of service (QoS) in application domains such as cloud computing and computer networks. This metric is particularly important in tandem service systems, where the end-to-end service is provided through a chain of services. Service-rate control is a common mechanism for providing QoS guarantees in service systems. In this paper, we introduce a reinforcement learning-based (RL-based) service-rate controller that provides probabilistic upper-bounds on the end-to-end delay of the system, while preventing the overuse of service resources. In order to have a general framework, we use queueing theory to model the service systems. However, we adopt an RL-based approach to avoid the limitations of queueing-theoretic methods. In particular, we use Deep Deterministic Policy Gradient (DDPG) to learn the service rates (action) as a function of the queue lengths (state) in tandem service systems. In contrast to existing RL-based methods that quantify their performance by the achieved overall reward, which could be hard to interpret or even misleading, our proposed controller provides explicit probabilistic guarantees on the end-to-end delay of the system. The evaluations are presented for a tandem queueing system with non-exponential inter-arrival and service times, the results of which validate our controller's capability in meeting QoS constraints.	翻訳日:2021-04-04 01:48:11 公開日:2021-01-12
# 計算物理学における自動モデル推薦のためのデータ拡張と特徴選択 Data augmentation and feature selection for automatic model recommendation in computational physics ( http://arxiv.org/abs/2101.04530v1 ) ライセンス: Link先を確認	Thomas Daniel, Fabien Casenave, Nissrine Akkari, David Ryckelynck	(参考訳) 分類アルゴリズムは、最近、計算物理学において、環境や物理システムの状態に適応した数値的手法やモデルの選択に応用されている。このような分類タスクでは、ラベル付きトレーニングデータは数値シミュレーションから得られ、一般にメッシュ上に離散化された物理フィールドに対応する。トレーニングデータの欠如、高次元化、物理データへの共通データ拡張技術の適用不可能という3つの難題が生まれている。この記事では、これらの問題に対処するために、2つのアルゴリズムを紹介します。1つは特徴選択による次元の削減、もう1つはデータ拡張です。これらのアルゴリズムは、評価のために様々な分類器と組み合わせられる。 6つの多層パーセプトロンからなる積層アンサンブルとリッジロジスティック回帰を組み合わせた場合、非線形構造力学の分類問題において90%の精度が得られる。 Classification algorithms have recently found applications in computational physics for the selection of numerical methods or models adapted to the environment and the state of the physical system. For such classification tasks, labeled training data come from numerical simulations and generally correspond to physical fields discretized on a mesh. Three challenging difficulties arise: the lack of training data, their high dimensionality, and the non-applicability of common data augmentation techniques to physics data. This article introduces two algorithms to address these issues, one for dimensionality reduction via feature selection, and one for data augmentation. These algorithms are combined with a wide variety of classifiers for their evaluation. When combined with a stacking ensemble made of six multilayer perceptrons and a ridge logistic regression, they enable reaching an accuracy of 90% on our classification problem for nonlinear structural mechanics.	翻訳日:2021-04-04 01:47:12 公開日:2021-01-12
# 空間情報を用いた時系列データの効率的解析のためのディープセルリカレントネットワーク Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information ( http://arxiv.org/abs/2101.05608v1 ) ライセンス: Link先を確認	Lasitha Vidyaratne, Mahbubul Alam, Alexander Glandon, Anna Shabalina, Christopher Tennant, and Khan Iftekharuddin	(参考訳) 大規模時系列データの効率的な処理は、機械学習の複雑な問題である。手動で特徴抽出を行う従来のセンサ信号処理パイプラインは、高次元データによる膨大な計算コストを伴うことが多い。ディープリカレントニューラルネットワークは、時系列処理を改善するための自動機能学習に有望である。しかし、一般的なディープ・リカレントモデルでは、データの複雑さが増すにつれてスケールと深さが大きくなる。これは、時間的および空間的特性を持つ高次元データの存在において特に困難である。そこで本研究では,複雑な多次元時系列データを空間情報で効率的に処理する新しいディープセルリカレントニューラルネットワーク(dcrnn)アーキテクチャを提案する。提案モデルにおけるセルリカレントアーキテクチャにより,空間分布センサ信号源からの時系列データの位置認識同期処理が可能となる。提案アーキテクチャにおけるセルラ性による広範なトレーニング可能なパラメータ共有は,高次元入力を用いた再帰処理ユニットの使用効率を保証している。そこで本研究では,DCRNNモデルの多クラス時系列データの分類における汎用性についても検討した。その結果、DCRNNアーキテクチャは2つの時系列データセット、つまり、発作検出のためのマルチチャネルの頭皮EEGデータセットと、社内で得られたマシン故障検出データセットを用いて評価される。その結果,本論文の手法と比較した場合,学習可能なパラメータをかなり少なくしつつ,最先端の性能を実現できることが示唆された。 Efficient processing of large-scale time series data is an intricate problem in machine learning. Conventional sensor signal processing pipelines with hand engineered feature extraction often involve huge computational cost with high dimensional data. Deep recurrent neural networks have shown promise in automated feature learning for improved time-series processing. However, generic deep recurrent models grow in scale and depth with increased complexity of the data. This is particularly challenging in presence of high dimensional data with temporal and spatial characteristics. Consequently, this work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to efficiently process complex multi-dimensional time series data with spatial information. The cellular recurrent architecture in the proposed model allows for location-aware synchronous processing of time series data from spatially distributed sensor signal sources. Extensive trainable parameter sharing due to cellularity in the proposed architecture ensures efficiency in the use of recurrent processing units with high-dimensional inputs. This study also investigates the versatility of the proposed DCRNN model for classification of multi-class time series data from different application domains. Consequently, the proposed DCRNN architecture is evaluated using two time-series datasets: a multichannel scalp EEG dataset for seizure detection, and a machine fault detection dataset obtained in-house. The results suggest that the proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.	翻訳日:2021-04-04 01:46:57 公開日:2021-01-12
# デバイス上インテント分類の強化された文字表現 A character representation enhanced on-device Intent Classification ( http://arxiv.org/abs/2101.04456v1 ) ライセンス: Link先を確認	Sudeep Deepak Shivnikar, Himanshu Arora, Harichandana B S S	(参考訳) 意図分類は自然言語理解システムにおいて重要なタスクである。既存のアプローチは、ベンチマークデータセットで完璧なスコアを獲得しました。しかし、モバイルやタブレットなどの低リソースデバイスへのデプロイには適していない。モデルの大きさが大きすぎるためですそこで本稿では,デバイス上で効率的に動作可能な,意図分類のための新しい軽量アーキテクチャを提案する。我々は文字特徴を使って単語表現を豊かにする。実験により,提案モデルが既存手法より優れ,ベンチマークデータセットの最先端結果が得られた。また,本モデルではメモリフットプリントが5MB程度で,推定時間は2ミリ秒程度であり,資源制約環境下での効率を実証する。 Intent classification is an important task in natural language understanding systems. Existing approaches have achieved perfect scores on the benchmark datasets. However they are not suitable for deployment on low-resource devices like mobiles, tablets, etc. due to their massive model size. Therefore, in this paper, we present a novel light-weight architecture for intent classification that can run efficiently on a device. We use character features to enrich the word representation. Our experiments prove that our proposed model outperforms existing approaches and achieves state-of-the-art results on benchmark datasets. We also report that our model has tiny memory footprint of ~5 MB and low inference time of ~2 milliseconds, which proves its efficiency in a resource-constrained environment.	翻訳日:2021-04-04 01:46:40 公開日:2021-01-12
# 話題分布を持つxlnetモデルを用いた偽ニュース検出システム: constraint@aaai2021 shared task Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task ( http://arxiv.org/abs/2101.11425v1 ) ライセンス: Link先を確認	Akansha Gautam, Venktesh V, Sarah Masud	(参考訳) 情報へのアクセスの容易さとインターネット上での急速な普及(速度とボリュームの両方)により、偽情報から真実情報をフィルタリングすることは困難になっている。研究コミュニティは現在、現実世界の政治的影響をもたらす偽ニュースの自動検出という課題に直面している。このような研究はConstraint@AAA12021 Shared Task on COVID19 Fake News Detection in Englishという形で行われた。本稿では,この共有タスクの一環として提案した新しい手法について光を当てる。我々のチームは、LDA(Latent Dirichlet Allocation)のトピック分布とXLNetの文脈表現を組み合わせたアプローチを導入しました。また,提案手法を既存のベースラインと比較し,XLNet + Topic DistributionsがF1スコア0.967を達成することにより,他の手法よりも優れていることを示す。 With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA12021 Shared Task on COVID19 Fake News Detection in English. In this paper, we shed light on a novel method we proposed as a part of this shared task. Our team introduced an approach to combine topical distributions from Latent Dirichlet Allocation (LDA) with contextualized representations from XLNet. We also compared our method with existing baselines to show that XLNet + Topic Distributions outperforms other approaches by attaining an F1-score of 0.967.	翻訳日:2021-04-04 01:46:31 公開日:2021-01-12
# クラウドカウントのための強化情報融合ネットワーク Enhanced Information Fusion Network for Crowd Counting ( http://arxiv.org/abs/2101.04279v1 ) ライセンス: Link先を確認	Geng Chen and Peirong Guo	(参考訳) 近年,画像中の人物数を予測する手法である群集カウントは,コンピュータビジョンにおける課題となっている。本稿では,カラム内の情報冗長性問題を解決するために,クロスカラム特徴融合ネットワークを提案する。我々は,異なる列が他の列から重要な情報を得るのを助けるために,情報フローのチャネルを提供する情報融合モジュール(IFM)を紹介する。このチャネルを通じて、異なる列が情報を交換し、他の列から有用な特徴を抽出し、キー情報を強化する。したがって、イメージ内のすべての領域に注意を払うためにカラムは必要ない。各列は異なる領域に責任を持ち、各列の負担を軽減できる。実験では、モデルの一般化性はより堅牢で、異なるデータセット間で転送した結果は最先端のモデルと同等の結果が得られます。 In recent years, crowd counting, a technique for predicting the number of people in an image, becomes a challenging task in computer vision. In this paper, we propose a cross-column feature fusion network to solve the problem of information redundancy in columns. We introduce the Information Fusion Module (IFM) which provides a channel for information flow to help different columns to obtain significant information from another column. Through this channel, different columns exchange information with each other and extract useful features from the other column to enhance key information. Hence, there is no need for columns to pay attention to all areas in the image. Each column can be responsible for different regions, thereby reducing the burden of each column. In experiments, the generalizability of our model is more robust and the results of transferring between different datasets acheive the comparable results with the state-of-the-art models.	翻訳日:2021-04-04 01:46:16 公開日:2021-01-12
# マルチモーダル眼球運動データセットとマルチモーダル眼球運動セグメンテーション解析 A Multimodal Eye Movement Dataset and a Multimodal Eye Movement Segmentation Analysis ( http://arxiv.org/abs/2101.04318v1 ) ライセンス: Link先を確認	Wolfgang Fuhl and Enkelejda Kasneci	(参考訳) 注視眼球運動を伴う新しいデータセットを提案する。データセットは、現実世界やシミュレーターでの乗車中に記録された80万以上の視線ポイントで構成されている。合計19名の被験者の眼球運動を注記した。このデータセットには、眼球閉鎖、瞳孔中心、光学ベクトル、眼球角の中心から始まる瞳孔中心へのベクトルなど、いくつかのデータソースがある。これらの異なるデータソースを個別に分析・評価し、眼球運動分類に適合する良さと組み合わせて評価する。これらの結果は、リアルタイムシステムやアルゴリズムの開発者がアプリケーションに最適なデータソースを見つけるのに役立つだろう。また、このデータセット上で新しいアルゴリズムをトレーニングして評価することもできる。データとmatlabコードは、https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2fa%20multimodal%20eye%20movement%20dataset%20and%20...&mode=listでダウンロードできる。 We present a new dataset with annotated eye movements. The dataset consists of over 800,000 gaze points recorded during a car ride in the real world and in the simulator. In total, the eye movements of 19 subjects were annotated. In this dataset there are several data sources such as the eyelid closure, the pupil center, the optical vector, and a vector into the pupil center starting from the center of the eye corners. These different data sources are analyzed and evaluated individually as well as in combination with respect to their goodness of fit for eye movement classification. These results will help developers of real-time systems and algorithms to find the best data sources for their application. Also, new algorithms can be trained and evaluated on this data set. The data and the Matlab code can be downloaded here https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FA%20Multimodal%20Eye%20Movement%20Dataset%20and%20...&mode=list	翻訳日:2021-04-04 01:45:40 公開日:2021-01-12
# 逆攻撃に対する画像輝度のランダム変換 Random Transformation of Image Brightness for Adversarial Attack ( http://arxiv.org/abs/2101.04321v1 ) ライセンス: Link先を確認	Bo Yang, Kaiyong Xu, Hengjun Wang, Hengwei Zhang	(参考訳) ディープニューラルネットワークは、オリジナルの画像に小さな人間の知覚できない摂動を加えることで構築される敵の例に弱いが、モデル出力の不正確な予測を行う。ディープニューラルネットワークがデプロイされる前に、敵攻撃は安全クリティカルなアプリケーションにおいて堅牢なモデルを評価し選択するための重要な方法となる。しかし、難易度の高いブラックボックス設定では、攻撃成功率、すなわち敵の例の転送可能性を改善する必要がある。画像拡張法に基づき、画像輝度のランダム変換により、逆例生成における過剰フィットを解消し、その転送性を向上させることが判明した。そこで本研究では,FGSM(Fast Gradient Sign Method)関連手法と統合して,より堅牢な勾配に基づく攻撃を構築し,より優れた転送性を持つ逆例を生成する,この現象に基づく逆例生成手法を提案する。 ImageNetデータセットに関する大規模な実験は、この方法の有効性を実証している。本手法は,通常のネットワークであろうと逆であれ,データ拡張に基づく攻撃手法よりもブラックボックス攻撃の成功率が高い。この手法がモデルの堅牢性の評価と改善に役立つことを期待している。 Deep neural networks are vulnerable to adversarial examples, which are crafted by adding small, human-imperceptible perturbations to the original images, but make the model output inaccurate predictions. Before deep neural networks are deployed, adversarial attacks can thus be an important method to evaluate and select robust models in safety-critical applications. However, under the challenging black-box setting, the attack success rate, i.e., the transferability of adversarial examples, still needs to be improved. Based on image augmentation methods, we found that random transformation of image brightness can eliminate overfitting in the generation of adversarial examples and improve their transferability. To this end, we propose an adversarial example generation method based on this phenomenon, which can be integrated with Fast Gradient Sign Method (FGSM)-related methods to build a more robust gradient-based attack and generate adversarial examples with better transferability. Extensive experiments on the ImageNet dataset demonstrate the method's effectiveness. Whether on normally or adversarially trained networks, our method has a higher success rate for black-box attacks than other attack methods based on data augmentation. We hope that this method can help to evaluate and improve the robustness of models.	翻訳日:2021-04-04 01:45:20 公開日:2021-01-12
# 迷わずに混ざり合う Mixup Without Hesitation ( http://arxiv.org/abs/2101.04342v1 ) ライセンス: Link先を確認	Hao Yu, Huanyu Wang, Jianxin Wu	(参考訳) ミックスアップはサンプルのペアを線形補間して新しいサンプルを作成するが、実装が容易であり、画像分類タスクに有効であることが示されている。しかし、ミックスアップには2つの欠点がある:1つは、十分に訓練されたモデルを得るために、より多くのトレーニングエポックが必要とされることである。本稿では,ミックスアップが常に表現空間を探索し,強化学習における探索・探索ジレンマにインスパイアされて,簡潔で効果的で使いやすいトレーニングアルゴリズムであるミックスアップ無湿(mWh)を提案する。我々は,mWhが基本データ拡張とミックスアップを徐々に置き換えることで,探索と搾取のバランスが良いことを示す。もともとのミキシングアップよりもトレーニング時間が短く、最適なハイパーパラメーターを探すことなく、すなわちmWhが混成アップとして振る舞うような強いベースラインを実現することができる。 mWhはCutMixに転送することもでき、オブジェクト検出などの他の機械学習やコンピュータビジョンタスクにも一貫した改善が加えられる。私たちのコードはオープンソースで、https://github.com/yuhao318/mwhで利用可能です。 Mixup linearly interpolates pairs of examples to form new samples, which is easy to implement and has been shown to be effective in image classification tasks. However, there are two drawbacks in mixup: one is that more training epochs are needed to obtain a well-trained model; the other is that mixup requires tuning a hyper-parameter to gain appropriate capacity but that is a difficult task. In this paper, we find that mixup constantly explores the representation space, and inspired by the exploration-exploitation dilemma in reinforcement learning, we propose mixup Without hesitation (mWh), a concise, effective, and easy-to-use training algorithm. We show that mWh strikes a good balance between exploration and exploitation by gradually replacing mixup with basic data augmentation. It can achieve a strong baseline with less training time than original mixup and without searching for optimal hyper-parameter, i.e., mWh acts as mixup without hesitation. mWh can also transfer to CutMix, and gain consistent improvement on other machine learning and computer vision tasks such as object detection. Our code is open-source and available at https://github.com/yuhao318/mwh	翻訳日:2021-04-04 01:44:48 公開日:2021-01-12
# インタラクティブな画像分割再考: 機能空間アノテーション Rethinking Interactive Image Segmentation: Feature Space Annotation ( http://arxiv.org/abs/2101.04378v1 ) ライセンス: Link先を確認	Jord\~ao Bragantini (UNICAMP), Alexandre Falc\~ao (UNICAMP), Laurent Najman (ligm)	(参考訳) インタラクティブな画像分割手法の進歩にもかかわらず、高品質なピクセルレベルのアノテーションは依然として時間がかかり、手間がかかる。特徴空間投影によって導かれる複数の画像から対話的かつ同時的なセグメントアノテーションを提案し,ラベリングが進行するにつれてメトリック学習により最適化する。この戦略は、画像領域でアノテーションを実行する既存のインタラクティブセグメンテーション手法とは対照的である。提案手法は,iCoSeg,DAVIS,Rooftopといった前景セグメンテーションデータセットにおける最先端手法の精度を超えることができることを示す。さらに、既知のセマンティクスセグメンテーションデータセットであるcityscapesでは、元のアノテーション手順の74.75倍の精度で、91.5\%の精度を実現している。付録は追加の質的結果を示す。コードとビデオのデモは公開時に公開される。 Despite the progress of interactive image segmentation methods, high-quality pixel-level annotation is still time-consuming and laborious -- a bottleneck for several deep learning applications. We take a step back to propose interactive and simultaneous segment annotation from multiple images guided by feature space projection and optimized by metric learning as the labeling progresses. This strategy is in stark contrast to existing interactive segmentation methodologies, which perform annotation in the image domain. We show that our approach can surpass the accuracy of state-of-the-art methods in foreground segmentation datasets: iCoSeg, DAVIS, and Rooftop. Moreover, it achieves 91.5\% accuracy in a known semantic segmentation dataset, Cityscapes, being 74.75 times faster than the original annotation procedure. The appendix presents additional qualitative results. Code and video demonstration will be released upon publication.	翻訳日:2021-04-04 01:44:28 公開日:2021-01-12
# 二段階cnnに基づく木ログ認識 Two-stage CNN-based wood log recognition ( http://arxiv.org/abs/2101.04450v1 ) ライセンス: Link先を確認	Georg Wimmer and Rudolf Schraml and Heinz Hofbauer and Alexander Petutschnigg and Andreas Uhl	(参考訳) ログの起源の証明はますます重要になりつつある。 industry 4.0の文脈で、違法なロギングと戦うために、個々のログを追跡するモチベーションが高まっている。この分野でのこれまでの研究は、指紋や虹彩認識にインスパイアされた手法に基づくデジタルログエンド画像を用いたログ追跡に重点を置いていた。本研究は,CNNトレーニングのための三重項損失関数を用いて,ログ端のCNNに基づくセグメンテーションとセグメント化されたログ端の最終的な認識を組み合わせた畳み込みニューラルネットワーク(CNN)に基づくアプローチを提案する。その結果,提案手法は従来のアプローチよりも優れていることがわかった。 The proof of origin of logs is becoming increasingly important. In the context of Industry 4.0 and to combat illegal logging there is an increasing motivation to track each individual log. Our previous works in this field focused on log tracking using digital log end images based on methods inspired by fingerprint and iris-recognition. This work presents a convolutional neural network (CNN) based approach which comprises a CNN-based segmentation of the log end combined with a final CNN-based recognition of the segmented log end using the triplet loss function for CNN training. Results show that the proposed two-stage CNN-based approach outperforms traditional approaches.	翻訳日:2021-04-04 01:43:51 公開日:2021-01-12
# 画像合成におけるきめ細かいセマンティック制約 Fine-grained Semantic Constraint in Image Synthesis ( http://arxiv.org/abs/2101.04558v1 ) ライセンス: Link先を確認	Pengyang Li and Donghui Wang	(参考訳) 本稿では,精細な属性とマスクを入力として用いる多段高分解能画像合成モデルを提案する。提案モデルでは, 微粒化属性を用いて, 得られた画像の特徴を, 属性内の細粒化情報を通じて詳細に制約することができる。従来のマスクでは,生成した画像が視覚に適合するように制約され,生成する対向ネットワークから生成されたサンプルの予期せぬ多様性が低減される。また,画像の全体像とサブ領域を同時に識別することで,生成的敵ネットワークの識別能力を向上させる手法を提案する。さらに,データセットのラベル付き属性を最適化する手法を提案し,手動ラベリングノイズを低減する。その結果,画像合成モデルはよりリアルな画像を生成することがわかった。 In this paper, we propose a multi-stage and high-resolution model for image synthesis that uses fine-grained attributes and masks as input. With a fine-grained attribute, the proposed model can detailedly constrain the features of the generated image through rich and fine-grained semantic information in the attribute. With mask as prior, the model in this paper is constrained so that the generated images conform to visual senses, which will reduce the unexpected diversity of samples generated from the generative adversarial network. This paper also proposes a scheme to improve the discriminator of the generative adversarial network by simultaneously discriminating the total image and sub-regions of the image. In addition, we propose a method for optimizing the labeled attribute in datasets, which reduces the manual labeling noise. Extensive quantitative results show that our image synthesis model generates more realistic images.	翻訳日:2021-04-04 01:43:21 公開日:2021-01-12
# ファシカルランドマークの高速検出とその応用:調査 Fast Facial Landmark Detection and Applications: A Survey ( http://arxiv.org/abs/2101.10808v1 ) ライセンス: Link先を確認	Kostiantyn Khabarlak, Larysa Koriashkina	(参考訳) 本稿では,ニューラルネットワークに基づく顔のランドマーク検出アルゴリズムの探索と解析を行う。ここ数年で品質が大幅に向上したアプローチは、大きなポーズと感情の多様性、高いレベルの顔隠蔽を備えたデータセットに重点を置いています。本稿では,300-W,AFLW,WFLW,COFWという,難易度と最新度のデータセットの品質比較を行った。さらに、CPU、GPU、モバイルデバイスのアルゴリズム速度を比較します。完全性については、オープン実装で利用可能な確立されたメソッドについても簡単に触れます。さらに、ランドマーク検出アルゴリズムのアプリケーションと脆弱性についても取り上げる。それによって、将来さらなるアルゴリズム改善につながるであろう課題が生まれます。 In this paper we survey and analyze modern neural-network-based facial landmark detection algorithms. We focus on approaches that have led to a significant increase in quality over the past few years on datasets with large pose and emotion variability, high levels of face occlusions - all of which are typical in real-world scenarios. We summarize the improvements into categories, provide quality comparison on difficult and modern in-the-wild datasets: 300-W, AFLW, WFLW, COFW. Additionally, we compare algorithm speed on CPU, GPU and Mobile devices. For completeness, we also briefly touch on established methods with open implementations available. Besides, we cover applications and vulnerabilities of the landmark detection algorithms. Based on which, we raise problems that as we hope will lead to further algorithm improvements in future.	翻訳日:2021-04-04 01:42:30 公開日:2021-01-12
# HighAir:階層型グラフニューラルネットワークによる品質予測手法 HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method ( http://arxiv.org/abs/2101.04264v1 ) ライセンス: Link先を確認	Jiahui Xu, Ling Chen, Mingqi Lv, Chaoqun Zhan, Sanjian Chen, Jian Chang	(参考訳) 空気質を正確に予測することは、一般市民を肺や心臓病から守るのに不可欠である。これは、異なる汚染源と様々な影響要因の間の複雑な相互作用のため、難しい課題である。既存の大気汚染予測手法では,都市と監視局間の大気汚染物質の拡散過程を効果的にモデル化することはできない。本稿では,エンコーダ・デコーダアーキテクチャを採用し,気象や土地利用など,複雑な空気品質に影響する要因を考慮した階層型グラフニューラルネットワークによる空気品質予測手法を提案する。具体的には,都市レベルのグラフと駅レベルのグラフを階層的な視点から構築し,都市レベルのパターンと駅レベルのパターンをそれぞれ検討する。我々は,レベル間インタラクションを実装するために,上位配信と下位更新という2つの戦略を設計し,レベル内インタラクションを実装するためのメッセージパッシング機構を導入する。風向に基づくエッジウェイトを動的に調整し, 動的要因と空気質の関係をモデル化する。我々は,61,500km2以内の10大都市をカバーしているヤンツェ川デルタ市のデータセットについて,HighAirと最先端の空気質予測手法を比較した。実験の結果,HighAirは他の手法よりも優れていた。 Accurately forecasting air quality is critical to protecting general public from lung and heart diseases. This is a challenging task due to the complicated interactions among distinct pollution sources and various other influencing factors. Existing air quality forecasting methods cannot effectively model the diffusion processes of air pollutants between cities and monitoring stations, which may suddenly deteriorate the air quality of a region. In this paper, we propose HighAir, i.e., a hierarchical graph neural network-based air quality forecasting method, which adopts an encoder-decoder architecture and considers complex air quality influencing factors, e.g., weather and land usage. Specifically, we construct a city-level graph and station-level graphs from a hierarchical perspective, which can consider city-level and station-level patterns, respectively. We design two strategies, i.e., upper delivery and lower updating, to implement the inter-level interactions, and introduce message passing mechanism to implement the intra-level interactions. We dynamically adjust edge weights based on wind direction to model the correlations between dynamic factors and air quality. We compare HighAir with the state-of-the-art air quality forecasting methods on the dataset of Yangtze River Delta city group, which covers 10 major cities within 61,500 km2. The experimental results show that HighAir significantly outperforms other methods.	翻訳日:2021-04-04 01:42:22 公開日:2021-01-12
# トランザクション不正検出のための説明可能なディープビヘイビアシーケンスクラスタリング Explainable Deep Behavioral Sequence Clustering for Transaction Fraud Detection ( http://arxiv.org/abs/2101.04285v1 ) ライセンス: Link先を確認	Wei Min, Weiming Liang, Hang Yin, Zhurong Wang, Mei Li, Alok Lal	(参考訳) eコマース業界では、ユーザー行動シーケンスデータは検索や商品販売といった多くのビジネスユニットで製品を改善するために広く使われている。しかし、その3v特性、すなわち金融サービスで使われることは稀である。体積、速度、バラエティ - しかし、その非構造的性質のためでもある。本稿では,金融サービスシナリオの深層学習に基づくクラスタ化行動データ表現手法(findeepbehaviorcluster)を提案する。動作シーケンスデータを利用するために,クリックストリームデータをイベントシーケンスとして扱い,時間アテンションに基づくBi-LSTMを用いて,教師なしの方法でシーケンス埋め込みを学習し,リスクエキスパートが生成した直感的な特徴と組み合わせてハイブリッドな特徴表現を形成する。また, FAISS プロジェクトに基づく HDBSCAN アルゴリズムのエンジニアリング最適化である GPU を用いた HDBSCAN (pHDBSCAN) アルゴリズムを提案する。アルゴリズムの計算効率は、元の実装に比べて500倍に向上し、フラッシュ詐欺パターン検出が実現された。実験の結果,提案するFinDeepBehaviorClusterフレームワークは,ビジネス価値の高い不正取引を捕捉できることがわかった。また、直感的な特徴を用いてリスククラスタからパターンを抽出するためにルール抽出法を適用し、事例調査のためにリスククラスタにナラティブ記述を付加し、未知のリスクパターンをリアルタイム詐欺検出のために掘り出すことができる。要約すると、FinDeepBehaviorClusterは、既存のリアルタイム不正検出エンジンを補完するリスク管理戦略であり、不正検出と積極的なリスク防御能力をさらに高めることができる。 In e-commerce industry, user behavior sequence data has been widely used in many business units such as search and merchandising to improve their products. However, it is rarely used in financial services not only due to its 3V characteristics - i.e. Volume, Velocity and Variety - but also due to its unstructured nature. In this paper, we propose a Financial Service scenario Deep learning based Behavior data representation method for Clustering (FinDeepBehaviorCluster) to detect fraudulent transactions. To utilize the behavior sequence data, we treat click stream data as event sequence, use time attention based Bi-LSTM to learn the sequence embedding in an unsupervised fashion, and combine them with intuitive features generated by risk experts to form a hybrid feature representation. We also propose a GPU powered HDBSCAN (pHDBSCAN) algorithm, which is an engineering optimization for the original HDBSCAN algorithm based on FAISS project, so that clustering can be carried out on hundreds of millions of transactions within a few minutes. The computation efficiency of the algorithm has increased 500 times compared with the original implementation, which makes flash fraud pattern detection feasible. Our experimental results show that the proposed FinDeepBehaviorCluster framework is able to catch missed fraudulent transactions with considerable business values. In addition, rule extraction method is applied to extract patterns from risky clusters using intuitive features, so that narrative descriptions can be attached to the risky clusters for case investigation, and unknown risk patterns can be mined for real-time fraud detection. In summary, FinDeepBehaviorCluster as a complementary risk management strategy to the existing real-time fraud detection engine, can further increase our fraud detection and proactive risk defense capabilities.	翻訳日:2021-04-04 01:42:00 公開日:2021-01-12
# マルチタスク学習によるシードストッキング Seed Stocking Via Multi-Task Learning ( http://arxiv.org/abs/2101.04333v1 ) ライセンス: Link先を確認	Yunhe Feng and Wenjun Zhou	(参考訳) 作物種子の販売者は、少なくとも1年は在庫する種子の種類や量を計画する必要がある。 1つの作物には多数の種子品種があり、それぞれが異なる生育条件下で最高の性能を発揮できる。天候の予測不能を考えると、農家は高い収量と低いリスクのバランスをとる決定を下さなければならない。種子ベンダーは、農家のニーズを予想し、それらを準備する必要がある。本研究では,3つの主要なステップで種子需要を推定するための分析フレームワークを提案する。まず、各品種の収量とリスクを、あたかもそれぞれの場所に植えられたかのように見積もる。異なる種種を用いた過去の実験は品種間で非常に不均衡であり, 生育条件の組合せは少ないため, 類似品種の情報を借りるためにマルチタスク学習を採用している。第2に,収量とリスクのトレードオフを求めることにより,各地の種子のベストミックスを決定する。第3に,このようなミックスを集約して,成長する各場所の収量とリスクを再バランスさせるために,上位5品種を選択します。マルチタスク学習は収率予測に有効なソリューションであり、全体的な分析フレームワークは優れたパフォーマンスをもたらしています。 Sellers of crop seeds need to plan for the variety and quantity of seeds to stock at least a year in advance. There are a large number of seed varieties of one crop, and each can perform best under different growing conditions. Given the unpredictability of weather, farmers need to make decisions that balance high yield and low risk. A seed vendor needs to be able to anticipate the needs of farmers and have them ready. In this study, we propose an analytical framework for estimating seed demand with three major steps. First, we will estimate the yield and risk of each variety as if they were planted at each location. Since past experiments performed with different seed varieties are highly unbalanced across varieties, and the combination of growing conditions is sparse, we employ multi-task learning to borrow information from similar varieties. Second, we will determine the best mix of seeds for each location by seeking a tradeoff between yield and risk. Third, we will aggregate such mix and pick the top five varieties to re-balance the yield and risk for each growing location. We find that multi-task learning provides a viable solution for yield prediction, and our overall analytical framework has resulted in a good performance.	翻訳日:2021-04-04 01:41:27 公開日:2021-01-12
# エッジIoTソリューションのための信頼性の高いフリート分析 Reliable Fleet Analytics for Edge IoT Solutions ( http://arxiv.org/abs/2101.04414v1 ) ライセンス: Link先を確認	Emmanuel Raj, Magnus Westerlund, Leonardo Espinosa-Leal	(参考訳) 近年、iot(internet of things)デバイスのデプロイメントが急増し、ビッグデータと低レイテンシ通信の需要が高まりました。インフラストラクチャの需要の変化は、IoTアプリケーションに人工知能を使用することで、リアルタイムな意思決定を可能にする。 AIoT(Artificial Intelligence of Things)は、AI(Artificial Intelligence)テクノロジとIoTインフラストラクチャの組み合わせで、堅牢で効率的な操作と意思決定を提供する。 AIoTアプリケーションを実現するためにエッジコンピューティングが登場している。エッジコンピューティングは、データソースまたはその近くで洞察と意思決定を生成し、クラウドまたは中央リポジトリに送信されるデータ量を削減することができる。本稿では,エッジにおける機械学習モデル(Edge MLOps)の継続的デリバリ,デプロイメント,監視を可能にするために,AIoTアプリケーションのエッジでの機械学習を容易にするフレームワークを提案する。コントリビューションは、大規模にフリート分析を提供するためのサービス、ツール、メソッドを含むアーキテクチャである。本稿では,大学キャンパスの部屋でiotデバイスを用いた実験を行うことで,フレームワークの予備検証を行う。機械学習実験では,各エッジデバイスに配置したモデルを用いて,各室内の空気質を予測するための多変量時系列予測を行う。これらの実験により,提案するフリート分析フレームワークの効率性とロバスト性を検証する。 In recent years we have witnessed a boom in Internet of Things (IoT) device deployments, which has resulted in big data and demand for low-latency communication. This shift in the demand for infrastructure is also enabling real-time decision making using artificial intelligence for IoT applications. Artificial Intelligence of Things (AIoT) is the combination of Artificial Intelligence (AI) technologies and the IoT infrastructure to provide robust and efficient operations and decision making. Edge computing is emerging to enable AIoT applications. Edge computing enables generating insights and making decisions at or near the data source, reducing the amount of data sent to the cloud or a central repository. In this paper, we propose a framework for facilitating machine learning at the edge for AIoT applications, to enable continuous delivery, deployment, and monitoring of machine learning models at the edge (Edge MLOps). The contribution is an architecture that includes services, tools, and methods for delivering fleet analytics at scale. We present a preliminary validation of the framework by performing experiments with IoT devices on a university campus's rooms. For the machine learning experiments, we forecast multivariate time series for predicting air quality in the respective rooms by using the models deployed in respective edge devices. By these experiments, we validate the proposed fleet analytics framework for efficiency and robustness.	翻訳日:2021-04-04 01:41:09 公開日:2021-01-12
# 貯水池と貯水池の大陸規模流れのモデリング : 有効性の実証と課題の定式化 Continental-scale streamflow modeling of basins with reservoirs: a demonstration of effectiveness and a delineation of challenges ( http://arxiv.org/abs/2101.04423v1 ) ライセンス: Link先を確認	Wenyu Ouyang, Kathryn Lawson, Dapeng Feng, Lei Ye, Chi Zhang, Chaopeng Shen	(参考訳) 主要水路の大部分が流水に影響を与えるダムを有しており、大規模な水理モデルで考慮する必要がある。しかし,ダムを有する流域の毎日の流量予測は,様々なモデリング手法,特に大規模において困難である。そこで我々は,情報のみを用いて長期記憶(LSTM)深層学習モデルにより,どのタイプの流域を適切に表現できるかを分割・コンカレントで検討した。アメリカ合衆国における3557の盆地(83%が減衰)のデータを解析し,貯水池の用途,容量対流出比(dor),流れの流れのディバージョンが流れモデルに及ぼす影響を明らかにした。驚いたことに、LSTMモデルは広く使われている参照ベースベースデータセットでトレーニングされたが、データセット全体でトレーニングされたモデルは、Nash-Sutcliffe効率係数(NSE)の中央値を示し、ベンチマークレベルのパフォーマンスに達した。ゼロドール, 小型ドール, 大型ドール盆地は異なる挙動を示し, カテゴリー間での移動モデルにより破滅的な結果が得られた。しかし、異なるデータセットからプールされたデータを用いたトレーニングでは、これらのグループに対してそれぞれ0.73、0.78、0.71の最適中央値NSEが得られ、既存のモデルに対して顕著な優位性を示した。これらの結果は、降雨流出プロセスの一部として小さなダムをモデル化するコヒーレントな混合モデリング戦略を支持するが、ダム化された流域を基準として扱う必要はなく、訓練セットに含める必要がある。 A large fraction of major waterways have dams influencing streamflow, which must be accounted for in large-scale hydrologic modeling. However, daily streamflow prediction for basins with dams is challenging for various modeling approaches, especially at large scales. Here we took a divide-and-conquer approach to examine which types of basins could be well represented by a long short-term memory (LSTM) deep learning model using only readily-available information. We analyzed data from 3557 basins (83% dammed) over the contiguous United States and noted strong impacts of reservoir purposes, capacity-to-runoff ratio (dor), and diversion on streamflow on streamflow modeling. Surprisingly, while the LSTM model trained on a widely-used reference-basin dataset performed poorly for more non-reference basins, the model trained on the whole dataset presented a median test Nash-Sutcliffe efficiency coefficient (NSE) of 0.74, reaching benchmark-level performance. The zero-dor, small-dor, and large-dor basins were found to have distinct behaviors, so migrating models between categories yielded catastrophic results. However, training with pooled data from different sets yielded optimal median NSEs of 0.73, 0.78, and 0.71 for these groups, respectively, showing noticeable advantages over existing models. These results support a coherent, mixed modeling strategy where smaller dams are modeled as part of rainfall-runoff processes, but dammed basins must not be treated as reference ones and must be included in the training set; then, large-dor reservoirs can be represented explicitly and future work should examine modeling reservoirs for fire protection and irrigation, followed by those for hydroelectric power generation, and flood control, etc.	翻訳日:2021-04-04 01:40:49 公開日:2021-01-12
# 消費税の不正理解のための進化的ゲームモデル An Evolutionary Game Model for Understanding Fraud in Consumption Taxes ( http://arxiv.org/abs/2101.04424v1 ) ライセンス: Link先を確認	M. Chica and J. Hernandez and C. Manrique-de-Lara-Pe\~nate and R. Chiong	(参考訳) 本稿では,消費税体系における不正行為のダイナミクスを研究・理解するための計算進化ゲームモデルを提案する。プレイヤーは、価値付加税(vat)を正しく宣言した場合は協力者であり、そうでない場合は離反者である。各プレイヤーの支払いは、回避された金額と税務当局によって検査される主観的確率に影響される。企業間の取引は買い手と売り手の両方が宣言しなければならないため、一方が採用する戦略は他方の支払いに影響を与える。我々は,このモデルについて,個体群と異なるスケールフリーネットワークを用いて検討する。スペイン・カナリア諸島に登録された企業によるVAT宣言の実際のデータを用いて,モデルパラメータを校正した。我々は,高低取引における監査確率のシナリオと人口の頻度,社会報酬や罰則を分析し,協力者の比率を高めるための最も効率的な政策を見出すことができた。 2つの大きな洞察が得られた。第一に、低取引に対する主観的な監査確率の増加は、高取引に対するこの確率の増加よりも効率的である。第二に、協力者に対する社会的報酬や、欠陥者に対する代替罰が効果的な政策であり得るが、その成功は、低取引と高取引の監査確率の分布に依存する。 This paper presents a computational evolutionary game model to study and understand fraud dynamics in the consumption tax system. Players are cooperators if they correctly declare their value added tax (VAT), and are defectors otherwise. Each player's payoff is influenced by the amount evaded and the subjective probability of being inspected by tax authorities. Since transactions between companies must be declared by both the buyer and seller, a strategy adopted by one influences the other's payoff. We study the model with a well-mixed population and different scale-free networks. Model parameters were calibrated using real-world data of VAT declarations by businesses registered in the Canary Islands region of Spain. We analyzed several scenarios of audit probabilities for high and low transactions and their prevalence in the population, as well as social rewards and penalties to find the most efficient policy to increase the proportion of cooperators. Two major insights were found. First, increasing the subjective audit probability for low transactions is more efficient than increasing this probability for high transactions. Second, favoring social rewards for cooperators or alternative penalties for defectors can be effective policies, but their success depends on the distribution of the audit probability for low and high transactions.	翻訳日:2021-04-04 01:39:48 公開日:2021-01-12
# 説明可能性の拡大:AIシステムにおける社会的透明性を目指して Expanding Explainability: Towards Social Transparency in AI systems ( http://arxiv.org/abs/2101.04719v1 ) ライセンス: Link先を確認	Upol Ehsan, Q. Vera Liao, Michael Muller, Mark O. Riedl, Justin D. Weisz	(参考訳) AIを利用したシステムは、連続的な意思決定を仲介する傾向にあるため、エンドユーザーが情報と説明責任を負う行動を取ることが重要である。人間と人間の相互作用の説明は社会的に構成されている。 AIシステムはしばしば社会組織に組み込まれる。しかし、説明可能なAI(XAI)アプローチは主にアルゴリズム中心である。我々は、社会的な組織的文脈をAIによる意思決定の説明に取り入れた社会的透明性(Social Transparency, ST)を導入し、探求することで、社会的なXAIへの発展的な一歩を踏み出した。 stを概念的に探究するため,我々は投機的設計シナリオに基づく29人のaiユーザと実践者とのインタビューを行った。我々はSTの構成的設計要素を提案し、STの効果と含意を技術、意思決定、組織レベルで解き放つ概念的枠組みを開発した。このフレームワークは、STがAIに対する信頼を校正し、意思決定を改善し、組織的な集団行動を促進し、全体的説明責任を育む方法について説明している。本研究は, XAI の設計空間を拡大し,人間中心型 XAI の言説に寄与する。 As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algorithm-centered. We take a developmental step towards socially-situated XAI by introducing and exploring Social Transparency (ST), a sociotechnically informed perspective that incorporates the socio-organizational context into explaining AI-mediated decision-making. To explore ST conceptually, we conducted interviews with 29 AI users and practitioners grounded in a speculative design scenario. We suggested constitutive design elements of ST and developed a conceptual framework to unpack ST's effect and implications at the technical, decision-making, and organizational level. The framework showcases how ST can potentially calibrate trust in AI, improve decision-making, facilitate organizational collective actions, and cultivate holistic explainability. Our work contributes to the discourse of Human-Centered XAI by expanding the design space of XAI.	翻訳日:2021-04-04 01:39:27 公開日:2021-01-12
# 肺疾患におけるct画像の定量および自動解析のための患者別アプローチ--covid-19患者への応用 A patient-specific approach for quantitative and automatic analysis of computed tomography images in lung disease: application to COVID-19 patients ( http://arxiv.org/abs/2101.04430v1 ) ライセンス: Link先を確認	L. Berta, C. De Mattia, F. Rizzetto, S. Carrazza, P.E. Colombo, R. Fumagalli, T. Langer, D. Lizio, A. Vanzulli, A. Torresin	(参考訳) 肺CT画像の定量的な計測は広く用いられており、しばしば生理学との明確なつながりがない。本研究は,CT画像(WAVE)における肺の高度評価のための患者非依存モデルを提案する。肺の下部CTヒストグラムデータポイントに平均 (Mu.f) と幅 (Sigma.f) のガウスフィットを適用し, よく評価された肺体積 (WAVE.f) を推定した。肺CT画像と4DCT画像を用いて,CT再建パラメータと呼吸周期の独立性を解析した。第3のコホートで算出されたガウス測定値と第1の放射線学的特徴を健康な肺と比較した。各肺はさらに24領域に区分され, 局所密度変化を表すため, ガウスフィットパラメータmu.f由来の新しいバイオマーカーが提案されている。 WAVE.fは80%の症例で呼吸運動から独立していた。 1%, 2%, 最大14%の違いは, 適度な反復強度とFBPアルゴリズム, 1mm, 3mmのスライス厚, 異なる再構成カーネルを比較した。健康な被験者は、計算されたすべての指標について、COVID-19患者と大きく異なっていた。局所バイオマーカーのグラフィカル表現は、単一の2次元画像において空間的および定量的情報を提供する。固定ヒストグラム閾値に基づく他の指標とは異なり、このモデルは物体間および物体内変動性を考えることができる。さらに、観察者とは独立に、病気の重症度を定量化するための局所バイオマーカーを定義する。 Quantitative metrics in lung computed tomography (CT) images have been widely used, often without a clear connection with physiology. This work proposes a patient-independent model for the estimation of well-aerated volume of lungs in CT images (WAVE). A Gaussian fit, with mean (Mu.f) and width (Sigma.f) values, was applied to the lower CT histogram data points of the lung to provide the estimation of the well-aerated lung volume (WAVE.f). Independence from CT reconstruction parameters and respiratory cycle was analysed using healthy lung CT images and 4DCT acquisitions. The Gaussian metrics and first order radiomic features calculated for a third cohort of COVID-19 patients were compared with those relative to healthy lungs. Each lung was further segmented in 24 subregions and a new biomarker derived from Gaussian fit parameter Mu.f was proposed to represent the local density changes. WAVE.f resulted independent from the respiratory motion in 80% of the cases. Differences of 1%, 2% and up to 14% resulted comparing a moderate iterative strength and FBP algorithm, 1 and 3 mm of slice thickness and different reconstruction kernel. Healthy subjects were significantly different from COVID-19 patients for all the metrics calculated. Graphical representation of the local biomarker provides spatial and quantitative information in a single 2D picture. Unlike other metrics based on fixed histogram thresholds, this model is able to consider the inter-and intra-subject variability. In addition, it defines a local biomarker to quantify the severity of the disease, independently of the observer.	翻訳日:2021-04-04 01:39:08 公開日:2021-01-12
# KuzborskijとSzepesv\'ariの信頼境界について A note on a confidence bound of Kuzborskij and Szepesv\'ari ( http://arxiv.org/abs/2101.04671v1 ) ライセンス: Link先を確認	Omar Rivasplata	(参考訳) 興味深い最近の研究で、Kuzborskij と Szepesv\'ari は独立確率変数の函数に対する信頼度を導出した。 Kuzborskij と Szepesv\'ari は PAC-Bayes-ification of their confidence bound も設立した。彼らの研究の2つの重要な側面は、確率変数が非有界な範囲であり、必ずしも同じ分布であるとは限らないことである。このノートの目的は、これらの興味深い結果を合理化して宣伝し、議論することである。この公開ノートは、例え「フィーチャー映画」を楽しみながらプレビューシーケンスをスキップする人のために書かれています。 In an interesting recent work, Kuzborskij and Szepesv\'ari derived a confidence bound for functions of independent random variables, which is based on an inequality that relates concentration to squared perturbations of the chosen function. Kuzborskij and Szepesv\'ari also established the PAC-Bayes-ification of their confidence bound. Two important aspects of their work are that the random variables could be of unbounded range, and not necessarily of an identical distribution. The purpose of this note is to advertise/discuss these interesting results, with streamlined proofs. This expository note is written for persons who, metaphorically speaking, enjoy the "featured movie" but prefer to skip the preview sequence.	翻訳日:2021-04-04 01:38:41 公開日:2021-01-12
# 自己教師あり表現学習による画像からの銀河距離の推定 Estimating Galactic Distances From Images Using Self-supervised Representation Learning ( http://arxiv.org/abs/2101.04293v1 ) ライセンス: Link先を確認	Md Abul Hayat, Peter Harrington, George Stein, Zarija Luki\'c, Mustafa Mustafa	(参考訳) 対照的な自己教師付き学習フレームワークを用いて、光度画像から銀河の距離を推定する。我々は、コンピュータビジョンからのデータ拡張と、銀河塵のアプリケーション固有の拡張を取り入れた。結果として得られる銀河画像の視覚的表現は意味的に有用であり、高速に類似性検索が可能であり、赤方偏移推定のタスクでうまく微調整できることがわかった。本研究では,(1)ラベルなしデータの大規模なコーパスを事前学習し,(2)ラベル付きデータに2-4倍の精度を必要とする完全教師付きモデルの精度を達成できること,(2)Sloan Digital Sky Survey (SDSS)のMain Galaxy Sampleにあるすべてのデータラベルを用いて自己教師付き表現を微調整することにより,最先端の教師付き学習手法よりも優れていることを示す。 We use a contrastive self-supervised learning framework to estimate distances to galaxies from their photometric images. We incorporate data augmentations from computer vision as well as an application-specific augmentation accounting for galactic dust. We find that the resulting visual representations of galaxy images are semantically useful and allow for fast similarity searches, and can be successfully fine-tuned for the task of redshift estimation. We show that (1) pretraining on a large corpus of unlabeled data followed by fine-tuning on some labels can attain the accuracy of a fully-supervised model which requires 2-4x more labeled data, and (2) that by fine-tuning our self-supervised representations using all available data labels in the Main Galaxy Sample of the Sloan Digital Sky Survey (SDSS), we outperform the state-of-the-art supervised learning method.	翻訳日:2021-04-04 01:38:28 公開日:2021-01-12
# CAnet:深層学習を用いたFDD大規模MIMOにおけるアップリンク支援ダウンリンクチャネル獲得 CAnet: Uplink-aided Downlink Channel Acquisition in FDD Massive MIMO using Deep Learning ( http://arxiv.org/abs/2101.04377v1 ) ライセンス: Link先を確認	Jiajia Guo, Chao-Kai Wen, Shi Jin	(参考訳) 周波数分割二重化システムでは、ダウンリンクチャネル状態情報(CSI)取得方式は高いトレーニングとフィードバックのオーバーヘッドをもたらす。本稿では,これらのオーバーヘッドを軽減するために,ディープラーニングを用いたアップリンク支援ダウンリンクチャネル獲得フレームワークを提案する。チャネル推定やフィードバックモジュールのみに焦点を当てた既存の作業とは異なり、私たちの知る限りでは、ダウンリンクパイロット設計、チャネル推定、フィードバックを含む、ダウンリンクCSI取得プロセス全体を考慮した最初の研究である。まず,角領域の双方向チャネル間の相関を利用して適応的なパイロット設計モジュールを提案し,チャネル推定を改善する。次に、フィードバックモジュール中のビット割り当て問題を回避するため、複雑なチャネルを結合し、基地局のチャネル再構成にアップリンクチャネルの大きさを埋め込む。最後に、上記の2つのモジュールを組み合わせて、2つの人気のあるダウンリンクチャネル獲得フレームワークを比較します。前者のフレームワークは、その後、ユーザ機器のチャネルを推定し、返送する。後者のユーザ装置は、受信したパイロット信号を基地局に直接送り返す。その結果、アップリンクの助けを借りて、パイロット信号を直接フィードバックすることで、約20%のフィードバックビットを節約できることがわかった。 In frequency-division duplexing systems, the downlink channel state information (CSI) acquisition scheme leads to high training and feedback overheads. In this paper, we propose an uplink-aided downlink channel acquisition framework using deep learning to reduce these overheads. Unlike most existing works that focus only on channel estimation or feedback modules, to the best of our knowledge, this is the first study that considers the entire downlink CSI acquisition process, including downlink pilot design, channel estimation, and feedback. First, we propose an adaptive pilot design module by exploiting the correlation in magnitude among bidirectional channels in the angular domain to improve channel estimation. Next, to avoid the bit allocation problem during the feedback module, we concatenate the complex channel and embed the uplink channel magnitude to the channel reconstruction at the base station. Lastly, we combine the above two modules and compare two popular downlink channel acquisition frameworks. The former framework estimates and feeds back the channel at the user equipment subsequently. The user equipment in the latter one directly feeds back the received pilot signals to the base station. Our results reveal that, with the help of uplink, directly feeding back the pilot signals can save approximately 20% of feedback bits, which provides a guideline for future research.	翻訳日:2021-04-04 01:38:08 公開日:2021-01-12
# 放射線特徴とコントラスト学習を用いた胸部X線上の肺炎検出 Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning ( http://arxiv.org/abs/2101.04269v1 ) ライセンス: Link先を確認	Yan Han, Chongyan Chen, Ahmed H Tewfik, Ying Ding, Yifan Peng	(参考訳) 胸部X線は非侵襲性から最も一般的な診断の1つである。胸部X線画像の数は急上昇したが、胸部X線を読むのは放射線技師が手動で行い、火傷や遅延が発生する。医学画像から多くの定量的特徴を抽出できる放射線学のサブフィールドとして伝統的にラジオミクスは、深層学習時代以前の医療画像診断を容易にする可能性を示している。深層学習の台頭に伴い、胸部X線診断における深部ニューラルネットワークの説明可能性はまだ不透明である。本研究では,胸部x線中の肺炎をx線学的特徴と対比学習を用いて検出する新しい枠組みを提案する。 rsna肺炎検出チャレンジデータセットを用いた実験により,いくつかの最先端モデル(f1-scoreでは10%以上)に対して優れた結果が得られ,モデルの解釈性が向上した。 Chest X-ray becomes one of the most common medical diagnoses due to its noninvasiveness. The number of chest X-ray images has skyrocketed, but reading chest X-rays still have been manually performed by radiologists, which creates huge burnouts and delays. Traditionally, radiomics, as a subfield of radiology that can extract a large number of quantitative features from medical images, demonstrates its potential to facilitate medical imaging diagnosis before the deep learning era. With the rise of deep learning, the explainability of deep neural networks on chest X-ray diagnosis remains opaque. In this study, we proposed a novel framework that leverages radiomics features and contrastive learning to detect pneumonia in chest X-ray. Experiments on the RSNA Pneumonia Detection Challenge dataset show that our model achieves superior results to several state-of-the-art models (> 10% in F1-score) and increases the model's interpretability.	翻訳日:2021-04-04 01:37:49 公開日:2021-01-12
# LiDARおよびカメラセンサセットアップの自動外部校正法 Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor Setups ( http://arxiv.org/abs/2101.04431v1 ) ライセンス: Link先を確認	Jorge Beltr\'an, Carlos Guindel, Fernando Garc\'ia	(参考訳) ほとんどのセンサーはlidarと視覚システムで構成されており、ロバストなシーン理解を得るために必要な異なるアルゴリズムの信頼性を向上させる補完的情報を提供する。しかし、異なるソースからの情報の効果的な使用には、関連するセンサー間の正確なキャリブレーションが必要である。そこで本研究では,LiDAR,モノクラーカメラ,ステレオカメラを含むセンサ対の外部パラメータを同一あるいは異なるモードで校正する手法を提案する。第1に、カスタム校正対象に属する基準点を校正するセンサによって提供されるデータから抽出し、第2に、両点セットの登録により最適な剛性変換を求める。提案手法は、通常車両のセットアップで見られるように、非常に異なる解像度とポーズのデバイスを扱うことができる。提案手法の性能を評価するため,一般的なシミュレーションフレームワーク上に構築された新しい評価スイートを紹介した。合成環境における実験により, キャリブレーションアルゴリズムは既存の手法よりも有意に優れており, 実データテストは評価スイートで得られた結果と相関することがわかった。オープンソースコードはhttps://github.com/beltransen/velo2cam_calibrationで入手できる。 Most sensor setups for onboard autonomous perception are composed of LiDARs and vision systems, as they provide complementary information that improves the reliability of the different algorithms necessary to obtain a robust scene understanding. However, the effective use of information from different sources requires an accurate calibration between the sensors involved, which usually implies a tedious and burdensome process. We present a method to calibrate the extrinsic parameters of any pair of sensors involving LiDARs, monocular or stereo cameras, of the same or different modalities. The procedure is composed of two stages: first, reference points belonging to a custom calibration target are extracted from the data provided by the sensors to be calibrated, and second, the optimal rigid transformation is found through the registration of both point sets. The proposed approach can handle devices with very different resolutions and poses, as usually found in vehicle setups. In order to assess the performance of the proposed method, a novel evaluation suite built on top of a popular simulation framework is introduced. Experiments on the synthetic environment show that our calibration algorithm significantly outperforms existing methods, whereas real data tests corroborate the results obtained in the evaluation suite. Open-source code is available at https://github.com/beltransen/velo2cam_calibration	翻訳日:2021-04-04 01:37:33 公開日:2021-01-12
# 野生における共同脱塩・脱鼻--地底不確かさ下での訓練を事例として Joint Demosaicking and Denoising in the Wild: The Case of Training Under Ground Truth Uncertainty ( http://arxiv.org/abs/2101.04442v1 ) ライセンス: Link先を確認	Jierun Chen, Song Wen, S.-H. Gary Chan	(参考訳) デジタルカメラパイプラインにおける2つの基本的なステップは、ノイズの多い輝度からクリーンなカラーイメージを再構築することである。本稿では,野生における共同解体・復調のための新しい学習フレームワークであるWild-JDDを提案し,研究する。トレーニングデータの基底的真理が現実の完全な反映であると一般的に仮定する先行研究とは対照的に、ここでは野生における基底的真理の不確かさのより一般的な不完全なケースを考察する。まず, ジッパー効果, カラーモアレ, 残留雑音など, 様々な種類の人工物として現れることを示す。次に,2段階データ分解過程を定式化し,基底分布に共役事前分布を課すような基礎的真理不確かさを捉える。その後、劣化した入力に基づいて条件付けられた共役事前分布のパラメータを近似するニューラルネットワークを訓練するために、下限値(elbo)損失の証拠を導出する。最後に, 分散型入力の性能をさらに高めるために, 入力を弱い情報量優先にすることで, 単純かつ効果的な微調整戦略を考案する。基礎的な真実の不確実性を考慮すると、Wild-JDDは最適化の間、よく解釈可能である。広範な実験によって、合成データセットとリアルデータセットの両方で、共同デモサイクリングとデノイジングタスクで最先端のスキームを上回ることが検証された。 Image demosaicking and denoising are the two key fundamental steps in digital camera pipelines, aiming to reconstruct clean color images from noisy luminance readings. In this paper, we propose and study Wild-JDD, a novel learning framework for joint demosaicking and denoising in the wild. In contrast to previous works which generally assume the ground truth of training data is a perfect reflection of the reality, we consider here the more common imperfect case of ground truth uncertainty in the wild. We first illustrate its manifestation as various kinds of artifacts including zipper effect, color moire and residual noise. Then we formulate a two-stage data degradation process to capture such ground truth uncertainty, where a conjugate prior distribution is imposed upon a base distribution. After that, we derive an evidence lower bound (ELBO) loss to train a neural network that approximates the parameters of the conjugate prior distribution conditioned on the degraded input. Finally, to further enhance the performance for out-of-distribution input, we design a simple but effective fine-tuning strategy by taking the input as a weakly informative prior. Taking into account ground truth uncertainty, Wild-JDD enjoys good interpretability during optimization. Extensive experiments validate that it outperforms state-of-the-art schemes on joint demosaicking and denoising tasks on both synthetic and realistic raw datasets.	翻訳日:2021-04-04 01:37:13 公開日:2021-01-12
# Binary TTC: 自律ナビゲーションのための時間ジオフェンス Binary TTC: A Temporal Geofence for Autonomous Navigation ( http://arxiv.org/abs/2101.04777v1 ) ライセンス: Link先を確認	Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen	(参考訳) タイム・トゥ・コンタクト(TTC、Time-to-Contact)は、物体が観測者の飛行機と衝突する時であり、経路計画のための強力なツールである。 TTCには、単眼カメラのみを必要とするなど、いくつかの利点がある。しかし、各画素に対するTTCの回帰は簡単ではなく、既存のほとんどの手法はシーンに関する仮定を単純化する。この課題に対処するために、TTCを単純なバイナリ分類によって推定する。我々は、観測者が一定の時間内に障害物と衝突するかどうかを低レイテンシで予測する。このようなシナリオでは、従来の方法よりも25倍以上高速で6.4ミリ秒の時間的測地を提供する。提案手法は,計算予算が許す場合,任意に微細な量子化(連続値を含む)で画素当たりのTTCを推定できる。我々の知識を最大限に活用するために,本手法は初めて,十分高いフレームレートでTCC情報(バイナリまたは粗い量子化)を提供する。 Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene -- even for humans. TTC presents several advantages, including requiring only a monocular, uncalibrated camera. However, regressing TTC for each pixel is not straightforward, and most existing methods make over-simplifying assumptions about the scene. We address this challenge by estimating TTC via a series of simpler, binary classifications. We predict with low latency whether the observer will collide with an obstacle within a certain time, which is often more critical than knowing exact, per-pixel TTC. For such scenarios, our method offers a temporal geofence in 6.4 ms -- over 25x faster than existing methods. Our approach can also estimate per-pixel TTC with arbitrarily fine quantization (including continuous values), when the computational budget allows for it. To the best of our knowledge, our method is the first to offer TTC information (binary or coarsely quantized) at sufficiently high frame-rates for practical use.	翻訳日:2021-04-04 01:36:32 公開日:2021-01-12
# ドメインフリーな医用画像拡張のための生成逆U-Net Generative Adversarial U-Net for Domain-free Medical Image Augmentation ( http://arxiv.org/abs/2101.04793v1 ) ライセンス: Link先を確認	Xiaocong Chen and Yun Li and Lina Yao and Ehsan Adeli and Yu Zhang	(参考訳) 注釈付き医用画像の不足は、医用画像コンピューティングの分野における最大の課題の1つだ。十分な数のトレーニングサンプルがなければ、ディープラーニングベースのモデルは過剰フィッティングの問題に苦しむ可能性が高い。一般的な解決策は、画像回転、トリミング、リサイズなどの画像操作である。これらの方法は、より多くのトレーニングサンプルが導入されるにつれて、過度に適合する問題を緩和するのに役立ちます。しかし、追加情報を持つ新しい画像を導入することはなく、テストセットがトレーニングセットに現れる類似のサンプルを含む可能性があるため、データ漏洩につながる可能性がある。この課題に対処するために,生成型逆ネットワークを用いた多様な画像を生成することを提案する。本稿では, 生成逆ネットワークとU-Netの両方を利用する, 生成逆ネットワークと呼ばれる新しい生成手法を開発する。既存のアプローチとは異なり、新しく設計されたモデルはドメインフリーで、様々な医療画像に一般化できる。コンピュータ断層撮影(CT)スキャン,病理学,X線など,8つの多様なデータセットに対して大規模な実験を行った。可視化と定量化により,提案手法の有効性を実証し,高画質な医用画像の生成に有効であることを示す。 The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing. Without a sufficient number of training samples, deep learning based models are very likely to suffer from over-fitting problem. The common solution is image manipulation such as image rotation, cropping, or resizing. Those methods can help relieve the over-fitting problem as more training samples are introduced. However, they do not really introduce new images with additional information and may lead to data leakage as the test set may contain similar samples which appear in the training set. To address this challenge, we propose to generate diverse images with generative adversarial network. In this paper, we develop a novel generative method named generative adversarial U-Net , which utilizes both generative adversarial network and U-Net. Different from existing approaches, our newly designed model is domain-free and generalizable to various medical images. Extensive experiments are conducted over eight diverse datasets including computed tomography (CT) scan, pathology, X-ray, etc. The visualization and quantitative results demonstrate the efficacy and good generalization of the proposed method on generating a wide array of high-quality medical images.	翻訳日:2021-04-04 01:36:13 公開日:2021-01-12
# トレース比最適化と多視点学習への応用 Trace Ratio Optimization with an Application to Multi-view Learning ( http://arxiv.org/abs/2101.04292v1 ) ライセンス: Link先を確認	Li Wang and Lei-Hong Zhang and Ren-Cang Li	(参考訳) スティーフェル多様体上のトレース比最適化問題について,理論と数値計算の両方の観点から検討した。この問題は,フィッシャー線形判別分析,正準相関解析,非平衡散逸問題から,少なくとも3つの特別な事例が生じた。固有ベクトル依存性を持つ非線形固有値問題の形で必要条件が確立され、自己整合体(SCF)反復に基づく数値法が設計され、常に収束することが証明された。多視点サブスペース学習の応用として,実世界データセット上で新しいフレームワークとそのインスタンス化された具体モデルを提案する。数値実験の結果,提案手法の有効性と新しい多視点部分空間学習モデルの有効性が示された。 A trace ratio optimization problem over the Stiefel manifold is investigated from the perspectives of both theory and numerical computations. At least three special cases of the problem have arisen from Fisher linear discriminant analysis, canonical correlation analysis, and unbalanced Procrustes problem, respectively. Necessary conditions in the form of nonlinear eigenvalue problem with eigenvector dependency are established and a numerical method based on the self-consistent field (SCF) iteration is designed and proved to be always convergent. As an application to multi-view subspace learning, a new framework and its instantiated concrete models are proposed and demonstrated on real world data sets. Numerical results show that the efficiency of the proposed numerical methods and effectiveness of the new multi-view subspace learning models.	翻訳日:2021-04-04 01:35:57 公開日:2021-01-12
# NeurIPS 2020 Workshop on Machine Learning for the Development World: Improving Resilience」の開催報告 Proceedings of the NeurIPS 2020 Workshop on Machine Learning for the Developing World: Improving Resilience ( http://arxiv.org/abs/2101.04347v1 ) ライセンス: Link先を確認	Tejumade Afonja, Konstantin Klemmer, Aya Salama, Paula Rodriguez Diaz, Niveditha Kalavakonda, Oluwafemi Azeez	(参考訳) 以下は、2020年12月12日土曜日に開催された第43回NeurIPS Conference on Neural Information Processing Systems (NeurIPS)の一部として開催されるML4D(Machine Learning for the developing World)の第4回ワークショップの手順である。 These are the proceedings of the 4th workshop on Machine Learning for the Developing World (ML4D), held as part of the Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) on Saturday, December 12th 2020.	翻訳日:2021-04-04 01:35:45 公開日:2021-01-12
# 活性化密度に基づくエネルギー効率の良いニューラルネットワークの混合精度量子化 Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks ( http://arxiv.org/abs/2101.04354v1 ) ライセンス: Link先を確認	Karina Vasquez, Yeshwanth Venkatesha, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda	(参考訳) ニューラルネットワークが組み込みデバイスで広く普及するにつれて、リソース制約のある環境への展開を容易にするためのモデル圧縮技術が必要である。量子化は最先端のモデル圧縮をもたらすゴートメソッドの1つである。ほとんどのアプローチは、完全に訓練されたモデルを採用し、異なるヒューリスティックを適用して、ネットワークの異なる層に対して最適なビット精度を決定する。活性化密度 (AD) に基づいて, 層内の非ゼロ活性化の比率を推定し, イントレーニング量子化法を提案する。本手法は,混合精度モデルによる学習中の各層に対するビット幅を計算する。トレーニング中に精度の低いモデルをトレーニングするため、このアプローチはトレーニング複雑性の低い最終量子化モデルをもたらし、再トレーニングの必要性も排除します。我々は、VGG19/ResNet18アーキテクチャ上で、CIFAR-10、CIFAR-100、TinyImagenetなどのベンチマークデータセットで実験を行い、その精度とエネルギー推定を報告する。推定乗算累積 (MAC) の削減と, トレーニングの複雑さを50%減らすことで, 4.5倍の利点が得られる。提案手法の省エネルギー効果を更に評価するため,pim(mixed-precision scalable process in memory)ハードウェアアクセラレーションプラットフォームを開発した。ハードウェアプラットフォームには、マルチビット精密ニューラルネットワークモデルを扱うためのシフト付加機能が含まれている。提案手法を用いて得られた量子化モデルをPIMプラットフォーム上で評価すると,16ビットモデルと比較して約5倍のエネルギー削減が得られる。さらに,広告ベースの量子化と広告ベースのプルーニング(どちらもトレーニング中)を統合すると,vgg19とresnet18アーキテクチャの最大198倍,44倍のエネルギー削減がpcmプラットフォーム上で実現されることが分かった。 As neural networks gain widespread adoption in embedded devices, there is a need for model compression techniques to facilitate deployment in resource-constrained environments. Quantization is one of the go-to methods yielding state-of-the-art model compression. Most approaches take a fully trained model, apply different heuristics to determine the optimal bit-precision for different layers of the network, and retrain the network to regain any drop in accuracy. Based on Activation Density (AD)-the proportion of non-zero activations in a layer-we propose an in-training quantization method. Our method calculates bit-width for each layer during training yielding a mixed precision model with competitive accuracy. Since we train lower precision models during training, our approach yields the final quantized model at lower training complexity and also eliminates the need for re-training. We run experiments on benchmark datasets like CIFAR-10, CIFAR-100, TinyImagenet on VGG19/ResNet18 architectures and report the accuracy and energy estimates for the same. We achieve ~4.5x benefit in terms of estimated multiply-and-accumulate (MAC) reduction while reducing the training complexity by 50% in our experiments. To further evaluate the energy benefits of our proposed method, we develop a mixed-precision scalable Process In Memory (PIM) hardware accelerator platform. The hardware platform incorporates shift-add functionality for handling multi-bit precision neural network models. Evaluating the quantized models obtained with our proposed method on the PIM platform yields ~5x energy reduction compared to 16-bit models. Additionally, we find that integrating AD based quantization with AD based pruning (both conducted during training) yields up to ~198x and ~44x energy reductions for VGG19 and ResNet18 architectures respectively on PIM platform compared to baseline 16-bit precision, unpruned models.	翻訳日:2021-04-04 01:35:38 公開日:2021-01-12
# 機械学習による新しい半導体の解釈可能な発見 Interpretable discovery of new semiconductors with machine learning ( http://arxiv.org/abs/2101.04383v1 ) ライセンス: Link先を確認	Hitarth Choubisa (1), Petar Todorovi\'c (1), Joao M. Pina (1), Darshan H. Parmar (1), Ziliang Li (1), Oleksandr Voznyy (4), Isaac Tamblyn (2,3), Edward Sargent (1) ((1) Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada, (2) National Research Council of Canada, Ottawa, ON, Canada, (3) Vector Institute for Artificial Intelligence, Toronto, ON, Canada, (4) Department of Physical and Environmental Sciences, University of Toronto, Scarborough, ON, Canada)	(参考訳) ディープラーニングモデルは、密度汎関数理論(DFT)で計算された結果を、DFT$^{6}$のコストの10万分の1で再現する。実験材料合成におけるガイダンスを提供するには, 正確かつ効果的な探索アルゴリズムと, 実験観測と整合したトレーニングデータを組み合わせる必要がある。本稿では,Deep Adaptive Regressive Weighted Intelligent Network (DARWIN) を用いて,高スループットハイブリッドDFTデータに基づいて学習したマシン学習サロゲートモデルを用いた進化的アルゴリズムを報告する。この戦略は、対象特性を持つ候補に対して、10$^8$三元および10$^{11}$四元数$^{7}$の材料空間の効率的な探索を可能にする。ハロゲン化物とBサイトカチオンの電気陰性度の違いが3次構造安定性の強い予測因子であることの発見など、解釈可能な設計規則を提供する。例えば、紫外線放射を求めるとき、DARWINはその電子陰性率差に基づいて、K$_2$CuX$_3$ (X = Cl, Br) を有望な物質族として予測する。我々はこれらの物質を、安定で直接バンドギャップUVエミッタとして合成し、発見した。このアプローチは、人間が使用する知識蒸留も可能にする。 Machine learning models of materials$^{1-5}$ accelerate discovery compared to ab initio methods: deep learning models now reproduce density functional theory (DFT)-calculated results at one hundred thousandths of the cost of DFT$^{6}$. To provide guidance in experimental materials synthesis, these need to be coupled with an accurate yet effective search algorithm and training data consistent with experimental observations. Here we report an evolutionary algorithm powered search which uses machine-learned surrogate models trained on high-throughput hybrid functional DFT data benchmarked against experimental bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN). The strategy enables efficient search over the materials space of ~10$^8$ ternaries and 10$^{11}$ quaternaries$^{7}$ for candidates with target properties. It provides interpretable design rules, such as our finding that the difference in the electronegativity between the halide and B-site cation being a strong predictor of ternary structural stability. As an example, when we seek UV emission, DARWIN predicts K$_2$CuX$_3$ (X = Cl, Br) as a promising materials family, based on its electronegativity difference. We synthesized and found these materials to be stable, direct bandgap UV emitters. The approach also allows knowledge distillation for use by humans.	翻訳日:2021-04-04 01:35:04 公開日:2021-01-12
# 二成分ニューラルネットワークによる高出力IoTデバイス上の音事象検出 Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices ( http://arxiv.org/abs/2101.04446v1 ) ライセンス: Link先を確認	Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Michele Magno, Elisabetta Farella, Luca Benini	(参考訳) サウンドイベント検出(SED)は、消費者およびスマートシティアプリケーションにおいてホットなトピックである。ディープニューラルネットワークに基づく既存のアプローチは非常に効果的だが、超低消費電力の常時オンデバイスをターゲットにする場合、メモリ、電力、スループットの面で非常に要求される。レイテンシ、可用性、コスト、プライバシ要件は、最新のIoTシステムに対して、センサに近いノード上でデータを処理し、非常に限られたエネルギー供給と、最先端のDNNを実行する前にメモリサイズと処理能力に厳しい制約を課している。本稿では,高エネルギー効率なRISC-V(8+1)コアGAP8マイクロコントローラと,極端量子化と小フットプリント型バイナリニューラルネットワーク(BNN)の組み合わせについて検討する。既存のSED用CNNのフットプリント(815kB)が、当社プラットフォームで利用可能なメモリ512kBを超えていることから、バイナリフィルタとアクティベーションを使用してネットワークを再トレーニングし、これらのメモリ制約を満たす。完全な)バイナリニューラルネットワークは、同等の完全精度のベースラインに比べて、難しいImageNetオブジェクト認識チャレンジにおいて、12-18%の精度が自然に低下する。このBNNは77.9%の精度に達し、全精度版よりわずか7%低く、重量は58kB(7.2倍)、メモリは262kB(2.4倍)である。 BNNの実装では,全ネットワーク上での最大スループットは4.6 GMAC/sと1.5 GMAC/sで,それぞれ67.1 GMAC/s/W,31.3 GMAC/s/Wの効率に対応するMel binsによる前処理を含む。 ARM Cortex-M4の実装と比較して、我々のシステムは実行時間が10.3倍速く、エネルギー効率が51.1倍高い。 Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency.	翻訳日:2021-04-04 01:34:44 公開日:2021-01-12
# 深層ニューラルネットワークを用いた呼吸イベントの自動検出 Automated Respiratory Event Detection Using Deep Neural Networks ( http://arxiv.org/abs/2101.04635v1 ) ライセンス: Link先を確認	Thijs E Nassi, Wolfgang Ganglberger, Haoqi Sun, Abigail A Bucklin, Siddharth Biswal, Michel J A M van Putten, Robert J Thomas, M Brandon Westover	(参考訳) 睡眠中の呼吸を評価するゴールドスタンダードはポリソムノグラフィ(polysomnography)であり、重荷が高く(分析時間と測定コストの両方において)、繰り返すのが困難である。呼吸分析の自動化は、テスト効率を改善し、世界中で利用可能な実装機会を可能にする。マサチューセッツ総合病院(MGH)の9,656個のポリソムノグラフィー記録を用いて, 閉塞性無呼吸, 中枢性無呼吸, 低呼吸, 呼吸自覚関連覚醒を検出するため, 単一呼吸帯に基づくニューラルネットワーク(WaveNet)を訓練した。パフォーマンス評価には、apnea-hypopnea index分析を用いたイベントベースおよび記録ベースのメトリクスが含まれる。このモデルは8,455枚のポリソノグラフィー記録を含む公開データセットであるSleep-Heart-Health-Study-1でさらに評価された。 MGHデータセットの2次無呼吸事象検出には、95%の精度、0.89のアパネ-ハイパネ指数$r^2$、レシーバ動作特性曲線の曲線下領域、0.93と0.74の精度-リコール曲線が得られた。マルチクラスタスクでは,全ラベル付き中枢性無呼吸の81%が正しく分類され,この指標は閉塞性無呼吸の46%,呼吸時無呼吸の29%,低呼吸の16%であった。誤った予測の大部分は、別の種類の呼吸イベントとして誤分類であった。呼吸イベントを完全自動検出し, 臨床応用に十分な精度で無呼吸ハイポネア指数を評価できる。イベントタイプの分化はより困難であり、人間の呼吸アウトプットの複雑さと、手動アノテーションで使用される臨床閾値と基準のある程度の任意性を反映している可能性がある。 The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we trained a neural network (WaveNet) based on a single respiratory effort belt to detect obstructive apnea, central apnea, hypopnea and respiratory-effort related arousals. Performance evaluation included event-based and recording-based metrics - using an apnea-hypopnea index analysis. The model was further evaluated on a public dataset, the Sleep-Heart-Health-Study-1, containing 8,455 polysomnographic recordings. For binary apnea event detection in the MGH dataset, the neural network obtained an accuracy of 95%, an apnea-hypopnea index $r^2$ of 0.89 and area under the curve for the receiver operating characteristics curve and precision-recall curve of 0.93 and 0.74, respectively. For the multiclass task, we obtained varying performances: 81% of all labeled central apneas were correctly classified, whereas this metric was 46% for obstructive apneas, 29% for respiratory effort related arousals and 16% for hypopneas. The majority of false predictions were misclassifications as another type of respiratory event. Our fully automated method can detect respiratory events and assess the apnea-hypopnea index with sufficient accuracy for clinical utilization. Differentiation of event types is more difficult and may reflect in part the complexity of human respiratory output and some degree of arbitrariness in the clinical thresholds and criteria used during manual annotation.	翻訳日:2021-04-04 01:34:10 公開日:2021-01-12
# double-adversarial activation anomaly detection: adversarial autoencoder are anomaly generators Double-Adversarial Activation Anomaly Detection: Adversarial Autoencoders are Anomaly Generators ( http://arxiv.org/abs/2101.04645v1 ) ライセンス: Link先を確認	J.-P. Schulze, P. Sperl, K. B\"ottinger	(参考訳) 異常検出は、固有のクラス不均衡のため、機械学習アルゴリズムにとって難しいタスクである。観測されたデータを手動で分析するのはコストが高く、時間を要するため、通常、使用可能な場合の既知の異常はごくわずかである。生成モデルとニューラルネットワークの隠れ活性化の解析に着想を得て,DA3Dと呼ばれる新しい教師なし異常検出手法を導入する。ここでは,通常のデータのみに基づく異常な反例を生成するために,対向オートエンコーダを用いる。これらの人工的な異常は、実際の、しかし目に見えない異常を検出することができる。新たな生成手法により,異常検出の教師なしタスクを教師付きタスクに変換する。 DA3Dは、ドメイン知識を必要としない純粋にデータ駆動の方法で最先端の異常検出手法の性能を上回る。 Anomaly detection is a challenging task for machine learning algorithms due to the inherent class imbalance. It is costly and time-demanding to manually analyse the observed data, thus usually only few known anomalies if any are available. Inspired by generative models and the analysis of the hidden activations of neural networks, we introduce a novel unsupervised anomaly detection method called DA3D. Here, we use adversarial autoencoders to generate anomalous counterexamples based on the normal data only. These artificial anomalies used during training allow the detection of real, yet unseen anomalies. With our novel generative approach, we transform the unsupervised task of anomaly detection to a supervised one, which is more tractable by machine learning and especially deep learning methods. DA3D surpasses the performance of state-of-the-art anomaly detection methods in a purely data-driven way, where no domain knowledge is required.	翻訳日:2021-04-04 01:33:37 公開日:2021-01-12
# 人・場所・つながり--社会的場所の景観と社会ネットワーク構造 People, Places, and Ties: Landscape of social places and their social network structures ( http://arxiv.org/abs/2101.04737v1 ) ライセンス: Link先を確認	Jaehyuk Park, Bogdan State, Monica Bhole, Michael C. Bailey, and Yong-Yeol Ahn	(参考訳) 社会化の場として本質的な役割から、ネットワーク科学、社会学、地理学、都市計画、地域研究など幅広い分野から「第三の場所」が研究されている。しかし、第3位に大規模な国勢調査がないため、研究者は体系的な調査を控えた。ここでは,facebookページを用いて,第三者とそのソーシャルネットワークを組織的に調査する。解析の結果,第3地点の分布は地理的に多様であり,その分布は人口動態や郡特性と高い相関関係にあることが明らかとなった。礼拝の場所」のような特定の種類のページは、コミュニティの好みや集中に対する潜在的な相補性を示唆する大量のクラスタリングを示している。また, 異なるタイプの社会的場所のソーシャルネットワークは, 既成友情の密着したコミュニティである可能性が高いのに対して, 既成友情のPlaces of Worship と「コミュニティ・アメニティ」のページカテゴリーは, 新たな友情の結びつきを橋渡しする傾向にある。本研究は,社会空間と社会関係の体系的比較研究において,今後の研究のマイルストーンとなるものと考えられる。 Due to their essential role as places for socialization, "third places" - social places where people casually visit and communicate with friends and neighbors - have been studied by a wide range of fields including network science, sociology, geography, urban planning, and regional studies. However, the lack of a large-scale census on third places kept researchers from systematic investigations. Here we provide a systematic nationwide investigation of third places and their social networks, by using Facebook pages. Our analysis reveals a large degree of geographic heterogeneity in the distribution of the types of third places, which is highly correlated with baseline demographics and county characteristics. Certain types of pages like "Places of Worship" demonstrate a large degree of clustering suggesting community preference or potential complementarities to concentration. We also found that the social networks of different types of social place differ in important ways: The social networks of 'Restaurants' and 'Indoor Recreation' pages are more likely to be tight-knit communities of pre-existing friendships whereas 'Places of Worship' and 'Community Amenities' page categories are more likely to bridge new friendship ties. We believe that this study can serve as an important milestone for future studies on the systematic comparative study of social spaces and their social relationships.	翻訳日:2021-04-04 01:33:21 公開日:2021-01-12
# エアフォイル gan: エアフォイルのエンコーディングと合成 foraerodynamic-aware shape optimization Airfoil GAN: Encoding and Synthesizing Airfoils forAerodynamic-aware Shape Optimization ( http://arxiv.org/abs/2101.04757v1 ) ライセンス: Link先を確認	Yuyang Wang, Kenji Shimada, Amir Barati Farimani	(参考訳) エアフォイルのような空力形状の現在の設計は、可能な設計空間を探索するための計算集約的なシミュレーションを伴う。通常、このような設計は設計パラメータの事前定義に依存し、新しい形状の合成に制限を課す。本研究では,既存の翼から表現を自動的に学習し,学習した表現を用いて新しい翼を生成するデータ駆動型形状符号化・生成法を提案する。これらの表現は、空気力学的性能に基づいて合成翼形状の最適化に使用される。我々のモデルは、変分オートエンコーダとジェネレーティブ・アドバーサリアル・ネットワークを組み合わせたニューラルネットワークであるVAEGANに基づいて構築されており、勾配に基づく手法で訓練されている。本モデルでは,(1)既存のエアフォイルを潜在ベクターにエンコードし,それからエアフォイルを再構築し,(2)潜在ベクターをランダムにサンプリングしてエアフォイル座標領域にマッピングし,(3)学習した特徴を遺伝的アルゴリズムにより最適化し,所望の空力特性を有するエアフォイルを合成する。実験の結果,事前定義された設計パラメータを使わずに,形状情報を網羅的かつ包括的に符号化できることがわかった。特徴ベクトルの補間/補間またはガウス雑音からのサンプリングにより、モデルは新しい翼形状を自動的に合成することができる。遺伝的アルゴリズムによって学習された特徴の形状を最適化することで、合成された翼は特定の空力特性を持つように進化し、空力製品の設計を効果的かつ効率的に導くことができる。 The current design of aerodynamic shapes, like airfoils, involves computationally intensive simulations to explore the possible design space. Usually, such design relies on the prior definition of design parameters and places restrictions on synthesizing novel shapes. In this work, we propose a data-driven shape encoding and generating method, which automatically learns representations from existing airfoils and uses the learned representations to generate new airfoils. The representations are then used in the optimization of synthesized airfoil shapes based on their aerodynamic performance. Our model is built upon VAEGAN, a neural network that combines Variational Autoencoder with Generative Adversarial Network and is trained by the gradient-based technique. Our model can (1) encode the existing airfoil into a latent vector and reconstruct the airfoil from that, (2) generate novel airfoils by randomly sampling the latent vectors and mapping the vectors to the airfoil coordinate domain, and (3) synthesize airfoils with desired aerodynamic properties by optimizing learned features via a genetic algorithm. Our experiments show that the learned features encode shape information thoroughly and comprehensively without predefined design parameters. By interpolating/extrapolating feature vectors or sampling from Gaussian noises, the model can automatically synthesize novel airfoil shapes, some of which possess competitive or even better aerodynamic properties comparing with training airfoils. By optimizing shape on learned features via a genetic algorithm, synthesized airfoils can evolve to have specific aerodynamic properties, which can guide designing aerodynamic products effectively and efficiently.	翻訳日:2021-04-04 01:32:57 公開日:2021-01-12
# SARA(Self-Adaptive Reconfigurable Arrays):スケーリングGEMM高速化を支援するML Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling GEMM Acceleration ( http://arxiv.org/abs/2101.04799v1 ) ライセンス: Link先を確認	Ananda Samajdar, Michael Pellauer, Tushar Krishna	(参考訳) 層形状とサイズの観点からのディープニューラルネットワーク(DNN)モデルの多様性の向上に伴い、研究コミュニティはフレキシブル/再構成可能な加速器基板を調査してきた。この研究は2つの課題を提起した。ひとつは、パフォーマンス上のメリットと再構成可能性のオーバーヘッドをトレードオフできるアクセラレータアレイ内の適切な柔軟性を決定することです。 2つ目は、現在のDNNモデルと/またはレイヤの配列の適切な設定を決定し、実行時にアクセラレータを再設定できることです。本稿では、self adaptive reconfigurable array(sara)と呼ばれる新しいタイプのアクセラレータを紹介します。 SARAアーキテクチャは、再構成可能な配列と、実行時に配列の最適化された構成を決定するハードウェアユニットの両方で構成されている。我々は、SARAのインスタンスをSAGARと呼ぶアクセラレータでデモし、様々なサイズの小さな配列の分散コレクションや柔軟なアスペクト比を持つ単一配列として機能するように構成できる、新しい再構成可能なシストリックアレイを導入しました。我々はまた、現在の層パラメータに対する配列設定とデータフローを推奨するADAPTNETと呼ばれる新しいレコメンデーションニューラルネットワークを開発した。 ADAPTNETは、ADAPTNETを実行時に実行し、配列を再設定する統合されたカスタムハードウェアADAPTNETXで動作し、アクセル全体を自己充足する。 SAGARは、分散システムとして動作する10244x4配列の集合と同じマッピング柔軟性を提供し、3.5倍の電力効率と3.2倍の計算密度を実現している。 With increasing diversity in Deep Neural Network(DNN) models in terms of layer shapes and sizes, the research community has been investigating flexible/reconfigurable accelerator substrates. This line of research has opened up two challenges. The first is to determine the appropriate amount of flexibility within an accelerator array that that can trade-off the performance benefits versus the area overheads of the reconfigurability. The second is being able to determine the right configuration of the array for the current DNN model and/or layer and reconfigure the accelerator at runtime. This work introduces a new class of accelerators that we call Self Adaptive Reconfigurable Array (SARA). SARA architectures comprise of both a reconfigurable array and a hardware unit capable of determining an optimized configuration for the array at runtime. We demonstrate an instance of SARA with an accelerator we call SAGAR, which introduces a novel reconfigurable systolic array that can be configured to work as a distributed collection of smaller arrays of various sizes or as a single array with flexible aspect ratios. We also develop a novel recommendation neural network called ADAPTNET which recommends an array configuration and dataflow for the current layer parameters. ADAPTNET runs on an integrated custom hardware ADAPTNETX that runs ADAPTNET at runtime and reconfigures the array, making the entire accelerator self-sufficient. SAGAR is capable of providing the same mapping flexibility as a collection of 10244x4 arrays working as a distributed system while achieving 3.5x more power efficiency and 3.2x higher compute density Furthermore, the runtime achieved on the recommended parameters from ADAPTNET is 99.93% of the best achievable runtime.	翻訳日:2021-04-04 01:32:29 公開日:2021-01-12
# 4脚ラインフォロワロボットへの組込み型コンピュータビジョンシステムの適用 Embedded Computer Vision System Applied to a Four-Legged Line Follower Robot ( http://arxiv.org/abs/2101.04804v1 ) ライセンス: Link先を確認	Beatriz Arruda Asfora	(参考訳) ロボットは知覚と行動の結びつきとして定義することができる。このプロジェクトは、ロボットの視覚と動作をつなぐ自動コンピュータビジョン組み込みシステムを使用して、ロボットを駆動することを目的としている。ロボットに色認識システムを実装するために、処理言語、androidシステム、arduinoプラットフォーム、pixyカメラなどのオープンソースツールが選択される。制約は明確です – 単純さ,複製性,財務性です。ロボット工学、コンピュータビジョン、画像処理を統合するために、このロボットは典型的な移動ロボットの課題であるラインフォローに適用される。パスと背景を区別する問題は、一般的な大津法、実験による色の組み合わせに基づくしきい値、彩度と彩度による色追跡など、様々なアプローチで分析される。次に移動する場所の決定は、経路の線の中心に基づいており、完全に自動化されている。 4本足のロボットをプラットフォームとして、カメラを唯一のセンサーとして使用することで、ロボットはラインを追跡することに成功した。イメージのキャプチャからロボットの移動まで、統合ロボティクスがいかに実現可能かは明らかです。本論文の課題は機械工学、エレクトロニクス、制御システム、プログラミングに関する知識のみである。この作業に関するすべてがドキュメント化され、オープンソースのオンラインページで利用可能になったため、ロボット工学の学習と実験に役立てることができる。 Robotics can be defined as the connection of perception to action. Taking this further, this project aims to drive a robot using an automated computer vision embedded system, connecting the robot's vision to its behavior. In order to implement a color recognition system on the robot, open source tools are chosen, such as Processing language, Android system, Arduino platform and Pixy camera. The constraints are clear: simplicity, replicability and financial viability. In order to integrate Robotics, Computer Vision and Image Processing, the robot is applied on a typical mobile robot's issue: line following. The problem of distinguishing the path from the background is analyzed through different approaches: the popular Otsu's Method, thresholding based on color combinations through experimentation and color tracking via hue and saturation. Decision making of where to move next is based on the line center of the path and is fully automated. Using a four-legged robot as platform and a camera as its only sensor, the robot is capable of successfully follow a line. From capturing the image to moving the robot, it's evident how integrative Robotics can be. The issue of this paper alone involves knowledge of Mechanical Engineering, Electronics, Control Systems and Programming. Everything related to this work was documented and made available on an open source online page, so it can be useful in learning and experimenting with robotics.	翻訳日:2021-04-04 01:31:44 公開日:2021-01-12
# ニューラルネットワークを用いた仮想マイクロホン推定器 Neural Network-based Virtual Microphone Estimator ( http://arxiv.org/abs/2101.04315v1 ) ライセンス: Link先を確認	Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki	(参考訳) 少数のマイクロホンのためのマイクロホンアレイ技術の開発は、多くのデバイスに制約があるため重要である。この状況に対処する一つの方向は、例えばいくつかの物理モデル仮定に基づいて、マイク信号の数を事実上増やすことである。しかし、そのような仮定は必ずしも現実的な条件で満たされない。本稿では,ニューラルネットワークを用いた仮想マイクロホン推定器(NN-VME)を提案する。 NN-VMEは、最近の時間領域ニューラルネットワークの正確な推定能力を利用して、仮想マイクロホン信号を時間領域内で直接推定する。訓練時の仮想マイクの位置での実際の観察を利用した教師あり学習フレームワークを採用する。したがって、nn-vmeはマルチチャンネルの観測のみを使用して訓練することができ、実記録を直接行うことができ、非現実的な物理モデルに基づく仮定の必要性を回避できる。提案するnn-vmeは実記録においても高い仮想マイクロホン推定性能を達成し,nn-vmeを付加したビームフォーマによって音声強調と認識性能の両方が向上することを示す。 Developing microphone array technologies for a small number of microphones is important due to the constraints of many devices. One direction to address this situation consists of virtually augmenting the number of microphone signals, e.g., based on several physical model assumptions. However, such assumptions are not necessarily met in realistic conditions. In this paper, as an alternative approach, we propose a neural network-based virtual microphone estimator (NN-VME). The NN-VME estimates virtual microphone signals directly in the time domain, by utilizing the precise estimation capability of the recent time-domain neural networks. We adopt a fully supervised learning framework that uses actual observations at the locations of the virtual microphones at training time. Consequently, the NN-VME can be trained using only multi-channel observations and thus directly on real recordings, avoiding the need for unrealistic physical model-based assumptions. Experiments on the CHiME-4 corpus show that the proposed NN-VME achieves high virtual microphone estimation performance even for real recordings and that a beamformer augmented with the NN-VME improves both the speech enhancement and recognition performance.	翻訳日:2021-04-04 01:31:24 公開日:2021-01-12
# LSTMネットワークを用いた機械型通信におけるイベント駆動ソーストラヒック予測 Event-Driven Source Traffic Prediction in Machine-Type Communications Using LSTM Networks ( http://arxiv.org/abs/2101.04365v1 ) ライセンス: Link先を確認	Thulitha Senevirathna, Bathiya Thennakoon, Tharindu Sankalpa, Chatura Seneviratne, Samad Ali and Nandana Rajatheva	(参考訳) ソーストラフィック予測は、機械型通信(MTC)における予測リソース割り当てを可能にする主な課題の1つである。本稿では,イベント駆動ソーストラフィック予測のための長期短期記憶(lstm)ベースのディープラーニング手法を提案する。ソーストラフィック予測問題は、過去の送信データに基づいて、機械型装置(MTD)の送信状態を主焦点とするシーケンス生成タスクとして定式化することができる。これは、LSTMネットワークがデバイス間の因果関係を識別できるように、送信データを再構成することで実現される。このような因果関係の知識は、イベント駆動のトラフィック予測を可能にする。提案手法の性能は、異なるエントロピー範囲のmddによる事象に関するデータを用いて検討した。我々のモデルは、既存のベースラインソリューションよりも、リソースの節約と精度を約9%で上回ります。また,我々のモデルによるランダムアクセス (RA) 要求の低減について解析し,LSTMに基づくソーストラフィック予測手法の結果として必要な信号量が少ないことを示す。 Source traffic prediction is one of the main challenges of enabling predictive resource allocation in machine type communications (MTC). In this paper, a Long Short-Term Memory (LSTM) based deep learning approach is proposed for event-driven source traffic prediction. The source traffic prediction problem can be formulated as a sequence generation task where the main focus is predicting the transmission states of machine-type devices (MTDs) based on their past transmission data. This is done by restructuring the transmission data in a way that the LSTM network can identify the causal relationship between the devices. Knowledge of such a causal relationship can enable event-driven traffic prediction. The performance of the proposed approach is studied using data regarding events from MTDs with different ranges of entropy. Our model outperforms existing baseline solutions in saving resources and accuracy with a margin of around 9%. Reduction in Random Access (RA) requests by our model is also analyzed to demonstrate the low amount of signaling required as a result of our proposed LSTM based source traffic prediction approach.	翻訳日:2021-04-04 01:31:06 公開日:2021-01-12
# Type4Py: Pythonの深い類似性学習に基づく型推論 Type4Py: Deep Similarity Learning-Based Type Inference for Python ( http://arxiv.org/abs/2101.04470v1 ) ライセンス: Link先を確認	Amir M. Mir, Evaldas Latoskinas, Sebastian Proksch, Georgios Gousios	(参考訳) PythonやJavascriptのような動的言語は、開発者の柔軟性のために静的型付けを交換する。これは生産性が向上すると言われているが、静的型付けの欠如はランタイム例外、型不整合を引き起こし、IDEサポートの弱さの大きな要因である。これらの問題を緩和するため、PEP 484はPythonのオプション型アノテーションを導入した。既存のコードベースへの型の再適合はエラーを起こしやすいため、既存の部分的に注釈付けされたコードベースに基づいた自動型アノテーションを実現するための学習ベースのアプローチが提案されている。しかし、レア型とユーザ定義型の予測は依然として困難である。本稿では,pythonの類似度学習に基づく型推論モデルtype4pyを提案する。我々は、高次元空間における同種の型と異種の型を区別することを学ぶ階層型ニューラルネットワークモデルを設計し、その結果、型をクラスタ化する。最寄りの検索では、python関数の型シグネチャが考えられる。分析されたモジュールで見える型は、軽量な依存性分析を使って表面化されます。定量的および定性的な評価の結果,Type4Pyはタイプ予測タスクにおける最先端アプローチよりも有意に優れていた。トップ1の予測を考えると、Type4PyはTypilusやTypeWriterよりも19.33%、13.49%高い精度を得られる。 Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility. While this allegedly enables greater productivity, lack of static typing can cause runtime exceptions, type inconsistencies, and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing codebases is error-prone and laborious, learning-based approaches have been proposed to enable automatic type annotations based on existing, partially annotated codebases. However, the prediction of rare and user-defined types is still challenging. In this paper, we present Type4Py, a deep similarity learning-based type inference model for Python. We design a hierarchical neural network model that learns to discriminate between types of the same kind and dissimilar types in a high-dimensional space, which results in clusters of types. Nearest neighbor search suggests likely type signatures of given Python functions. The types visible to analyzed modules are surfaced using lightweight dependency analysis. The results of quantitative and qualitative evaluation indicate that Type4Py significantly outperforms state-of-the-art approaches at the type prediction task. Considering the Top-1 prediction, Type4Py obtains 19.33% and 13.49% higher precision than Typilus and TypeWriter, respectively, while utilizing a much bigger vocabulary.	翻訳日:2021-04-04 01:30:35 公開日:2021-01-12
# パラメータ依存力学系の初期値問題に対する機械学習 Machine Learning for Initial Value Problems of Parameter-Dependent Dynamical Systems ( http://arxiv.org/abs/2101.04595v1 ) ライセンス: Link先を確認	Roland Pulch and Maha Youssef	(参考訳) 物理パラメータを含む非線形力学系の初期値問題を考察する。溶液による利息の量が観測される。離散化は、多くの時間点における興味の量の軌跡をもたらす。パラメータの集合から軌道の離散値へのマッピングについて検討する。このマッピングの評価は初期値の問題を解決する必要がある。あるいは、機械学習の概念を用いて、評価が低い計算作業を必要とする近似を決定する。我々は、軌道のサンプルデータに適合するフィードフォワードニューラルネットワークを採用している。電気回路をモデル化する実験例に対して数値計算の結果を示す。 We consider initial value problems of nonlinear dynamical systems, which include physical parameters. A quantity of interest depending on the solution is observed. A discretisation yields the trajectories of the quantity of interest in many time points. We examine the mapping from the set of parameters to the discrete values of the trajectories. An evaluation of this mapping requires to solve an initial value problem. Alternatively, we determine an approximation, where the evaluation requires low computation work, using a concept of machine learning. We employ feedforward neural networks, which are fitted to data from samples of the trajectories. Results of numerical computations are presented for a test example modelling an electric circuit.	翻訳日:2021-04-04 01:30:14 公開日:2021-01-12
# MP3net: 単純な畳み込みGANによる生オーディオからのコヒーレントで微小な音楽生成 MP3net: coherent, minute-long music generation from raw audio with a simple convolutional GAN ( http://arxiv.org/abs/2101.04785v1 ) ライセンス: Link先を確認	Korneel van den Broek	(参考訳) 本稿では,MP3/Vorbis音声圧縮技術を利用して,長距離コヒーレンスを有する長大な高品質オーディオサンプルを生成する深層畳み込みGANを提案する。このモデルは、すべての位相情報を含むMDCT(Modified Discrete Cosine Transform)データ表現を使用する。したがって、位相生成はモデルに不可欠な部分である。人間の耳の聴覚マスキングと心理音響知覚限界を利用して、真の分布を広げ、トレーニングプロセスを安定化させる。モデルアーキテクチャは深部2次元畳み込みネットワークであり、各ジェネレータモデルブロックは時間軸に沿って分解能を高め、周波数軸に沿って高いオクターブを追加する。より深いレイヤは出力のすべての部分に接続され、完全なトラックのコンテキストを持つ。これにより、長距離コヒーレンスを示すサンプルを生成することができる。我々はMP3netを使って、1つのクラウドTPUv2で250時間トレーニングした後、サンプルレート22kHzの95sステレオトラックを作成します。 CNNベースのモデルアーキテクチャのさらなる利点は、新しい曲の生成がほぼ瞬時に行われることである。 We present a deep convolutional GAN which leverages techniques from MP3/Vorbis audio compression to produce long, high-quality audio samples with long-range coherence. The model uses a Modified Discrete Cosine Transform (MDCT) data representation, which includes all phase information. Phase generation is hence integral part of the model. We leverage the auditory masking and psychoacoustic perception limit of the human ear to widen the true distribution and stabilize the training process. The model architecture is a deep 2D convolutional network, where each subsequent generator model block increases the resolution along the time axis and adds a higher octave along the frequency axis. The deeper layers are connected with all parts of the output and have the context of the full track. This enables generation of samples which exhibit long-range coherence. We use MP3net to create 95s stereo tracks with a 22kHz sample rate after training for 250h on a single Cloud TPUv2. An additional benefit of the CNN-based model architecture is that generation of new songs is almost instantaneous.	翻訳日:2021-04-04 01:29:46 公開日:2021-01-12
# UCNN:非構造化メッシュの畳み込み戦略 UCNN: A Convolutional Strategy on Unstructured Mesh ( http://arxiv.org/abs/2101.05207v1 ) ライセンス: Link先を確認	Mengfei Xu, Shufang Song, Xuxiang Sun, Weiwei Zhang	(参考訳) 流体力学の機械学習では、フルコネクテッドニューラルネットワーク(FNN)はモデリングにのみローカル機能を使用するが、畳み込みニューラルネットワーク(CNN)は構造化/非構造化メッシュのデータには適用できない。 FNNとCNNの限界を克服するため、非構造畳み込みニューラルネットワーク(UCNN)が提案され、重み関数を通じて近隣ノードの特徴を集約し、効果的に活用する。随伴ベクトルモデリングは、ucnnの性能を研究するタスクとして取られる。フローフィールド特徴から随伴ベクトルへのマッピング関数は、GPU上の効率的な並列実装によって構成される。 UCNNのモデリング能力は,テストケースにおける検証セットや空力形状の最適化においてFNNと比較される。さらに,メッシュ変化がUCNNのモデリング能力に及ぼす影響について検討した。その結果,UCNNはモデリング過程においてより正確であることが示唆された。 In machine learning for fluid mechanics, fully-connected neural network (FNN) only uses the local features for modelling, while the convolutional neural network (CNN) cannot be applied to data on structured/unstructured mesh. In order to overcome the limitations of FNN and CNN, the unstructured convolutional neural network (UCNN) is proposed, which aggregates and effectively exploits the features of neighbour nodes through the weight function. Adjoint vector modelling is taken as the task to study the performance of UCNN. The mapping function from flow-field features to adjoint vector is constructed through efficient parallel implementation on GPU. The modelling capability of UCNN is compared with that of FNN on validation set and in aerodynamic shape optimization at test case. The influence of mesh changing on the modelling capability of UCNN is further studied. The results indicate that UCNN is more accurate in modelling process.	翻訳日:2021-04-04 01:29:28 公開日:2021-01-12
# 深層学習によるボアホール比抵抗測定システムの設計 Design of borehole resistivity measurement acquisition systems using deep learning ( http://arxiv.org/abs/2101.05623v1 ) ライセンス: Link先を確認	M. Shahriari, A. Hazra, D. Pardo	(参考訳) lwd(loging-while-drilling)装置で記録されたボアホール比抵抗測定は、地球の地下特性を特徴付けるために広く用いられている。石油やガスなどの天然資源の抽出を促進する。 lwd装置は、井戸付近の地表面の電気的特性を推定し、おそらく井戸軌道を補正するために、電磁的測定のリアルタイムな反転を必要とする。深層ニューラルネットワーク(dnn)ベースの手法は、トレーニングフェーズ中にオフラインで前方および逆問題を近似するので、ボアホール比抵抗測定の迅速な反転に適しており、評価にほんの1秒(すなわち予測)しか必要としない。しかし、逆問題は通常複数の解を許容する。データミスフィットに基づく従来の損失関数を持つDNNは、逆問題の解決には不適当である。これは、エンコーダ-デコーダアーキテクチャ用に特別に設計された損失関数に正規化項を追加することで部分的に克服できる。しかし、正則化を加えることで、優先すべき物理解の集合に対する可能な解の数を大幅に制限する。これを回避するために,正規化を伴わない2段階損失関数を用いる。さらに, 逆解を保証するためには, 十分な数の計測値を持つ注意深く選択した計測取得システムが必要である。そこで本研究では,DNNに基づく計測取得システムの設計のための反復アルゴリズムを提案する。いくつかの合成例を通してDNNに基づく反復アルゴリズムについて述べる。以上の結果から, 測定装置上および下方における抵抗層と導電層の両方を同定し, 特徴付けるのに十分であることがわかった。数値的な結果は有望であるが, 産業目的のためにはさらなる改良が必要である。 Borehole resistivity measurements recorded with logging-while-drilling (LWD) instruments are widely used for characterizing the earth's subsurface properties. They facilitate the extraction of natural resources such as oil and gas. LWD instruments require real-time inversions of electromagnetic measurements to estimate the electrical properties of the earth's subsurface near the well and possibly correct the well trajectory. Deep Neural Network (DNN)-based methods are suitable for the rapid inversion of borehole resistivity measurements as they approximate the forward and inverse problem offline during the training phase and they only require a fraction of a second for the evaluation (aka prediction). However, the inverse problem generally admits multiple solutions. DNNs with traditional loss functions based on data misfit are ill-equipped for solving an inverse problem. This can be partially overcome by adding regularization terms to a loss function specifically designed for encoder-decoder architectures. But adding regularization seriously limits the number of possible solutions to a set of a priori desirable physical solutions. To avoid this, we use a two-step loss function without any regularization. In addition, to guarantee an inverse solution, we need a carefully selected measurement acquisition system with a sufficient number of measurements. In this work, we propose a DNN-based iterative algorithm for designing such a measurement acquisition system. We illustrate our DNN-based iterative algorithm via several synthetic examples. Numerical results show that the obtained measurement acquisition system is sufficient to identify and characterize both resistive and conductive layers above and below the logging instrument. Numerical results are promising, although further improvements are required to make our method amenable for industrial purposes.	翻訳日:2021-04-04 01:29:15 公開日:2021-01-12

Title

Authors

Abstract

論文公表日・翻訳日

# ネットワーク構造を用いたノードの属性予測

Predicting Attributes of Nodes Using Network Structure ( http://arxiv.org/abs/1912.12264v3 )

ライセンス: Link先を確認

Sarwan Ali, Muhammad Haroon Shakeel, Imdadullah Khan, Safiullah Faizullah, Muhammad Asad Khan

(参考訳) ソーシャルネットワークのような多くのグラフでは、ノードはその振る舞いを表す属性を持つ。このようなグラフでノード属性を予測することは、レコメンデーションシステム、プライバシー保護、ターゲット広告といった多くのドメインのアプリケーションにとって重要な問題である。属性は属性間のパターンや相関を分析し、分類/回帰アルゴリズムを用いて予測することができる。しかし,これらの手法は容易に利用可能なネットワークトポロジ情報を利用できない。この観点から、ノードの異なる属性間の相互接続を利用して予測精度を向上させることができる。本稿では,隣人の属性をすべて利用し,属性値が$a_i$の値となる属性$a_i$(機械学習アルゴリズムの入力として使用される)に関する特徴マップを用いてノードを表現する手法を提案する。 10個の実世界のデータセットに対して広範な実験を行い、提案した特徴マップがこれらのデータセットのベースラインアプローチと比較して予測精度を大幅に向上することを示す。

In many graphs such as social networks, nodes have associated attributes representing their behavior. Predicting node attributes in such graphs is an important problem with applications in many domains like recommendation systems, privacy preservation, and targeted advertisement. Attributes values can be predicted by analyzing patterns and correlations among attributes and employing classification/regression algorithms. However, these approaches do not utilize readily available network topology information. In this regard, interconnections between different attributes of nodes can be exploited to improve the prediction accuracy. In this paper, we propose an approach to represent a node by a feature map with respect to an attribute $a_i$ (which is used as input for machine learning algorithms) using all attributes of neighbors to predict attributes values for $a_i$. We perform extensive experimentation on ten real-world datasets and show that the proposed feature map significantly improves the prediction accuracy as compared to baseline approaches on these datasets.

翻訳日:2023-06-09 23:17:03 公開日:2021-01-12

# 多光子一致率におけるサム則

Sum rules in multiphoton coincidence rates ( http://arxiv.org/abs/2004.11504v2 )

ライセンス: Link先を確認

David Amaro Alcal\'a and Dylan Spivak and Hubert de Guise

(参考訳) 多重光子干渉法実験では、元のユニタリ散乱行列を0$sのコセット行列に置き換えることで、慎重に選択された一致率の和を単純化できることを示す。これらの$0$sの個数と配置は、元の率の合計に影響を与えることなく、和における各項の複雑さを減少させる。特に、永久数のモジュラス二乗の和の評価は、ある場合において行列式のモジュラス二乗の和となることが示されている。レートの和は、干渉計における光学素子の除去と等価であることが示されている。

We show that sums of carefully chosen coincidence rates in a multiphoton interferometry experiment can be simplified by replacing the original unitary scattering matrix with a coset matrix containing $0$s. The number and placement of these $0$s reduces the complexity of each term in the sum without affecting the original sum of rates. In particular, the evaluation of sums of modulus squared of permanents is shown to turn in some cases into a sum of modulus squared of determinants. The sums of rates are shown to be equivalent to the removal of some optical elements in the interferometer.

翻訳日:2023-05-22 06:25:48 公開日:2021-01-12

# d波量子アニールにおけるハミルトン雑音のベンチマーク

Benchmarking Hamiltonian Noise in the D-Wave Quantum Annealer ( http://arxiv.org/abs/2006.16421v3 )

ライセンス: Link先を確認

Tristan Zaborniak, Rog\'erio de Sousa

(参考訳) 様々なノイズ源は量子コンピュータの性能を制限し、量子ビット状態を制御不能に変化させ、コヒーレンス時間を短縮する。量子異方体では、このノイズはハミルトン問題を定義するパラメータにさらなるゆらぎをもたらし、もともとプログラムされた問題から乱される問題の基底状態を見つける。本稿では,量子アニーラのプログラムされたハミルトニアンのノイズ量を評価する手法について述べる。プログラムされたハミルトニアン集合の係数をゼロにする縮退の列は、量子アニーリングプロトコルにおけるハミルトニアンパラメータの"in situ"に影響する雑音スペクトル密度の推定に繋がることを示す。この方法は、D-Waveの低雑音2000キュービットデバイス(DW_2000Q_6)および最近リリースされた5000キュービットデバイス(Advantage_system1.1)で実証される。 dw_2000q_6のベンチマークでは、フラックス量子ビットを形成する材料固有のフラックスノイズの周波数依存性特性が1/f^{0.7}$で支配されるハミルトンノイズを示す。対照的にAdvantage_system1.1は、全てのアニール時間に対してDW_2000Q_6よりも2〜3倍高い内在的なフラックスノイズ振幅を持つ、低アニール時間における追加ノイズ源の影響を受けている。

Various sources of noise limit the performance of quantum computers by altering qubit states in an uncontrolled manner throughout computations and reducing their coherence time. In quantum annealers, this noise introduces additional fluctuations to the parameters defining the original problem Hamiltonian, such that they find the ground states of problems perturbed from those originally programmed. Here we describe a method to benchmark the amount of noise affecting the programmed Hamiltonian of a quantum annealer. We show that a sequence of degenerate runs with the coefficients of the programmed Hamiltonian set to zero leads to an estimate of the noise spectral density affecting Hamiltonian parameters "in situ" during the quantum annealing protocol. The method is demonstrated in D-Wave's lower noise 2000 qubit device (DW_2000Q_6) and in its recently released 5000 qubit device (Advantage_system1.1). Our benchmarking of DW_2000Q_6 shows Hamiltonian noise dominated by the $1/f^{0.7}$ frequency dependence characteristic of flux noise intrinsic to the materials forming flux qubits. In contrast, Advantage_system1.1 is found to be affected by additional noise sources for low annealing times, with underlying intrinsic flux noise amplitudes $2-3$ times higher than in DW_2000Q_6 for all annealing times.

翻訳日:2023-05-12 03:19:49 公開日:2021-01-12

# 2\otimes d$量子システムのための分離可能かつ絶対分離可能な状態球の構築

Constructing a ball of separable and absolutely separable states for $2\otimes d$ quantum system ( http://arxiv.org/abs/2007.00891v2 )

ライセンス: Link先を確認

Satyabrata Adhikari

(参考訳) 絶対分離状態は、任意の大域的ユニタリ変換の作用の下で分離可能な状態の一種である。これらの状態は量子相関を持ちうるか、そうでないかもしれないし、これらの相関は量子不協和によって測定することができる。絶対分離状態は、たとえ無限小の量子相関を含むとしても、量子計算において有用である。したがって、ゼロディスコードを持つ2量子ビットの絶対分離可能な状態のクラスを探索するために、すべてのゼロディスコード状態を示す$\varrho$ である$tr(\varrho^{2})$ の上限を導出した。一般に、上界は検討中の状態に依存するが、その状態がゼロディスコード状態の特定のクラスに属するならば、上界は状態独立であることが分かる。後に、これらの非調和状態の特定のクラスのうち、絶対分離可能な部分クラスが存在することが示される。さらに、与えられたキュービット量子状態の分離性に必要な条件を導出した。次に、導出条件を用いて、$tr(\rho^{2})\leq tr(x^{2})+2tr(xz)+tr(z^{2})$ で記述された2\otimes d$ の量子系を構築し、そこで2\otimes d$ 量子系は密度演算子 $\rho$ によって記述され、ブロック行列 $x,y$ と $z$ が $x,z\geq 0$ で表現できる。特に、qubit-qubit系では、新しく構築された球は、$Tr(\rho^{2})\leq \frac{1}{3}$ で表される球と比較して、より大きい絶対分離状態を含むことを示す。最後に,Qubit-qudit系の絶対分離性について,調査中の純度の観点から,その必要条件を導出した。

Absolute separable states is a kind of separable state that remain separable under the action of any global unitary transformation. These states may or may not have quantum correlation and these correlations can be measured by quantum discord. We find that the absolute separable states are useful in quantum computation even if it contains infinitesimal quantum correlation in it. Thus to search for the class of two-qubit absolute separable states with zero discord, we have derived an upper bound for $Tr(\varrho^{2})$, where $\varrho$ denoting all zero discord states. In general, the upper bound depends on the state under consideration but if the state belong to some particular class of zero discord states then we found that the upper bound is state independent. Later, it is shown that among these particular classes of zero discord states, there exist sub-classes which are absolutely separable. Furthermore, we have derived necessary conditions for the separability of a given qubit-qudit states. Then we used the derived conditions to construct a ball for $2\otimes d$ quantum system described by $Tr(\rho^{2})\leq Tr(X^{2})+2Tr(XZ)+Tr(Z^{2})$, where the $2\otimes d$ quantum system is described by the density operator $\rho$ which can be expressed by block matrices $X,Y$ and $Z$ with $X,Z\geq 0$. In particular, for qubit-qubit system, we show that the newly constructed ball contain larger class of absolute separable states compared to the ball described by $Tr(\rho^{2})\leq \frac{1}{3}$. Lastly, we have derived the necessary condition in terms of purity for the absolute separability of a qubit-qudit system under investigation.

翻訳日:2023-05-11 20:56:30 公開日:2021-01-12

# c-Functionによる位相量子相転移の検出

Detecting Topological Quantum Phase Transitions via the c-Function ( http://arxiv.org/abs/2007.07273v2 )

ライセンス: Link先を確認

Matteo Baggioli, Dimitrios Giataganas

(参考訳) 位相的量子臨界点の位置を検出するための新しい高精度プローブとしてc関数を提案する。直接応用として、位相的に自明な絶縁相と隙間のないワイル半金属の間の位相量子相転移を示すホログラフィックモデルを考える。量子臨界点は空間方向において強いリフシッツ様の異方性を示し、量子相転移は標準ランダウパラダイムに従わない。 c-函数は量子臨界点における大域的な特徴をロバストに示し、2つの異なるゼロ温度位相を非常に精度良く区別する。 c-関数と絡み合いエントロピーの関係を考慮すると、我々の提案は量子相転移の一般的な特徴であり、ホログラフィーフレームワークを超えて適用可能であると推測する。

We propose the c-function as a new and accurate probe to detect the location of topological quantum critical points. As a direct application, we consider a holographic model which exhibits a topological quantum phase transition between a topologically trivial insulating phase and a gapless Weyl semimetal. The quantum critical point displays a strong Lifshitz-like anisotropy in the spatial directions and the quantum phase transition does not follow the standard Landau paradigm. The c-function robustly shows a global feature at the quantum criticality and distinguishes with great accuracy the two separate zero temperature phases. Taking into account the relation of the c-function with the entanglement entropy, we conjecture that our proposal is a general feature of quantum phase transitions and that is applicable beyond the holographic framework.

翻訳日:2023-05-10 01:58:26 公開日:2021-01-12

# He型およびLi型等電子系列の超コンパクト正確な波動関数と変分電卓 I.地上状態

Ultra-Compact accurate wave functions for He-like and Li-like iso-electronic sequences and variational calculus. I. Ground state ( http://arxiv.org/abs/2007.11745v4 )

ライセンス: Link先を確認

A.V. Turbiner, J.C. Lopez Vieyra, J.C. del Valle, D.J. Nader

(参考訳) クーロン電荷(qmcc)の量子力学の適用可能性の領域を記述する一般化ハイレラス・キノ下関数(英語版)とゲバラ・ハリス・タービナー関数(英語版)の形でのいくつかの超コンパクトな正確な波動関数、または同値に、点状、無限重核を持つ静的近似におけるhe様およびli様の等電子配列の基底状態エネルギー(4-5重要な桁(s.d.))に対する非相対論的qed(nrqed)が構成されている。どちらの列に対しても、得られたパラメータは単純な滑らかな関数によって$Z$に収まることが示され、一般にこれらのパラメータは変分計算で現れるパラメータとは異なる。 He型2電子系列では、基底状態関数の近似式がエネルギー$\sim 10^{-3}$\,a.u.に対して絶対精度を与え、カスプパラメータと6つの期待値の両方に対して同じ相対精度$\sim 10^{-2}-10^{-3}$を求める。 Li型3電子系列では、変分試行関数として取る最も正確な超コンパクト関数はエネルギー$\sim 10^{-3}$\,a.u., 2-3 s.d.の電子核カスプパラメータ$Z \leq 20$と3 s.d.の2つの期待値$Z=3$に対して絶対的精度を与える。

Several ultra-compact accurate wave functions in the form of generalized Hylleraas-Kinoshita functions and Guevara-Harris-Turbiner functions, which describe the domain of applicability of the Quantum Mechanics of Coulomb Charges (QMCC), or, equivalently, the Non-Relativistic QED (NRQED), for the ground state energies (4-5 significant digits (s.d.)) of He-like and Li-like iso-electronic sequences in the static approximation with point-like, infinitely heavy nuclei are constructed. It is shown that for both sequences the obtained parameters can be fitted in $Z$ by simple smooth functions: in general, these parameters differ from the ones emerging in variational calculations. For the He-like two-electron sequence the approximate expression for the ground state function, which provides absolute accuracy for the energy $\sim 10^{-3}$\,a.u. and the same relative accuracies $\sim 10^{-2}-10^{-3}$ for both the cusp parameters and the six expectation values, is found. For the Li-like three-electron sequence the most accurate ultra-compact function taken as the variational trial function provides absolute accuracy for energy $\sim 10^{-3}$\,a.u., 2-3 s.d. for the electron-nuclear cusp parameter for $Z \leq 20$ and 3 s.d. for the two expectation values for $Z=3$.

翻訳日:2023-05-08 11:07:54 公開日:2021-01-12

# ナノスケールダイヤモンド磁気計による脂質二層膜のラベルフリー位相変化検出

Label-free phase change detection of lipid bilayers using nanoscale diamond magnetometry ( http://arxiv.org/abs/2007.13085v5 )

ライセンス: Link先を確認

Hitoshi Ishiwata, Hiroshi C. Watanabe, Shinya Hanashima, Takayuki Iwasaki and Mutsuko Hatano

(参考訳) ダイヤモンド中のNV中心は、NMRスペクトルと温度測定の高感度ナノスケール分析のための特別な品質の量子センサーである。本研究では,NV中心の深さによって決定される小体積~(6nm)$^{3}$のアンサンブル平均核スピン検出による脂質二層膜のナノスケール相変化検出について検討した。ナノスケールNMR信号の解析により、脂質二分子膜の厚さは6.2 nm$\pm$3.4 nmで、プロトン密度は65プロトン/nm$^{3}$で、ダイヤモンド試料の上に脂質二分子膜の形成を検証する。ナノスケール体積の相関スペクトルは, 71.8mTの印加磁場における陽子のラーモア周波数に対応する3.06MHzの量子振動を呈し, モンテカルロシミュレーションで構築した2次元分子拡散モデルと分子動力学シミュレーションの結果を比較した。 1.5$\pm$ 0.25 nm$^{2}$/${\mu}$sから3.0$\pm$ 0.5 nm$^{2}$/${\mu}$sへの拡散定数の変化があり、温度が26.5$^\circ$Cから36.0$^\circ$Cへ変化する。その結果, ナノスケールダイヤモンド磁力計を用いたラベルフリー測定では, 並進拡散と温度変化の同時観測が可能となった。本手法は, 細胞膜を無ラベルでイメージングし, その相組成と動的特性を理解する方法である。

The NV center in a diamond is a quantum sensor with exceptional quality for highly sensitive nanoscale analysis of NMR spectra and thermometry. In this study, we investigate nanoscale phase change detection of lipid bilayers utilizing ensemble-averaged nuclear spin detection from small volume ~ (6 nm)$^{3}$, which was determined by the depth of the NV center. Analysis of nanoscale NMR signal confirm thickness of lipid bilayer to be 6.2 nm $\pm$ 3.4 nm with proton density of 65 proton/nm$^{3}$ verifying formation of lipid bilayer on top of diamond sample. Correlation spectroscopy from nanoscale volume reveals quantum oscillation at 3.06 MHz corresponding to the Larmor frequency of proton at an applied magnetic field of 71.8 mT. The result of the correlation spectroscopy was compared with the 2D molecular diffusion model constructed by Monte Carlo simulation combined with results from molecular dynamics simulation. There is a change in diffusion constant from 1.5 $\pm$ 0.25 nm$^{2}$/${\mu}$s to 3.0 $\pm$ 0.5 nm$^{2}$/${\mu}$s when the temperature changes from 26.5 $^\circ$C to 36.0 $^\circ$C. Our results demonstrate that simultaneous observation of changes in translational diffusion and temperature is possible in label-free measurements using nanoscale diamond magnetometry. Our method paves the way for label-free imaging of cell membranes for understanding its phase composition and dynamics.

翻訳日:2023-05-08 04:47:15 公開日:2021-01-12

# 偏光エンタングル光子のサニャック源に結合したtsirelsonへの接近

Approaching the Tsirelson bound with a Sagnac source of polarization-entangled photons ( http://arxiv.org/abs/2008.01575v2 )

ライセンス: Link先を確認

Sandra Meraner, Robert J. Chapman, Stefan Frick, Robert Keil, Maximilian Prilm\"uller and Gregor Weihs

(参考訳) 高忠実度偏光束縛光子は量子通信の強力な資源であり、絡み合いと量子テレポーテーションを分散する。ベル-CHSHの不等式 $S\leq2$ は二部交絡によって破られ、最大の絡み合った状態だけが$S=2\sqrt{2}$ を達成できる。自発的パラメトリックダウン変換源は、トシレルソン境界に近い相関を持つ絡み合った光子を生成することができる。サニャック構成は、固有の安定性、コンパクトなフットプリント、高いコレクション効率を提供するが、ソース輝度と絡み合いの可視性の間にはトレードオフがあることが多い。ここでは、2\sqrt{2}-s=(5.65\pm0.57)\times10^{-3}$のsagnac偏光エンタングルソースを最高値とほぼ同等に生成し、(4660\pm70)$ pairs/s/mwを生成・検出する。我々の情報源は、0.9953\pm0.0003$ concurrenceと0.99743\pm0.00014$ を理想的なベル状態に忠実に記録している。サニャック光源の系統的誤差を研究することにより、結晶内の集束焦点の精度が、実験におけるS$パラメータの削減に最大の役割を果たすことを明らかにした。 sagnacソースで記録された最高$s$パラメータを、非常に高い輝度を維持しながら最新のものにすることができる経路を提供する。

High-fidelity polarization-entangled photons are a powerful resource for quantum communication, distributing entanglement and quantum teleportation. The Bell-CHSH inequality $S\leq2$ is violated by bipartite entanglement and only maximally entangled states can achieve $S=2\sqrt{2}$, the Tsirelson bound. Spontaneous parametric down-conversion sources can produce entangled photons with correlations close to the Tsirelson bound. Sagnac configurations offer intrinsic stability, compact footprint and high collection efficiency, however, there is often a trade off between source brightness and entanglement visibility. Here, we present a Sagnac polarization-entangled source with $2\sqrt{2}-S=(5.65\pm0.57)\times10^{-3}$, on-par with the highest values recorded, while generating and detecting $(4660\pm70)$ pairs/s/mW, which is a substantially higher brightness than previously reported for Sagnac sources and around two orders of magnitude brighter than for traditional cone sources with the highest $S$ parameter. Our source records $0.9953\pm0.0003$ concurrence and $0.99743\pm0.00014$ fidelity to an ideal Bell state. By studying systematic errors in Sagnac sources, we identify that the precision of the collection focal point inside the crystal plays the largest role in reducing the $S$ parameter in our experiment. We provide a pathway that could enable the highest $S$ parameter recorded with a Sagnac source to-date while maintaining very high brightness.

翻訳日:2023-05-07 04:33:44 公開日:2021-01-12

# sachdev-ye-kitaev鎖の非ユニタリダイナミクス

Non-unitary dynamics of Sachdev-Ye-Kitaev chain ( http://arxiv.org/abs/2008.11955v2 )

ライセンス: Link先を確認

Chunxiao Liu, Pengfei Zhang, Xiao Chen

(参考訳) Sachdev-Ye-Kitaevモデルに基づく一元的および虚的進化からなる一元的非一元的ダイナミクスのシリーズを構築する。短距離の絡み合い状態から始めて、大きな$N$制限の経路積分形式を用いて絡み合いのダイナミクスを解析する。その結果,(1)想像上の進化の強さを変化させることにより,相互作用モデルが高度に絡み合った体積法相から領域法相への1次相転移を示すこと,(2)1次元自由フェルミオンモデルは創発的な二次元共形対称性を持つ広範な臨界状態を示すこと,の2つが特に興味深い。

We construct a series of one-dimensional non-unitary dynamics consisting of both unitary and imaginary evolutions based on the Sachdev-Ye-Kitaev model. Starting from a short-range entangled state, we analyze the entanglement dynamics using the path integral formalism in the large $N$ limit. Among all the results that we obtain, two of them are particularly interesting: (1) By varying the strength of the imaginary evolution, the interacting model exhibits a first order phase transition from the highly entangled volume law phase to an area law phase; (2) The one-dimensional free fermion model displays an extensive critical regime with emergent two-dimensional conformal symmetry.

翻訳日:2023-05-04 19:46:15 公開日:2021-01-12

# 非可換オブザーバブルの同時連続測定による量子力学

Quantum dynamics under simultaneous and continuous measurement of noncommutative observables ( http://arxiv.org/abs/2008.12908v2 )

ライセンス: Link先を確認

Chao Jiang, Gentaro Watanabe

(参考訳) 我々は、コンピュテータが必ずしも$c$-numberでない2つの非可換可観測器の同時かつ連続的な測定を考える。アーサース・ケリーモデルを再検討し、2つの観測可能な系の同時測定を記述するために一般化する。この一般化モデルを用いて,scott and milburn (scott and milburn, phys. rev. a 63, 042101 (2001)) によって提案されたスキームに従って,連続的にシステムを測定する。非条件のマスター方程式は連続極限におけるリンドブラッド形式に還元される。さらに、マスター方程式はこれらの2つの測定値のクロス項を含まないことが分かる。最後に,2つの観測機器の同時連続計測に基づくフィードバック制御により,外部領域における2段階のシステム状態を作成する手法を提案する。

We consider simultaneous and continuous measurement of two noncommutative observables of the system whose commutator is not necessarily a $c$-number. We revisit the Arthurs-Kelly model and generalize it to describe the simultaneous measurement of two observables of the system. Using this generalized model, we continuously measure the system by following the scheme proposed by Scott and Milburn [Scott and Milburn, Phys. Rev. A 63, 042101 (2001)]. We find that the unconditioned master equation reduces to the Lindblad form in the continuous limit. In addition, we find that the master equation does not contain a cross term of these two measurements. Finally, we propose a scheme to prepare the state of a two-level system in an external field by feedback control based on the simultaneous, continuous measurement of the two observables.

翻訳日:2023-05-04 09:13:50 公開日:2021-01-12

# ガラスを用いた円形磁性電子レンズの量子力学とB(z)$の電力則モデル

Quantum mechanics of round magnetic electron lenses with Glaser and power law models of $B(z)$ ( http://arxiv.org/abs/2009.13943v2 )

ライセンス: Link先を確認

Sameen Ahmed Khan, Ramaswamy Jagannathan

(参考訳) 単一粒子レベルでの量子電子ビーム光学のスカラー理論は、折りたたみワウトフイセン様変換法を用いてディラック方程式から導かれる。軸磁場$B(z)$に対するGlaserと電力法則モデルを用いた円形磁性電子レンズの研究を行った。グラッサーモデルレンズの同軸量子プロパゲータは、その同軸運動方程式のよく知られた基本解によって得られる。 B(z)$のパワーローモデルを持つレンズの場合、微分方程式を解くことで得られる偏軸方程式のよく知られた基本解もまた、ペアノ・ベーカー級数を用いて構成される。収差の量子力学を簡潔に論じる。収差における量子不確実性の役割と非パラ軸ビームの運動方程式の非線形部分について指摘する。本稿の目的は電子ビーム光学の量子力学を理解することであるが、現在の電子ビームデバイスの光学に対する量子効果の影響は無視できるかもしれない。

Scalar theory of quantum electron beam optics, at the single-particle level, derived from the Dirac equation using a Foldy-Wouthuysen-like transformation technique is considered. Round magnetic electron lenses with Glaser and power law models for the axial magnetic field $B(z)$ are studied. Paraxial quantum propagator for the Glaser model lens is obtained in terms of the well known fundamental solutions of its paraxial equation of motion. In the case of lenses with the power law model for $B(z)$ the well known fundamental solutions of the paraxial equations, obtained by solving the differential equation, are constructed using the Peano-Baker series also. Quantum mechanics of aberrations is discussed briefly. Role of quantum uncertainties in aberrations, and in the nonlinear part of the equations of motion for a nonparaxial beam, is pointed out. The main purpose of this article is to understand the quantum mechanics of electron beam optics though the influence of quantum effects on the optics of present-day electron beam devices might be negligible.

翻訳日:2023-04-30 16:31:55 公開日:2021-01-12

# ボソニックラダーにおける物質のヘリカル相の探索

Exploring helical phases of matter in bosonic ladders ( http://arxiv.org/abs/2010.02740v2 )

ライセンス: Link先を確認

Andreas Haller, Apollonas S. Matsoukas-Roubeas, Yueting Pan, Matteo Rizzi and Michele Burrello

(参考訳) 超低温原子のラダーモデルは、人工ゲージ場と相互作用の間の相互作用と関連する物質の異なる現象と相に関する実験および理論的研究のための多目的プラットフォームを提供する。強い相関を持つヘリカル状態は、粒子と磁束密度の比で現れることが知られており、しばしば分数量子ホール状態の1次元極限として解釈され、プレトポロジーと呼ばれる。しかしながら、それらのシグネチャは通常、これらの状態を特徴づける小さなギャップのために観察するのが困難である。本稿では,充填係数1のボソニックラダーモデルについて検討する。ボゾン化, 再正規化群, 行列積状態シミュレーションに基づいて, この共鳴に現れる2つの強相関ヘリカル相をピンポイントとした。 2種のハードコアボソンとオンサイト反発のみのシステムで1つがアクセス可能であることを示し,光学格子実験に応用可能であることを示した。そのシグネチャは、リアルなシステムサイズのために、幅広いパラメータで拡張可能で安定である。

Ladder models of ultracold atoms offer a versatile platform for the experimental and theoretical study of different phenomena and phases of matter linked to the interplay between artificial gauge fields and interactions. Strongly correlated helical states are known to appear for specific ratios of the particle and magnetic flux densities and they can often be interpreted as a one-dimensional limit of fractional quantum Hall states, thus being called pretopological. Their signatures, however, are typically hard to observe due to the small gaps characterizing these states. Here we investigate bosonic ladder models at filling factor 1. Based on bosonization, renormalization group and matrix product state simulations we pinpoint two strongly correlated helical phases appearing at this resonance. We show that one of them can be accessed in systems with two-species hardcore bosons and on-site repulsions only, thus amenable for optical lattice experiments. Its signatures are sizable and stable over a broad range of parameters for realistic system sizes.

翻訳日:2023-04-29 20:25:07 公開日:2021-01-12

# 分子振動分極における非断熱現象

Nonadiabatic phenomena in molecular vibrational polaritons ( http://arxiv.org/abs/2011.07480v2 )

ライセンス: Link先を確認

Tam\'as Szidarovszky, P\'eter Badank\'o, G\'abor J. Hal\'asz, \'Agnes Vib\'ok

(参考訳) 非断熱現象は、赤外線キャビティに閉じ込められた分子の可逆運動において研究される。振動分極子間の円錐交差(CI)は、電子分極子表面間のCIと類似している。振動分極子のスペクトル,トポロジカル,動的性質は,分子振動,回転,空洞フォトニックモードの非断熱的結合の明確な指紋を示す。さらに, HCl分子の1つで$^{35}$Clから$^{37}$Clに変化させることにより, HCl分子の2つの可視化HCl分子とキャビティモードからなり, 分子置換対称性を破り, 偏光性表面, ナノジアバティックカップリング, 関連するスペクトル, トポロジカル, 動的性質を著しく損なうことができた。これは、現実的な分極系をモデル化する際に異なる分子アイソトポローグの自然発生を考慮する必要があることを意味する。

Nonadiabatic phenomena are investigated in the rovibrational motion of molecules confined in an infrared cavity. Conical intersections (CIs) between vibrational polaritons, similar to CIs between electronic polaritonic surfaces, are found. The spectral, topological, and dynamic properties of the vibrational polaritons show clear fingerprints of nonadiabatic couplings between molecular vibration, rotation and the cavity photonic mode. Furthermore, it is found that for the investigated system, composed of two rovibrating HCl molecules and the cavity mode, breaking the molecular permutational symmetry, by changing $^{35}$Cl to $^{37}$Cl in one of the HCl molecules, the polaritonic surfaces, nonadiabatic couplings, and related spectral, topological, and dynamic properties can deviate substantially. This implies that the natural occurrence of different molecular isotopologues needs to be considered when modeling realistic polaritonic systems.

翻訳日:2023-04-24 01:46:56 公開日:2021-01-12

# パルス光メカニカル計測による2モード機械エンタングルメントの作製と検証

Preparation and Verification of Two-Mode Mechanical Entanglement Through Pulsed Optomechanical Measurements ( http://arxiv.org/abs/2011.10289v2 )

ライセンス: Link先を確認

Pascal Neveu, Jack Clarke, Michael R. Vanner, Ewold Verhagen

(参考訳) 短光パルスと測定を用いて, 単一光共振器に結合した2つのメカニカルモード間の両部ガウス交絡を生成・検証するプロトコルを提案する。我々のプロトコルは、解決されたサイドバンド機構や低い熱フォノン占有を必要とせず、機械的な動きの期間未満で量子絡みの発生と検証を可能にする。コンディショニング位置測定により、効果的な2モード機械スクイーズを介して絡み合いが生じる。機械周波数と光学結合速度の実験的偏差に対する絡み合いの堅牢性について検討した。

We propose a protocol how to generate and verify bipartite Gaussian entanglement between two mechanical modes coupled to a single optical cavity, by means of short optical pulses and measurement. Our protocol requires neither the resolved sideband regime, nor low thermal phonon occupancy, and allows the generation and verification of quantum entanglement in less than a mechanical period of motion. Entanglement is generated via effective two-mode mechanical squeezing through conditioning position measurements. We study the robustness of entanglement to experimental deviations in mechanical frequencies and optomechanical coupling rates.

翻訳日:2023-04-23 15:03:10 公開日:2021-01-12

# 単一粒子ステアリングと非局所性:Stern-Gerlach実験

Single-particle steering and nonlocality: The consecutive Stern-Gerlach Experiments ( http://arxiv.org/abs/2011.11797v2 )

ライセンス: Link先を確認

E Benitez Rodriguez and E Piceno Martinez, and L M Arevalo Aguilar

(参考訳) 量子非局所性と量子ステアリング(quantum steering)は、古典的な資源だけでは生成できない量子系の基本的な相関である。非局所性(nonlocality)とは、量子ステアリングアリスがボブの状態を遠隔操作するときに、遠方系で実施される測定結果に影響を及ぼす能力を記述する。非局所性やステアリングの研究は、量子情報の開発や、量子鍵分布のような非局所的な資源を必要とする多くのアプリケーションに基本的な関心を持っている。一方、Stern-Gerlach実験は量子力学と量子情報の歴史、発展、教育において重要な位置を占めている。特に、連続Stern-Gerlach実験の思考実験は、量子作用素間の非可換性の概念を実証するために一般的に用いられる。しかしながら、我々の知る限りでは、連続するスターン・ゲラハ実験は完全な量子的な方法では行われておらず、連続するスターン・ゲラハ実験を横断する原子が古典的な経路を辿るという考えは広く受け入れられている。ここでは2つの連続Stern-Gerach実験が非局所性とステアリングを生成することを示す。また,この結果の意義と絡み合いとの関係について考察する。これは、質量を持つ粒子の量子相関を用いて、この有望な実験を用いて非局所的タックを生成することを示唆している。

Quantum nonlocality and quantum steering are fundamental correlations of quantum systems which can not be created using classical resources only. Nonlocality describes the ability to influence the possible results of measurements carried out in distant systems, in quantum steering Alice remotely steers Bob's state. Research in nonlocality and steering possess a fundamental interest for the development of quantum information and in many applications requiring nonlocal resources like quantum key distribution. On the other hand, the Stern-Gerlach experiment holds an important place in the history, development and teaching of quantum mechanics and quantum information. In particular, the thought experiment of consecutive Stern-Gerlach Experiments is commonly used to exemplify the concept of non-commutativity between quantum operators. However, to the best of our knowledge, the consecutive Stern-Gerlach Experiments have not been treated in a fully quantum manner yet, and it is a widely accepted idea that atoms crossing consecutive Stern-Gerlach Experiments follow classical paths. Here we demonstrate that two consecutive Stern-Gerach Experiment generate nonlocality and steering, these nonlocal effects strongly modify our usual understanding of this experiment. Also, we discuss the implications of this result and its relation with the entanglement. This suggests the use of quantum correlations, of particles possessing mass, to generate nonlocal taks using this venerable experiment.

翻訳日:2023-04-23 08:41:55 公開日:2021-01-12

# グラフ上の連続時間量子ウォークの輸送効率

Transport efficiency of continuous-time quantum walks on graphs ( http://arxiv.org/abs/2011.13794v2 )

ライセンス: Link先を確認

Luca Razzoli, Matteo G. A. Paris, Paolo Bordone

(参考訳) 連続時間量子ウォークは、グラフ上で連続的に進化する量子粒子(または励起)の伝播を記述する。そのため、光ハーベストティングシステムなどの輸送プロセスをモデリングするための自然なフレームワークを提供する。特に、輸送特性は、調査中のグラフの初期状態と特定の特徴に強く依存する。本稿では,グラフトポロジの役割を論じ,正則性,対称性,接続性が異なるグラフの輸送特性について考察する。我々は障害や非一貫性を無視し、損失過程を説明できる単一のトラップ頂点を仮定する。特に、各グラフに対して、最大輸送効率を持つ状態の部分空間を解析的に決定する。本研究は,環境支援量子輸送のベンチマークを提供し,接続性が輸送効率の指標に乏しいことを示唆する。実際、あるグラフの転送効率と接続性の間には特定の相関関係があるが、一般には相関しない。

Continuous-time quantum walk describes the propagation of a quantum particle (or an excitation) evolving continuously in time on a graph. As such, it provides a natural framework for modeling transport processes, e.g., in light-harvesting systems. In particular, the transport properties strongly depend on the initial state and on the specific features of the graph under investigation. In this paper, we address the role of graph topology, and investigate the transport properties of graphs with different regularity, symmetry, and connectivity. We neglect disorder and decoherence, and assume a single trap vertex accountable for the loss processes. In particular, for each graph, we analytically determine the subspace of states having maximum transport efficiency. Our results provide a set of benchmarks for environment-assisted quantum transport, and suggest that connectivity is a poor indicator for transport efficiency. Indeed, we observe some specific correlations between transport efficiency and connectivity for certain graphs, but in general they are uncorrelated.

翻訳日:2023-04-22 20:28:19 公開日:2021-01-12

# Google Apple Exposure Notification (GAEN)フレームワークの批判

A Critique of the Google Apple Exposure Notification (GAEN) Framework ( http://arxiv.org/abs/2012.05097v2 )

ライセンス: Link先を確認

Jaap-Henk Hoepman

(参考訳) 新型コロナウイルス(COVID-19)の感染拡大を受け、医療当局が感染した人との接触が密接で持続しているかどうかを判断するためのツールとして、デジタルコンタクトの追跡が提案されている。 2020年4月、GoogleとAppleは、コンタクトトレースのための分散型でよりプライバシーに優しいプラットフォームとして、Google Apple Exposure Notification (GAEN)フレームワークをリリースした。 GAENフレームワークは、アプリケーション(ライセンス)層を完全に置き換えるのではなく、主にオペレーティングシステム層で露出通知を実装している。本稿では,このアプローチの結果について考察する。これはOS層における大量監視のための休眠機能を生み出すと我々は主張する。我々は、医療当局が純粋に中央集権化された接触追跡を行うのを技術的に防ぐことができないことを示す。 GAENによってGoogleとAppleは、接触追跡が健康当局によって実際にどのように実装されているか(あるいは実装されていないか)、そしてそれがいかに機能的不気味のリスクをもたらすかを決定することができる。

As a response to the COVID-19 pandemic digital contact tracing has been proposed as a tool to support the health authorities in their quest to determine who has been in close and sustained contact with a person infected by the coronavirus. In April 2020 Google and Apple released the Google Apple Exposure Notification (GAEN) framework, as a decentralised and more privacy friendly platform for contact tracing. The GAEN framework implements exposure notification mostly at the operating system layer, instead of fully at the app(lication) layer. In this paper we study the consequences of this approach. We argue that this creates a dormant functionality for mass surveillance at the operating system layer. We show how it does not technically prevent the health authorities from implementing a purely centralised form of contact tracing (even though that is the stated aim). We highlight that GAEN allows Google and Apple to dictate how contact tracing is (or rather isn't) implemented in practice by health authorities, and how it introduces the risk of function creep.

翻訳日:2023-04-21 08:07:03 公開日:2021-01-12

# ディープニューラルネットワークを用いた単発電子スピン読み出しのノイズロバスト分類

Noise-robust classification of single-shot electron spin readouts using a deep neural network ( http://arxiv.org/abs/2012.10841v2 )

ライセンス: Link先を確認

Yuta Matsumoto, Takafumi Fujita, Arne Ludwig, Andreas D. Wieck, Kazunori Komatani, Akira Oiwa

(参考訳) 量子点接触や量子ドットなどの電荷センサによる電荷とスピン状態の単発読み出しは、半導体スピン量子ビットの動作に必須の技術である。単発読み出しの忠実度は、信号対雑音比、システム温度、しきい値などの数値パラメータといった実験条件に依存する。ノイズの多い環境下で堅牢な正確な電荷検出スキームは、スケーラブルなフォールトトレラント量子計算アーキテクチャの開発には不可欠である。本研究では,ディープニューラルネットワーク(dnn)を用いて,雑音に対して頑健な単発読み出し分類手法を提案する。重要なことに、DNN分類器は、充電ラインで実験的に得られた電荷遷移信号のデータセットを用いて、トレーニング可能なパラメータを調整することにより、任意のノイズ環境でスピンアップおよびスピンダウン信号を自動的に設定する。さらに, 様々な量子ドット実験における電荷状態とスピン状態の測定に用いる2つの従来の分類法と比較して, 雑音環境下でのdnn分類が頑健であることを検証した。

Single-shot readout of charge and spin states by charge sensors such as quantum point contacts and quantum dots are essential technologies for the operation of semiconductor spin qubits. The fidelity of the single-shot readout depends both on experimental conditions such as signal-to-noise ratio, system temperature and numerical parameters such as threshold values. Accurate charge sensing schemes that are robust under noisy environments are indispensable for developing a scalable fault-tolerant quantum computation architecture. In this study, we present a novel single-shot readout classification method that is robust to noises using a deep neural network (DNN). Importantly, the DNN classifier is automatically configured for spin-up and spin-down signals in any noise environment by tuning the trainable parameters using the datasets of charge transition signals experimentally obtained at a charging line. Moreover, we verify that our DNN classification is robust under noisy environment in comparison to the two conventional classification methods used for charge and spin state measurements in various quantum dot experiments.

翻訳日:2023-04-20 02:31:01 公開日:2021-01-12

# 標準平衡状態のワイルウィグナー表現

Weyl-Wigner Representation of Canonical Equilibrium States ( http://arxiv.org/abs/2012.11674v2 )

ライセンス: Link先を確認

F. Nicacio

(参考訳) 標準熱平衡量子状態のワイルヴィグナー表現は、ハイゼンベルク作用素とメタプレクティック作用素のワイルヴィグナー記号のウィック回転を通じて二次ハミルトニアンのクラス全体に対して得られる。これらのユニタリと本質的に関連づけられた古典的構造の挙動はウィック写像の下で記述され、熱平衡状態は複素シンプレクティック行列によって完全に決定され、熱力学的性質をすべて設定する。ハミルトン力学の4つのカテゴリ(放物型、楕円型、双曲型、ロキソドロミック)を分析した。半古典的および高温の近似は、古典的および/または二次的挙動と比較される。

The Weyl-Wigner representations for canonical thermal equilibrium quantum states are obtained for the whole class of quadratic Hamiltonians through a Wick rotation of the Weyl-Wigner symbols of Heisenberg and metaplectic operators. The behavior of classical structures inherently associated to these unitaries is described under the Wick mapping, unveiling that a thermal equilibrium state is fully determined by a complex symplectic matrix, which sets all of its thermodynamical properties. The four categories of Hamiltonian dynamics (Parabolic, Elliptic, Hyperbolic, and Loxodromic) are analyzed. Semiclassical and high temperature approximations are derived and compared to the classical and/or quadratic behavior.

翻訳日:2023-04-20 00:08:53 公開日:2021-01-12

# 量子ダルブーの定理です

The Quantum Darboux Theorem, ( http://arxiv.org/abs/2012.15260v2 )

ライセンス: Link先を確認

Olindo Corradini, Emanuele Latini and Andrew Waldron

(参考訳) 量子力学的プロパゲータの計算問題は、波動関数のベクトル束に作用する平坦な接続による並列輸送のためのウィルソン線演算子の計算として再キャストすることができる。この図では、基底多様体は奇数次元シンプレクティック幾何学(英語版)、あるいは非常に総称的に「位相時空」と見なすことのできる接触多様体であり、ファイバーはヒルベルト空間である。このアプローチは、局所古典力学を直線に変換する接触多様体上のダルブーの定理と平行な「量子ダルブーの定理」を享受する。量子ダルブーの定理が非調和量子ポテンシャルに対してどのように機能するかを詳述する。特に,局所的に複雑な量子力学を自明にするゲージ変換の漸近性を計算するための新しい図式論的手法を開発した。

The problem of computing quantum mechanical propagators can be recast as a computation of a Wilson line operator for parallel transport by a flat connection acting on a vector bundle of wavefunctions. In this picture the base manifold is an odd dimensional symplectic geometry, or quite generically a contact manifold that can be viewed as a "phase-spacetime", while the fibers are Hilbert spaces. This approach enjoys a "quantum Darboux theorem" that parallels the Darboux theorem on contact manifolds which turns local classical dynamics into straight lines. We detail how the quantum Darboux theorem works for anharmonic quantum potentials. In particular, we develop a novel diagrammatic approach for computing the asymptotics of a gauge transformation that locally makes complicated quantum dynamics trivial.

翻訳日:2023-04-18 07:49:36 公開日:2021-01-12

# アイゼンハートリフトの定量化

Quantizing the Eisenhart Lift ( http://arxiv.org/abs/2012.15288v2 )

ライセンス: Link先を確認

Kieran Finn, Sotirios Karamitsos and Apostolos Pilaftsis

(参考訳) 古典的なアイゼンハートリフト(英: classical eisenhart lift)とは、高次元の曲面多様体(リフト多様体)で進化する自由系を用いて、ポテンシャルに従属する古典系の力学を再現する手法である。我々は、アイゼンハートリフトの定式化を量子系に拡張し、リフト多様体はポテンシャルの古典的効果だけでなく量子力学的効果も再現することを示した。特に、昇降系のシュロディンガー方程式の解は、新しい自由度を射出した後に元の系の解に還元されることが分かる。この文脈では、古典系の持ち上げ運動量に対応する保存量子数を特定する。さらに、アイゼンハートリフトを量子場理論(QFT)に適用する。昇降場空間多様体はスカラー場ポテンシャルの古典的効果と量子的効果の両方を再現できることを示す。 QFTの場合、持ち上げられた運動量の類似は、時間だけでなく空間でも保存される量子電荷である。この電荷の異なる可能な値は、すべて互いに交わらないフォック空間のアンサンブルをラベル付けする。これらの拡張フォック空間の宇宙定数とゲージ階層問題との関連性を考慮する。

The classical Eisenhart lift is a method by which the dynamics of a classical system subject to a potential can be recreated by means of a free system evolving in a higher-dimensional curved manifold, known as the lifted manifold. We extend the formulation of the Eisenhart lift to quantum systems, and show that the lifted manifold recreates not only the classical effects of the potential, but also its quantum mechanical effects. In particular, we find that the solutions of the Schrodinger equations of the lifted system reduce to those of the original system after projecting out the new degrees of freedom. In this context, we identify a conserved quantum number, which corresponds to the lifted momentum of the classical system. We further apply the Eisenhart lift to Quantum Field Theory (QFT). We show that a lifted field space manifold is able to recreate both the classical and quantum effects of a scalar field potential. We find that, in the case of QFT, the analogue of the lifted momentum is a quantum charge that is conserved not only in time, but also in space. The different possible values for this charge label an ensemble of Fock spaces that are all disjoint from one another. The relevance of these extended Fock spaces to the cosmological constant and gauge hierarchy problems is considered.

翻訳日:2023-04-18 07:38:16 公開日:2021-01-12

# 光子損失と量子会議のキーアグリーメント

Quantum Conference Key Agreement with Photon Loss ( http://arxiv.org/abs/2101.01483v2 )

ライセンス: Link先を確認

Phattharaporn Singkanipa and Pieter Kok

(参考訳) 会議鍵合意 (CKA) は、2つ以上の関係者が共通の秘密鍵を共有したいという情報処理課題である。本稿では、冗長符号化と誤り訂正に基づくCKAの損失耐性プロトコルを提案する。我々のプロトコルは、既存の損失の少ないCKAプロトコルよりも転送速度が向上する。しかし、符号化と誤り訂正には追加費用がかかる。我々は、生成確率p > 0.3の光子源を用いて、プロトコルの秘密鍵レートが既存のプロトコルを克服できることを示す。したがって、損失耐性プロトコルの現実的な実装には高い確率で絡み合う光子源が必要である。

Conference key agreement (CKA) is an information processing task where more than two parties want to share a common secret key. Here, we present a loss-resilient protocol for CKA, based on redundant encoding and error correction. Our protocol provides a speed-up in transmission rate over the existing lossy CKA protocol. However, encoding and error correction come with extra cost. We show that, using photon sources with creation probability p > 0.3, our protocol's secret key rate can overcome the existing protocol's. Hence, high probability entangled photon sources are required for realistic implementation of our loss-resilient protocol.

翻訳日:2023-04-17 19:58:08 公開日:2021-01-12

# Mobius構造における単一光子の制御可能な非相互伝達

Controllable non-reciprocal transmission of single photon in Mobius structure ( http://arxiv.org/abs/2101.04282v1 )

ライセンス: Link先を確認

Hai-Yuan Zhu, Xin-Yuan Hu, Jun-Jie Lin, Jia-Yi Wu, Shuo Li, Yan-Xiang Wang, Fu-Guo Deng, Na-Na Zhang

(参考訳) 制御可能な非相互伝送モデルを提案する。このモデルはモビウス環(mobius ring)から成り、この環は2つの1次元半無限鎖に連結され、モビウス環の空洞の内側に2段階の原子が配置されている。グリーン関数の手法を用いて、モデルによる単一光子の透過率の研究を行う。その結果、このモデルでは非相互伝達が達成でき、2レベル原子は単一光子の非相互輸送の量子スイッチとして振る舞うことができた。この制御可能な非相互伝送モデルは、新しい量子非相互デバイスを刺激することができる。

We propose a controllable non-reciprocal transmission model. The model consists of a Mobius ring, which is connected with two one-dimensional semi-infinite chains, and with a two-level atom located inside one of the cavities of the Mobius ring. We use the method of Green function to study the transmittance of a single photon through the model. The results show that the non-reciprocal transmission can be achieved in this model and the two-level atom can behave as a quantum switch for the non-reciprocal transport of the single photon. This controllable non-reciprocal transmission model may inspire new quantum non-reciprocal devices.

翻訳日:2023-04-17 00:51:03 公開日:2021-01-12

# Wick回転想像時間進化を用いた量子オプション価格設定

Quantum option pricing using Wick rotated imaginary time evolution ( http://arxiv.org/abs/2101.04280v1 )

ライセンス: Link先を確認

Santosh Kumar Radha

(参考訳) 本稿では,量子環境における価格設定の問題を再検討する。提案するアルゴリズムは,初期状態を作成し,オプション価格を表現し,既存の仮想時間シミュレーションアルゴリズムを用いて進化させる。この価格設定の方法は、初期オプション価格を量子状態にマッピングし、ウィックの想像上の時間空間における時間依存をシミュレートするものである。我々は、ヨーロッパオプションのアルゴリズムを概念実証として、特定の想像上の時間発展アルゴリズムを用いて数値的に検証し、アジアオプションのような経路依存オプションにどのように拡張できるかを示す。提案手法はハイブリッド変分アルゴリズムを用いており, 近距離量子コンピュータに関係していると考えられる。

In this paper we reformulate the problem of pricing options in a quantum setting. Our proposed algorithm involves preparing an initial state, representing the option price, and then evolving it using existing imaginary time simulation algorithms. This way of pricing options boils down to mapping an initial option price to a quantum state and then simulating the time dependence in Wick's imaginary time space. We numerically verify our algorithm for European options using a particular imaginary time evolution algorithm as proof of concept and show how it can be extended to path dependent options like Asian options. As the proposed method uses a hybrid variational algorithm, it is bound to be relevant for near-term quantum computers.

翻訳日:2023-04-17 00:50:51 公開日:2021-01-12

# 量子ニューラルネットワークの表現性

Expressivity of Quantum Neural Networks ( http://arxiv.org/abs/2101.04273v1 )

ライセンス: Link先を確認

Yadong Wu, Juan Yao, Pengfei Zhang, and Hui Zhai

(参考訳) 本研究では、十分に深い量子ニューラルネットワークが目標関数をできるだけ正確に近似できるかどうかという問題に対処する。対象関数が物理的観測可能である単純だが典型的な物理的状況から始めて、学習対象が直接物理的観測可能ではなく、loshimidtエコーやrenyiエントロピーのような複数のレプリカを持つ拡大ヒルベルト空間における物理的観測可能として表現できる状況まで議論を拡大する。主な発見は、データセット内の入力波関数が量子回路が作用するヒルベルト空間全体を満たさない場合にのみ正確な近似が可能であり、より正確には、前者のヒルベルト空間次元は後者のヒルベルト空間次元の半分以下でなければならないということである。場合によっては、例えば、入力波関数が異なるレプリカ間で対称でなければならない場合、データセットの固有の特性のために、この要件を自動で満たすことができる。そして、この要求がデータセットで満たされない場合、入力時に常にウェーブ関数が固定された1つの補助キュービットを追加することで、表現能力が回復できることを示す。本研究は,古典的ニューラルネットワークの表現性の基礎となる普遍近似定理の量子ニューラルネットワークアナロジーの確立に向けたものである。

In this work, we address the question whether a sufficiently deep quantum neural network can approximate a target function as accurate as possible. We start with simple but typical physical situations that the target functions are physical observables, and then we extend our discussion to situations that the learning targets are not directly physical observables, but can be expressed as physical observables in an enlarged Hilbert space with multiple replicas, such as the Loshimidt echo and the Renyi entropy. The main finding is that an accurate approximation is possible only when the input wave functions in the dataset do not exhaust the entire Hilbert space that the quantum circuit acts on, and more precisely, the Hilbert space dimension of the former has to be less than half of the Hilbert space dimension of the latter. In some cases, this requirement can be satisfied automatically because of the intrinsic properties of the dataset, for instance, when the input wave function has to be symmetric between different replicas. And if this requirement cannot be satisfied by the dataset, we show that the expressivity capabilities can be restored by adding one ancillary qubit where the wave function is always fixed at input. Our studies point toward establishing a quantum neural network analogy of the universal approximation theorem that lays the foundation for expressivity of classical neural networks.

翻訳日:2023-04-17 00:50:39 公開日:2021-01-12

# オープンおよび周期駆動システムにおける量子制御

Quantum Control in Open and Periodically Driven Systems ( http://arxiv.org/abs/2101.04267v1 )

ライセンス: Link先を確認

Si-Yuan Bai, Chong Chen, Hong Wu, Jun-Hong An

(参考訳) 量子技術は、技術革新を実現するために量子資源の効率的な利用を頼りにしている。これらのシステムは、それぞれの状態が異なる量子プロトコルを実現するために望ましい方法に従うように制御される。しかし、システムと環境の相互作用によって引き起こされるデコヒーレンスは、望ましい方法から逸脱する状態を引き起こす。アクティブコントロールとパッシブデコヒーレンスの共存の下で量子資源を保護する方法は重要である。近年の研究では、デコヒーレンスが系環境エネルギースペクトルの特徴によって決定されることが明らかになっている:エネルギースペクトルにおける境界状態の形成に伴うデコヒーレンスを抑制することができる。デコヒーレンスを制御するためのガイドラインを提供する。このようなアイデアは周期駆動系のシステムに一般化することができる。準エネルギースペクトルにおけるフロッケ境界状態の操作により、フロッケ工学と呼ばれる周期運転によるコヒーレント制御は、デコヒーレンスを制御するだけでなく、人工的にエキゾチックな位相を合成する際にも多用途ツールとなっている。オープンおよび周期駆動システムにおける量子制御の進展について概観する。境界国家が果たす卓越した役割と、デコヒーレンスを抑え、新しいトポロジカルフェーズを創出する周期運転による制御性に特に注目される。

Quantum technology resorts to efficient utilization of quantum resources to realize technique innovation. The systems are controlled such that their states follow the desired manners to realize different quantum protocols. However, the decoherence caused by the system-environment interactions causes the states deviating from the desired manners. How to protect quantum resources under the coexistence of active control and passive decoherence is of significance. Recent studies have revealed that the decoherence is determined by the feature of the system-environment energy spectrum: Accompanying the formation of bound states in the energy spectrum, the decoherence can be suppressed. It supplies a guideline to control decoherence. Such idea can be generalized to systems under periodic driving. By virtue of manipulating Floquet bound states in the quasienergy spectrum, coherent control via periodic driving dubbed as Floquet engineering has become a versatile tool not only in controlling decoherence, but also in artificially synthesizing exotic topological phases. We will review the progress on quantum control in open and periodically driven systems. Special attention will be paid to the distinguished role played by the bound states and their controllability via periodic driving in suppressing decoherence and generating novel topological phases.

翻訳日:2023-04-17 00:50:17 公開日:2021-01-12

# 新しいパラメータ化された絡み合いモノトーン

A new parameterized entanglement monotone ( http://arxiv.org/abs/2101.04256v1 )

ライセンス: Link先を確認

Xue Yang, Ming-Xing Luo, Yan-Han Yang, Shao-Ming Fei

(参考訳) 絡み合い共起は量子実験における絡み合いを特徴付けるために広く使われている。絡み合いモノトンとして、特定の量子タリスエントロピーに関係している。本論文の目的は,Tsallisエントロピーにインスパイアされた$q$-concurrenceと命名された,新しいパラメータ化二部絡みモノトンを提案することである。我々は、正の部分的転位基準と再無視基準を用いて、任意の二成分量子絡み合い状態のq$-共起に対する解析的下限を導出し、強い分離可能性基準と興味深い関係を示す。新しい絡み合いモノトンは二部体等方性状態の特徴付けに用いられる。最後に、2つの二成分純粋状態を重ね合わせることにより、任意の絡み合いに対するq$-concurrenceを推定する計算方法を提案する。重ね合わせ演算は、重ね合わせられている2つの状態が両側直交または片側直交である場合において、最大で1つのebitを$q$-concurrenceで増加させることができる。これらの結果は、量子通信や量子情報処理で興味深い、絡み合いに関する一連の新しい現象を明らかにする。

Entanglement concurrence has been widely used for featuring entanglement in quantum experiments. As an entanglement monotone it is related to specific quantum Tsallis entropy. Our goal in this paper is to propose a new parameterized bipartite entanglement monotone which is named as $q$-concurrence inspired by general Tsallis entropy. We derive an analytical lower bound for the $q$-concurrence of any bipartite quantum entanglement state by employing positive partial transposition criterion and realignment criterion, which shows an interesting relationship to the strong separability criteria. The new entanglement monotone is used to characterize bipartite isotropic states. Finally, we provide a computational method to estimate the $q$-concurrence for any entanglement by superposing two bipartite pure states. It shows that the superposition operations can at most increase one ebit for the $q$-concurrence in the case that the two states being superposed are bi-orthogonal or one-sided orthogonal. These results reveal a series of new phenomena about the entanglement, which may be interesting in quantum communication and quantum information processing.

翻訳日:2023-04-17 00:49:57 公開日:2021-01-12

# 大規模量子コンピュータとシミュレータのプラットフォームとしての記憶リングにおける冷イオンビーム--研究・開発への挑戦と方向性

Cold ion beam in a storage ring as a platform for large-scale quantum computers and simulators: challenges and directions for research and development ( http://arxiv.org/abs/2101.04247v1 )

ライセンス: Link先を確認

Timur Shaftan, Boris B. Blinov

(参考訳) 本研究の目的は,スケーラブル量子コンピューティング(qc)と量子シミュレーション(qs)のためのプラットフォームとして,多数のイオンを蓄積・冷却・制御可能な大規模記憶リング型イオントラップシステムを構築する可能性を評価することである。このようなトラップでは、イオンは冷却レーザの周波数と強度によって一定速度で円路に沿って移動する結晶ビームを形成する。本稿では,現在最先端の線形イオントラップ装置で利用可能な100個未満から,ストレージリング装置における105個の結晶化イオンの順まで,量子ビット数の面で大きな飛躍的な進歩を考察する。この新しいトラップ設計は、荷電粒子の貯蔵リングとQCと質量分析に使用される線形イオントラップの2つの異なる概念を統一する。本稿では粒子加速器の言語を用いてイオン状態と動力学について論じる。上記の概念の違いを概説し、回転するイオンビームで大きな環の課題を分析し、現在の1000倍の量子ビットを持つ将来の量子コンピュータを実現するために必要な研究と開発のための目標を提案する。このような大規模量子システムを作成する上で、量子ビットのコヒーレンスと量子論理演算の高忠実さを維持しながら課題となる。アナログ量子シミュレーションを実行することは、そのようなデバイスに対する達成可能な初期目標である。複雑な量子システムの量子シミュレーションは、基礎科学と応用研究の両方を前進させる。原子核と素粒子物理学、多体量子システム、格子ゲージ理論、原子核構造計算は、大規模な量子シミュレーションシステムが自然の理解を進める上で非常に強力なツールになる数少ない例である。

The purpose of this paper is to evaluate the possibility of constructing a large-scale storage-ring-type ion-trap system capable of storing, cooling, and controlling a large number of ions as a platform for scalable quantum computing (QC) and quantum simulations (QS). In such a trap, the ions form a crystalline beam moving along a circular path with a constant velocity determined by the frequency and intensity of the cooling lasers. In this paper we consider a large leap forward in terms of the number of qubits, from fewer than 100 available in state-of-the-art linear ion-trap devices today to an order of 105 crystallized ions in the storage-ring setup. This new trap design unifies two different concepts: the storage rings of charged particles and the linear ion traps used for QC and mass spectrometry. In this paper we use the language of particle accelerators to discuss the ion state and dynamics. We outline the differences between the above concepts, analyze challenges of the large ring with a revolving beam of ions, and propose goals for the research and development required to enable future quantum computers with 1000 times more qubits than available today. The challenge of creating such a large-scale quantum system while maintaining the necessary coherence of the qubits and the high fidelity of quantum logic operations is significant. Performing analog quantum simulations may be an achievable initial goal for such a device. Quantum simulations of complex quantum systems will move forward both the fundamental science and the applied research. Nuclear and particle physics, many-body quantum systems, lattice gauge theories, and nuclear structure calculations are just a few examples in which a large-scale quantum simulation system would be a very powerful tool to move forward our understanding of nature.

翻訳日:2023-04-17 00:49:35 公開日:2021-01-12

# 大規模非制約最適化のための正規化限定メモリBFGS法とその効率的な実装

A Regularized Limited Memory BFGS method for Large-Scale Unconstrained Optimization and its Efficient Implementations ( http://arxiv.org/abs/2101.04413v1 )

ライセンス: Link先を確認

Hardik Tankaria, Shinji Sugimoto and Nobuo Yamashita

(参考訳) リミテッドメモリBFGS(L-BFGS)法は、大規模非制約最適化の一般的な方法の1つである。標準L-BFGS法はラインサーチを用いてグローバル収束を保証するため、時には多数の関数評価を必要とする。難易度を克服するために,一定の正規化技術を備えた新しいl-bfgsを提案する。通常の仮定の下で、そのグローバル収束を示す。本手法をより堅牢かつ効率的にするために,ノンモノトーン法やWolfe線探索の同時利用など,いくつかの手法で拡張する。最後に, CUTEstにおけるテスト問題に対する数値計算結果について述べる。

The limited memory BFGS (L-BFGS) method is one of the popular methods for solving large-scale unconstrained optimization. Since the standard L-BFGS method uses a line search to guarantee its global convergence, it sometimes requires a large number of function evaluations. To overcome the difficulty, we propose a new L-BFGS with a certain regularization technique. We show its global convergence under the usual assumptions. In order to make the method more robust and efficient, we also extend it with several techniques such as nonmonotone technique and simultaneous use of the Wolfe line search. Finally, we present some numerical results for test problems in CUTEst, which show that the proposed method is robust in terms of solving number of problems.

翻訳日:2023-04-17 00:42:57 公開日:2021-01-12

# 金属電極によるシリコン中のドナースピンの空間分解脱コヒーレンス

Spatially-resolved decoherence of donor spins in silicon strained by a metallic electrode ( http://arxiv.org/abs/2101.04391v1 )

ライセンス: Link先を確認

V. Ranjan, B. Albanese, E. Albertinale, E. Billaud, D. Flanigan, J. J. Pla, T. Schenkel, D. Vion, D. Esteve, E. Flurin, J. J. L. Morton, Y. M. Niquet, P. Bertet

(参考訳) 電子スピンは既知の最もコヒーレントな固体系の一つであるが、量子センシングや情報処理のデバイスで使用されるためには、通常は界面の近くに置かれなければならない。電子スピンのコヒーレンスとスペクトル特性に対するそのような界面の影響を理解し緩和することは、そのような応用を実現する上で非常に重要であるが、単一スピンの研究からそのようなデータを推測するには、意味のある結果を得るために多くの測定が必要である。ここでは, ミリケルビン温度における28シリコン中の表面近傍ビスマスドナースピンのコヒーレンスを包括的に研究する。特に、金属電極によるひずみ誘起周波数シフトを用いて、スピンコヒーレンスの空間地図を電極に対する深さと位置の関数として用いる。磁場非感受性クロック遷移の測定により、表面スピンによる磁気ノイズと電荷ノイズを分離する。本研究は, ひずみ分離スピン共鳴スペクトルの定量的モデルとシリコン表面における常磁性不純物濃度の抽出を含む。このような表面近傍の電子スピンに対するこれらのデコヒーレンス機構の相互作用は、量子技術におけるそれらの応用において重要であるが、歪分割とクロック遷移の組み合わせはコヒーレンス寿命を最大2桁まで延長し、平均深さ100nmで最大300msに達する。ここで紹介する近接面アンサンブルにおけるコヒーレンスを空間的にマッピングする手法は、ダイヤモンド、炭化ケイ素、および光学結晶中の希土類イオンの欠陥など、他の活発な興味を持つスピン系に直接適用できる。

Electron spins are amongst the most coherent solid-state systems known, however, to be used in devices for quantum sensing and information processing applications, they must be typically placed near interfaces. Understanding and mitigating the impacts of such interfaces on the coherence and spectral properties of electron spins is critical to realize such applications, but is also challenging: inferring such data from single-spin studies requires many measurements to obtain meaningful results, while ensemble measurements typically give averaged results that hide critical information. Here, we report a comprehensive study of the coherence of near-surface bismuth donor spins in 28-silicon at millikelvin temperatures. In particular, we use strain-induced frequency shifts caused by a metallic electrode to make spatial maps of spin coherence as a function of depth and position relative to the electrode. By measuring magnetic-field-insensitive clock transitions we separate magnetic noise caused by surface spins from charge noise. Our results include quantitative models of the strain-split spin resonance spectra and extraction of paramagnetic impurity concentrations at the silicon surface. The interplay of these decoherence mechanisms for such near-surface electron spins is critical for their application in quantum technologies, while the combination of the strain splitting and clock transition extends the coherence lifetimes by up to two orders of magnitude, reaching up to 300 ms at a mean depth of only 100nm. The technique we introduce here to spatially map coherence in near-surface ensembles is directly applicable to other spin systems of active interest, such as defects in diamond, silicon carbide, and rare earth ions in optical crystals.

翻訳日:2023-04-17 00:42:33 公開日:2021-01-12

# 不完全なデバイスを用いたソース独立量子乱数発生器のセキュリティ解析と改善

Security Analysis and Improvement of Source Independent Quantum Random Number Generators with Imperfect Devices ( http://arxiv.org/abs/2101.04327v1 )

ライセンス: Link先を確認

Xing Lin, Shuang Wang, Zhen-Qiang Yin, Guan-Jie Fan-Yuan, Rong Wang, Wei Chen, De-Yong He, Zheng Zhou, Guang-Can Guo, and Zheng-Fu Han

(参考訳) 乱数生成器(QRNG)は、数値シミュレーションや暗号など多くのアプリケーションにおいて、真の乱数発生源として不可欠である。近年,信頼できない情報源でセキュアな乱数を生成できるsi-qrng(source-independent quantum random number generator)が実現されている。しかし、SI-QRNGで使用される信頼されているが不完全なデバイスの測定の抜け穴はまだ十分に調べられていないため、特に高速システムではセキュリティ上の問題が発生する。本稿では,SI-QRNGにおける実用的不完全な測定装置のセキュリティ欠陥を指摘する。また,条件付き最小エントロピーの再計算とモニタの追加により,これらの情報漏洩を防止するための対応策を提案する。さらに, 有限サイズ効果を考慮に入れることで, 残差パルスの影響が, 多数のサンプルラウンドを持つ有限サイズ効果よりも大きいことを示す。プロトコルは単純かつ効果的であり,si-qrngのセキュリティや高速測定装置との互換性が向上し,超高速でセキュリティ認定された商用si-qrngシステムの構築方法が確立されている。

A quantum random number generator (QRNG) as a genuine source of randomness is essential in many applications, such as number simulation and cryptography. Recently, a source-independent quantum random number generator (SI-QRNG), which can generate secure random numbers with untrusted sources, has been realized. However, the measurement loopholes of the trusted but imperfect devices used in SI-QRNGs have not yet been fully explored, which will cause security problems, especially in high-speed systems. Here, we point out and evaluate the security loopholes of practical imperfect measurement devices in SI-QRNGs. We also provide corresponding countermeasures to prevent these information leakages by recalculating the conditional minimum entropy and adding a monitor. Furthermore, by taking into account the finite-size effect,we show that the influence of the afterpulse can exceed that of the finite-size effect with the large number of sampled rounds. Our protocol is simple and effective, and it promotes the security of SI-QRNG in practice as well as the compatibility with high-speed measurement devices, thus paving the way for constructing ultrafast and security-certified commercial SI-QRNG systems.

翻訳日:2023-04-17 00:42:06 公開日:2021-01-12

# ブロックチェーンによるパンデミック予防接種証明書の共有:ケーススタディとパフォーマンス評価

Sharing pandemic vaccination certificates through blockchain: Case study and performance evaluation ( http://arxiv.org/abs/2101.04575v1 )

ライセンス: Link先を確認

Jos\'e Luis Hern\'andez-Ramos, Georgios Karopoulos, Dimitris Geneiatakis, Tania Martin, Georgios Kambourakis, and Igor Nai Fovino

(参考訳) この研究は、covid-19やその他の疾病予防証明書のセキュアな共有のための、スケーラブルなブロックチェーンベースのプラットフォームを提案している。例示的なユースケースとして,欧州連合の国を考慮し,大規模展開をシミュレートする。提案するプラットフォームは,計算資源使用量,ネットワーク応答時間,帯域幅といった幅広いシミュレーションによって評価される。その結果,提案手法はすべての評価基準において満足できる性能を示し,実実装のペースを設定できることが示唆された。 vis-\`a- 関連する研究によると、提案されたプラットフォームは、特に大規模で本格的な実装とその評価のプリズムによって、斬新である。

This work proposes a scalable, blockchain-based platform for the secure sharing of COVID-19 or other disease vaccination certificates. As an indicative use case, we simulate a large-scale deployment by considering the countries of the European Union. The proposed platform is evaluated through extensive simulations in terms of computing resource usage, network response time and bandwidth. Based on the results, the proposed scheme shows satisfactory performance across all major evaluation criteria, suggesting that it can set the pace for real implementations. Vis-\`a-vis the related work, the proposed platform is novel, especially through the prism of a large-scale, full-fledged implementation and its assessment.

翻訳日:2023-04-17 00:33:30 公開日:2021-01-12

# ダイヤモンド中の窒素空洞を用いたマイスナースクリーニングの全光学・マイクロ波検出

All-Optical and Microwave-Free Detection of Meissner Screening using Nitrogen-Vacancy Centers in Diamond ( http://arxiv.org/abs/2101.04571v1 )

ライセンス: Link先を確認

D. Paone, D. Pinto, G. Kim, L. Feng, M-J. Kim, R. St\"ohr, A. Singha, S. Kaiser, G. Logvenov, B. Keimer, J. Wrachtrup and K. Kern

(参考訳) 薄膜超伝導体の微視的研究は、非平衡相転移の検出とナノスケールでのダイナミクスの解明に重要な役割を果たしている。しかし、ナノスケールの空間分解能とピコ秒時間分解能を持つ磁気センサはこれらの探索に不可欠である。本稿では,非侵襲量子センサとしてダイヤモンド中の負帯電窒素空孔(nv)中心を利用し,超伝導薄膜中のマイスナー状態の空間的検出を可能にする全光マイクロ波フリー方式を提案する。超伝導LSCO薄膜上にNV注入ダイヤモンド膜を配置する。 NVフォトルミネッセンス (PL) の強いB場依存性により, 外部磁場下でのLSCOのマイスナースクリーニングを非共鳴的に検討することができる。

Microscopic studies on thin film superconductors play an important role for probing non-equilibrium phase transitions and revealing dynamics at the nanoscale. However, magnetic sensors with nanometer scale spatial and picosecond temporal resolution are essential for exploring these. Here, we present an all-optical, microwave-free method, that utilizes the negatively charged nitrogen-vacancy (NV) center in diamond as a non-invasive quantum sensor and enables the spatial detection of the Meissner state in a superconducting thin film. We place an NV implanted diamond membrane on a superconducting LSCO thin film. The strong B-field dependence of the NV photoluminescence (PL) allows us to investigate the Meissner screening in LSCO under an externally applied magnetic field in a non-resonant manner.

翻訳日:2023-04-17 00:33:20 公開日:2021-01-12

# 光子と圧縮コヒーレント光子の特性関数と準確率分布

Characteristic function and quasi-probability distribution of photons and of squeezed coherent photons ( http://arxiv.org/abs/2101.04549v1 )

ライセンス: Link先を確認

Moorad Alexanian

(参考訳) 我々は、光子の熱状態における圧縮コヒーレント光子のp次特性関数とその準分布関数であるフーリエ変換を考察し、圧縮コヒーレント光子の平均数と数分散を計算する。前の特性関数から計算された全ての特性は、パラメトリック変換によって圧縮されたコヒーレント光子の温度状態における光子の当該性質を計算するために使うことができる。特に、パラメトリック変換によって平均数と値のばらつきを得ることができる。

We consider the p-ordered characteristic function and its Fourier transform, the quasidistribution function, of squeezed coherent photons in a thermal state of photons and calculate the mean number and number variance of squeezed coherent photons. All the properties calculated from the previous characteristic function can be used to calculate said properties for photons in a thermal state of squeezed coherent photons via a parametric transformation. In particular, one can obtain the mean number and number variance via the parametric transformation.

翻訳日:2023-04-17 00:33:07 公開日:2021-01-12

# GSM-GPRSを用いたスマート街灯

GSM-GPRS Based Smart Street Light ( http://arxiv.org/abs/2101.06102v1 )

ライセンス: Link先を確認

Imran Kabir, Shihab Uddin Ahamad, Mohammad Naim Uddin, Shah Mohazzem Hossain, Faija Farjana, Partha Protim Datta, Md. Raduanul Alam Riad, Mohammed Hossam-E-Haider

(参考訳) 街路灯はバングラデシュの街路を照らす伝統的なマニュアルシステムであり、ゾーンの街路灯を制御するためだけに専用人が掲示され、街路灯を照らして1日に2回点灯し、日中や日中は街路灯の展示が行われる。これにより予算が削減される。これに加えて、欠陥のあるライトは、技術的な欠点に繋がる長い間、関係する当局の邪魔にならないかもしれない。本稿では,手動制御,半自動制御,完全自動制御を備えたSIM900 GSM-GPRS Shieldを用いたバングラデシュのような国の街灯制御のプロセスを示す。

Street lighting system has always been the traditional manual system of illuminating the streets in Bangladesh, where a dedicated person is posted only to control the street lights of a zone, who roams around the zonal area to switch on and switch off the lights two times a day, which brings about the exhibition of bright lights in street even after sunrise and in some cases maybe the whole day. This results in insertion to the budget. In addition to this, faulty lights may not come to the heed of the concerned authority for a long time which leads to the technical downside. This paper demonstrates a process of controlling the street lights in country like Bangladesh employing SIM900 GSM-GPRS Shield which comes up with the provision of manual control, semi-automated control as well as full-automated control.

翻訳日:2023-04-17 00:25:42 公開日:2021-01-12

# 重力波検出のためのボース・アインシュタイン凝縮体の2モードフォノンスクイーズ

Two-mode Phonon Squeezing in Bose-Einstein Condensates for Gravitational Wave Detection ( http://arxiv.org/abs/2101.05051v1 )

ライセンス: Link先を確認

Paul Juschitz

(参考訳) 圧縮された非古典的状態は、測定装置の感度を古典的状態の限界を超えて押し上げる能力があるため、量子メトロロジーの不可欠な道具である。光におけるそれらの生成は標準的な技術となっているが、超低温原子の気体中の集合励起の圧縮状態の生成は、ボース=アインシュタイン凝縮体(BEC)のフォノンは、比較的最近の問題である。このタスクは、becベースの量子メロジカルデバイスに対する多くの提案や、重力波の検出にそれらを適用する可能性と、継続的に関連づけられている。本論文の目的は, 均一なBECに対して振動する外部電位が最近説明された効果を利用して, 現代の技術によって2モード圧縮されたフォノン状態を生成することである。この問題は、一般相対性理論やエフィモフ物理学のような、冷たい原子以外の様々な分野の要素をまとめる。これに答えるために、初期熱フォノニック状態における振動電位による完全な変換を考慮し、摂動の大きさの上限を見つけるとともに、メトロロジーにおける使用に関して最終状態の品質を定量化することができる。これらの知見を既存の実験に応用して, スクイーズ方式の有効性を判断し, 有効性に適していないことを示す一方で, 効率的な実装が可能で, 実験範囲内と思われる設定を提案する。最適化のための広大なパラメータ空間を出発する余地を考えると、検討されたメカニズムは、もともとこの研究を動機付けた重力波検出器だけでなく、より一般的には超低温原子に基づく量子力学分野にも応用できる。

Squeezed, nonclassical states are an integral tool of quantum metrology due to their ability to push the sensitivity of a measurement apparatus beyond the limits of classical states. While their creation in light has become a standard technique, the production of squeezed states of the collective excitations in gases of ultracold atoms, the phonons of a Bose-Einstein condensate (BEC), is a comparably recent problem. This task is continuously gaining relevance with a growing number of proposals for BEC-based quantum metrological devices and the possibility to apply them in the detection of gravitational waves. The objective of this thesis is to find whether the recently described effect of an oscillating external potential on a uniform BEC can be exploited to generate two-mode squeezed phonon states, given present day technology. This question brings together elements of a range of fields beyond cold atoms, such as general relativity and Efimov physics. To answer it, the full transformation caused by the oscillating potential on an initially thermal phononic state is considered, allowing to find an upper bound for the magnitude of this perturbation as well as to quantify the quality of the final state with respect to its use in metrology. These findings are then applied to existing experiments to judge the feasibility of the squeezing scheme and while the results indicate that they are not well suited for it, a setup is proposed that allows for its efficient implementation and seems within experimental reach. In view of the vast parameter space leaving room for optimization, the considered mechanism could find applications not only in the gravitational wave detector that originally motivated this work, but more generally in the field of quantum metrology based on ultracold atoms.

翻訳日:2023-04-17 00:25:26 公開日:2021-01-12

# 非エルミート粒子アンサンブルの情報と熱力学特性

Information and thermodynamic properties of a non-Hermitian particle ensemble ( http://arxiv.org/abs/2101.04803v1 )

ライセンス: Link先を確認

F. C. E. Lima, A. R. P. Moreira, and C. A. S. Almeida

(参考訳) 非相対論的量子力学の文脈では、非エルミート系のシャノンのエントロピーを調べ、この量がサイクロトロン周波数でどのように変化するかを理解する。その後、均一磁場の存在下でこれらのスピンレス粒子のアンサンブルを構築することに注意を向ける。次に,モデルの熱力学特性について検討する。最後に、シャノンのエントロピーと熱力学的性質が磁場の作用によってどのように変化するかを示す。

In the context of non-relativistic quantum mechanics, we investigated Shannon's entropy of a non-Hermitian system to understand how this quantity is modified with the cyclotron frequency. Subsequently, we turn our attention to the construction of an ensemble of these spinless particles in the presence of a uniform magnetic field. Then, we study the thermodynamic properties of the model. Finally, we show how Shannon's entropy and thermodynamic properties are modified with the action of the magnetic field.

翻訳日:2023-04-17 00:24:55 公開日:2021-01-12

# 個々の窒素空洞核スピンの室温制御と電気的読み出し

Room-temperature control and electrical readout of individual nitrogen-vacancy nuclear spins ( http://arxiv.org/abs/2101.04769v1 )

ライセンス: Link先を確認

Michal Gulka, Daniel Wirtitsch, Viktor Iv\'ady, Jelle Vodnik, Jaroslav Hruby, Goele Magchiels, Emilie Bourgeois, Adam Gali, Michael Trupke, Milos Nesladek

(参考訳) 半導体の核スピンは量子計算、通信、センシングなど量子技術の主要な候補である。ダイヤモンドの核スピンは、非常に長いコヒーレンス寿命のため特に魅力的である。窒素空孔(NV)中心では、これらの核量子ビットは、フォトニックリンクを介する絡み合いを可能にする補助電子量子ビットの恩恵を受ける。電子自体による量子情報の転送は、隣り合う中心への制御された転送や双極子相互作用によって、より高速でより小さなプロセッサが実現されるが、そのようなノードの配列の光学的読み出しは、必要な準回折サイト間距離のために困難を伴う。ここでは、nv電子に結合した1つの14n核スピンである、そのような系の基本単位の電気的読み出しを示す。本研究は, 量子ゲート動作と核量子ビットレジスタの電気的読み出しを, ナノスケール電極構造に適合して行うことを目的とする。このデモンストレーションは、半導体スケーラビリティを備えた大規模ダイヤモンド量子デバイスへのマイルストーンである。

Nuclear spins in semiconductors are leading candidates for quantum technologies, including quantum computation, communication, and sensing. Nuclear spins in diamond are particularly attractive due to their extremely long coherence lifetime. With the nitrogen-vacancy (NV) centre, such nuclear qubits benefit from an auxiliary electronic qubit, which has enabled entanglement mediated by photonic links. The transport of quantum information by the electron itself, via controlled transfer to an adjacent centre or via the dipolar interaction, would enable even faster and smaller processors, but optical readout of arrays of such nodes presents daunting challenges due to the required sub-diffraction inter-site distances. Here, we demonstrate the electrical readout of a basic unit of such systems - a single 14N nuclear spin coupled to the NV electron. Our results provide the key ingredients for quantum gate operations and electrical readout of nuclear qubit registers, in a manner compatible with nanoscale electrode structures. This demonstration is therefore a milestone towards large-scale diamond quantum devices with semiconductor scalability.

翻訳日:2023-04-17 00:24:47 公開日:2021-01-12

# 量子演算回路の現実的最悪のケース解析について

On the realistic worst case analysis of quantum arithmetic circuits ( http://arxiv.org/abs/2101.04764v1 )

ライセンス: Link先を確認

Alexandru Paler, Oumarou Oumarou, Robert Basmadjian

(参考訳) 量子回路の設計における直観は誤解を招く可能性があることを示す。特に私たちはこう示しています a) t数の減少は,総深度を増加させることができる。 b) NISQ回路における測定のためにCNOTを交換することは有益かもしれない。 c) 相対位相トフォリアンシラの計測に基づく非計算は,回路の深さの最大30%を占めることができる。 d) 面積及び容積コストの指標は、資源分析を誤報することができる。私たちの発見は、qubitsは極めて希少なリソースであり続けると仮定している。結果は、NISQとQECC保護回路の両方に適用できる。提案手法はトフォリゲートをクリフォード+Tゲートに分解する複数の方法を用いる。リップルキャリーを用いた加算回路と乗算回路について述べる。副産物として,実用的に重要な回路幅に対して,リプルキャリー付加回路はキャリーキャリー付加回路よりも資源効率が高いことを示す。これらの手法と回路はオープンソース QUANTIFY ソフトウェアで実装された。

We provide evidence that commonly held intuitions when designing quantum circuits can be misleading. In particular we show that: a) reducing the T-count can increase the total depth; b) it may be beneficial to trade CNOTs for measurements in NISQ circuits; c) measurement-based uncomputation of relative phase Toffoli ancillae can make up to 30\% of a circuit's depth; d) area and volume cost metrics can misreport the resource analysis. Our findings assume that qubits are and will remain a very scarce resource. The results are applicable for both NISQ and QECC protected circuits. Our method uses multiple ways of decomposing Toffoli gates into Clifford+T gates. We illustrate our method on addition and multiplication circuits using ripple-carry. As a byproduct result we show systematically that for a practically significant range of circuit widths, ripple-carry addition circuits are more resource efficient than the carry-lookahead addition ones. The methods and circuits were implemented in the open-source QUANTIFY software.

翻訳日:2023-04-17 00:24:28 公開日:2021-01-12

# 差分制約付きknapsack問題に対するしきい値探索に基づくメメティックアルゴリズム

A threshold search based memetic algorithm for the disjunctively constrained knapsack problem ( http://arxiv.org/abs/2101.04753v1 )

ライセンス: Link先を確認

Zequn Wei and Jin-Kao Hao

(参考訳) 連結制約クナップサック問題は、クナップサック容量を満足しながら選択されたアイテムの総利益を最大化するように、容量制約クナップサックに対向するアイテムのサブセットを充填することを特徴とする。 DCKPは多くの応用があり、計算は困難である(NP-hard)。本研究では, DCKP を解くためのしきい値探索に基づくメメティックアルゴリズムを提案し, メメティックフレームワークとしきい値探索を組み合わせた高品質な解を求める。 6340のベンチマークインスタンスの2セットに関する広範な計算的評価結果から,提案手法は最先端手法と比較して高い競合性を示す。特に,セットi (100 インスタンス) とセットii (6240 インスタンス) について,24 と 354 において,最もよく知られた結果(新しい下限)がそれぞれ改善されていることを報告した。我々は,アルゴリズムの重要成分を分析し,アルゴリズムの性能向上のためにその役割を明かす。我々のアルゴリズムのコードは公開される予定だ。

The disjunctively constrained knapsack problem consists in packing a subset of pairwisely compatible items in a capacity-constrained knapsack such that the total profit of the selected items is maximized while satisfying the knapsack capacity. DCKP has numerous applications and is however computationally challenging (NP-hard). In this work, we present a threshold search based memetic algorithm for solving the DCKP that combines the memetic framework with threshold search to find high quality solutions. Extensive computational assessments on two sets of 6340 benchmark instances in the literature demonstrate that the proposed algorithm is highly competitive compared to the state-of-the-art methods. In particular, we report 24 and 354 improved best-known results (new lower bounds) for Set I (100 instances) and for Set II (6240 instances), respectively. We analyze the key algorithmic components and shed lights on their roles for the performance of the algorithm. The code of our algorithm will be made publicly available.

翻訳日:2023-04-17 00:24:17 公開日:2021-01-12

# 移動振動子の励起

Excitation of a moving oscillator ( http://arxiv.org/abs/2101.04721v1 )

ライセンス: Link先を確認

Viktor V. Dodonov

(参考訳) 運動法則の移動中心を持つ量子調和振動子のコヒーレント状態とフォック状態の間の遷移振幅と確率を計算する。これらの量は移動中心加速度のフーリエ変換によって決定される。定加速度の場合、その確率は振動子周波数と振動し、各周期の後に励起は起こらない。調和トラップ中心の振動や回転運動の例も考慮される。得られた原子トラップでは,高調波トラップ中心の動きによる振動状態の励起効果が観測できることが推定された。

We calculate transition amplitudes and probabilities between the coherent and Fock states of a quantum harmonic oscillator with a moving center for an arbitrary law of motion. These quantities are determined by the Fourier transform of the moving center acceleration. In the case of a constant acceleration, the probabilities oscillate with the oscillator frequency, so that no excitation occurs after every period. Examples of oscillating and rotating motion of the harmonic trap center are considered too. Estimations show that the effect of excitation of vibration states due to the motion of the harmonic trap center can be observed in available atomic traps.

翻訳日:2023-04-17 00:23:58 公開日:2021-01-12

# pontryagin differentiable programming: エンドツーエンドの学習と制御フレームワーク

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework ( http://arxiv.org/abs/1912.12970v5 )

ライセンス: Link先を確認

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou

(参考訳) 本稿では,pdp(pontryagin differentiable programming)手法を開発し,学習と制御タスクの幅広いクラスを解決するための統一フレームワークを構築した。 The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. 逆強化学習,システム識別,制御・計画の3つの学習モードについて検討した。マルチリンクロボットアーム,6-DoF操縦四極子,6-DoFロケット搭載着陸など,多次元システムにおける学習モード毎のPDPの能力を示す。

This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.

翻訳日:2023-01-17 03:12:21 公開日:2021-01-12

# 競争型生成分類のための情報ボトルネックを用いた正規化流れの訓練

Training Normalizing Flows with the Information Bottleneck for Competitive Generative Classification ( http://arxiv.org/abs/2001.06448v5 )

ライセンス: Link先を確認

Lynton Ardizzone, Radek Mackowiak, Carsten Rother, Ullrich K\"othe

(参考訳) Information Bottleneck (IB) の目的は、情報理論を用いてタスクパフォーマンス対堅牢性トレードオフを定式化することである。標準判別分類設定においてうまく適用されている。 IBが正規化フローなどの生成可能性モデルのトレーニングにも利用できるかどうかという疑問が提起される。正規化フローは非可逆ネットワークアーキテクチャ(INN)を使用するため、構築によって情報保存される。これはボトルネックの概念と矛盾しているようだ。本稿では,まず,innをトレーニングする条件付き正規化フローのクラスであるib-innsの理論と方法論を開発する。少数の"em control"情報損失を導入することで,innの生成能力を無傷に保ちながら,ibの漸近的に正確な定式化が可能になる。次に,これらのモデルの性質を実験的に検討し,特に生成的分類器として用いた。このモデルクラスは、不確実性定量化の改善や分布外検出などの利点を提供するが、従来の生成型分類法は分類精度がかなり低い。 IBのトレードオフパラメータは、標準分類器に近い生成能力と精度の混合を制御している。経験的に、この混合状態における我々の不確実性推定は、従来の生成的および識別的分類器と好意的に比較できる。

The Information Bottleneck (IB) objective uses information theory to formulate a task-performance versus robustness trade-off. It has been successfully applied in the standard discriminative classification setting. We pose the question whether the IB can also be used to train generative likelihood models such as normalizing flows. Since normalizing flows use invertible network architectures (INNs), they are information-preserving by construction. This seems contradictory to the idea of a bottleneck. In this work, firstly, we develop the theory and methodology of IB-INNs, a class of conditional normalizing flows where INNs are trained using the IB objective: Introducing a small amount of {\em controlled} information loss allows for an asymptotically exact formulation of the IB, while keeping the INN's generative capabilities intact. Secondly, we investigate the properties of these models experimentally, specifically used as generative classifiers. This model class offers advantages such as improved uncertainty quantification and out-of-distribution detection, but traditional generative classifier solutions suffer considerably in classification accuracy. We find the trade-off parameter in the IB controls a mix of generative capabilities and accuracy close to standard classifiers. Empirically, our uncertainty estimates in this mixed regime compare favourably to conventional generative and discriminative classifiers.

翻訳日:2023-01-10 09:59:32 公開日:2021-01-12

# AI-GAN: 攻撃に触発された敵の例の生成

AI-GAN: Attack-Inspired Generation of Adversarial Examples ( http://arxiv.org/abs/2002.02196v2 )

ライセンス: Link先を確認

Tao Bai, Jun Zhao, Jinlin Zhu, Shoudong Han, Jiefeng Chen, Bo Li, Alex Kot

(参考訳) ディープニューラルネットワーク(DNN)は、入力に知覚できない摂動を加えることで構築される敵の例に対して脆弱である。近年、様々な攻撃や戦略が提案されているが、現実的かつ効率的に敵の例を生成する方法は未解決のままである。本稿では、ジェネレータ、識別器、攻撃者が共同で訓練されるAI-GAN(Attack-Inspired GAN)と呼ばれる新しいフレームワークを提案する。トレーニングが完了すると、入力画像とターゲットクラスを効率よく生成できる。一般的なデータセット \eg MNIST と CIFAR-10 に関する広範な実験を通じて、AI-GAN は高い攻撃成功率を達成し、さまざまな設定で生成時間を大幅に短縮する。さらに、AI-GANは初めて、すべてのクラスで約90\%の成功率で、複雑なデータセット \eg CIFAR-100にスケールすることに成功した。

Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to inputs. Recently different attacks and strategies have been proposed, but how to generate adversarial examples perceptually realistic and more efficiently remains unsolved. This paper proposes a novel framework called Attack-Inspired GAN (AI-GAN), where a generator, a discriminator, and an attacker are trained jointly. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. Through extensive experiments on several popular datasets \eg MNIST and CIFAR-10, AI-GAN achieves high attack success rates and reduces generation time significantly in various settings. Moreover, for the first time, AI-GAN successfully scales to complicated datasets \eg CIFAR-100 with around $90\%$ success rates among all classes.

翻訳日:2023-01-03 10:12:23 公開日:2021-01-12

# 摂動に基づく正則化によるアーキテクチャ探索の安定化

Stabilizing Differentiable Architecture Search via Perturbation-based Regularization ( http://arxiv.org/abs/2002.05283v3 )

ライセンス: Link先を確認

Xiangning Chen, Cho-Jui Hsieh

(参考訳) 微分可能なアーキテクチャ探索(DARTS)は、アーキテクチャを識別するための一般的なNASソリューションである。アーキテクチャ空間の継続的な緩和に基づいて、DARTSは異なるアーキテクチャウェイトを学び、探索コストを大幅に削減する。しかし、その安定性は、探索が進むにつれて劣化するアーキテクチャをもたらすことで問題視されている。最終アーキテクチャを蒸留する際の劇的な性能低下につながる急激なバリデーション損失の状況は、不安定を引き起こす重要な要因であることがわかった。そこで本研究では,DARTS法における損失景観の平滑化と一般化性の向上を目的として,摂動型正規化SmoothDARTS(SDARTS)を提案する。特に,我々の新しい定式化はDARTS法をランダムな平滑化または逆攻撃によって安定化させる。 nas-bench-1shot1の探索軌跡は,提案手法の有効性を示し,安定性の向上により,4つのデータセットの様々な検索空間における性能向上を実現する。さらに,SDARTSは,スムーズな損失景観と性能向上を考慮に入れた検証損失のヘッセン的規範を暗黙的に正則化していることを示す。

Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the search cost. However, its stability has been challenged for yielding deteriorating architectures as the search proceeds. We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. Based on this observation, we propose a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods. In particular, our new formulations stabilize DARTS-based methods by either random smoothing or adversarial attack. The search trajectory on NAS-Bench-1Shot1 demonstrates the effectiveness of our approach and due to the improved stability, we achieve performance gain across various search spaces on 4 datasets. Furthermore, we mathematically show that SDARTS implicitly regularizes the Hessian norm of the validation loss, which accounts for a smoother loss landscape and improved performance.

翻訳日:2023-01-01 18:52:58 公開日:2021-01-12

# hemlets posh: 3次元ポーズと形状推定のための部分中心ヒートマップ三重項学習

HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose and Shape Estimation ( http://arxiv.org/abs/2003.04894v3 )

ライセンス: Link先を確認

Kun Zhou, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

(参考訳) 一つの画像から3Dのポーズを推定するのは難しい作業だ。本研究は,2次元観察と3次元解釈のギャップを短くする中間状態部分熱マップトリプレット(HEMlets)を導入することで,検出された2次元関節を3次元空間に持ち上げる不確実性に対処しようとするものである。 HEMletsは3つのジョイントヒートマップを使用して、各骨格体部に対するエンドジョイントの相対的な深さ情報を表す。提案手法では,まず入力画像からHEMletを予測するために畳み込みネットワーク(ConvNet)を訓練し,その後にボリューム結合熱マップの回帰を行う。積分演算を利用して体積熱マップから関節の位置を抽出し、エンドツーエンドの学習を保証する。ネットワーク設計の単純さにも拘わらず、定量的比較は最高の方法(例えばHuman3.6Mの$20\%)よりも大幅に性能が向上したことを示している。提案手法は,骨格関節の弱いアノテートした相対深度情報しか得られない "in-the-wild" 画像による訓練を自然に支援する。これにより,屋外画像の質的比較により,モデルの一般化能力が向上する。ヘムレットの姿勢推定の強みを活かし,さらに浅く効果的なネットワークモジュールを設計・付加し,身体の姿勢や形状のsmplパラメータを緩和する。 HEMletsベースのヒューマンポーズと形状回復パイプラインHEMlets PoShを総称する。 HEMlets PoShアプローチで得られた最先端の成果を,既存の人体回収ベンチマークの大規模定量および定性的実験により正当化した。

Estimating 3D human pose from a single image is a challenging task. This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets), which shortens the gap between the 2D observation and the 3D interpretation. The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part. In our approach, a Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression. We leverage on the integral operation to extract the joint locations from the volumetric heatmaps, guaranteeing end-to-end learning. Despite the simplicity of the network design, the quantitative comparisons show a significant performance improvement over the best-of-grade methods (e.g. $20\%$ on Human3.6M). The proposed method naturally supports training with "in-the-wild" images, where only weakly-annotated relative depth information of skeletal joints is available. This further improves the generalization ability of our model, as validated by qualitative comparisons on outdoor images. Leveraging the strength of the HEMlets pose estimation, we further design and append a shallow yet effective network module to regress the SMPL parameters of the body pose and shape. We term the entire HEMlets-based human pose and shape recovery pipeline HEMlets PoSh. Extensive quantitative and qualitative experiments on the existing human body recovery benchmarks justify the state-of-the-art results obtained with our HEMlets PoSh approach.

翻訳日:2022-12-24 21:19:48 公開日:2021-01-12

# 自動ベイズ最適化に向けて : 獲得関数に関わる第一歩

Towards Automatic Bayesian Optimization: A first step involving acquisition functions ( http://arxiv.org/abs/2003.09643v2 )

ライセンス: Link先を確認

Eduardo C. Garrido Merch\'an, Luis C. Jariego P\'erez

(参考訳) ベイズ最適化 (Bayesian Optimization) は、ブラックボックスの最適化(つまり、解析的表現や勾配にアクセスできない関数)の最先端技術であり、評価は高価であり、評価はノイズが多い。ベイズ最適化の最も一般的な応用は機械学習アルゴリズムの自動ハイパーパラメータチューニングであり、これらのアルゴリズムの一般化誤差を最適化して機械学習アルゴリズムの最適構成を得る。ベイズ最適化手法は成功して適用されているものの、確率的サロゲートモデルや使用する獲得関数などの設定が必要なハイパーパラメータも備えている。これらのハイパーパラメータの構成に関する悪い決定は、悪い品質結果を得ることを意味する。通常、これらのハイパーパラメータは、我々が評価したい目的関数を仮定することで調整されますが、目的関数に関する事前情報を持っていないシナリオがあります。本稿では,ベイズ最適化の獲得関数を自動的にチューニングする複数のヒューリスティックを探索し,ベイズ最適化に対する最初の試みを提案する。機械学習アルゴリズムのベンチマーク問題とハイパーパラメータチューニング問題において,これらのヒューリシックの有効性について述べる。

Bayesian Optimization is the state of the art technique for the optimization of black boxes, i.e., functions where we do not have access to their analytical expression nor its gradients, they are expensive to evaluate and its evaluation is noisy. The most popular application of bayesian optimization is the automatic hyperparameter tuning of machine learning algorithms, where we obtain the best configuration of machine learning algorithms by optimizing the estimation of the generalization error of these algorithms. Despite being applied with success, bayesian optimization methodologies also have hyperparameters that need to be configured such as the probabilistic surrogate model or the acquisition function used. A bad decision over the configuration of these hyperparameters implies obtaining bad quality results. Typically, these hyperparameters are tuned by making assumptions of the objective function that we want to evaluate but there are scenarios where we do not have any prior information about the objective function. In this paper, we propose a first attempt over automatic bayesian optimization by exploring several heuristics that automatically tune the acquisition function of bayesian optimization. We illustrate the effectiveness of these heurisitcs in a set of benchmark problems and a hyperparameter tuning problem of a machine learning algorithm.

翻訳日:2022-12-21 10:16:24 公開日:2021-01-12

# Bayesian ODE Solvers: The Maximum A Posteriori Estimate

Bayesian ODE Solvers: The Maximum A Posteriori Estimate ( http://arxiv.org/abs/2004.00623v2 )

ライセンス: Link先を確認

Filip Tronarp, Simo Sarkka, Philipp Hennig

(参考訳) 近年,正規微分方程式の数値解を非線形ベイズ推定問題として定式化できることが確立されており,ガウス・マルコフ前駆体を用いると,ガウスフィルタリングや平滑化によって近似的に解くことができる。ガウス推定値の分類が確立され、階層の最上部に最大後方推定値が設定され、反復された拡張カルマン平滑化によって計算できる。残りの3つのクラスは明示的、半単純、暗黙的と呼ばれ、これはベクトル場の条件に対応する古典的概念と類似しており、その下にフィルタ更新が局所的な最大a後方推定を生成する。最大後続推定は、前項に付随する再生ヒルベルト空間の最適補間に対応し、この場合、滑らか性のソボレフ空間 $\nu+1$ と等価である。その結果, ソボレフ空間における散乱データ近似と非線形解析の手法を用いて, ベクトル場上の緩やかな条件下での補間距離(最大ステップサイズ)の多項式速度で, 最大アフター推定値が真の解に収束することを示した。開発された手法は、古典的収束解析法よりも、これらの推定器の収束を研究するための新しい、より自然なアプローチを提供する。これらの方法と理論的結果は数値的な例で示される。

It has recently been established that the numerical solution of ordinary differential equations can be posed as a nonlinear Bayesian inference problem, which can be approximately solved via Gaussian filtering and smoothing, whenever a Gauss--Markov prior is used. In this paper the class of $\nu$ times differentiable linear time invariant Gauss--Markov priors is considered. A taxonomy of Gaussian estimators is established, with the maximum a posteriori estimate at the top of the hierarchy, which can be computed with the iterated extended Kalman smoother. The remaining three classes are termed explicit, semi-implicit, and implicit, which are in similarity with the classical notions corresponding to conditions on the vector field, under which the filter update produces a local maximum a posteriori estimate. The maximum a posteriori estimate corresponds to an optimal interpolant in the reproducing Hilbert space associated with the prior, which in the present case is equivalent to a Sobolev space of smoothness $\nu+1$. Consequently, using methods from scattered data approximation and nonlinear analysis in Sobolev spaces, it is shown that the maximum a posteriori estimate converges to the true solution at a polynomial rate in the fill-distance (maximum step size) subject to mild conditions on the vector field. The methodology developed provides a novel and more natural approach to study the convergence of these estimators than classical methods of convergence analysis. The methods and theoretical results are demonstrated in numerical examples.

翻訳日:2022-12-17 19:33:24 公開日:2021-01-12

# YOLOv3によるドローン検出におけるアノテーションエラーの影響

Effect of Annotation Errors on Drone Detection with YOLOv3 ( http://arxiv.org/abs/2004.01059v4 )

ライセンス: Link先を確認

Aybora Koksal, Kutalmis Gokalp Ince, A. Aydin Alatan

(参考訳) 近年の深層ネットワークの進歩により、ディープラーニングバックボーンを用いた物体検出と追跡アルゴリズムが大幅に改善されているが、この急速な発展により大量の注釈付きラベルが必要となった。たとえこのような半自動的なアノテーションプロセスの詳細が、特にビデオアノテーションに関して、正確には分かっていないとしても、自動化されたラベリングプロセスが使われることが多い。残念ながら、このようなアプローチは誤ったアノテーションをもたらす可能性がある。本研究では,物体検出問題に対する異なる種類のアノテーションエラーをシミュレートし,トレーニングおよび試験段階における誤アノテーションを用いた最先端オブジェクト検出器YOLOv3の性能について検討する。さらに、cvpr-2020アンチuavチャレンジデータセットにおける避けられないアノテーションエラーについても、この有用なデータセットのアノテーションエラーを修正するソリューションを提案しながら、この方法で検討している。

Following the recent advances in deep networks, object detection and tracking algorithms with deep learning backbones have been improved significantly; however, this rapid development resulted in the necessity of large amounts of annotated labels. Even if the details of such semi-automatic annotation processes for most of these datasets are not known precisely, especially for the video annotations, some automated labeling processes are usually employed. Unfortunately, such approaches might result with erroneous annotations. In this work, different types of annotation errors for object detection problem are simulated and the performance of a popular state-of-the-art object detector, YOLOv3, with erroneous annotations during training and testing stages is examined. Moreover, some inevitable annotation errors in CVPR-2020 Anti-UAV Challenge dataset is also examined in this manner, while proposing a solution to correct such annotation errors of this valuable data set.

翻訳日:2022-12-17 12:45:46 公開日:2021-01-12

# 低次元関数のロバストテスト

Robust testing of low-dimensional functions ( http://arxiv.org/abs/2004.11642v2 )

ライセンス: Link先を確認

Anindya De, Elchanan Mossel and Joe Neeman

(参考訳) 高次元推論における自然問題は、分類器 $f:\mathbb{R}^n \rightarrow \{-1,1\}$ がその入力データの少数の線形方向に依存するかどうかを決定することである。関数 $g: \mathbb{r}^n \rightarrow \{-1,1\}$、入力空間の$k$-次元部分空間によって完全に決定される場合、線型 $k$-junta と呼ぶ。著者たちの最近の研究は、線型$k$-juntasがテスト可能であることを示した。したがって、次のように区別するアルゴリズムが存在する。 1.$f: \mathbb{R}^n \rightarrow \{-1,1\}$ これは表面積が$s$の線型$k$-juntaである。 2.$f$ is $\epsilon$-far from any linear $k$-junta with surface area $(1+\epsilon)s$ ここでアルゴリズムのクエリの複雑さは周囲の次元$n$とは独立である。耐雑音性試験への関心が高まった後, 本論文では, 耐雑音性(あるいは頑健性)の検証を行った。すなわち、任意の$c>0$,$\epsilon>0$を与えられたアルゴリズムを区別する。 1.$f: \mathbb{R}^n \rightarrow \{-1,1\}$は、少なくとも$c$と、表面積が$s$の線型$k$-juntaとの相関を持つ。 2.$f$ は、最大 $c-\epsilon$ と、表面積が最大 $s$ の線形 $k$-junta との相関を持つ。テスト担当者のクエリの複雑さは$k^{\mathsf{poly}(s/\epsilon)}$である。この手法を用いることで、任意のクラス$\mathcal{C}$ of linear $k$-juntas with surface area with $s$に対して同じクエリ複雑性を持つ完全耐雑音試験器も得られる。その結果、クエリ複雑性が$k^{O(\mathsf{poly}(\log k/\epsilon))} である完全雑音耐性試験器を、ガウス空間上の$k$-半空間(定数$k$)の交叉のクラスに対して得られる。私たちのクエリの複雑さは、周囲の次元$n$とは独立です。以前は、1つのハーフスペースでさえ、非自明なノイズ耐性テスターは知られていない。

A natural problem in high-dimensional inference is to decide if a classifier $f:\mathbb{R}^n \rightarrow \{-1,1\}$ depends on a small number of linear directions of its input data. Call a function $g: \mathbb{R}^n \rightarrow \{-1,1\}$, a linear $k$-junta if it is completely determined by some $k$-dimensional subspace of the input space. A recent work of the authors showed that linear $k$-juntas are testable. Thus there exists an algorithm to distinguish between: 1. $f: \mathbb{R}^n \rightarrow \{-1,1\}$ which is a linear $k$-junta with surface area $s$, 2. $f$ is $\epsilon$-far from any linear $k$-junta with surface area $(1+\epsilon)s$, where the query complexity of the algorithm is independent of the ambient dimension $n$. Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any $c>0$, $\epsilon>0$, distinguishes between 1. $f: \mathbb{R}^n \rightarrow \{-1,1\}$ has correlation at least $c$ with some linear $k$-junta with surface area $s$. 2. $f$ has correlation at most $c-\epsilon$ with any linear $k$-junta with surface area at most $s$. The query complexity of our tester is $k^{\mathsf{poly}(s/\epsilon)}$. Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class $\mathcal{C}$ of linear $k$-juntas with surface area bounded by $s$. As a consequence, we obtain a fully noise tolerant tester with query complexity $k^{O(\mathsf{poly}(\log k/\epsilon))}$ for the class of intersection of $k$-halfspaces (for constant $k$) over the Gaussian space. Our query complexity is independent of the ambient dimension $n$. Previously, no non-trivial noise tolerant testers were known even for a single halfspace.

翻訳日:2022-12-10 04:26:39 公開日:2021-01-12

# DeepRx: 完全な畳み込み型ディープラーニングレシーバー

DeepRx: Fully Convolutional Deep Learning Receiver ( http://arxiv.org/abs/2005.01494v2 )

ライセンス: Link先を確認

Mikko Honkala, Dani Korpi, Janne M.J. Huttunen

(参考訳) ディープラーニングは、ヒューリスティックなアルゴリズムに届かない多くの問題を解決した。現在の無線システムはよく理解されており、多くのタスクに最適なアルゴリズムが存在するにもかかわらず、無線通信にもうまく適用されている。受信機の個々の部分を学習することで得られる利得もあるが、受信機全体を共同で学習する方がよい。しかし、これはしばしば、最適解の実装が不可能な、難解な非線形問題をもたらす。そこで本研究では,周波数領域信号ストリームから符号化されていないビットへのレシーバパイプライン全体を実行する,完全畳み込みニューラルネットワークDeepRxを提案する。本研究では,畳み込みニューラルネットワークの入力を,データとパイロットシンボルの両方を用いて非常に特異的に構成することにより,正確なチャネル推定を容易にする。また、DeepRxは5Gシステムで使用されるチャネル符号化と互換性のあるソフトビットを出力する。 3GPP定義チャネルモデルを用いて、DeepRxが従来の手法より優れていることを示す。また,検出精度を向上させるために,未知のデータシンボルの既知の星座点と局所的なシンボル分布を利用するために,DeepRx学習による高い性能が期待できることを示す。

Deep learning has solved many problems that are out of reach of heuristic algorithms. It has also been successfully applied in wireless communications, even though the current radio systems are well-understood and optimal algorithms exist for many tasks. While some gains have been obtained by learning individual parts of a receiver, a better approach is to jointly learn the whole receiver. This, however, often results in a challenging nonlinear problem, for which the optimal solution is infeasible to implement. To this end, we propose a deep fully convolutional neural network, DeepRx, which executes the whole receiver pipeline from frequency domain signal stream to uncoded bits in a 5G-compliant fashion. We facilitate accurate channel estimation by constructing the input of the convolutional neural network in a very specific manner using both the data and pilot symbols. Also, DeepRx outputs soft bits that are compatible with the channel coding used in 5G systems. Using 3GPP-defined channel models, we demonstrate that DeepRx outperforms traditional methods. We also show that the high performance can likely be attributed to DeepRx learning to utilize the known constellation points of the unknown data symbols, together with the local symbol distribution, for improved detection accuracy.

翻訳日:2022-12-07 01:40:06 公開日:2021-01-12

# BERTと従来の機械学習テキスト分類の比較

Comparing BERT against traditional machine learning text classification ( http://arxiv.org/abs/2005.13012v2 )

ライセンス: Link先を確認

Santiago Gonz\'alez-Carvajal and Eduardo C. Garrido-Merch\'an

(参考訳) 近年、BERTモデルは、人間の監督なしに教師付きテキスト分類などの複数のNLPタスクに対処できる、最先端の機械学習モデルとして人気を博している。優れた成果をもたらすあらゆるタイプのコーパスに対処する柔軟性は、このアプローチをアカデミックだけでなく、業界でも非常に人気があります。しかし、成功して何年にもわたって多くの異なるアプローチが使われてきた。本研究では,BERTを初めて紹介し,古典的NLPアプローチについて概説する。そして,従来のTF-IDFボキャブラリに対するBERTの振る舞いを機械学習アルゴリズムに入力する,さまざまなシナリオを扱う一連の実験を経験的にテストした。本研究の目的は,NLPタスクのデフォルトとしてBERTの使用を支持するか拒否する経験的証拠を追加することである。実験では、BERTの優位性と、NLP問題で使用されるデフォルト技術としてBERTを使用する経験的証拠を付加するテキスト言語のような、NLP問題の特徴の独立性を示す。

The BERT model has arisen as a popular state-of-the-art machine learning model in the recent years that is able to cope with multiple NLP tasks such as supervised text classification without human supervision. Its flexibility to cope with any type of corpus delivering great results has make this approach very popular not only in academia but also in the industry. Although, there are lots of different approaches that have been used throughout the years with success. In this work, we first present BERT and include a little review on classical NLP approaches. Then, we empirically test with a suite of experiments dealing different scenarios the behaviour of BERT against the traditional TF-IDF vocabulary fed to machine learning algorithms. Our purpose of this work is to add empirical evidence to support or refuse the use of BERT as a default on NLP tasks. Experiments show the superiority of BERT and its independence of features of the NLP problem such as the language of the text adding empirical evidence to use BERT as a default technique to be used in NLP problems.

翻訳日:2022-11-28 23:22:17 公開日:2021-01-12

# gan(generative adversarial network)を用いた胸部x線画像におけるganによる肺炎およびcovid-19の検出

Data Augmentation using Generative Adversarial Networks (GANs) for GAN-based Detection of Pneumonia and COVID-19 in Chest X-ray Images ( http://arxiv.org/abs/2006.03622v2 )

ライセンス: Link先を確認

Saman Motamed and Patrik Rogalla and Farzad Khalvati

(参考訳) 畳み込みニューラルネットワーク(CNN)のトレーニングの成功には、かなりの量のデータが必要である。小さなデータセットでは、ネットワークは一般的でない。データ拡張技術は、既存のトレーニングデータをより効果的に利用することにより、ニューラルネットワークの一般化性を向上させる。しかし、標準的なデータ拡張手法は、限定可能な代替データを生成する。 GAN(Generative Adversarial Networks)は,新たなデータ生成とCNNの性能向上に利用されている。それでも、ganのトレーニングのためのデータ拡張技術は、cnnと比較して未検討である。そこで本研究では, 造血モデルを用いて, 肺炎およびcovid-19の半監督検出のための胸部x線増倍用ganアーキテクチャを提案する。提案したGANは, 肺炎, COVID-19の胸部X線検査において, データを効果的に増強し, 疾患の分類精度を向上させるのに有用である。我々は、GANモデルとDeep Convolutional GANと、2つの異なるX線データセット上の従来の拡張方法(回転、ズーム等)を比較し、GANに基づく拡張手法が、X線画像の異常を検出するための他の拡張方法を上回ることを示す。

Successful training of convolutional neural networks (CNNs) requires a substantial amount of data. With small datasets networks generalize poorly. Data Augmentation techniques improve the generalizability of neural networks by using existing training data more effectively. Standard data augmentation methods, however, produce limited plausible alternative data. Generative Adversarial Networks (GANs) have been utilized to generate new data and improve the performance of CNNs. Nevertheless, data augmentation techniques for training GANs are under-explored compared to CNNs. In this work, we propose a new GAN architecture for augmentation of chest X-rays for semi-supervised detection of pneumonia and COVID-19 using generative models. We show that the proposed GAN can be used to effectively augment data and improve classification accuracy of disease in chest X-rays for pneumonia and COVID-19. We compare our augmentation GAN model with Deep Convolutional GAN and traditional augmentation methods (rotate, zoom, etc) on two different X-ray datasets and show our GAN-based augmentation method surpasses other augmentation methods for training a GAN in detecting anomalies in X-ray images.

翻訳日:2022-11-25 03:43:46 公開日:2021-01-12

# デバイス社会における分類課題のための連帯学習と連続学習

Federated and continual learning for classification tasks in a society of devices ( http://arxiv.org/abs/2006.07129v2 )

ライセンス: Link先を確認

Fernando E. Casado, Dylan Lema, Roberto Iglesias, Carlos V. Regueiro, Sen\'en Barro

(参考訳) 現在、デバイスはますます相互接続され、センサ化され、ほぼユビキタスな状況にあります。近年、ディープラーニングは、これらのデバイスが収集できる膨大なデータから知識を抽出する一般的な方法となっている。それにもかかわらず、集中型最先端学習手法は、利用可能な情報が通常プライベートで部分的、偏りがあり、時間とともに進化する、真の分散問題に直面したときに多くの欠点がある。フェデレートラーニング(Federated Learning)は、複数の分散デバイスがリモートで、協調的にモデルをトレーニングし、データのプライバシを保存するための人気のフレームワークである。しかし、現在のフェデレートラーニングにおける提案は、スマートフォンなどの非専用デバイスで実装できないようなディープアーキテクチャに重点を置いている。また、時間とともにデータ分布が予期せぬ方法で変化し、概念ドリフト(concept drift)と呼ばれる現象を引き起こすシナリオについてはほとんど研究されていない。したがって、本研究では、軽量で伝統的な学習者を用いた新しいフェデレーションおよび継続型アーキテクチャである、ライトフェデレーションおよび継続型コンセンサス(LFedCon2)を提示したい。当社の手法では,スマートフォンやロボットなどの無力デバイスが,ローカル,継続的,自律的,あるいはユーザからリアルタイムに学習することができると同時に,クラウド上でのモデルの改善も実現している。提案手法をスマートフォン利用者の異種コミュニティに適用し,歩行認識の課題を解決した。この結果は、LFedCon2が他の最先端メソッドに対してもたらす利点を示している。

Today we live in a context in which devices are increasingly interconnected and sensorized and are almost ubiquitous. Deep learning has become in recent years a popular way to extract knowledge from the huge amount of data that these devices are able to collect. Nevertheless, centralized state-of-the-art learning methods have a number of drawbacks when facing real distributed problems, in which the available information is usually private, partial, biased and evolving over time. Federated learning is a popular framework that allows multiple distributed devices to train models remotely, collaboratively, and preserving data privacy. However, the current proposals in federated learning focus on deep architectures that in many cases are not feasible to implement in non-dedicated devices such as smartphones. Also, little research has been done regarding the scenario where data distribution changes over time in unforeseen ways, causing what is known as concept drift. Therefore, in this work we want to present Light Federated and Continual Consensus (LFedCon2), a new federated and continual architecture that uses light, traditional learners. Our method allows powerless devices (such as smartphones or robots) to learn in real time, locally, continuously, autonomously and from users, but also improving models globally, in the cloud, combining what is learned locally, in the devices. In order to test our proposal, we have applied it in a heterogeneous community of smartphone users to solve the problem of walking recognition. The results show the advantages that LFedCon2 provides with respect to other state-of-the-art methods.

翻訳日:2022-11-22 04:26:48 公開日:2021-01-12

# ニューラル非線形トラッキング

Neural Non-Rigid Tracking ( http://arxiv.org/abs/2006.13240v2 )

ライセンス: Link先を確認

Alja\v{z} Bo\v{z}i\v{c}, Pablo Palafox, Michael Zollh\"ofer, Angela Dai, Justus Thies, Matthias Nie{\ss}ner

(参考訳) そこで我々は, 最先端の非剛性復元を, 頑健な最適化によって実現できる新しい, エンドツーエンドで学習可能な, 微分可能な非剛性トラッカーを提案する。非剛体移動物体の2つの入力RGB-Dフレームを考慮し、畳み込みニューラルネットワークを用いて、密度の高い対応とその信頼性を予測する。これらの対応は、ARAP(as-rigid-as-possible)最適化問題における制約として用いられる。重み付き非線形最小二乗解法によって勾配バックプロパゲーションを可能にすることにより,非剛性追跡のタスクに最適であるように,エンドツーエンドで対応と信頼度を学習することができる。この定式化の下では、自己超越(self-supervision)を通じて対応信頼度を学習し、学習された堅牢な最適化を通知する。最新の手法と比較して,提案アルゴリズムは再構築性能を向上し,同時に85倍の精度で対応予測を行う。コードを利用可能にします。

We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction by a learned robust optimization. Given two input RGB-D frames of a non-rigidly moving object, we employ a convolutional neural network to predict dense correspondences and their confidences. These correspondences are used as constraints in an as-rigid-as-possible (ARAP) optimization problem. By enabling gradient back-propagation through the weighted non-linear least squares solver, we are able to learn correspondences and confidences in an end-to-end manner such that they are optimal for the task of non-rigid tracking. Under this formulation, correspondence confidences can be learned via self-supervision, informing a learned robust optimization, where outliers and wrong correspondences are automatically down-weighted to enable effective tracking. Compared to state-of-the-art approaches, our algorithm shows improved reconstruction performance, while simultaneously achieving 85 times faster correspondence prediction than comparable deep-learning based methods. We make our code available.

翻訳日:2022-11-17 22:52:49 公開日:2021-01-12

# 機能拡張リワード学習 : 人間の入力を再考する

Feature Expansive Reward Learning: Rethinking Human Input ( http://arxiv.org/abs/2006.13208v2 )

ライセンス: Link先を確認

Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan

(参考訳) ロボットがタスクを実行する方法に満足していない場合、それを修正するために介入することができる。逆学習法により、ロボットは人間の入力に基づいて報酬関数をオンラインで適応できるが、手作りの機能に依存している。これらの特徴で補正が説明できない場合、deep inverse reinforcement learning (irl)の最近の研究は、ロボットがタスクのデモンストレーションを要求し、生の状態空間上で定義された報酬を回収できることを示唆している。私たちの洞察では、デモから欠けている機能について暗黙的に学ぶのではなく、ロボットは、何が欠けているかを明示的に教えるデータを求めるべきである。そこで我々は,ロボットが教えている特徴が表現されていない状態からロボットを誘導する新しいタイプの人間入力を紹介した。本稿では,生の状態空間から特徴を学習し,それを報酬関数に統合するアルゴリズムを提案する。人間の入力を欠落した特徴に焦点を合わせることで、サンプルの複雑さを減らし、上記の深いIRLベースラインに対する学習報酬の一般化を改善する。本研究は,7dofロボットマニピュレータを用いた実験や,シミュレーション環境でのユーザ実験で紹介する。

When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states where the feature being taught is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a physical 7DOF robot manipulator, as well as in a user study conducted in a simulated environment.

翻訳日:2022-11-17 21:41:11 公開日:2021-01-12

# ProVe -- ニューラルネットワークモデルに基づく自動製品置換とコールドスタートのためのセルフ教師付きパイプライン

ProVe -- Self-supervised pipeline for automated product replacement and cold-starting based on neural language models ( http://arxiv.org/abs/2006.14994v2 )

ライセンス: Link先を確認

Andrei Ionut Damian, Laurentiu Piciu, Cosmin Mihai Marinescu

(参考訳) 小売業の垂直産業では、企業は新しい購買行動への迅速な理解と適応の人間の限界に対処している。さらに、小売業は、製品・ブランド・カテゴリの大量選択を適切に管理する人的制限を克服する必要がある。これらの制限は、商業的(セールスの損失、顧客満足度の低下など)と運用的視点(外産、過剰生産など)の両方から欠陥をもたらす。本稿では,自然言語理解に基づくパイプラインアプローチを提案し,アウトオブストックの製品に最適な代替品を推奨する。さらに,取引履歴がほとんどない小売業者のポートフォリオに新たに導入された製品を管理するためのソリューションを提案する。このソリューションはビジネスに役立つ – 新製品を適切なカテゴリに自動的に割り当てる; 1日目からクロス販売を補完する製品を推奨する; 取引履歴がほとんどなくても販売予測を行う。最後に,本論文で提示したパイプラインを適用したベクトル空間モデルを,ディープラーニングに基づく需要予測ソリューションのセマンティック情報として直接利用し,より正確な予測を行う。研究と実験のプロセスは、実際のプライベートなトランザクションデータを使用して行われたが、ソースコードはhttps://github.com/lummetry/proveで入手できる。

In retail vertical industries, businesses are dealing with human limitation of quickly understanding and adapting to new purchasing behaviors. Moreover, retail businesses need to overcome the human limitation of properly managing a massive selection of products/brands/categories. These limitations lead to deficiencies from both commercial (e.g. loss of sales, decrease in customer satisfaction) and operational perspective (e.g. out-of-stock, over-stock). In this paper, we propose a pipeline approach based on Natural Language Understanding, for recommending the most suitable replacements for products that are out-of-stock. Moreover, we will propose a solution for managing products that were newly introduced in a retailer's portfolio with almost no transactional history. This solution will help businesses: automatically assign the new products to the right category; recommend complementary products for cross-sell from day 1; perform sales predictions even with almost no transactional history. Finally, the vector space model resulted by applying the pipeline presented in this paper is directly used as semantic information in deep learning-based demand forecasting solutions, leading to more accurate predictions. The whole research and experimentation process have been done using real-life private transactional data, however the source code is available on https://github.com/Lummetry/ProVe

翻訳日:2022-11-16 20:57:17 公開日:2021-01-12

# 全体はパーツを上回っていますか? AI説明が相補的チームパフォーマンスに及ぼす影響

Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance ( http://arxiv.org/abs/2006.14779v3 )

ライセンス: Link先を確認

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, Daniel S. Weld

(参考訳) 多くの研究者は、AIが推奨を説明すると、意思決定タスクにおける人間とAIチームのパフォーマンスが改善することを示す研究で説明可能なAIを動機付けている。しかし、以前の研究では、AIが人間と最高のチームの両方を上回った場合にのみ、説明による改善が観察された。チームの正確さが人間かAIの作業ソロよりも高い場合、説明は相補的なパフォーマンスにつながるか? 我々は3つのデータセットで混合手法のユーザスタディを行い、人間に匹敵する精度のAIは、参加者がタスク(ある条件下で自身を説明する)を解決するのに役立ちます。 ai拡張による補足的な改善が見られたが、説明では改善されなかった。むしろ説明は、人間がその正確性に関わらず、AIの推奨を受け入れる可能性を高めた。我々の結果は、人間中心のAIに新しい課題をもたらす。AIへの適切な信頼を促す説明的アプローチを開発することは可能か?

Many researchers motivate explainable AI with studies showing that human-AI team performance on decision-making tasks improves when the AI explains its recommendations. However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team. Can explanations help lead to complementary performance, where team accuracy is higher than either the human or the AI working solo? We conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions). While we observed complementary improvements from AI augmentation, they were not increased by explanations. Rather, explanations increased the chance that humans will accept the AI's recommendation, regardless of its correctness. Our result poses new challenges for human-centered AI: Can we develop explanatory approaches that encourage appropriate trust in AI, and therefore help generate (or improve) complementary performance?

翻訳日:2022-11-16 20:37:15 公開日:2021-01-12

# 距離測度空間におけるルベーグ点の最も近い近傍特性

A Nearest Neighbor Characterization of Lebesgue Points in Metric Measure Spaces ( http://arxiv.org/abs/2007.03937v4 )

ライセンス: Link先を確認

Tommaso Cesari (TSE), Roberto Colomboni (IIT)

(参考訳) ほぼ全ての点がルベーグ点である性質は、近接近傍に基づくいくつかの分類アルゴリズムの一貫性に不可欠であることが証明されている。我々は,1-Nearest Neighbor回帰アルゴリズムを用いて点推定を行い,対応する収束問題におけるタイブレーキング規則で果たす役割を具体化する。次に、結果を応用して、ほぼすべての点がルベーグ点である一般距離空間において、大きな1-Nearest Neighbor分類アルゴリズムのリスクの収束を証明した。

The property of almost every point being a Lebesgue point has proven to be crucial for the consistency of several classification algorithms based on nearest neighbors. We characterize Lebesgue points in terms of a 1-Nearest Neighbor regression algorithm for pointwise estimation, fleshing out the role played by tie-breaking rules in the corresponding convergence problem. We then give an application of our results, proving the convergence of the risk of a large class of 1-Nearest Neighbor classification algorithms in general metric spaces where almost every point is a Lebesgue point.

翻訳日:2022-11-12 12:47:32 公開日:2021-01-12

# 学習モデルにおける境界厚さと堅牢性

Boundary thickness and robustness in learning models ( http://arxiv.org/abs/2007.05086v2 )

ライセンス: Link先を確認

Yaoqing Yang, Rajiv Khanna, Yaodong Yu, Amir Gholami, Kurt Keutzer, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

(参考訳) さまざまな敵対的および非敵対的腐敗に対する機械学習モデルの堅牢性は、依然として注目されている。本稿では,分類器の境界厚さの概念を導入し,その関係とモデルロバスト性の有用性について述べる。厚い決定境界は性能を向上し、薄い決定境界は過剰適合(例えば、トレーニングとテストの間の堅牢な一般化ギャップによって測定される)と低い堅牢性をもたらす。より厚い境界は、敵の例に対する堅牢性(例えば、敵の訓練の堅牢性テスト精度の向上)といわゆるアウト・オブ・ディストリビューション(OOD)変換の改善に役立つことを示す。理論的には、トレーニング中の境界の厚さを最大化することは、いわゆるミックスアップトレーニングに似ている。これらの結果から,混合訓練における雑音増強は境界厚さをさらに増加させ,様々な形態の敵攻撃やOOD変換に対する脆弱性に対処することを示した。また、最近の作業のいくつかの行のパフォーマンス改善は、より厚い境界と共に起こることも示せる。

Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with and usefulness for model robustness. Thick decision boundaries lead to improved performance, while thin decision boundaries lead to overfitting (e.g., measured by the robust generalization gap between training and testing) and lower robustness. We show that a thicker boundary helps improve robustness against adversarial examples (e.g., improving the robust test accuracy of adversarial training) as well as so-called out-of-distribution (OOD) transforms, and we show that many commonly-used regularization and data augmentation procedures can increase boundary thickness. On the theoretical side, we establish that maximizing boundary thickness during training is akin to the so-called mixup training. Using these observations, we show that noise-augmentation on mixup training further increases boundary thickness, thereby combating vulnerability to various forms of adversarial attacks and OOD transforms. We can also show that the performance improvement in several lines of recent work happens in conjunction with a thicker boundary.

翻訳日:2022-11-12 03:40:10 公開日:2021-01-12

# 深い相互作用で再重み付けを学ぶ

Learning to Reweight with Deep Interactions ( http://arxiv.org/abs/2007.04649v2 )

ライセンス: Link先を確認

Yang Fan, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

(参考訳) 近年,教師モデルを用いて,データ選択や損失関数設計などを通じて,学生モデル(実際のタスクで使用される)のトレーニングを指導する機械学習の概念が導入されている。教師モデルを用いてトレーニングデータを重み付けする特定の種類の授業であるリウェイトへの学習は、その単純さと有効性から多くの注目を集める。教師モデルでは,学習繰り返し数や学習モデルの損失/正確性などの浅面情報のみを学習/評価セットから活用するが,学習モデルの内部状態を無視し,学習結果の再重み付けの可能性を制限する。本研究では,教師モデルが教師モデルに内部状態を提供する改良データ重み付けアルゴリズムを提案し,教師モデルが学習サンプルの適応重み付けを返し,生徒モデルのトレーニングを強化する。教師モデルは、検証セットから伝播するメタ勾配を用いて、生徒モデルと共同で訓練される。クリーン/ノイズラベルとニューラルマシン翻訳を用いた画像分類実験により,従来の手法に比べて大きな改善が得られた。

Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc. Learning to reweight, which is a specific kind of teaching that reweights training data using a teacher model, receives much attention due to its simplicity and effectiveness. In existing learning to reweight works, the teacher model only utilizes shallow/surface information such as training iteration number and loss/accuracy of the student model from training/validation sets, but ignores the internal states of the student model, which limits the potential of learning to reweight. In this work, we propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model, and the teacher model returns adaptive weights of training samples to enhance the training of the student model. The teacher model is jointly trained with the student model using meta gradients propagated from a validation set. Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.

翻訳日:2022-11-12 03:21:29 公開日:2021-01-12

# 非負行列分解のための逆アニーリング

Reverse Annealing for Nonnegative/Binary Matrix Factorization ( http://arxiv.org/abs/2007.05565v2 )

ライセンス: Link先を確認

John Golden, Daniel O'Malley

(参考訳) 近年、量子アニールはある種の行列分解アルゴリズムにおいて有効で高速なサブルーチンとして用いられることが示されている。量子アニーリングアルゴリズムは、迅速で近似的な答えに最適だったが、性能は急速に高まった。本稿では,非負行列分解問題に対する量子アニーリングサブルーチンの前方アニーリングの代わりに逆アニーリングを利用する。フォワードアニーリングによる最初のグローバルサーチの後、リバースアニーリングは、既存のソリューションを洗練する一連のローカルサーチを実行する。前と逆の焼鈍の組み合わせは、最も短い実行時間を除いて、前と逆の焼鈍だけでは性能が著しく向上する。

It was recently shown that quantum annealing can be used as an effective, fast subroutine in certain types of matrix factorization algorithms. The quantum annealing algorithm performed best for quick, approximate answers, but performance rapidly plateaued. In this paper, we utilize reverse annealing instead of forward annealing in the quantum annealing subroutine for nonnegative/binary matrix factorization problems. After an initial global search with forward annealing, reverse annealing performs a series of local searches that refine existing solutions. The combination of forward and reverse annealing significantly improves performance compared to forward annealing alone for all but the shortest run times.

翻訳日:2022-11-11 21:41:14 公開日:2021-01-12

# 正しい理由: 画像分類を堅牢にすること

Right for the Right Reason: Making Image Classification Robust ( http://arxiv.org/abs/2007.11924v2 )

ライセンス: Link先を確認

Anna Nguyen, Adrian Oberf\"oll, Michael F\"arber

(参考訳) 画像データの分類における畳み込みニューラルネットワーク(CNN)の有効性を徹底的に実証した。人間への分類を説明するため,近年,分類証拠を可視化する手法が開発されている。これらの説明は、時々画像は正しく分類されるが、間違った理由、すなわち偶然の証拠に基づいて、正しく分類される。もちろん、画像が正しい理由、すなわち実際の証拠に基づいて正しく分類されることは望ましい。そこで本稿では,画像分類におけるオブジェクト整列説明量を測定するための新しい説明品質指標を提案する。オブジェクト検出手法、説明手法、ObAlExを用いて、実際の証拠に対するCNNの焦点を定量化する。さらに,CNNのさらなるトレーニングにより,CNNの精度を低下させることなく,CNNの焦点を向上できることを示す。

The effectiveness of Convolutional Neural Networks (CNNs)in classifying image data has been thoroughly demonstrated. In order to explain the classification to humans, methods for visualizing classification evidence have been developed in recent years. These explanations reveal that sometimes images are classified correctly, but for the wrong reasons,i.e., based on incidental evidence. Of course, it is desirable that images are classified correctly for the right reasons, i.e., based on the actual evidence. To this end, we propose a new explanation quality metric to measure object aligned explanation in image classification which we refer to as theObAlExmetric. Using object detection approaches, explanation approaches, and ObAlEx, we quantify the focus of CNNs on the actual evidence. Moreover, we show that additional training of the CNNs can improve the focus of CNNs without decreasing their accuracy.

翻訳日:2022-11-07 12:03:31 公開日:2021-01-12

# 文文脈が単語意味に及ぼす影響を特徴づける:脳を行動にマッピングする

Characterizing the Effect of Sentence Context on Word Meanings: Mapping Brain to Behavior ( http://arxiv.org/abs/2007.13840v3 )

ライセンス: Link先を確認

N. Aguirre-Celis and R. Miikkulainen

(参考訳) 意味的特徴モデルはfMRIデータの予測と解釈に人気がある。特に、先行研究により、文読解におけるfMRIパターンの違いは、単語の意味的特徴表現における文脈依存的な変化によって説明できることが示されている。しかし、これらの変化を認識し、それに同意するかどうかは、明らかに疑問視されている。本論文は,人間-対象研究を通じてこの問題に答えることを目的とする。対象者は、特定の文で単語が使われたとき、単語が属する意味からどのように変化するか判断するよう求められた。判断は、偶然よりもはるかに高いモデル予測と一致した。その結果,単語の意味は文の文脈によって体系的に変化するという仮説を支持した。

Semantic feature models have become a popular tool for prediction and interpretation of fMRI data. In particular, prior work has shown that differences in the fMRI patterns in sentence reading can be explained by context-dependent changes in the semantic feature representations of the words. However, whether the subjects are aware of such changes and agree with them has been an open question. This paper aims to answer this question through a human-subject study. Subjects were asked to judge how the word change from their generic meaning when the words were used in specific sentences. The judgements were consistent with the model predictions well above chance. Thus, the results support the hypothesis that word meaning change systematically depending on sentence context.

翻訳日:2022-11-06 07:33:57 公開日:2021-01-12

# マルチスケール特徴抽出による高効率膵分節化

Efficient, high-performance pancreatic segmentation using multi-scale feature extraction ( http://arxiv.org/abs/2009.00872v2 )

ライセンス: Link先を確認

Moritz Knolle (1 and 2), Georgios Kaissis (1 and 2 and 3 and 4), Friederike Jungmann (1), Sebastian Ziegelmayer (1), Daniel Sasse (1), Marcus Makowski (1), Daniel Rueckert (2 and 4), Rickmer Braren (1) ((1) Department of diagnostic and interventional Radiology, Technical University of Munich, Munich, Germany, (2) Institute for Artificial Intelligence and Data Science in Medicine and Healthcare, Technical University of Munich, Munich, Germany, (3) OpenMined Research, (4) Department of Computing, Imperial College London, London, United Kingdom)

(参考訳) 人工知能を用いた画像解析手法を臨床応用するには, 高性能アルゴリズムの開発が不可欠である。例えば、自然画像に基づく既存のセグメンテーションアルゴリズムは、パラメータの使用において効率的でなく、医用画像に最適化されていない。本稿では,高効率なマルチスケール画像特徴量利用による高性能化に焦点をあてた,高度に最適化されたニューラルネットワークに基づく膵分画アルゴリズムであるmonetを提案する。

For artificial intelligence-based image analysis methods to reach clinical applicability, the development of high-performance algorithms is crucial. For example, existent segmentation algorithms based on natural images are neither efficient in their parameter use nor optimized for medical imaging. Here we present MoNet, a highly optimized neural-network-based pancreatic segmentation algorithm focused on achieving high performance by efficient multi-scale image feature utilization.

翻訳日:2022-10-22 19:02:44 公開日:2021-01-12

# 大規模マルチタスク言語理解の測定

Measuring Massive Multitask Language Understanding ( http://arxiv.org/abs/2009.03300v3 )

ライセンス: Link先を確認

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt

(参考訳) テキストモデルのマルチタスク精度を測定するための新しいテストを提案する。このテストは、初等数学、アメリカ史、コンピュータ科学、法律など、57のタスクをカバーする。このテストで高い精度を達成するためには、モデルは広範な世界知識と問題解決能力を持つ必要がある。近年のモデルではほぼランダム率の精度が高いが、最大のgpt-3モデルは平均で20ポイント近い確率でランダム確率を改善できることがわかった。しかし、57タスクのすべてにおいて、最高のモデルには、専門家レベルの精度に到達する前に、かなりの改善が必要である。モデルは性能も劣悪であり、いつ間違っているか分からないことが多い。さらに悪いことに、道徳や法のような社会的に重要な主題について、いまだにほぼランダムな正確さを持っている。モデルの学術的および専門的な理解の幅と深さを包括的に評価することにより、我々のテストは、多くのタスクにわたるモデルを分析し、重要な欠点を特定するのに使用できる。

We propose a new test to measure a text model's multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. To attain high accuracy on this test, models must possess extensive world knowledge and problem solving ability. We find that while most recent models have near random-chance accuracy, the very largest GPT-3 model improves over random chance by almost 20 percentage points on average. However, on every one of the 57 tasks, the best models still need substantial improvements before they can reach expert-level accuracy. Models also have lopsided performance and frequently do not know when they are wrong. Worse, they still have near-random accuracy on some socially important subjects such as morality and law. By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings.

翻訳日:2022-10-21 02:04:21 公開日:2021-01-12

# censored, quantized, and generalized group admmを用いたコミュニケーション効率の高い分散学習

Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMM ( http://arxiv.org/abs/2009.06459v2 )

ライセンス: Link先を確認

Chaouki Ben Issaid, Anis Elgabli, Jihong Park, Mehdi Bennis, M\'erouane Debbah

(参考訳) 本稿では,相互接続作業者のネットワーク上で定義されたコンセンサス最適化問題を解決する,コミュニケーション効率のよい分散機械学習フレームワークを提案する。提案するアルゴリズムであるCensored and Quantized Generalized GADMM(CQ-GGADMM)は,グループ交代方向乗算器(GADMM)の作業者グループ化と分散学習の考え方を活用し,その適用性を一般化ネットワークトポロジに拡張し,量子化後の無視可能な更新のためのリンク検閲を取り入れた通信効率のフロンティアを推し進める。我々はCQ-GGADMMが局所目的関数がいくつかの軽度仮定の下で強く凸であるときに線形収束率を達成することを理論的に証明する。数値シミュレーションにより、cq-ggadmmは、検閲された分散admmとgadmmのワーカーグループ化法と比較して、通信ラウンド数で通信効率が高く、精度と収束速度を損なうことなく、エネルギー消費を伝達する。

In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored and Quantized Generalized GADMM (CQ-GGADMM), leverages the worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the censored decentralized ADMM, and the worker grouping method of GADMM.

翻訳日:2022-10-18 12:06:57 公開日:2021-01-12

# ディープラーニングを用いたソースコードの自動生成と自動補完:現在の言語モデルに基づくアプローチの比較と検討

Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches ( http://arxiv.org/abs/2009.07740v4 )

ライセンス: Link先を確認

Juan Cruz-Benito, Sanjay Vishwakarma, Francisco Martin-Fernandez, Ismael Faro

(参考訳) 近年,言語モデルにおけるディープラーニングの利用が注目されている。いくつかの研究プロジェクトは、人間が書くと解釈できるテキストを生成することができ、多くのアプリケーション領域で新しい可能性を可能にすると主張している。言語処理に関連するさまざまな分野のうち、この種のモデリングを適用する際に最も注目すべきはプログラミング言語である。機械学習コミュニティは長年にわたり、このソフトウェアエンジニアリング領域を調査し、人間がプログラムするコードの自動補完、生成、修正、評価など、さまざまなアプローチの適用といった目標を追求してきた。 Deep-Learning対応の言語モデルアプローチの人気が高まっていることを踏まえ、異なるディープラーニングアーキテクチャを比較して、プログラミングコードに基づいて言語モデルを作成し、使用する経験的な論文が不足していることを発見した。本稿では,awd-lstms,awd-qrnns,transformerなどのニューラルネットワークアーキテクチャを比較し,トランスファーラーニングとトークン化を用いて,pythonデータセットを用いたコード生成とマスクタスクの補完を行う。結果を踏まえて,それぞれのアプローチの強みと弱み,言語モデルの評価や実際のプログラミングコンテキストに適用する上でのギャップについて検討する。

In recent years, the use of deep learning in language models gained much attention. Some research projects claim that they can generate text that can be interpreted as human-writing, enabling new possibilities in many application areas. Among the different areas related to language processing, one of the most notable in applying this type of modeling is programming languages. For years, the Machine Learning community has been researching this software engineering area, pursuing goals like applying different approaches to auto-complete, generate, fix, or evaluate code programmed by humans. Considering the increasing popularity of the Deep-Learning-enabled language models approach, we detected a lack of empirical papers that compare different deep learning architectures to create and use language models based on programming code. This paper compares different neural network architectures like AWD-LSTMs, AWD-QRNNs, and Transformer while using transfer learning and different tokenizations to see how they behave in building language models using a Python dataset for code generation and filling mask tasks. Considering the results, we discuss each approach's different strengths and weaknesses and what gaps we find to evaluate the language models or apply them in a real programming context.

翻訳日:2022-10-17 23:28:29 公開日:2021-01-12

# ニューロシンボリック神経変性疾患の確率論的プログラムによるモデル化

Neuro-symbolic Neurodegenerative Disease Modeling as Probabilistic Programmed Deep Kernels ( http://arxiv.org/abs/2009.07738v3 )

ライセンス: Link先を確認

Alexander Lavin

(参考訳) 神経変性疾患のパーソナライズされた予測モデリングのための,確率的プログラムによる深層学習手法を提案する。分析は、予測性能と、解釈可能性、不確実性推論、データ効率、ドメイン知識の活用といった重要な医療ai特性を評価する、ニューラルネットワークとシンボリック機械学習のアプローチのスペクトルを考察する。我々のベイズ的アプローチは、ガウス的プロセスの柔軟性とニューラルネットワークの構造的パワーを組み合わせてバイオマーカーの進行をモデル化する。アルツハイマー病予測の問題点について評価を行い、神経変性予測の正確性と時系列性の両方においてディープラーニングを上回り、ベイズ型非パラメトリックスと確率的プログラミングの実用的利点を生かした。

We present a probabilistic programmed deep kernel learning approach to personalized, predictive modeling of neurodegenerative diseases. Our analysis considers a spectrum of neural and symbolic machine learning approaches, which we assess for predictive performance and important medical AI properties such as interpretability, uncertainty reasoning, data-efficiency, and leveraging domain knowledge. Our Bayesian approach combines the flexibility of Gaussian processes with the structural power of neural networks to model biomarker progressions, without needing clinical labels for training. We run evaluations on the problem of Alzheimer's disease prediction, yielding results that surpass deep learning in both accuracy and timeliness of predicting neurodegeneration, and with the practical advantages of Bayesian nonparametrics and probabilistic programming.

翻訳日:2022-10-17 23:20:44 公開日:2021-01-12

# 経験的利得最大化による学習の枠組み

A Framework of Learning Through Empirical Gain Maximization ( http://arxiv.org/abs/2009.14250v2 )

ライセンス: Link先を確認

Yunlong Feng and Qiang Wu

(参考訳) 本稿では,重み付き雑音や外れ値が応答変数に現れるような頑健な回帰問題に対処するために,経験的ゲイン最大化(EGM)の枠組みを開発する。 EGMの考え方は、通常のように真理関数を直接近似するのではなく、雑音分布の密度関数を近似することである。全ての観測を同等に重要視し、異常観測の存在下で問題となる古典的な最大度推定とは異なり、egmスキームは最小距離推定の観点から解釈でき、これらの観測を無知にすることができる。さらに,いくつかのロバストな非凸回帰パラダイム(例えば,タキー回帰や断続最小二乗回帰)を新しいフレームワークに再構成できることが示されている。そこで我々は,これら十分に確立されているが,完全には理解されていない回帰アプローチに対して,統一的な解析を行うことにより,EMGの学習理論を開発する。新しい枠組みから、既存の有界非凸損失関数の新たな解釈を結論付けることができる。この新しい枠組みでは、ロバスト回帰のための有名なテューキーの双重損失と非パラメトリックスムージングのための三重重項の2つの用語が密接に関連している。より正確には、タキーの双重損失は三重項カーネルから導出できることが示されている。同様に、切り詰められた正方形損失、ゲマン・マククリール損失、指数的正方形損失といった機械学習における有界な非凸損失関数は、統計学においてある滑らかなカーネルから再構成することもできる。さらに,新しいフレームワークにより,ロバスト学習のための境界非凸損失関数の考案が可能となった。

We develop in this paper a framework of empirical gain maximization (EGM) to address the robust regression problem where heavy-tailed noise or outliers may present in the response variable. The idea of EGM is to approximate the density function of the noise distribution instead of approximating the truth function directly as usual. Unlike the classical maximum likelihood estimation that encourages equal importance of all observations and could be problematic in the presence of abnormal observations, EGM schemes can be interpreted from a minimum distance estimation viewpoint and allow the ignorance of those observations. Furthermore, it is shown that several well-known robust nonconvex regression paradigms, such as Tukey regression and truncated least square regression, can be reformulated into this new framework. We then develop a learning theory for EGM, by means of which a unified analysis can be conducted for these well-established but not fully-understood regression approaches. Resulting from the new framework, a novel interpretation of existing bounded nonconvex loss functions can be concluded. Within this new framework, the two seemingly irrelevant terminologies, the well-known Tukey's biweight loss for robust regression and the triweight kernel for nonparametric smoothing, are closely related. More precisely, it is shown that the Tukey's biweight loss can be derived from the triweight kernel. Similarly, other frequently employed bounded nonconvex loss functions in machine learning such as the truncated square loss, the Geman-McClure loss, and the exponential squared loss can also be reformulated from certain smoothing kernels in statistics. In addition, the new framework enables us to devise new bounded nonconvex loss functions for robust learning.

翻訳日:2022-10-13 05:34:17 公開日:2021-01-12

# クロスコネクテッド$\psi$-netによる感度重み付け画像の位相からの定量的感受性マップの再構成

Reconstruction of Quantitative Susceptibility Maps from Phase of Susceptibility Weighted Imaging with Cross-Connected $\Psi$-Net ( http://arxiv.org/abs/2010.05395v3 )

ライセンス: Link先を確認

Zhiyang Lu, Jun Li, Zheng Li, Hongjian He, Jun Shi

(参考訳) 定量的感受性マッピング(QSM)は、磁気感受性を定量化する新しい位相ベースの手法である。既存のQSM再構成法は、通常、高品質の位相データに対する複雑な前処理を必要とする。本研究では,サセプティビティ重み付け画像(SWI)で生成された高域通過フィルタ位相データの新たな値について検討し,QSMをこれらの位相データから直接SWIに再構成するためのエンドツーエンドの$\Psi$-Net(C$\Psi$-Net)を開発する。 C$\Psi$-Net は古典的な U-Net の中間分岐を加えて$\Psi$-like 構造を形成する。特別に設計された拡張された相互作用ブロックは、このブランチの各レベルに埋め込まれ、より広い空間範囲の位相画像からより感受性情報を取得するための受容野を拡大する。さらに、クロスコネクションはブランチ間で利用され、C$\Psi$-Netでリッチなコンテキスト情報をキャプチャして正確な再構成を行うマルチレゾリューション機能融合スキームを実装している。ヒトのデータセットにおける実験結果は、c$\psi$-netが他のqsm再構成アルゴリズムよりも優れた性能を達成していることを示している。

Quantitative Susceptibility Mapping (QSM) is a new phase-based technique for quantifying magnetic susceptibility. The existing QSM reconstruction methods generally require complicated pre-processing on high-quality phase data. In this work, we propose to explore a new value of the high-pass filtered phase data generated in susceptibility weighted imaging (SWI), and develop an end-to-end Cross-connected $\Psi$-Net (C$\Psi$-Net) to reconstruct QSM directly from these phase data in SWI without additional pre-processing. C$\Psi$-Net adds an intermediate branch in the classical U-Net to form a $\Psi$-like structure. The specially designed dilated interaction block is embedded in each level of this branch to enlarge the receptive fields for capturing more susceptibility information from a wider spatial range of phase images. Moreover, the crossed connections are utilized between branches to implement a multi-resolution feature fusion scheme, which helps C$\Psi$-Net capture rich contextual information for accurate reconstruction. The experimental results on a human dataset show that C$\Psi$-Net achieves superior performance in our task over other QSM reconstruction algorithms.

翻訳日:2022-10-08 08:10:50 公開日:2021-01-12

# k-simplex2vec: node2vecの単純拡張

k-simplex2vec: a simplicial extension of node2vec ( http://arxiv.org/abs/2010.05636v2 )

ライセンス: Link先を確認

Celia Hacker

(参考訳) 本稿では, ユークリッド特徴を単純な複合体に関連付ける新しい手法を提案し, 統計的および機械学習ツールの入力として利用する方法を提案する。この方法は、ノード2vecアルゴリズムを高次元の単純化に拡張し、単純複素体の構造やグラフ内の高次相互作用に関する洞察を与える。

We present a novel method of associating Euclidean features to simplicial complexes, providing a way to use them as input to statistical and machine learning tools. This method extends the node2vec algorithm to simplices of higher dimensions, providing insight into the structure of a simplicial complex, or into the higher-order interactions in a graph.

翻訳日:2022-10-08 08:02:01 公開日:2021-01-12

# 機械学習力場

Machine Learning Force Fields ( http://arxiv.org/abs/2010.07067v2 )

ライセンス: Link先を確認

Oliver T. Unke, Stefan Chmiela, Huziel E. Sauceda, Michael Gastegger, Igor Poltavsky, Kristof T. Sch\"utt, Alexandre Tkatchenko, Klaus-Robert M\"uller

(参考訳) 近年,計算化学における機械学習 (ML) の利用は,従来の電子構造手法の計算複雑性により,これまでも多くの進歩を遂げてきた。最も有望な応用の1つは、ab initio法の精度と古典的なFFの効率のギャップを狭めることを目的としたMLベースの力場(FF)の構築である。鍵となるアイデアは、化学結合や関連する相互作用に関する知識の先入観に頼らずに、化学構造とポテンシャルエネルギーの間の統計的関係を学ぶことである。このような普遍的なML近似は、原則として訓練に使用される参照データの品質と量によってのみ制限される。本稿ではML-FFの応用の概要とそれらから得られる化学的知見を紹介する。 ML-FFの基礎となる概念は詳細に説明され、それらをスクラッチから構築およびテストするためのステップバイステップガイドが提供される。このテキストは、次世代のML-FFによって克服されるであろう課題に関する議論で締めくくられている。

In recent years, the use of Machine Learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.

翻訳日:2022-10-07 14:03:01 公開日:2021-01-12

# 衛星周波数計画設計における深部強化学習の適用性と課題

Applicability and Challenges of Deep Reinforcement Learning for Satellite Frequency Plan Design ( http://arxiv.org/abs/2010.08015v2 )

ライセンス: Link先を確認

Juan Jose Garau Luis, Edward Crawley and Bruce Cameron

(参考訳) 深層強化学習(DRL)モデルの研究とベンチマークは、航空宇宙工学や通信を含む多くの産業でトレンドとなっている。これらの分野での最近の研究は、古典的アプローチが時間要件を満たしていない、あるいは最適解を得ることができない、複雑なリアルタイム意思決定問題に対処するこの種のモデルを提案する。 DRLモデルの優れた性能は特定のユースケースやシナリオに対して証明されているが、ほとんどの研究は実際の運用においてそのようなモデルの妥協や一般化可能性について論じていない。本稿では,DRLモデルの異なる要素のトレードオフと,それらが最終性能に与える影響について検討する。そこで我々は、マルチビーム衛星コンステレーションをユースケースとして、周波数計画設計(FPD)問題を選択し、それに対処するためのDRLモデルを提案する。ポリシ,ポリシオプティマイザ,状態,アクション,報酬表現,トレーニング環境という,パフォーマンスに大きな影響を与える6つのコア要素を特定した。これらの要素ごとに異なる選択肢を分析し、その効果を特徴づける。また、異なるシナリオを考慮に入れたり、環境を非定常にしたりするために、複数の環境も利用しています。以上の結果から,DRLは実業務におけるFPD問題,特に意思決定の高速化に対処する潜在的手法である可能性が示唆された。しかし、すべてのシナリオでDRLモデルが他のモデルよりも優れており、6つのコア要素のそれぞれに最適なアプローチは、運用環境の特徴に依存している。航空産業における将来的な複雑な問題を解決するためのDRLの可能性について合意する一方で、適切なモデルや訓練手順を設計することの重要性、それらのモデルの適用性を理解し、主な性能トレードオフを報告することについても考察する。

The study and benchmarking of Deep Reinforcement Learning (DRL) models has become a trend in many industries, including aerospace engineering and communications. Recent studies in these fields propose these kinds of models to address certain complex real-time decision-making problems in which classic approaches do not meet time requirements or fail to obtain optimal solutions. While the good performance of DRL models has been proved for specific use cases or scenarios, most studies do not discuss the compromises and generalizability of such models during real operations. In this paper we explore the tradeoffs of different elements of DRL models and how they might impact the final performance. To that end, we choose the Frequency Plan Design (FPD) problem in the context of multibeam satellite constellations as our use case and propose a DRL model to address it. We identify 6 different core elements that have a major effect in its performance: the policy, the policy optimizer, the state, action, and reward representations, and the training environment. We analyze different alternatives for each of these elements and characterize their effect. We also use multiple environments to account for different scenarios in which we vary the dimensionality or make the environment nonstationary. Our findings show that DRL is a potential method to address the FPD problem in real operations, especially because of its speed in decision-making. However, no single DRL model is able to outperform the rest in all scenarios, and the best approach for each of the 6 core elements depends on the features of the operation environment. While we agree on the potential of DRL to solve future complex problems in the aerospace industry, we also reflect on the importance of designing appropriate models and training procedures, understanding the applicability of such models, and reporting the main performance tradeoffs.

翻訳日:2022-10-07 04:44:07 公開日:2021-01-12

# 球形知識拡散による教師学習ギャップの低減

Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation ( http://arxiv.org/abs/2010.07485v5 )

ライセンス: Link先を確認

Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai

(参考訳) 知識蒸留は、より大きいものからマッピング関数を学習することで、コンパクトで効果的なモデルを得ることを目的としている。生徒の能力が限られているため、生徒は教師に不利になる。そのため、大容量教師からの蒸留では、学生のパフォーマンスが予想外に低下し、キャパシティギャップ問題と呼ばれた。本研究では,教師と学生の信頼のギャップについて検討する。知識蒸留には信頼度は必要とせず,学生が自信を習得せざるを得ない場合には,学生のパフォーマンスを損なう可能性がある。我々は,このギャップを明示的に解消するために球面知識蒸留法を提案する。この新しい知識表現は、はるかに大きな教師でコンパクトモデルを改善することができ、温度に対して堅牢である。 CIFAR100とImageNetの両方で実験を行い,大幅な改良を行った。具体的には、以前のSOTAよりも大幅に改善されたResNet18から73.0の精度をトレーニングし、生徒の約2倍のresnet34と同等である。実装はhttps://github.com/forjiuzhou/Spherical-Knowledge-Distillationで共有されている。

Knowledge distillation aims at obtaining a compact and effective model by learning the mapping function from a much larger one. Due to the limited capacity of the student, the student would underfit the teacher. Therefore, student performance would unexpectedly drop when distilling from an oversized teacher, termed the capacity gap problem. We investigate this problem by study the gap of confidence between teacher and student. We find that the magnitude of confidence is not necessary for knowledge distillation and could harm the student performance if the student are forced to learn confidence. We propose Spherical Knowledge Distillation to eliminate this gap explicitly, which eases the underfitting problem. We find this novel knowledge representation can improve compact models with much larger teachers and is robust to temperature. We conducted experiments on both CIFAR100 and ImageNet, and achieve significant improvement. Specifically, we train ResNet18 to 73.0 accuracy, which is a substantial improvement over previous SOTA and is on par with resnet34 almost twice the student size. The implementation has been shared at https://github.com/forjiuzhou/Spherical-Knowledge-Distillation.

翻訳日:2022-10-07 03:24:59 公開日:2021-01-12

# 自然言語が情報と処理手順の両方を符号化するという考えに基づく自然言語理解の新しいアプローチ

New Approaches for Natural Language Understanding based on the Idea that Natural Language encodes both Information and its Processing Procedures ( http://arxiv.org/abs/2010.12789v3 )

ライセンス: Link先を確認

Limin Zhang

(参考訳) 自然言語は情報エンコーディングの方法であり、情報だけでなく、情報がどのように処理されるかの手順もエンコードする必要がある。自然言語を理解するには、コンピュータ言語を想像し設計するのと同じで、最初のステップは情報(またはデータ)と情報(またはデータ)の処理手順を分離することである。自然言語では、データの処理手順を構造チャンクとポインタチャンクとして直接符号化する(この論文では、語彙チャンクをデータチャンク、構造チャンク、ポインタチャンクに再分類する)。データ部分については属性情報の分類符号化システムと情報編成アーキテクチャ(情報集合の構成構造と情報集合の階層構造を含む)について論じた。第2節では、第2節で詳述した理論的な部分が実例で検証され、本論文の研究では、機械が対話で伝達される情報を理解することを目標としている。第4節では、"Understanding"の基本条件を要約し、"Understanding"とは何か、どのように進むべきかを再考する。本研究では,NLUの実用的,理論的基礎および研究手法について述べる。また、人工知能(AI)領域における大規模かつ多種類の情報処理にも適用することができる。

We must recognize that natural language is a way of information encoding, and it encodes not only the information but also the procedures for how information is processed. To understand natural language, the same as we conceive and design computer languages, the first step is to separate information (or data) and the processing procedures of information (or data). In natural language, some processing procedures of data are encoded directly as the structure chunk and the pointer chunk (this paper has reclassified lexical chunks as the data chunk, structure chunk, and the pointer chunk); some processing procedures of data imply in sentences structures; some requests of processing procedures are expressed by information senders and processed by information receivers. For the data parts, the classification encoding system of attribute information and the information organization architecture (including constitutional structures of information sets and the hierarchy between the information sets) were discussed. In section 2, the theoretical part elaborated in section 2 has been verified in examples and proofed that the studies in this paper have achieved the goal of enabling machines to understand the information conveyed in the dialogue. In section 4, the author summarizes the basic conditions of "Understanding", rethinks what "Understanding" is and how to proceed. The study in this paper provides a practical, theoretical basis and research methods for NLU. It also can be applied in large-scale and multi-type information processing in the artificial intelligence (AI) area.

翻訳日:2022-10-03 12:52:20 公開日:2021-01-12

# 深層学習ニューラルネットワークを用いた放射線治療用線量予測におけるモンテカルロ降雨量とブートストラップアグリゲーションの比較

A comparison of Monte Carlo dropout and bootstrap aggregation on the performance and uncertainty estimation in radiation therapy dose prediction with deep learning neural networks ( http://arxiv.org/abs/2011.00388v2 )

ライセンス: Link先を確認

Dan Nguyen, Azar Sadeghnejad Barkousaraie, Gyanendra Bohara, Anjali Balagopal, Rafe McBeth, Mu-Han Lin, Steve Jiang

(参考訳) 近年,人工知能技術とアルゴリズムが放射線治療における治療計画の進展に重点を置いている。これらは臨床ワークフローに取り入れられ始めているため、臨床医からの懸念は、モデルが正確かどうかではなく、その答えが正しいかどうか分からない場合に、モデルが人間のオペレータに表現できるかどうかである。深層学習モデルにおいてモンテカルロドロップアウト(mcdo)とブートストラップ凝集(bagging)技術を用いて放射線治療用線量予測のための不確実性推定を行う。我々は,両モデルとも合理的な不確実性マップを生成できることを示し,提案手法により,予測と関連する指標に対する解釈可能な不確実性と境界を生成する。性能面では,バグングは統計的に有意な損失値の減少と誤差をもたらす。ベージの追加により,ベースラインフレームワークと比較して,Dmeanでは0.34%,Dmaxでは0.19%のエラー削減が可能になった。全体として、ベージフレームワークは、ベースラインフレームワークの2.87のMAEとは対照的に、2.62のMAEをかなり低くした。バッグングの有用性は、単にパフォーマンスの観点からは、問題と許容できる予測誤差に大きく依存しており、トレーニング中の高い事前計算コストは、その使用が有利かどうかを決定するために考慮すべきである。不確かさを見積もったデプロイメントでは、どちらのフレームワークも、約12秒の時間で同じパフォーマンスを提供する。アンサンブルベースのメタヒューリスティックとして、既存の機械学習アーキテクチャを使って安定性とパフォーマンスを向上させることができ、MCDOはアーキテクチャの一部としてドロップアウトしたディープラーニングモデルに適用することができる。

Recently, artificial intelligence technologies and algorithms have become a major focus for advancements in treatment planning for radiation therapy. As these are starting to become incorporated into the clinical workflow, a major concern from clinicians is not whether the model is accurate, but whether the model can express to a human operator when it does not know if its answer is correct. We propose to use Monte Carlo dropout (MCDO) and the bootstrap aggregation (bagging) technique on deep learning models to produce uncertainty estimations for radiation therapy dose prediction. We show that both models are capable of generating a reasonable uncertainty map, and, with our proposed scaling technique, creating interpretable uncertainties and bounds on the prediction and any relevant metrics. Performance-wise, bagging provides statistically significant reduced loss value and errors in most of the metrics investigated in this study. The addition of bagging was able to further reduce errors by another 0.34% for Dmean and 0.19% for Dmax, on average, when compared to the baseline framework. Overall, the bagging framework provided significantly lower MAE of 2.62, as opposed to the baseline framework's MAE of 2.87. The usefulness of bagging, from solely a performance standpoint, does highly depend on the problem and the acceptable predictive error, and its high upfront computational cost during training should be factored in to deciding whether it is advantageous to use it. In terms of deployment with uncertainty estimations turned on, both frameworks offer the same performance time of about 12 seconds. As an ensemble-based metaheuristic, bagging can be used with existing machine learning architectures to improve stability and performance, and MCDO can be applied to any deep learning models that have dropout as part of their architecture.

翻訳日:2022-09-30 23:12:33 公開日:2021-01-12

# 仮説的介入による予測を可能にする因果法の検討

A scoping review of causal methods enabling predictions under hypothetical interventions ( http://arxiv.org/abs/2011.09815v2 )

ライセンス: Link先を確認

Lijing Lin, Matthew Sperrin, David A. Jenkins, Glen P. Martin, Niels Peek

(参考訳) 背景と目的: 予測モデルが通常開発される手法は、パラメータや予測を因果的に解釈するべきではないことを意味する。しかしながら、意思決定を支援するために予測モデルが使用される場合、仮定的な介入の下で結果を予測する必要性がしばしばある。本研究の目的は,仮説的介入による結果のリスク推定を可能にする予測モデルの開発・検証,因果推論の活用,主要な方法論的アプローチ,基礎となる仮定,目標推定,本手法を用いた潜在的な落とし穴や課題,未解決の方法論的課題の検証である。方法:2019年12月までに刊行された文献を体系的にレビューし,仮説的介入による予測モデルの使用を可能にするために因果的考察を用いた健康領域の論文を考察した。結果: データベース検索により4919件の論文を同定し, さらに115件の論文を手作業で検索し, その内13件を統計的および機械学習の文献から抽出した。観測データから因果推定を行う手法の多くは,境界構造モデルとg推定に基づいていた。結論: 臨床予測モデルへの仮説的介入下での予測を可能にする方法は2つある。 1)臨床試験及びメタアナリシスから推定される因果効果による観察研究に由来する予測モデルの改善 2)観測データから直接予測モデルと因果効果を推定する。これらの方法は、ダイナミックな治療体制への拡張と、臨床決定支援システムを運用するための複数の介入を考慮する必要がある。因果予測モデル」を検証する技術はまだ初期段階にある。

Background and Aims: The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. We aimed to identify published methods for developing and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference: their main methodological approaches, underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method, and unresolved methodological challenges. Methods: We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. Results: We identified 4919 papers through database searches and a further 115 papers through manual searches, of which 13 were selected for inclusion, from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation. Conclusions: There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: 1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses; and 2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy.

翻訳日:2022-09-23 21:44:02 公開日:2021-01-12

# ロボットにおける「能動的自己」のための感覚運動表現学習--モデル調査

Sensorimotor representation learning for an "active self" in robots: A model survey ( http://arxiv.org/abs/2011.12860v2 )

ライセンス: Link先を確認

Phuong D.H. Nguyen, Yasmin Kim Georgie, Ezgi Kayhan, Manfred Eppe, Verena Vanessa Hafner, and Stefan Wermter

(参考訳) 安全な人間とロボットの相互作用は、ロボットが「人間」の世界で適切に振る舞う方法を学ばなければならないため、操作のための厳格なルールを提供するのではなく、動的で非構造的な環境によって引き起こされる課題に対処する必要がある。人間では、これらの能力は私たちの身体を宇宙で知覚し、運動中の手足の位置を感知し、他の物体やエージェントを認識し、身体の一部が故意に相互作用するように制御する能力と関係していると考えられている。バイオインスパイアされた能力を持つ次世代ロボットについて,まず,身体スキーマの感覚表現,対人空間,人間の活動的自己など,これらの能力の根底にあるメカニズムの発達過程を概観する。第二に、これらの感覚表現のロボットモデルと自己のロボットモデルについての調査を行い、これらのモデルを人間モデルと比較する。最後に,これらのロボットモデルに欠けているものを解析し,自己爆発による感覚表現を発達させることにより,人工エージェントにおける自己感覚の出現を可能にするための理論的計算枠組みを提案する。

Safe human-robot interactions require robots to be able to learn how to behave appropriately in \sout{humans' world} \rev{spaces populated by people} and thus to cope with the challenges posed by our dynamic and unstructured environment, rather than being provided a rigid set of rules for operations. In humans, these capabilities are thought to be related to our ability to perceive our body in space, sensing the location of our limbs during movement, being aware of other objects and agents, and controlling our body parts to interact with them intentionally. Toward the next generation of robots with bio-inspired capacities, in this paper, we first review the developmental processes of underlying mechanisms of these abilities: The sensory representations of body schema, peripersonal space, and the active self in humans. Second, we provide a survey of robotics models of these sensory representations and robotics models of the self; and we compare these models with the human counterparts. Finally, we analyse what is missing from these robotics models and propose a theoretical computational framework, which aims to allow the emergence of the sense of self in artificial agents by developing sensory representations through self-exploration.

翻訳日:2022-09-21 03:21:12 公開日:2021-01-12

# (参考訳) gansにおけるwasserstein距離の一般化化に向けて

Towards Generalized Implementation of Wasserstein Distance in GANs ( http://arxiv.org/abs/2012.03420v2 )

ライセンス: CC BY 4.0

Minkai Xu, Zhiming Zhou, Guansong Lu, Jian Tang, Weinan Zhang, Yong Yu

(参考訳) ワッサーシュタイン距離のカントロヴィチ・ルビンシュタイン(KR)双対性に基づいて構築されたワッサーシュタイン GAN (WGAN) は、理論上最も健全なGANモデルの一つである。しかし実際には、GANの他の変種よりも常に優れているわけではない。これは主にKR双対性によって要求されるリプシッツ条件の不完全な実装のためである。リプシッツ制約の異なる実装でコミュニティで大規模な作業が行われてきたが、実際にはその制約を完全に満たすのは難しい。本稿では,強いリプシッツ制約が最適化に不要である可能性を論じる。その代わり、一歩後退して、リプシッツ制約を緩和しようとする。理論的には、ワッサーシュタイン距離のより一般的な双対形式であるソボレフ双対性は、リプシッツの制約を緩和するが、ワッサーシュタイン距離の好ましい勾配特性を維持している。さらに、KR双対性は実際にはソボレフ双対性の特別な場合であることを示す。さらに, 緩和双対性に基づき, sobolev wasserstein gan (swgan) という一般化した wgan トレーニングスキームを提案し, 既存の手法に対する swgan の改善を広範囲な実験で実証した。

Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models. However, in practice it does not always outperform other variants of GANs. This is mostly due to the imperfect implementation of the Lipschitz condition required by the KR duality. Extensive work has been done in the community with different implementations of the Lipschitz constraint, which, however, is still hard to satisfy the restriction perfectly in practice. In this paper, we argue that the strong Lipschitz constraint might be unnecessary for optimization. Instead, we take a step back and try to relax the Lipschitz constraint. Theoretically, we first demonstrate a more general dual form of the Wasserstein distance called the Sobolev duality, which relaxes the Lipschitz constraint but still maintains the favorable gradient property of the Wasserstein distance. Moreover, we show that the KR duality is actually a special case of the Sobolev duality. Based on the relaxed duality, we further propose a generalized WGAN training scheme named Sobolev Wasserstein GAN (SWGAN), and empirically demonstrate the improvement of SWGAN over existing methods with extensive experiments.

翻訳日:2021-05-21 09:41:39 公開日:2021-01-12

# (参考訳) フラグメントに基づく生成モデルによる分子最適化

Molecule Optimization via Fragment-based Generative Models ( http://arxiv.org/abs/2012.04231v2 )

ライセンス: CC BY 4.0

Ziqi Chen, Martin Renqiang Min, Srinivasan Parthasarathy, Xia Ning

(参考訳) 創薬において、分子最適化は、望ましい薬物特性の観点から薬候補をより良いものにするための重要なステップである。近年の人工知能の進歩により、従来のin vitroプロセスはシリコアプローチによってますます促進されている。本稿では,計算量最適化分子に対する革新的シリコアプローチを提案し,深層生成モデルを用いて最適化された分子グラフを生成する問題を定式化する。我々の生成モデルはフラグメントベースの薬物設計の重要なアイデアに従い、小さなフラグメントを変更することで分子を最適化します。我々のモデルは、最適化されたフラグメントの特定方法と、良い性質と悪い性質を持つ分子の違いから、これらのフラグメントの修正方法を学ぶ。新しい分子を最適化するために、我々のモデルは、予測されたフラグメントの位置で最適化されたフラグメントをデコードするために学習信号を適用します。また、パイプライン内の各モデルが1つのフラグメントを最適化できるように、パイプライン内に複数のモデルを構築します。提案手法は, 分子類似性制約下で80%以上の特性改善, 高分子類似性制約下で10%以上の特性改善により, 他者よりも顕著に優れていることを示す。

In drug discovery, molecule optimization is an important step in order to modify drug candidates into better ones in terms of desired drug properties. With the recent advance of Artificial Intelligence, this traditionally in vitro process has been increasingly facilitated by in silico approaches. We present an innovative in silico approach to computationally optimizing molecules and formulate the problem as to generate optimized molecular graphs via deep generative models. Our generative models follow the key idea of fragment-based drug design, and optimize molecules by modifying their small fragments. Our models learn how to identify the to-be-optimized fragments and how to modify such fragments by learning from the difference of molecules that have good and bad properties. In optimizing a new molecule, our models apply the learned signals to decode optimized fragments at the predicted location of the fragments. We also construct multiple such models into a pipeline such that each of the models in the pipeline is able to optimize one fragment, and thus the entire pipeline is able to modify multiple fragments of molecule if needed. We compare our models with other state-of-the-art methods on benchmark datasets and demonstrate that our methods significantly outperform others with more than 80% property improvement under moderate molecular similarity constraints, and more than 10% property improvement under high molecular similarity constraints.

翻訳日:2021-05-17 09:32:38 公開日:2021-01-12

# 降雨レーダ画像と風況予測の融合による降雨ノキャスティングへの応用

Fusion of rain radar images and wind forecasts in a deep learning model applied to rain nowcasting ( http://arxiv.org/abs/2012.05015v2 )

ライセンス: Link先を確認

Vincent Bouget and Dominique B\'er\'eziat and Julien Brajard and Anastase Charantonis and Arthur Filoche

(参考訳) 短期または中期の降雨予測は、農業管理や洪水リスクモニタリングといったいくつかの環境応用において主要な課題である。既存のデータ駆動アプローチ、特にディープラーニングモデルは、降雨レーダイメージのみを入力として、このタスクにおいて重要なスキルを示してきた。風などの気象パラメータが予測を改善するかどうかを判断するために,降雨レーダ画像と気象予報モデルによる風速の融合に関するディープラーニングモデルを訓練した。ネットワークはレーダーデータのみに基づいてトレーニングされた類似アーキテクチャと、基本的な永続化モデル、光学フローに基づくアプローチと比較された。地平線時間30分で予測する中・高降雨時の光流量をF1スコアで計算し, ネットワークの性能は8%向上した。さらに、降雨レーダイメージのみを使用してトレーニングされた同じアーキテクチャを7%上回っている。降雨量と風速データを組み合わせることでトレーニングプロセスを安定させ,特に降雨予測の難しい降雨量で大幅な改善が達成されている。

Short- or mid-term rainfall forecasting is a major task with several environmental applications such as agricultural management or flood risk monitoring. Existing data-driven approaches, especially deep learning models, have shown significant skill at this task, using only rainfall radar images as inputs. In order to determine whether using other meteorological parameters such as wind would improve forecasts, we trained a deep learning model on a fusion of rainfall radar images and wind velocity produced by a weather forecast model. The network was compared to a similar architecture trained only on radar data, to a basic persistence model and to an approach based on optical flow. Our network outperforms by 8% the F1-score calculated for the optical flow on moderate and higher rain events for forecasts at a horizon time of 30 min. Furthermore, it outperforms by 7% the same architecture trained using only rainfall radar images. Merging rain and wind data has also proven to stabilize the training process and enabled significant improvement especially on the difficult-to-predict high precipitation rainfalls.

翻訳日:2021-05-16 01:47:19 公開日:2021-01-12

# マルチリード心電図信号からの27の異常の同定:サインロス機能を有するSe-ResNetフレームワーク

Identification of 27 abnormalities from multi-lead ECG signals: An ensembled Se-ResNet framework with Sign Loss function ( http://arxiv.org/abs/2101.03895v2 )

ライセンス: Link先を確認

Zhaowei Zhu, Xiang Lan, Tingting Zhao, Yangming Guo, Pipin Kojodjojo, Zhuoyang Xu, Zhuo Liu, Siqi Liu, Han Wang, Xingzhi Sun, Mengling Feng

(参考訳) 心臓血管疾患は健康にとって大きな脅威であり、世界中の死因の1つである。 12誘導心電図は、心臓の異常を識別するための安価で一般的なツールである。早期かつ正確な診断は、早期の治療と介入により、心血管疾患の重篤な合併症を予防する。本研究の目的は,12誘導心電図記録から27個の心電図異常を自動的に識別するアルゴリズムを開発することである。

Cardiovascular disease is a major threat to health and one of the primary causes of death globally. The 12-lead ECG is a cheap and commonly accessible tool to identify cardiac abnormalities. Early and accurate diagnosis will allow early treatment and intervention to prevent severe complications of cardiovascular disease. In the PhysioNet/Computing in Cardiology Challenge 2020, our objective is to develop an algorithm that automatically identifies 27 ECG abnormalities from 12-lead ECG recordings.

翻訳日:2021-05-10 05:10:35 公開日:2021-01-12

# データ拡張ポリシとネットワークアーキテクチャの統合検索

Joint Search of Data Augmentation Policies and Network Architectures ( http://arxiv.org/abs/2012.09407v2 )

ライセンス: Link先を確認

Taiga Kashima, Yoshihiro Yamada, Shunta Saito

(参考訳) ディープニューラルネットワークをトレーニングする一般的なパイプラインは、データ拡張やネットワークアーキテクチャの選択など、いくつかのビルディングブロックで構成される。 automlは、これらのパーツを自動的に設計することを目的とした研究分野だが、ほとんどのメソッドは、各パーツを独立して探索する。本稿では,トレーニングパイプラインの設計にさらなる自動化を実現するために,データ拡張ポリシーとネットワークアーキテクチャを統合的に最適化する手法を提案する。私たちのアプローチの核となる考え方は、部分全体を差別化可能にすることです。提案手法は,拡張ポリシー探索法とネットワークアーキテクチャ探索法を組み合わせることで,エンドツーエンドでそれらを協調的に最適化する。実験の結果, 本手法は独立的に検索した結果に対して, 競争性, 優れた性能が得られることがわかった。

The common pipeline of training deep neural networks consists of several building blocks such as data augmentation and network architecture selection. AutoML is a research field that aims at automatically designing those parts, but most methods explore each part independently because it is more challenging to simultaneously search all the parts. In this paper, we propose a joint optimization method for data augmentation policies and network architectures to bring more automation to the design of training pipeline. The core idea of our approach is to make the whole part differentiable. The proposed method combines differentiable methods for augmentation policy search and network architecture search to jointly optimize them in the end-to-end manner. The experimental results show our method achieves competitive or superior performance to the independently searched results.

翻訳日:2021-05-02 07:36:44 公開日:2021-01-12

# ロバスト話者照合のための周波数選択付きマルチストリーム畳み込みニューラルネットワーク

Multi-stream Convolutional Neural Network with Frequency Selection for Robust Speaker Verification ( http://arxiv.org/abs/2012.11159v2 )

ライセンス: Link先を確認

Wei Yao, Shen Chen, Jiamin Cui, Yaolin Lou

(参考訳) 話者検証は、入力音声がクレーム話者に対応するかどうかを検証することを目的としており、従来は、特徴抽出器が全周波数範囲で動作する単一ストリームシナリオに基づいて、この種のシステムが展開されている。本稿では,完全周波数範囲ではなく部分周波数範囲を聴きながら分類タスクを行うのに十分な知識を機械が学べる,いわゆる周波数選択手法を仮定し,この手法を話者照合タスクに適用したマルチストリーム畳み込みニューラルネットワーク(cnn)の新たな枠組みを提案する。提案フレームワークは,複数のストリームから発生する多様な時間的埋め込みに対応し,音響モデリングの堅牢性を高める。時間的埋め込みの多様性については,周波数の完全帯域を複数のサブバンドに手作業で分割し,各ストリームの特徴抽出器が対象周波数領域として使用するサブバンドを選択することで,周波数選択による特徴拡張を検討する。従来の単一ストリームソリューションとは異なり、各発話は一度だけ処理されるが、このフレームワークでは複数のストリームが並列に処理される。各ストリームの入力発話は、所定の周波数範囲内の周波数セレクタによって前処理され、平均正規化により後処理される。各ストリームの正規化された時間埋め込みはプール層に流れ込み、融合した埋め込みを生成する。本稿では,voxcelebデータセットの広範な実験を行い,マルチストリームcnnが最小決定コスト関数 (mindcf) の相対的改善率20.53パーセントで,シングルストリームベースラインを有意に上回っていることを示す。

Speaker verification aims to verify whether an input speech corresponds to the claimed speaker, and conventionally, this kind of system is deployed based on single-stream scenario, wherein the feature extractor operates in full frequency range. In this paper, we hypothesize that machine can learn enough knowledge to do classification task when listening to partial frequency range instead of full frequency range, which is so called frequency selection technique, and further propose a novel framework of multi-stream Convolutional Neural Network (CNN) with this technique for speaker verification tasks. The proposed framework accommodates diverse temporal embeddings generated from multiple streams to enhance the robustness of acoustic modeling. For the diversity of temporal embeddings, we consider feature augmentation with frequency selection, which is to manually segment the full-band of frequency into several sub-bands, and the feature extractor of each stream can select which sub-bands to use as target frequency domain. Different from conventional single-stream solution wherein each utterance would only be processed for one time, in this framework, there are multiple streams processing it in parallel. The input utterance for each stream is pre-processed by a frequency selector within specified frequency range, and post-processed by mean normalization. The normalized temporal embeddings of each stream will flow into a pooling layer to generate fused embeddings. We conduct extensive experiments on VoxCeleb dataset, and the experimental results demonstrate that multi-stream CNN significantly outperforms single-stream baseline with 20.53 % of relative improvement in minimum Decision Cost Function (minDCF).

翻訳日:2021-04-27 06:19:04 公開日:2021-01-12

# (参考訳) MOOCにおける学習ニーズ改善のための教育コンテンツリンク

Educational Content Linking for Enhancing Learning Need Remediation in MOOCs ( http://arxiv.org/abs/2012.15826v2 )

ライセンス: CC BY 4.0

Shang-Wen Li

(参考訳) 2011年に導入されて以来、web上のさまざまなテーマに4000以上のmoocがあり、3500万人以上の学習者が参加している。 MOOCは、知識の普及を民主化し、世界最高の教育を学習者にもたらす能力を示した。しかし, 参加者間の距離, 学習者の人数, 学習者の背景の不均一性は, 学習経験に悪影響を及ぼすタイムリーな方法で学習者との対話を極めて困難にしている。課題に対処するため,本論文では,教育コンテンツリンクという枠組みを提案する。様々なコース教材に散在する学習コンテンツの断片を、容易にアクセス可能な構造にリンクし、整理することにより、このフレームワークが学習者の指導とコンテンツナビゲーションを改善することができると仮定する。 MOOCにおけるほとんどの指導と知識獲得は、学習者がコース資料を調査する際に行われるので、より良いコンテンツナビゲーションは、学習者が自分の混乱を解消し、学習結果と経験を改善するのに役立つ。予想を裏付けるために,1)手動でリンクを生成すれば学習が改善できるか,という2つの研究の枠組みについて,エンドツーエンドの研究を提示する。 2)機械学習による学習コンテンツの生成は可能か? 最初の質問を学習するために,学習教材を提示し,それらを同時に視覚化するインタフェースを構築した。このインターフェースにより,希望する教材をより効率的に検索し,より多くの概念をより容易に維持できることがわかった。第2の質問に対して,条件付き確率場に基づく自動コンテンツリンクアルゴリズムを提案する。リンクのないインターフェースに対する改善の規模は小さいものの、自動生成リンクは依然として学習の改善につながることを実証する。

Since its introduction in 2011, there have been over 4000 MOOCs on various subjects on the Web, serving over 35 million learners. MOOCs have shown the ability to democratize knowledge dissemination and bring the best education in the world to every learner. However, the disparate distances between participants, the size of the learner population, and the heterogeneity of the learners' backgrounds make it extremely difficult for instructors to interact with the learners in a timely manner, which adversely affects learning experience. To address the challenges, in this thesis, we propose a framework: educational content linking. By linking and organizing pieces of learning content scattered in various course materials into an easily accessible structure, we hypothesize that this framework can provide learners guidance and improve content navigation. Since most instruction and knowledge acquisition in MOOCs takes place when learners are surveying course materials, better content navigation may help learners find supporting information to resolve their confusion and thus improve learning outcome and experience. To support our conjecture, we present end-to-end studies to investigate our framework around two research questions: 1) can manually generated linking improve learning? 2) can learning content be generated with machine learning methods? For studying the first question, we built an interface that present learning materials and visualize the linking among them simultaneously. We found the interface enables users to search for desired course materials more efficiently, and retain more concepts more readily. For the second question, we propose an automatic content linking algorithm based on conditional random fields. We demonstrate that automatically generated linking can still lead to better learning, although the magnitude of the improvement over the unlinked interface is smaller.

翻訳日:2021-04-17 20:34:51 公開日:2021-01-12

# (参考訳) SUMOを用いた意味モデリング

Semantic Modeling with SUMO ( http://arxiv.org/abs/2012.15835v3 )

ライセンス: CC BY-SA 4.0

Robert B. Allen

(参考訳) 我々は,Suggested Upper Merged Ontology (SUMO) を用いてセマンティック・シミュレーションを開発する。汎用プログラミング言語を用いて,シミュレーションガソリンエンジンの遷移をモデル化した概念実証実験を行う。計算集約的な手法ではなく、慣れ親しんだソフトウェア工学のテスト手順に関連する計算集約的なアプローチを探求する。さらに,レキシコグラフィーの言語的アプローチに基づく用語の構造化表現を提案する。

We explore using the Suggested Upper Merged Ontology (SUMO) to develop a semantic simulation. We provide two proof-of-concept demonstrations modeling transitions in a simulated gasoline engine using a general-purpose programming language. Rather than focusing on computationally highly intensive techniques, we explore a less computationally intensive approach related to familiar software engineering testing procedures. In addition, we propose structured representations of terms based on linguistic approaches to lexicography.

翻訳日:2021-04-17 20:32:36 公開日:2021-01-12

# 中国農村部における"Brilliant AI Doctor" : AIによるCDSS展開の緊張と課題

"Brilliant AI Doctor" in Rural China: Tensions and Challenges in AI-Powered CDSS Deployment ( http://arxiv.org/abs/2101.01524v2 )

ライセンス: Link先を確認

Dakuo Wang and Liuping Wang and Zhan Zhang and Ding Wang and Haiyi Zhu and Yvonne Gao and Xiangmin Fan and Feng Tian

(参考訳) 人工知能(AI)技術は、先進的な臨床決定支援システム(CDSS)の実装にますます利用されている。臨床意思決定シナリオにおけるAI-CDSS(AI-CDSS)の有用性について検討した。しかし、特に発展途上国では、広告後のユーザー知覚と経験は未熟である。中国の6つの農村クリニックの22人の臨床医の観察とインタビューを通じて、AI-CDSSシステム(Brilliant Doctor)の設計と、現地のコンテキストやワークフローとの相違、技術的制限とユーザビリティ障壁、およびAI-CDSSの透明性と信頼性に関する問題など、農村の臨床的コンテキストとのさまざまな緊張関係を報告する。これらの緊張にもかかわらず、すべての参加者はAI-CDSSの将来に対する肯定的な態度を示し、特に臨床環境でのヒト-AIコラボレーションの未来を実現するために「医師のAIアシスタント」として機能した。最後に、発展途上国の農村臨床状況におけるAI-CDSS介入設計の意義について考察する。

Artificial intelligence (AI) technology has been increasingly used in the implementation of advanced Clinical Decision Support Systems (CDSS). Research demonstrated the potential usefulness of AI-powered CDSS (AI-CDSS) in clinical decision making scenarios. However, post-adoption user perception and experience remain understudied, especially in developing countries. Through observations and interviews with 22 clinicians from 6 rural clinics in China, this paper reports the various tensions between the design of an AI-CDSS system ("Brilliant Doctor") and the rural clinical context, such as the misalignment with local context and workflow, the technical limitations and usability barriers, as well as issues related to transparency and trustworthiness of AI-CDSS. Despite these tensions, all participants expressed positive attitudes toward the future of AI-CDSS, especially acting as "a doctor's AI assistant" to realize a Human-AI Collaboration future in clinical settings. Finally we draw on our findings to discuss implications for designing AI-CDSS interventions for rural clinical contexts in developing countries.

翻訳日:2021-04-11 22:59:54 公開日:2021-01-12

# (参考訳) 分類におけるバイアスと分散分析の統一的アプローチ

A unifying approach on bias and variance analysis for classification ( http://arxiv.org/abs/2101.01765v2 )

ライセンス: CC BY 4.0

Cemre Zor and Terry Windeatt

(参考訳) 標準バイアスと分散(B&V)の用語は、もともと回帰設定のために定義され、分類への拡張によって、文献においていくつかの異なるモデル/定義が導かれた。本稿では,Tumer & Ghosh (T&G) の一般的なフレームワークと James との関係について述べる。 2つのアプローチを統一することにより、0/1の損失に対して定義されたB&Vと、二乗誤差損失に対して与えられる境界分布の標準B&Vを関連付ける。クローズドフォームの関係は分類性能をより深く理解し、2つのケーススタディでその使用が実証されている。

Standard bias and variance (B&V) terminologies were originally defined for the regression setting and their extensions to classification have led to several different models / definitions in the literature. In this paper, we aim to provide the link between the commonly used frameworks of Tumer & Ghosh (T&G) and James. By unifying the two approaches, we relate the B&V defined for the 0/1 loss to the standard B&V of the boundary distributions given for the squared error loss. The closed form relationships provide a deeper understanding of classification performance, and their use is demonstrated in two case studies.

翻訳日:2021-04-11 12:57:36 公開日:2021-01-12

# (参考訳) 連携・協調・自動化産業システムにおけるフェデレーション学習の可能性

Opportunities of Federated Learning in Connected, Cooperative and Automated Industrial Systems ( http://arxiv.org/abs/2101.03367v2 )

ライセンス: CC BY 4.0

Stefano Savazzi, Monica Nicoli, Mehdi Bennis, Sanaz Kianoush, Luca Barbieri

(参考訳) 次世代の自律・ネットワーク産業システム(ロボット、車両、ドローン)は、超信頼性、低遅延通信(URLLC)およびコンピューティングの進歩を推進してきた。これらのネットワーク化されたマルチエージェントシステムは、ミッションクリティカルコントロール機能を提供するために、高速で通信効率のよい分散機械学習(ML)を必要とする。フェデレートラーニング(FL)を含む分散ML技術は、センシング、コミュニケーション、学習に精通する多分野の研究領域である。集中型サーバで生データサンプルを使用するのではなく、urllcを介して接続されたネットワークエージェントが、ローカルにトレーニングされたモデルのパラメータを定期的に交換する分散学習者として機能する、協調的な融合アプローチを活用する。本稿では,次世代ネットワーク産業システムにおけるFLの新たな可能性について考察する。スマートマニュファクチャリングにおけるコラボレーティブな自動車両とコラボレーティブなロボティクスにおける協調運転に焦点を当てたオープンな問題について議論する。

Next-generation autonomous and networked industrial systems (i.e., robots, vehicles, drones) have driven advances in ultra-reliable, low latency communications (URLLC) and computing. These networked multi-agent systems require fast, communication-efficient and distributed machine learning (ML) to provide mission critical control functionalities. Distributed ML techniques, including federated learning (FL), represent a mushrooming multidisciplinary research area weaving in sensing, communication and learning. FL enables continual model training in distributed wireless systems: rather than fusing raw data samples at a centralized server, FL leverages a cooperative fusion approach where networked agents, connected via URLLC, act as distributed learners that periodically exchange their locally trained model parameters. This article explores emerging opportunities of FL for the next-generation networked industrial systems. Open problems are discussed, focusing on cooperative driving in connected automated vehicles and collaborative robotics in smart manufacturing.

翻訳日:2021-04-09 09:35:19 公開日:2021-01-12

# hypoSVI: スタイン変動推論と物理インフォームドニューラルネットワークを用いた低中心インバージョン

HypoSVI: Hypocenter inversion with Stein variational inference and Physics Informed Neural Networks ( http://arxiv.org/abs/2101.03271v2 )

ライセンス: Link先を確認

Jonathan D. Smith, Zachary E. Ross, Kamyar Azizzadenesheli, Jack B. Muir

(参考訳) ステイン変分推論を用いた確率的中心反転のスキームを提案する。我々のアプローチは、アイコン方程式の解法を訓練する物理インフォームドニューラルネットワークの形で、微分可能フォワードモデルを用いている。これにより、核化されたスタインの差分に対して粒子の集まりを反復的に最適化することで、後部を迅速に近似することができる。本手法は,低中央分散逆問題に共通する非凸後部分布を扱うのに最適であることを示す。様々なハイパーパラメータの影響を調べるために一連の実験が行われた。一度トレーニングすれば、旅行時間表を構築する必要なしに、学習領域内の任意のネットワーク幾何に対して有効である。本研究では,分散音響センシングのような大規模N型センシング技術に最適であることを示す。

We introduce a scheme for probabilistic hypocenter inversion with Stein variational inference. Our approach uses a differentiable forward model in the form of a physics-informed neural network, which we train to solve the Eikonal equation. This allows for rapid approximation of the posterior by iteratively optimizing a collection of particles against a kernelized Stein discrepancy. We show that the method is well-equipped to handle highly non-convex posterior distributions, which are common in hypocentral inverse problems. A suite of experiments is performed to examine the influence of the various hyperparameters. Once trained, the method is valid for any network geometry within the study area without the need to build travel time tables. We show that the computational demands scale efficiently with the number of differential times, making it ideal for large-N sensing technologies like Distributed Acoustic Sensing.

翻訳日:2021-04-09 07:20:55 公開日:2021-01-12

# 平均回帰戦略における関数特性を持つ深層強化学習

Deep Reinforcement Learning with Function Properties in Mean Reversion Strategies ( http://arxiv.org/abs/2101.03418v2 )

ライセンス: Link先を確認

Sophia Gu

(参考訳) ゲーム産業におけるDeep Reinforcement Learningの最近の進歩により、我々は、同じ技術が一般的な量的財政問題にも有効かどうか疑問視している。本稿では,OpenAIによって開発された既製のライブラリが,逆転戦略に容易に適応できるかどうかを考察する。さらに、エージェントが検索する必要のある関数空間を狭めることで、よりよいパフォーマンスが得られるかどうかを確認し、テストします。報酬関数を慎重に選択したペナルティ項によって増強することで、これを実現する。

With the recent advancement in Deep Reinforcement Learning in the gaming industry, we are curious if the same technology would work as well for common quantitative financial problems. In this paper, we will investigate if an off-the-shelf library developed by OpenAI can be easily adapted to mean reversion strategy. Moreover, we will design and test to see if we can get better performance by narrowing the function space that the agent needs to search for. We achieve this through augmenting the reward function by a carefully picked penalty term.

翻訳日:2021-04-09 07:19:56 公開日:2021-01-12

# (参考訳) at-bert:adversarial training bert for acronym identification winning solution for sdu@aaai-21

AT-BERT: Adversarial Training BERT for Acronym Identification Winning Solution for SDU@AAAI-21 ( http://arxiv.org/abs/2101.03700v2 )

ライセンス: CC BY 4.0

Danqing Zhu, Wangli Lin, Yang Zhang, Qiwei Zhong, Guanxiong Zeng, Weilin Wu, Jiayu Tang

(参考訳) 頭字語識別は、省略された頭字語と句を見つけることに焦点を当てており、これは科学文書理解タスクに不可欠である。しかし、手動でアノテートされたデータセットの限られたサイズは、問題のさらなる改善を妨げる。大規模コーパス上で事前学習された言語モデルの最近のブレークスルーは、教師なし事前学習が下流タスクの性能を大幅に改善できることを示している。本稿では,AAAI 2021 の学術文書理解 (SDU) チャレンジにおいて,AT-BERT と名づけられた逆トレーニング BERT 手法を提案する。具体的には、事前訓練されたBERTが、より良いセマンティック表現をキャプチャするために採用されている。次に、FGMの対向訓練戦略をBERTの微調整に取り入れ、モデルをより堅牢で一般化する。さらに、複数のBERT変種から学んだ表現を包含するアンサンブル機構が考案された。これらすべてのコンポーネントを組み立てることにより,sciaiデータセットの実験結果から,提案手法が他手法よりも優れていることが示された。

Acronym identification focuses on finding the acronyms and the phrases that have been abbreviated, which is crucial for scientific document understanding tasks. However, the limited size of manually annotated datasets hinders further improvement for the problem. Recent breakthroughs of language models pre-trained on large corpora clearly show that unsupervised pre-training can vastly improve the performance of downstream tasks. In this paper, we present an Adversarial Training BERT method named AT-BERT, our winning solution to acronym identification task for Scientific Document Understanding (SDU) Challenge of AAAI 2021. Specifically, the pre-trained BERT is adopted to capture better semantic representation. Then we incorporate the FGM adversarial training strategy into the fine-tuning of BERT, which makes the model more robust and generalized. Furthermore, an ensemble mechanism is devised to involve the representations learned from multiple BERT variants. Assembling all these components together, the experimental results on the SciAI dataset show that our proposed approach outperforms all other competitive state-of-the-art methods.

翻訳日:2021-04-04 21:35:35 公開日:2021-01-12

# 階層的微分可能なアーキテクチャ探索による検索空間のアンチェーン

Unchain the Search Space with Hierarchical Differentiable Architecture Search ( http://arxiv.org/abs/2101.04028v2 )

ライセンス: Link先を確認

Guanting Liu, Yujie Zhong, Sheng Guo, Matthew R. Scott, Weilin Huang

(参考訳) 微分可能なアーキテクチャサーチ (DAS) は計算コストを削減した高性能アーキテクチャの探索に大きく進歩している。しかし、DASベースの手法は主に繰り返し可能なセル構造を探索することに集中しており、複数のステージに順次積み重ねてネットワークを形成する。この構成は検索空間を大幅に減らし、細胞間の接続の重要性を無視する。本稿では,この制限を克服するために,セルレベルとステージレベルの両方でアーキテクチャ検索を行う階層的微分可能アーキテクチャ探索(h-das)を提案する。具体的には、ネットワークがステージ固有の細胞構造を学習できるように、細胞レベルの検索空間を緩和する。ステージレベルの探索では,各ステージ内の細胞数やセル間の接続など,ステージのアーキテクチャを体系的に研究する。洞察に富んだ観察に基づいて,いくつかの探索ルールと損失をデザインし,より優れたステージレベルのアーキテクチャを探索する。このような階層的検索空間は、高価な検索コストを伴わずにネットワークの性能を大幅に向上させる。 CIFAR10とImageNetの大規模な実験により,提案したH-DASの有効性が示された。さらに、探索されたステージレベルのアーキテクチャは、既存のDAS法で探索されたセル構造と組み合わせることで、パフォーマンスをさらに向上させることができる。コードは、https://github.com/MalongTech/research-HDASで入手できる。

Differentiable architecture search (DAS) has made great progress in searching for high-performance architectures with reduced computational cost. However, DAS-based methods mainly focus on searching for a repeatable cell structure, which is then stacked sequentially in multiple stages to form the networks. This configuration significantly reduces the search space, and ignores the importance of connections between the cells. To overcome this limitation, in this paper, we propose a Hierarchical Differentiable Architecture Search (H-DAS) that performs architecture search both at the cell level and at the stage level. Specifically, the cell-level search space is relaxed so that the networks can learn stage-specific cell structures. For the stage-level search, we systematically study the architectures of stages, including the number of cells in each stage and the connections between the cells. Based on insightful observations, we design several search rules and losses, and mange to search for better stage-level architectures. Such hierarchical search space greatly improves the performance of the networks without introducing expensive search cost. Extensive experiments on CIFAR10 and ImageNet demonstrate the effectiveness of the proposed H-DAS. Moreover, the searched stage-level architectures can be combined with the cell structures searched by existing DAS methods to further boost the performance. Code is available at: https://github.com/MalongTech/research-HDAS

翻訳日:2021-04-04 14:39:33 公開日:2021-01-12

# (参考訳) 意味表現からのマルチコンディション生成の変換

Transforming Multi-Conditioned Generation from Meaning Representation ( http://arxiv.org/abs/2101.04257v1 )

ライセンス: CC BY 4.0

Joosung Lee

(参考訳) タスク指向会話システムでは,会話の流れに関連する特定の情報を生成する自然言語生成システムが有用である。本研究では,発話の意味を表す様々な情報を生成条件として考慮し,言語生成に焦点を当てた。意味表現からのNLG(文の意味の条件)は、通常、文計画と表面実現の2段階を経る。しかし、MR(Meaning Representation)から直接発話を生成するための単純なワンステージフレームワークを提案する。我々のモデルはGPT2に基づいており、文の構造を決定する必要がないスロットと値対の平らな条件の発話を生成する。 E2Eデータセット内の複数のシステムと6つの自動メトリクスを評価した。私たちのシステムは単純な手法ですが、従来のシステムと同等のパフォーマンスを自動測定で示しています。さらに,他の手法を使わずにデータセットの10%しか使用せず,同等の性能を実現し,ゼロショット生成や他のデータセットへの拡張の可能性を示す。

In task-oriented conversation systems, natural language generation systems that generate sentences with specific information related to conversation flow are useful. Our study focuses on language generation by considering various information representing the meaning of utterances as multiple conditions of generation. NLG from meaning representations, the conditions for sentence meaning, generally goes through two steps: sentence planning and surface realization. However, we propose a simple one-stage framework to generate utterances directly from MR (Meaning Representation). Our model is based on GPT2 and generates utterances with flat conditions on slot and value pairs, which does not need to determine the structure of the sentence. We evaluate several systems in the E2E dataset with 6 automatic metrics. Our system is a simple method, but it demonstrates comparable performance to previous systems in automated metrics. In addition, using only 10\% of the data set without any other techniques, our model achieves comparable performance, and shows the possibility of performing zero-shot generation and expanding to other datasets.

翻訳日:2021-04-04 13:11:30 公開日:2021-01-12

# (参考訳) clutter slicesアプローチによる室内空間の同定

Clutter Slices Approach for Identification-on-the-fly of Indoor Spaces ( http://arxiv.org/abs/2101.04262v1 )

ライセンス: CC BY 4.0

Upinder Kaur, Praveen Abbaraju, Harrison McCarty, and Richard M. Voyles

(参考訳) 建設空間は絶えず進化しており、継続的な測量、検査、評価を必要とする動的環境である。このような空間の伝統的な手動検査は、困難で時間を要する活動であることが証明されている。ロボットエージェントによる自動化は効果的なソリューションである。知覚能力を持つロボットは、屋内建設空間を自律的に分類し、調査することができる。本稿では,クラッタの一意なシグネチャを用いた室内空間の粗さ分類のための新しい識別・オン・ザ・フライ手法を提案する。乱雑に付与された文脈を用いて,廊下,階段,共用空間,トイレなどの一般的な屋内空間を認識する。提案したクラッタスライスパイプラインは,提案したクラッタスライスデータセットにおいて最大精度93.6%を達成する。このセンサ独立アプローチは、知的自律エージェントを環境をよりよく知覚するために様々な領域に一般化することができる。

Construction spaces are constantly evolving, dynamic environments in need of continuous surveying, inspection, and assessment. Traditional manual inspection of such spaces proves to be an arduous and time-consuming activity. Automation using robotic agents can be an effective solution. Robots, with perception capabilities can autonomously classify and survey indoor construction spaces. In this paper, we present a novel identification-on-the-fly approach for coarse classification of indoor spaces using the unique signature of clutter. Using the context granted by clutter, we recognize common indoor spaces such as corridors, staircases, shared spaces, and restrooms. The proposed clutter slices pipeline achieves a maximum accuracy of 93.6% on the presented clutter slices dataset. This sensor independent approach can be generalized to various domains to equip intelligent autonomous agents in better perceiving their environment.

翻訳日:2021-04-04 12:58:55 公開日:2021-01-12

# (参考訳) 手術映像における一時ガイド付き手指球追跡

Temporally Guided Articulated Hand Pose Tracking in Surgical Videos ( http://arxiv.org/abs/2101.04281v1 )

ライセンス: CC BY 4.0

Nathan Louis, Luowei Zhou, Steven J. Yule, Roger D. Dias, Milisa Manojlovich, Francis D. Pagani, Donald S. Likosky, Jason J. Corso

(参考訳) 手のポーズ追跡は未熟な問題であり、特に医療領域において、広範囲のアプリケーションで使用される可能性を持っている。生体内手術ビデオのロバストで正確な追跡システムにより、手の動きのダイナミクスや動きのパターンを捉えることができ、スキルアセスメント、手術従事者の訓練、時間的行動認識などのリッチなタスクに役立てることができる。本研究では,ポーズ予測に手ポーズを組み込むことでトラッキング精度を向上させる新しい手ポーズ推定モデルRes152-CondPoseを提案する。我々は,過去の予測を効果的に活用する時間的ガイド付きアプローチに従えば,フレーム単位の独立な予測を提供する最先端手法の改善を示す。さらに,マルチスタンスによる手ポーズアノテーションを提供する最初のデータセットであるオペレーショナルハンドを収集した。我々のデータセットには、28の公開手術ビデオから76の動画クリップと8.1k以上の注釈付き手ポーズインスタンスが含まれています。境界ボックス,手指ポーズアノテーション,トラッキングidを提供し,マルチインスタンス領域ベースおよび関節追跡を可能にした。手術手による評価では,平均平均精度(map),ポーズ推定精度,複数物体追跡精度(mota)を用いて,姿勢追跡性能を評価する手法が最先端手法よりも優れていることを示す。

Articulated hand pose tracking is an underexplored problem that carries the potential for use in an extensive number of applications, especially in the medical domain. With a robust and accurate tracking system on in-vivo surgical videos, the motion dynamics and movement patterns of the hands can be captured and analyzed for rich tasks including skills assessment, training surgical residents, and temporal action recognition. In this work, we propose a novel hand pose estimation model, Res152- CondPose, which improves tracking accuracy by incorporating a hand pose prior into its pose prediction. We show improvements over state-of-the-art methods which provide frame-wise independent predictions, by following a temporally guided approach that effectively leverages past predictions. Additionally, we collect the first dataset, Surgical Hands, that provides multi-instance articulated hand pose annotations for in-vivo videos. Our dataset contains 76 video clips from 28 publicly available surgical videos and over 8.1k annotated hand pose instances. We provide bounding boxes, articulated hand pose annotations, and tracking IDs to enable multi-instance area-based and articulated tracking. When evaluated on Surgical Hands, we show our method outperforms the state-of-the-art method using mean Average Precision (mAP), to measure pose estimation accuracy, and Multiple Object Tracking Accuracy (MOTA), to assess pose tracking performance.

翻訳日:2021-04-04 12:52:20 公開日:2021-01-12

# (参考訳) メタラーニングと一般AIの関連性に関する簡単な調査

A Brief Survey of Associations Between Meta-Learning and General AI ( http://arxiv.org/abs/2101.04283v1 )

ライセンス: CC BY 4.0

Huimin Peng

(参考訳) 本稿では,メタラーニングの歴史を概観し,一般AIへの貢献について述べる。メタラーニングはモデル一般化能力を向上し、分散処理と分散処理の両方に適用可能な汎用アルゴリズムを考案する。汎用AIは、タスク固有のモデルを、AIを使用して多様なタスクを解決するための高度な自動化を導入する一般的なアルゴリズムシステムに置き換える。我々は、メモリモジュール、メタラーナー、共進化、好奇心、忘れること、AI生成アルゴリズムなど、一般的なAI開発へのメタラーニングの主な貢献を要約する。メタラーニングと一般AIの関連性を示し、一般AIアルゴリズムの定式化にメタラーニングをどのように使用できるかについて議論する。

This paper briefly reviews the history of meta-learning and describes its contribution to general AI. Meta-learning improves model generalization capacity and devises general algorithms applicable to both in-distribution and out-of-distribution tasks potentially. General AI replaces task-specific models with general algorithmic systems introducing higher level of automation in solving diverse tasks using AI. We summarize main contributions of meta-learning to the developments in general AI, including memory module, meta-learner, coevolution, curiosity, forgetting and AI-generating algorithm. We present connections between meta-learning and general AI and discuss how meta-learning can be used to formulate general AI algorithms.

翻訳日:2021-04-04 12:31:00 公開日:2021-01-12

# (参考訳) 3D-ANAS:高速ハイパースペクトル画像分類のための3次元非対称ニューラルネットワーク探索

3D-ANAS: 3D Asymmetric Neural Architecture Search for Fast Hyperspectral Image Classification ( http://arxiv.org/abs/2101.04287v1 )

ライセンス: CC BY 4.0

Haokui Zhang, Chengrong Gong, Yunpeng Bai, Zongwen Bai and Ying Li

(参考訳) ハイパースペクトル画像はスペクトルと空間情報を豊富に含み、土地被覆分類において不定の役割を果たす。近年,ディープラーニング技術に基づいて,有望な性能を示すHSI分類手法が提案されている。しかし、これまでの研究では、1)ほとんどのディープラーニングモデルのアーキテクチャは手作業で設計されており、専門知識に依存しており、比較的退屈である。さらに、hsi分類では、異なるセンサーによってキャプチャされたデータセットは、物理的特性が異なる。それに合わせて、異なるモデルをさまざまなデータセット用に設計する必要があるため、アーキテクチャ設計の作業負荷はさらに増加する。隣接する画素のパッチの重複領域を繰り返し計算し、計算コストと時間コストを増大させる。さらに、分類精度は広範な調査実験に基づいて人工的に設定されるパッチサイズに敏感である。上記の問題を克服するため,まず3次元非対称ニューラルネットワーク探索アルゴリズムを提案し,HSI分類のための効率的なアーキテクチャを自動検索する。 hsisの特性を解析することにより、スペクトルと空間の情報を異なる分解畳み込みで処理する3次元非対称分解探索空間を特に構築する。さらに,反復操作を行わず,全体のコストを低減できる新しい高速分類フレームワーク,すなわち画素から画素への分類フレームワークを提案する。異なるセンサーによってキャプチャされた3つの公開HSIデータセットの実験では、我々の3D-ANASが設計したネットワークは、最先端のいくつかの手法と比較して競争力を発揮するが、推論速度ははるかに速い。

Hyperspectral images involve abundant spectral and spatial information, playing an irreplaceable role in land-cover classification. Recently, based on deep learning technologies, an increasing number of HSI classification approaches have been proposed, which demonstrate promising performance. However, previous studies suffer from two major drawbacks: 1) the architecture of most deep learning models is manually designed, relies on specialized knowledge, and is relatively tedious. Moreover, in HSI classifications, datasets captured by different sensors have different physical properties. Correspondingly, different models need to be designed for different datasets, which further increases the workload of designing architectures; 2) the mainstream framework is a patch-to-pixel framework. The overlap regions of patches of adjacent pixels are calculated repeatedly, which increases computational cost and time cost. Besides, the classification accuracy is sensitive to the patch size, which is artificially set based on extensive investigation experiments. To overcome the issues mentioned above, we firstly propose a 3D asymmetric neural network search algorithm and leverage it to automatically search for efficient architectures for HSI classifications. By analysing the characteristics of HSIs, we specifically build a 3D asymmetric decomposition search space, where spectral and spatial information are processed with different decomposition convolutions. Furthermore, we propose a new fast classification framework, i,e., pixel-to-pixel classification framework, which has no repetitive operations and reduces the overall cost. Experiments on three public HSI datasets captured by different sensors demonstrate the networks designed by our 3D-ANAS achieve competitive performance compared to several state-of-the-art methods, while having a much faster inference speed.

翻訳日:2021-04-04 12:16:30 公開日:2021-01-12

# (参考訳) Fits and Starts: AutoMLの企業利用とループにおける人間の役割

Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the Loop ( http://arxiv.org/abs/2101.04296v1 )

ライセンス: CC BY 4.0

Anamaria Crisan, Brittany Fiore-Gartland

(参考訳) AutoMLシステムは、通常のデータサイエンス作業のスピードアップと、統計学やコンピュータサイエンスの専門知識を持たない人たちの機械学習利用を可能にする。これらのシステムは、熟練したデータワーカーのプールが限られている企業環境で勢いを増している。本研究では,異なる規模の組織から29名の個人を対象に,データサイエンスにおけるAutoMLシステムの利用状況や利用意図についてインタビューを行った。また,データ可視化とAutoMLシステムとの併用について検討した。分析の結果,AutoMLの3つの利用シナリオは,さまざまなレベルの専門知識を持つデータワーカーが望む自動化レベルを要約するフレームワークとなった。スピードと人間の監視の緊張関係を表面化し、データの視覚化によって両者のバランスが悪くなることを発見した。本研究は,人間のループ内視覚分析手法の設計と実装に影響を及ぼすものである。

AutoML systems can speed up routine data science work and make machine learning available to those without expertise in statistics and computer science. These systems have gained traction in enterprise settings where pools of skilled data workers are limited. In this study, we conduct interviews with 29 individuals from organizations of different sizes to characterize how they currently use, or intend to use, AutoML systems in their data science work. Our investigation also captures how data visualization is used in conjunction with AutoML systems. Our findings identify three usage scenarios for AutoML that resulted in a framework summarizing the level of automation desired by data workers with different levels of expertise. We surfaced the tension between speed and human oversight and found that data visualization can do a poor job balancing the two. Our findings have implications for the design and implementation of human-in-the-loop visual analytics approaches.

翻訳日:2021-04-04 11:30:28 公開日:2021-01-12

# (参考訳) DeepiSign:CNNの統合性と認証を保護するために、目に見えないフレジブルな透かし

DeepiSign: Invisible Fragile Watermark to Protect the Integrityand Authenticity of CNN ( http://arxiv.org/abs/2101.04319v1 )

ライセンス: CC BY 4.0

Alsharif Abuadbba, Hyoungshick Kim, Surya Nepal

(参考訳) 自動運転車のような現実のアプリケーションでデプロイされる畳み込みニューラルネットワーク(cnns)は、毒殺攻撃や微調整といった操作攻撃に弱いことが示されている。したがって、妥協されたモデルは不正な出力を生成し、悪意ある振る舞いをするので、CNNの完全性と信頼性を保証することが不可欠である。本稿では,CNNモデルの整合性と信頼性を確保するために,DeepiSignと呼ばれる自己完結型タンパ保護手法を提案する。 DeepiSignは、秘密とハッシュ値をCNNモデルに安全に埋め込むために、脆弱な目に見えない透かしというアイデアを適用している。モデルの完全性と信頼性を検証するために、モデルからシークレットを取得し、シークレットのハッシュ値を計算し、それを埋め込みハッシュ値と比較する。 CNNモデルに埋め込まれたシークレットの影響を最小限に抑えるため、ウェーブレットベースの手法を用いて重みを周波数領域に変換し、そのシークレットをより少ない有意な係数に埋め込む。理論的解析により,DeepiSignは各層に最大1KBのシークレットを隠蔽し,モデルの精度を最小限に抑えることができた。 deepisignのセキュリティと性能を評価するために,3種類の操作攻撃(ターゲット入力中毒,アウトプット中毒,微調整)に対する3つのデータセット(mnist,cifar-10,imagenet)を用いて,事前学習した4つのモデル(resnet18,vgg16,alexnet,mobilenet)について実験を行った。その結果,DeepiSignは分類精度を低下させることなく検証可能であり,CNNによる攻撃に対して堅牢であることがわかった。

Convolutional Neural Networks (CNNs) deployed in real-life applications such as autonomous vehicles have shown to be vulnerable to manipulation attacks, such as poisoning attacks and fine-tuning. Hence, it is essential to ensure the integrity and authenticity of CNNs because compromised models can produce incorrect outputs and behave maliciously. In this paper, we propose a self-contained tamper-proofing method, called DeepiSign, to ensure the integrity and authenticity of CNN models against such manipulation attacks. DeepiSign applies the idea of fragile invisible watermarking to securely embed a secret and its hash value into a CNN model. To verify the integrity and authenticity of the model, we retrieve the secret from the model, compute the hash value of the secret, and compare it with the embedded hash value. To minimize the effects of the embedded secret on the CNN model, we use a wavelet-based technique to transform weights into the frequency domain and embed the secret into less significant coefficients. Our theoretical analysis shows that DeepiSign can hide up to 1KB secret in each layer with minimal loss of the model's accuracy. To evaluate the security and performance of DeepiSign, we performed experiments on four pre-trained models (ResNet18, VGG16, AlexNet, and MobileNet) using three datasets (MNIST, CIFAR-10, and Imagenet) against three types of manipulation attacks (targeted input poisoning, output poisoning, and fine-tuning). The results demonstrate that DeepiSign is verifiable without degrading the classification accuracy, and robust against representative CNN manipulation attacks.

翻訳日:2021-04-04 11:02:07 公開日:2021-01-12

# (参考訳) 機械学習と信号特徴抽出を組み合わせたブラインド変調分類

Blind Modulation Classification via Combined Machine Learning and Signal Feature Extraction ( http://arxiv.org/abs/2101.04337v1 )

ライセンス: CC BY 4.0

Jafar Norolahi, Paeiz Azmi

(参考訳) 本研究では,視覚・自動変調分類のためのアルゴリズムを提案する。低信号パワーから雑音比(SNR)の様々な変調を識別するために、機械傾きと信号特徴抽出の組み合わせが有効である。提案アルゴリズムは4つを含む。まず、正規および不規則なスペクトル特性に基づく変調信号の分岐に対するスペクトル分析に有利である。次に、受信信号に非線形ソフトマージン支持ベクトル(NS SVM)問題を適用し、そのシンボルを正しい(サポートベクトル)シンボルに分類する。 NS SVMの雇用は変調信号に対する物理層ノイズ効果の低減につながる。その後、k-centerクラスタリングは各クラスの中央を見つけることができる。最後に, 散乱図の相関関数推定は, 変調の既設理想散乱図と相関する。相関結果は分類結果である。さらなる評価のために、多くの公開手法と比較して成功率、性能、複雑さが提供される。シミュレーションにより、提案アルゴリズムは変調された信号をより少ないSNRで分類できることを示す。例えば、SNR=4.2dBで4-QAM、SNR=2.1dBで4-FSK、成功率は%99である。さらに,ns svmと特徴ベース関数の双対問題におけるカーネル関数の利用により,提案手法は複雑性が低く,実装が簡単である。

In this study, an algorithm to blind and automatic modulation classification has been proposed. It well benefits combined machine leaning and signal feature extraction to recognize diverse range of modulation in low signal power to noise ratio (SNR). The presented algorithm contains four. First, it advantages spectrum analyzing to branching modulated signal based on regular and irregular spectrum character. Seconds, a nonlinear soft margin support vector (NS SVM) problem is applied to received signal, and its symbols are classified to correct and incorrect (support vectors) symbols. The NS SVM employment leads to discounting in physical layer noise effect on modulated signal. After that, a k-center clustering can find center of each class. finally, in correlation function estimation of scatter diagram is correlated with pre-saved ideal scatter diagram of modulations. The correlation outcome is classification result. For more evaluation, success rate, performance, and complexity in compare to many published methods are provided. The simulation prove that the proposed algorithm can classified the modulated signal in less SNR. For example, it can recognize 4-QAM in SNR=-4.2 dB, and 4-FSK in SNR=2.1 dB with %99 success rate. Moreover, due to using of kernel function in dual problem of NS SVM and feature base function, the proposed algorithm has low complexity and simple implementation in practical issues.

翻訳日:2021-04-04 10:44:51 公開日:2021-01-12

# (参考訳) ハイパーネットワークに基づく期待整合信号回復アルゴリズムを用いた位相検索

Phase Retrieval using Expectation Consistent Signal Recovery Algorithm based on Hypernetwork ( http://arxiv.org/abs/2101.04348v1 )

ライセンス: CC BY 4.0

Chang-Jen Wang, Chao-Kai Wen, Shang-Ho (Lawrence) Tsai, Shi Jin, Geoffrey Ye Li

(参考訳) 位相検索(PR)は現代の計算イメージングシステムにおいて重要な要素である。過去半世紀にわたって多くのアルゴリズムが開発されてきた。近年のディープラーニングの進歩は、堅牢で高速なPRの新たな可能性を開いた。 deep unfoldingと呼ばれる新たなテクニックは、従来のモデルベースの反復アルゴリズムと、現代的なデータベースのディープラーニングとの系統的な接続を提供する。データ学習を利用した展開アルゴリズムは、元のアルゴリズムよりも顕著な性能と収束速度の向上を示した。その可能性にもかかわらず、既存の展開アルゴリズムのほとんどは、層依存パラメータを使用する場合、一定の数の反復に限られる。本研究では,既存の制約を克服するために,深い展開のための新しい枠組みを開発する。一般の逆問題に対して,我々のフレームワークが広く適用可能であるとしても,PRを例として取り上げる。我々の開発は、データ駆動学習において減衰因子が残される一般化予測整合信号回復アルゴリズム(GEC-SR)に基づいている。特に, GEC-SR の減衰係数を生成するハイパーネットワークを導入する。最適な減衰因子を直接学習する代わりに、ハイパーネットワークは、臨床設定に従って最適な減衰因子を生成する方法を学び、異なるシナリオへの適応性を確保する。ハイパーネットワークの動作を異なるレイヤ番号に適応させるため、私たちはリカレントアーキテクチャを使用して動的ハイパーネットワークを開発し、レイヤ間でオンラインに変化可能な減衰係数を生成します。また,ハイパーネットワークのロバスト性を高めるために自己アテンション機構を利用する。大規模な実験により、提案アルゴリズムは収束速度と精度で既存のアルゴリズムより優れており、多くの古典的PRアルゴリズムが不安定または失敗する非常に厳しい条件下でも機能することが示された。

Phase retrieval (PR) is an important component in modern computational imaging systems. Many algorithms have been developed over the past half century. Recent advances in deep learning have opened up a new possibility for robust and fast PR. An emerging technique, called deep unfolding, provides a systematic connection between conventional model-based iterative algorithms and modern data-based deep learning. Unfolded algorithms, powered by data learning, have shown remarkable performance and convergence speed improvement over the original algorithms. Despite their potential, most existing unfolded algorithms are strictly confined to a fixed number of iterations when employing layer-dependent parameters. In this study, we develop a novel framework for deep unfolding to overcome the existing limitations. Even if our framework can be widely applied to general inverse problems, we take PR as an example in the paper. Our development is based on an unfolded generalized expectation consistent signal recovery (GEC-SR) algorithm, wherein damping factors are left for data-driven learning. In particular, we introduce a hypernetwork to generate the damping factors for GEC-SR. Instead of directly learning a set of optimal damping factors, the hypernetwork learns how to generate the optimal damping factors according to the clinical settings, thus ensuring its adaptivity to different scenarios. To make the hypernetwork work adapt to varying layer numbers, we use a recurrent architecture to develop a dynamic hypernetwork, which generates a damping factor that can vary online across layers. We also exploit a self-attention mechanism to enhance the robustness of the hypernetwork. Extensive experiments show that the proposed algorithm outperforms existing ones in convergence speed and accuracy, and still works well under very harsh settings, that many classical PR algorithms unstable or even fail.

翻訳日:2021-04-04 10:32:38 公開日:2021-01-12

# (参考訳) ランクモデルに対するニューラルラーニングの校正と不確かさについて

On the Calibration and Uncertainty of Neural Learning to Rank Models ( http://arxiv.org/abs/2101.04356v1 )

ライセンス: CC BY 4.0

Gustavo Penha and Claudia Hauff

(参考訳) Probability Ranking Principle (PRP) によれば、関連性確率の順に文書をランク付けすると、アドホック検索に最適な文書ランキングが得られる。 PRPは、2つの条件が満たされたときに成り立つ: [C1] モデルが十分に校正され、[C2] 関連性の確率が確実に報告される。しかし、ディープニューラルネットワーク(DNN)はよく校正されておらず、不確実性の原因がいくつかあるため、[C1]と[C2]はニューラルランサーによって満たされない可能性がある。ニューラルラーニング・トゥ・ランク(L2R)のアプローチの成功を考えると、特にBERTベースのアプローチは、まずどの状況を決定論的に分析する。出力ポイント推定神経ローダは校正されるそこで,本研究では,2つの手法を用いて,提案した確率的ランク付けに導かれるニューラルランク付けの不確かさをモデル化し,点推定とは対照的に関連性の予測分布を出力する。会話応答ランク付けのアドホック検索タスクにおける実験結果から, (i) bertベースのランク付けはロバストに調整されないこと, 確率的bertベースのランク付けがより良いキャリブレーションをもたらすこと, (ii) 不確実性推定は, リスク認識型ニューラルネットワークのランキング, すなわち, ランク付け時の不確実性を考慮し, 不可解な会話コンテキストの予測に有効であることが明らかとなった。

According to the Probability Ranking Principle (PRP), ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad-hoc retrieval. The PRP holds when two conditions are met: [C1] the models are well calibrated, and, [C2] the probabilities of relevance are reported with certainty. We know however that deep neural networks (DNNs) are often not well calibrated and have several sources of uncertainty, and thus [C1] and [C2] might not be satisfied by neural rankers. Given the success of neural Learning to Rank (L2R) approaches-and here, especially BERT-based approaches-we first analyze under which circumstances deterministic, i.e. outputs point estimates, neural rankers are calibrated. Then, motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers, which output a predictive distribution of relevance as opposed to point estimates. Our experimental results on the ad-hoc retrieval task of conversation response ranking reveal that (i) BERT-based rankers are not robustly calibrated and that stochastic BERT-based rankers yield better calibration; and (ii) uncertainty estimation is beneficial for both risk-aware neural ranking, i.e.taking into account the uncertainty when ranking documents, and for predicting unanswerable conversational contexts.

翻訳日:2021-04-04 09:55:48 公開日:2021-01-12

# (参考訳) 収束解析を用いたSOMに基づく勾配自由深層学習法

A SOM-based Gradient-Free Deep Learning Method with Convergence Analysis ( http://arxiv.org/abs/2101.05612v1 )

ライセンス: CC BY 4.0

Shaosheng Xu, Jinde Cao, Yichao Cao, Tong Wang

(参考訳) 深層学習における勾配降下法は一連の疑問を引き起こすため,新しい勾配フリー深層学習構造を提案する。従来の自己組織化マップに新たなモジュールを追加し、マップに残余を導入することで、Deep Valued Self-Organizing Mapネットワークを構築する。そして,このような深い価値を持つ自己組織化マップネットワークの収束性能に関する解析を行い,入力の次元と予測の損失を考慮に入れた設計パラメータの不平等性について述べる。

As gradient descent method in deep learning causes a series of questions, this paper proposes a novel gradient-free deep learning structure. By adding a new module into traditional Self-Organizing Map and introducing residual into the map, a Deep Valued Self-Organizing Map network is constructed. And analysis about the convergence performance of such a deep Valued Self-Organizing Map network is proved in this paper, which gives an inequality about the designed parameters with the dimension of inputs and the loss of prediction.

翻訳日:2021-04-04 09:39:12 公開日:2021-01-12

# (参考訳) 深層学習による非線形分散方程式のデータ駆動ピークと周期ピーク移動波解

Data-driven peakon and periodic peakon travelling wave solutions of some nonlinear dispersive equations via deep learning ( http://arxiv.org/abs/2101.04371v1 )

ライセンス: CC BY 4.0

Li Wang and Zhenya Yan

(参考訳) 数学物理学の分野では、波のピークに不連続な一階微分を持つ孤立波であるピークロン解を持つ多くの物理的に興味深い非線形分散方程式が存在する。 In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. これらの結果は、ピークン解とそれに対応する非線形分散方程式の実験設計をさらに研究するのに有用である。

In the field of mathematical physics, there exist many physically interesting nonlinear dispersive equations with peakon solutions, which are solitary waves with discontinuous first-order derivative at the wave peak. In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. These results will be useful to further study the peakon solutions and corresponding experimental design of nonlinear dispersive equations.

翻訳日:2021-04-04 09:24:32 公開日:2021-01-12

# (参考訳) 確率的マルチユーザバンディットを用いた動的スペクトルアクセス

Dynamic Spectrum Access using Stochastic Multi-User Bandits ( http://arxiv.org/abs/2101.04388v1 )

ライセンス: CC BY 4.0

Meghana Bande, Akshayaa Magesh, Venugopal V. Veeravalli

(参考訳) 非コーディネートスペクトルアクセスのためのアルゴリズムを開発するために、確率的マルチユーザーマルチアームバンディットフレームワークが使用される。先行研究とは対照的に、衝突しても報酬はゼロではないと仮定され、それによってユーザ数をチャネル数よりも多くすることができる。提案アルゴリズムは推定フェーズと割り当てフェーズから構成される。各ユーザがアルゴリズムを採用すると、システム全体の後悔は、持続時間$t$の時間ホリゾンよりも、オーダー$o(\log t)$のオーダーオプティマイズであることが示される。後悔の保証は、ユーザ数がチャネル数以上である場合とチャネル数未満の場合の両方に適用される。このアルゴリズムは、システムのユーザ数が時間とともに進化する動的ケースに拡張され、サブ線形後悔につながることが示されている。

A stochastic multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. In contrast to prior work, it is assumed that rewards can be non-zero even under collisions, thus allowing for the number of users to be greater than the number of channels. The proposed algorithm consists of an estimation phase and an allocation phase. It is shown that if every user adopts the algorithm, the system wide regret is order-optimal of order $O(\log T)$ over a time-horizon of duration $T$. The regret guarantees hold for both the cases where the number of users is greater than or less than the number of channels. The algorithm is extended to the dynamic case where the number of users in the system evolves over time, and is shown to lead to sub-linear regret.

翻訳日:2021-04-04 08:47:53 公開日:2021-01-12

# (参考訳) シミュレーションユーザによるレコメンダシステム効果の測定

Measuring Recommender System Effects with Simulated Users ( http://arxiv.org/abs/2101.04526v1 )

ライセンス: CC BY 4.0

Sirui Yao and Yoni Halpern and Nithum Thain and Xuezhi Wang and Kang Lee and Flavien Prost and Ed H. Chi and Jilin Chen and Alex Beutel

(参考訳) 食べ物レコメンデーションシステム -- が『emph{causing}』かどうかを確認し、不健康な食事習慣を育むか、単にユーザーの興味を反映させるだけか? レコメンダシステムの選択とバイアスによって、レコメンダシステムでのユーザの経験のどのくらいが時間の経過とともに引き起こされ、ユーザの好みとバイアスに基づいたものなのでしょうか? 人気バイアスとフィルターバブルは、最もよく研究されているシステムバイアスの2つだが、以前の研究のほとんどは、単一のレコメンデーションステップでシステムの振る舞いを理解することに集中している。これらのバイアスはユーザ行動とどのように相互作用し、反復的なインタラクションからどのようなユーザエクスペリエンスが生成されるのか? 本研究では,ユーザ行動の違いによる推薦システムの影響を測定するためのシミュレーションフレームワークを提案する。このシミュレーションフレームワークを用いて、(a)ユーザの好みからレコメンダシステムの効果を分離し、(b)「平均ユーザ」だけでなく、非定型ユーザ行動下での極端な体験についてもシステムがどのように機能するかを検討する。本稿では,シミュレーションフレームワークの一部として,シミュレーション上の評価指標のセットを提案し,レコメンダシステムの振る舞いを理解する。最後に,映画レンズにおける従来の協調フィルタリングと大規模生産レコメンデーションシステムに関する2つの実証的なケーススタディを提示し,人気バイアスが時間とともにどのように現れるかを理解する。

Imagine a food recommender system -- how would we check if it is \emph{causing} and fostering unhealthy eating habits or merely reflecting users' interests? How much of a user's experience over time with a recommender is caused by the recommender system's choices and biases, and how much is based on the user's preferences and biases? Popularity bias and filter bubbles are two of the most well-studied recommender system biases, but most of the prior research has focused on understanding the system behavior in a single recommendation step. How do these biases interplay with user behavior, and what types of user experiences are created from repeated interactions? In this work, we offer a simulation framework for measuring the impact of a recommender system under different types of user behavior. Using this simulation framework, we can (a) isolate the effect of the recommender system from the user preferences, and (b) examine how the system performs not just on average for an "average user" but also the extreme experiences under atypical user behavior. As part of the simulation framework, we propose a set of evaluation metrics over the simulations to understand the recommender system's behavior. Finally, we present two empirical case studies -- one on traditional collaborative filtering in MovieLens and one on a large-scale production recommender system -- to understand how popularity bias manifests over time.

翻訳日:2021-04-04 08:14:03 公開日:2021-01-12

# (参考訳) オブジェクト提案生成のためのスーパーピクセルベースリファインメント

Superpixel-based Refinement for Object Proposal Generation ( http://arxiv.org/abs/2101.04574v1 )

ライセンス: CC BY 4.0

Christian Wilms and Simone Frintrop

(参考訳) オブジェクトの正確なセグメンテーションは、クラスに依存しないオブジェクトの提案生成やインスタンスセグメンテーションといったタスクにおいて重要な問題である。ディープラーニングベースのシステムは通常、cnnの固有のダウンサンプリングのため、粗い特徴マップに基づいてオブジェクトのセグメンテーションを生成する。これにより、画像内のオブジェクト境界に順応しないセグメンテーション境界が導かれる。そこで本研究では,最新のオブジェクト提案システムであるAttentionMask上に,新たなスーパーピクセルベースの改良手法を提案する。特徴抽出にスーパーピクセルプーリングと、新しいスーパーピクセル分類器を用いて、高精度スーパーピクセルが対象物に属しているか否かを判定する。実験の結果,AttentionMaskに比べて平均リコール率では最大26.0%の改善が見られた。さらに, セグメンテーションの質的, 定量的分析により, 様々な深層学習に基づくオブジェクト提案生成システムと比較して, 改良のための境界の定着度が著しく向上した。

Precise segmentation of objects is an important problem in tasks like class-agnostic object proposal generation or instance segmentation. Deep learning-based systems usually generate segmentations of objects based on coarse feature maps, due to the inherent downsampling in CNNs. This leads to segmentation boundaries not adhering well to the object boundaries in the image. To tackle this problem, we introduce a new superpixel-based refinement approach on top of the state-of-the-art object proposal system AttentionMask. The refinement utilizes superpixel pooling for feature extraction and a novel superpixel classifier to determine if a high precision superpixel belongs to an object or not. Our experiments show an improvement of up to 26.0% in terms of average recall compared to original AttentionMask. Furthermore, qualitative and quantitative analyses of the segmentations reveal significant improvements in terms of boundary adherence for the proposed refinement compared to various deep learning-based state-of-the-art object proposal generation systems.

翻訳日:2021-04-04 07:54:55 公開日:2021-01-12

# (参考訳) 高密度ハイパーグラフ試験におけるシャープ検出境界

Sharp detection boundaries on testing dense subhypergraph ( http://arxiv.org/abs/2101.04584v1 )

ライセンス: CC BY 4.0

Mingao Yuan and Zuofeng Shang

(参考訳) 本研究では,高密度ハイパーグラフの存在を検査する問題について検討する。ヌル仮説はエルドス=レーニ一様ランダムハイパーグラフであり、代替仮説は高密度な部分ハイパーグラフを含む一様ランダムハイパーグラフである。 1) エッジ確率は既知のもの,(2) エッジ確率は未知のもの,という2つのシナリオにおいて,鋭い検出境界を確立する。どちらのシナリオでも、鋭い検出可能な境界は適切なモデルパラメータによって特徴づけられる。モデルパラメータが検出可能な領域に落ちると漸近的に強力なテストが提供される。以上の結果から,一般的なハイパーグラフモデルの検出可能な領域は,グラフと大きく異なることがわかった。

We study the problem of testing the existence of a dense subhypergraph. The null hypothesis is an Erdos-Renyi uniform random hypergraph and the alternative hypothesis is a uniform random hypergraph that contains a dense subhypergraph. We establish sharp detection boundaries in both scenarios: (1) the edge probabilities are known; (2) the edge probabilities are unknown. In both scenarios, sharp detectable boundaries are characterized by the appropriate model parameters. Asymptotically powerful tests are provided when the model parameters fall in the detectable regions. Our results indicate that the detectable regions for general hypergraph models are dramatically different from their graph counterparts.

翻訳日:2021-04-04 07:41:53 公開日:2021-01-12

# (参考訳) 常識知識の次元

Dimensions of Commonsense Knowledge ( http://arxiv.org/abs/2101.04640v1 )

ライセンス: CC0 1.0

Filip Ilievski, Alessandro Oltramari, Kaixin Ma, Bin Zhang, Deborah L. McGuinness, Pedro Szekely

(参考訳) commonsenseの知識は、自然言語処理、ビジュアル処理、計画など、多くのaiアプリケーションにとって不可欠である。そのため、過去数十年にわたって、常識知識を含む多くの資料が設計され、構築されてきた。近年、大きなテキストベースのソースに焦点が当てられ、ニューラルネットワーク(言語)モデルとの統合が容易になり、典型的にはソースのセマンティクスを犠牲にして、テキストのタスクへの応用が容易になっている。このようなプラクティスは、これらのソースの調和を防ぎ、そのカバレッジとギャップを理解し、ダウンストリームタスクと知識のセマンティックアライメントを妨げる可能性がある。コモンセンス知識の統合は部分的成功をもたらしたが、既存のコモンセンス知識の包括的統合への明確な道筋はない。本稿では,コモンセンス知識の共通次元の周辺にこれらの情報源を整理することを目的とする。この目的のために,我々は,その関係に特に焦点をあてた,幅広い一般的なコモンセンスソースを調査した。我々はこれらの関係を13の知識次元に集約し、それぞれがソースにあるより具体的な関係を抽象化する。この統合により、私たちは別々のソースを統一し、それらのカバレッジ、重複、および知識次元に関するギャップの表示を計算することができます。さらに,コモンセンス知識を必要とする下流推論課題に対する各次元の影響を分析し,時間的・欲求的次元が下流課題の推論に非常に有益であるのに対し,識別性や語彙的知識は影響が少ないことを観察した。これらの結果は、現在の評価におけるいくつかの次元に焦点をあて、他を無視する可能性を明らかにしている。

Commonsense knowledge is essential for many AI applications, including those in natural language processing, visual processing, and planning. Consequently, many sources that include commonsense knowledge have been designed and constructed over the past decades. Recently, the focus has been on large text-based sources, which facilitate easier integration with neural (language) models and application on textual tasks, typically at the expense of the semantics of the sources. Such practice prevents the harmonization of these sources, understanding their coverage and gaps, and may hinder the semantic alignment of their knowledge with downstream tasks. Efforts to consolidate commonsense knowledge have yielded partial success, but provide no clear path towards a comprehensive consolidation of existing commonsense knowledge. The ambition of this paper is to organize these sources around a common set of dimensions of commonsense knowledge. For this purpose, we survey a wide range of popular commonsense sources with a special focus on their relations. We consolidate these relations into 13 knowledge dimensions, each abstracting over more specific relations found in sources. This consolidation allows us to unify the separate sources and to compute indications of their coverage, overlap, and gaps with respect to the knowledge dimensions. Moreover, we analyze the impact of each dimension on downstream reasoning tasks that require commonsense knowledge, observing that the temporal and desire/goal dimensions are very beneficial for reasoning on current downstream tasks, while distinctness and lexical knowledge have little impact. These results reveal focus towards some dimensions in current evaluation, and potential neglect of others.

翻訳日:2021-04-04 07:12:40 公開日:2021-01-12

# (参考訳) リアルか仮想か? 拡張現実シナリオにおける脳活動パターンを用いた参加者ターゲットの識別

Real or Virtual? Using Brain Activity Patterns to differentiate Attended Targets during Augmented Reality Scenarios ( http://arxiv.org/abs/2101.05272v1 )

ライセンス: CC BY 4.0

Lisa-Marie Vortmann, Leonid Schwenke, Felix Putze

(参考訳) 拡張現実(Augmented Reality)は、仮想コンポーネントと実際の環境の融合である。生成されたオブジェクトと自然オブジェクトの同時可視性は、ユーザがリアルまたは仮想の特定のターゲットに選択的に注意を向ける必要がある場合が多い。本研究では,拡張現実のシナリオで収集された脳波(eeg)データを分類する機械学習手法を用いて,この目標が現実か仮想かを検討した。浅い畳み込みニューラルネットワークは、テストデータとトレーニングデータが異なる試行で得られた場合、20人の参加者から平均70%以上の精度で3秒間のデータウィンドウを分類した。 20名中6名に対して, 人別分類が可能であった。このように、脳-コンピュータインタフェースの信頼性は、拡張現実アプリケーションに有用な入力メカニズムとして扱うのに十分である。

Augmented Reality is the fusion of virtual components and our real surroundings. The simultaneous visibility of generated and natural objects often requires users to direct their selective attention to a specific target that is either real or virtual. In this study, we investigated whether this target is real or virtual by using machine learning techniques to classify electroencephalographic (EEG) data collected in Augmented Reality scenarios. A shallow convolutional neural net classified 3 second data windows from 20 participants in a person-dependent manner with an average accuracy above 70\% if the testing data and training data came from different trials. Person-independent classification was possible above chance level for 6 out of 20 participants. Thus, the reliability of such a Brain-Computer Interface is high enough for it to be treated as a useful input mechanism for Augmented Reality applications.

翻訳日:2021-04-04 06:45:02 公開日:2021-01-12

# (参考訳) モバイルおよびwebアプリケーションのための境界対応セグメンテーションネットワーク

Boundary-Aware Segmentation Network for Mobile and Web Applications ( http://arxiv.org/abs/2101.04704v1 )

ライセンス: CC BY 4.0

Xuebin Qin and Deng-Ping Fan and Chenyang Huang and Cyril Diagne and Zichen Zhang and Adri\`a Cabeza Sant'Anna and Albert Su\`arez and Martin Jagersand and Ling Shao

(参考訳) 深層モデルは画像分割の精度とロバスト性を大幅に向上させたが、高精度な境界と微細構造を持つセグメンテーション結果を得ることは依然として課題である。本稿では,予測再定義アーキテクチャとハイブリッド損失を含む,シンプルながら強力な境界認識セグメンテーションネットワーク(BASNet)を提案し,高精度な画像セグメンテーションを実現する。予測再定義アーキテクチャは、分割確率マップの予測と精錬にそれぞれ使用される、密集した教師付きエンコーダ-デコーダネットワークと残留精細モジュールで構成される。ハイブリッド損失は、二進的クロスエントロピー、構造的類似性、および交叉対ユニオン損失の組み合わせであり、ネットワークは3レベル(ピクセルレベル、パッチレベル、マップレベル)の階層表現を学習するよう誘導する。我々は,有能なオブジェクトセグメンテーション,カモフラージュされたオブジェクトセグメンテーションを含む2つの逆タスクに対して,BASNetを評価し,鋭いセグメンテーション境界で非常に競争的な性能を実現することを示す。重要な点として、BASNetは単一のGPU上で70fps以上で動作する。 basnetをベースにして、arコピー&ペースト(ar copy & paste)という2つの商用アプリケーションを開発し、basnetは現実世界のオブジェクトの「コピー」と「ペースト」のために拡張現実と統合され、オブジェクトの背景を自動的に除去するwebベースのツールであるobject cut(オブジェクトカット)を開発した。どちらのアプリケーションもすでに多くの注目を集めており、現実世界に大きな影響を与えている。コードと2つのアプリケーションは、https://github.com/NathanUA/BASNetで公開される。

Although deep models have greatly improved the accuracy and robustness of image segmentation, obtaining segmentation results with highly accurate boundaries and fine structures is still a challenging problem. In this paper, we propose a simple yet powerful Boundary-Aware Segmentation Network (BASNet), which comprises a predict-refine architecture and a hybrid loss, for highly accurate image segmentation. The predict-refine architecture consists of a densely supervised encoder-decoder network and a residual refinement module, which are respectively used to predict and refine a segmentation probability map. The hybrid loss is a combination of the binary cross entropy, structural similarity and intersection-over-union losses, which guide the network to learn three-level (ie, pixel-, patch- and map- level) hierarchy representations. We evaluate our BASNet on two reverse tasks including salient object segmentation, camouflaged object segmentation, showing that it achieves very competitive performance with sharp segmentation boundaries. Importantly, BASNet runs at over 70 fps on a single GPU which benefits many potential real applications. Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is integrated with augmented reality for "COPYING" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal. Both applications have already drawn huge amount of attention and have important real-world impacts. The code and two applications will be publicly available at: https://github.com/NathanUA/BASNet.

翻訳日:2021-04-04 06:27:02 公開日:2021-01-12

# (参考訳) 対照的な自己教師付き学習を改善する明示的ホモグラフィ推定

Explicit homography estimation improves contrastive self-supervised learning ( http://arxiv.org/abs/2101.04713v1 )

ライセンス: CC BY 4.0

David Torpey and Richard Klein

(参考訳) 典型的なコントラスト自己監督アルゴリズムは、正と負の画像を直接または間接的に対比して監督信号として潜時空間の類似度尺度を用いる。自己教師付きアルゴリズムの実用性は近年改善されているが,計算処理など,その普及を妨げるボトルネックが依然として残っている。本稿では,自己教師付きコントラスト学習パラダイムにおける追加目標としてのモジュールを提案する。このモジュールをアフィン変換やホモグラフィーのパラメータに組み込むことによって、元のコントラスト目的に加えて、パフォーマンスと学習速度を向上することを示す。重要なことは、この加群がアフィン変換の様々な成分に不変性を強制しないことを保証する。本稿では,最近普及している2つの自己教師型アルゴリズムに対する追加目的の有効性を示す。提案手法の広範な実験的解析を行い,検討した全てのデータセットの性能向上を示す。さらに,一般ホモグラフィとアフィン変換はともに性能と収束性を改善するのに十分であるが,全ての場合においてアフィン変換は良好であることがわかった。

The typical contrastive self-supervised algorithm uses a similarity measure in latent space as the supervision signal by contrasting positive and negative images directly or indirectly. Although the utility of self-supervised algorithms has improved recently, there are still bottlenecks hindering their widespread use, such as the compute needed. In this paper, we propose a module that serves as an additional objective in the self-supervised contrastive learning paradigm. We show how the inclusion of this module to regress the parameters of an affine transformation or homography, in addition to the original contrastive objective, improves both performance and learning speed. Importantly, we ensure that this module does not enforce invariance to the various components of the affine transform, as this is not always ideal. We demonstrate the effectiveness of the additional objective on two recent, popular self-supervised algorithms. We perform an extensive experimental analysis of the proposed method and show an improvement in performance for all considered datasets. Further, we find that although both the general homography and affine transformation are sufficient to improve performance and convergence, the affine transformation performs better in all cases.

翻訳日:2021-04-04 05:51:41 公開日:2021-01-12

# (参考訳) リアルな微小地震事象のベイジアン後方推定を指向した高速機械学習

Towards fast machine-learning-assisted Bayesian posterior inference of realistic microseismic events ( http://arxiv.org/abs/2101.04724v1 )

ライセンス: CC BY 4.0

Davide Piras, Alessio Spurio Mancini, Benjamin Joachimi, Michael P. Hobson

(参考訳) 微小地震活動モニタリングに応用されたベイズ推定は、記録された地震計からの微小地震事象の座標とその関連する不確かさを原理的に推定することができる。しかしながら、これらのマイクロ地震事象の前方モデリングは、ベイズ源の反転を行うのに必要であり、計算資源の面では極めて高価である。実現可能な解決策は、機械学習技術に基づくサロゲートモデルをトレーニングし、前方モデルをエミュレートし、ベイズ推論を加速することだ。本稿では,等方性モーメントテンソルのソースのみを考慮した先行研究について改善する。記録された圧力波のパワースペクトルに基づいて機械学習アルゴリズムをトレーニングし、トレーニングされたエミュレータが$\textit{any}$ソースメカニズムのイベント座標の完全かつ高速な検索を可能にすることを示す。さらに,本手法は商用ノートパソコン上で1時間未満で動作可能であり,トレーニング地震計10^4ドル以下で正確な結果が得られるため,計算コストが低いことを示す。さらに,ベイズ証拠を推定することにより,トレーニングしたエミュレータを用いてソースメカニズムを同定する方法を実証する。この研究は、記録された地震計の効率的な局所化と特徴付けの基礎を築き、地震活動に対する人間の影響を定量化し、地震の危険を軽減するのに役立つ。

Bayesian inference applied to microseismic activity monitoring allows for principled estimation of the coordinates of microseismic events from recorded seismograms, and their associated uncertainties. However, forward modelling of these microseismic events, necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is to train a surrogate model based on machine learning techniques, to emulate the forward model and thus accelerate Bayesian inference. In this paper, we improve on previous work, which considered only sources with isotropic moment tensor. We train a machine learning algorithm on the power spectrum of the recorded pressure wave and show that the trained emulator allows for the complete and fast retrieval of the event coordinates for $\textit{any}$ source mechanism. Moreover, we show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop, while yielding accurate results using less than $10^4$ training seismograms. We additionally demonstrate how the trained emulators can be used to identify the source mechanism through the estimation of the Bayesian evidence. This work lays the foundations for the efficient localisation and characterisation of any recorded seismogram, thus helping to quantify human impact on seismic activity and mitigate seismic hazard.

翻訳日:2021-04-04 05:39:47 公開日:2021-01-12

# (参考訳) SEED:視覚表現のための自己教師型蒸留

SEED: Self-supervised Distillation For Visual Representation ( http://arxiv.org/abs/2101.04731v1 )

ライセンス: CC BY 4.0

Zhiyuan Fang, Jianfeng Wang, Lijuan Wang, Lei Zhang, Yezhou Yang, Zicheng Liu

(参考訳) 本稿では,小型モデルの自己教師型学習について述べる。この問題は,広範に使用されているコントラスト型自己教師付き学習手法が大規模モデルトレーニングにおいて大きな進歩を遂げているが,小モデルではうまく機能しないという経験的研究が動機である。この問題に対処するため,我々はSelf-SupErvised Distillation (SEED)という新たな学習パラダイムを提案し,より大規模なネットワーク(教師として)を利用して,表現的知識をより小さなアーキテクチャ(学生として)に自己管理的に伝達する。ラベルのないデータから直接学習する代わりに、教師が一連のインスタンスに対して推定する類似度スコア分布を模倣するように学生エンコーダを訓練する。シードはダウンストリームタスクにおける小さなネットワークのパフォーマンスを劇的に向上させる。自己監督ベースラインと比較して、SEEDはトップ1の精度を、EfficientNet-B0で42.2%から67.6%、ImageNet-1kデータセットでMobileNet-v3-Largeで36.3%から68.2%に改善している。

This paper is concerned with self-supervised learning for small models. The problem is motivated by our empirical studies that while the widely used contrastive self-supervised learning method has shown great progress on large model training, it does not work well for small models. To address this problem, we propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), where we leverage a larger network (as Teacher) to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion. Instead of directly learning from unlabeled data, we train a student encoder to mimic the similarity score distribution inferred by a teacher over a set of instances. We show that SEED dramatically boosts the performance of small networks on downstream tasks. Compared with self-supervised baselines, SEED improves the top-1 accuracy from 42.2% to 67.6% on EfficientNet-B0 and from 36.3% to 68.2% on MobileNet-v3-Large on the ImageNet-1k dataset.

翻訳日:2021-04-04 05:01:10 公開日:2021-01-12

# (参考訳) 運動計画を用いたブートストラップモータスキル学習

Bootstrapping Motor Skill Learning with Motion Planning ( http://arxiv.org/abs/2101.04736v1 )

ライセンス: CC BY 4.0

Ben Abbatematteo, Eric Rosen, Stefanie Tellex, George Konidaris

(参考訳) ロボットモーターのスキルをスクラッチから学ぶのは非常に遅いので、実際に人間のデモから得られる優れたスキルポリシーを使って学習をブートストラップする必要がある。しかし、人間の実演に頼ると、ロボットの自律性が低下し、運用期間を通じて様々なスキルを身につける必要がある。物体操作のための運動スキル学習をブートストラップする、完全に自律的なサンプルとして運動計画を用いることを提案する。本研究では,運動プランナーを用いて,動的運動プリミティブ表現を用いた引き出しの開閉と,ディープニューラルネットワークポリシによるマイクロ波ドアの開閉という,複雑な2つの操作シナリオにおいて,モータスキルのブートストラップを行う。また,本手法では,静的なシーンを考慮に入れたキネマティック計画では,この課題を解決するには不十分であるが,よりダイナミックなポリシーをブートストラップするには十分であることを示す。これら3例すべてにおいて,本手法は人為的な初期化と競合し,ランダムなポリシーから始めると著しく優れる。このアプローチにより、ロボットは人間の実演なしに動的タスクの運動ポリシーを効率的かつ自律的に学習することができる。

Learning a robot motor skill from scratch is impractically slow; so much so that in practice, learning must be bootstrapped using a good skill policy obtained from human demonstration. However, relying on human demonstration necessarily degrades the autonomy of robots that must learn a wide variety of skills over their operational lifetimes. We propose using kinematic motion planning as a completely autonomous, sample efficient way to bootstrap motor skill learning for object manipulation. We demonstrate the use of motion planners to bootstrap motor skills in two complex object manipulation scenarios with different policy representations: opening a drawer with a dynamic movement primitive representation, and closing a microwave door with a deep neural network policy. We also show how our method can bootstrap a motor skill for the challenging dynamic task of learning to hit a ball off a tee, where a kinematic plan based on treating the scene as static is insufficient to solve the task, but sufficient to bootstrap a more dynamic policy. In all three cases, our method is competitive with human-demonstrated initialization, and significantly outperforms starting with a random policy. This approach enables robots to to efficiently and autonomously learn motor policies for dynamic tasks without human demonstration.

翻訳日:2021-04-04 04:36:19 公開日:2021-01-12

# (参考訳) 大規模拡張グランガー因果性を用いた機能MRIからの統合失調症の分類

Classification of Schizophrenia from Functional MRI Using Large-scale Extended Granger Causality ( http://arxiv.org/abs/2101.10471v1 )

ライセンス: CC BY 4.0

Axel Wism\"uller and M. Ali Vosoughi

(参考訳) この文献は統合失調症が脳ネットワーク接続の変化と関連していることを示している。本研究では, 大規模拡張グランガー因果性 (lsXGC) が静止状態fMRIデータを用いてこのような変化を捉えることができるか検討する。本手法は,fMRI時系列間の有向因果関係を推定するための予測時系列モデルにおいて,ソース時系列の増大と合わせて次元削減を利用する。 lsXGCは、他のすべての時系列の存在下で、基礎となる動的システムとの関係を特定するため、多変量アプローチである。ここでlsxgcは、cobre(center of biomedical research excellence)データリポジトリから62名の被験者のサブセットを使用して、統合失調症患者を典型的なコントロールから分類するためのバイオマーカーとして機能する。分類の特徴としてlsxgcによって推定される脳結合を用いる。特徴抽出後,kendallのtauランク相関係数による特徴抽出を行い,サポートベクターマシンを用いた分類を行った。参考法として, 機能的接続性の標準尺度として文献で一般的に用いられる相互相関法と比較した。我々は,100種類の異なるトレーニング/テスト (90%/10%) データを分割して平均精度と受信機動作特性曲線 (auc) 下の平均領域を得る。その結果,lsXGCの平均精度範囲は[0.767,0.940],平均AUC範囲は[0.861,0.983]であった。 lsXGCの結果は, [0.721, 0.751] の平均精度と [0.744, 0.860] の平均 AUC との相互相関の結果よりも有意に高い。統合失調症のバイオマーカーとしてのlsXGCの有用性が示唆された。

The literature manifests that schizophrenia is associated with alterations in brain network connectivity. We investigate whether large-scale Extended Granger Causality (lsXGC) can capture such alterations using resting-state fMRI data. Our method utilizes dimension reduction combined with the augmentation of source time-series in a predictive time-series model for estimating directed causal relationships among fMRI time-series. The lsXGC is a multivariate approach since it identifies the relationship of the underlying dynamic system in the presence of all other time-series. Here lsXGC serves as a biomarker for classifying schizophrenia patients from typical controls using a subset of 62 subjects from the Centers of Biomedical Research Excellence (COBRE) data repository. We use brain connections estimated by lsXGC as features for classification. After feature extraction, we perform feature selection by Kendall's tau rank correlation coefficient followed by classification using a support vector machine. As a reference method, we compare our results with cross-correlation, typically used in the literature as a standard measure of functional connectivity. We cross-validate 100 different training/test (90%/10%) data split to obtain mean accuracy and a mean Area Under the receiver operating characteristic Curve (AUC) across all tested numbers of features for lsXGC. Our results demonstrate a mean accuracy range of [0.767, 0.940] and a mean AUC range of [0.861, 0.983] for lsXGC. The result of lsXGC is significantly higher than the results obtained with the cross-correlation, namely mean accuracy of [0.721, 0.751] and mean AUC of [0.744, 0.860]. Our results suggest the applicability of lsXGC as a potential biomarker for schizophrenia.

翻訳日:2021-04-04 04:21:57 公開日:2021-01-12

# (参考訳) 顔のスプーフィング検出のためのコンパクトなディープラーニングモデル

A Compact Deep Learning Model for Face Spoofing Detection ( http://arxiv.org/abs/2101.04756v1 )

ライセンス: CC BY 4.0

Seyedkooshan Hashemifard and Mohammad Akbari

(参考訳) 近年,顔バイオメトリック・セキュリティシステムが急速に普及しているため,プレゼンテーションアタック検出(PAD)は研究コミュニティから注目され,主要な研究分野となっている。研究者は、lpp、bsif、lpqなどの従来のテクスチャ特徴抽出の活用から、異なるアーキテクチャのディープニューラルネットワークの利用まで、様々な方法でこの問題に取り組んでいる。これらの技術は特定の攻撃シナリオやデータセットに対してそれぞれ達成されているが、その効率は特定の種類のプレゼンテーションアタックや機器(PAI)に限られているため、そのほとんどが目に見えない条件の問題を一般化できなかった。本稿では,手作りのテクスチャ特徴を完全に抽出したり,深層ニューラルネットワークにのみ依存するのではなく,広部と深部の両方を統合型ニューラルネットワークアーキテクチャで融合することで,この問題に対処する。主なアイデアは、両方の方法の強みを生かして、問題に対するよく一般化された解決策を導出することである。また,提案手法をそれぞれ別々に比較することにより,本手法の有効性を評価した。この手順は、ROSE-Youtu、SiW、NUAA Imposterデータセットなど、さまざまなスプーフィングデータセットで実行される。特に,スプーフィング検出タスク(ディープチャネル)のための畳み込みニューラルネットワーク設計を通じて学習したデータ駆動型特徴を応用した低次元潜在空間を同時学習し,スプーフィング検出機能を利用した周波数・時間次元(ワイドチャネル)のスプーフィング検出機能を活用する。

In recent years, face biometric security systems are rapidly increasing, therefore, the presentation attack detection (PAD) has received significant attention from research communities and has become a major field of research. Researchers have tackled the problem with various methods, from exploiting conventional texture feature extraction such as LBP, BSIF, and LPQ to using deep neural networks with different architectures. Despite the results each of these techniques has achieved for a certain attack scenario or dataset, most of them still failed to generalized the problem for unseen conditions, as the efficiency of each is limited to certain type of presentation attacks and instruments (PAI). In this paper, instead of completely extracting hand-crafted texture features or relying only on deep neural networks, we address the problem via fusing both wide and deep features in a unified neural architecture. The main idea is to take advantage of the strength of both methods to derive well-generalized solution for the problem. We also evaluated the effectiveness of our method by comparing the results with each of the mentioned techniques separately. The procedure is done on different spoofing datasets such as ROSE-Youtu, SiW and NUAA Imposter datasets. In particular, we simultanously learn a low dimensional latent space empowered with data-driven features learnt via Convolutional Neural Network designes for spoofing detection task (i.e., deep channel) as well as leverages spoofing detection feature already popular for spoofing in frequency and temporal dimensions ( i.e., via wide channel).

翻訳日:2021-04-04 04:09:29 公開日:2021-01-12

# (参考訳) 基数評価と順序評価の合同集約と学生用紙コンテストへの応用

Joint aggregation of cardinal and ordinal evaluations with an application to a student paper competition ( http://arxiv.org/abs/2101.04765v1 )

ライセンス: CC BY 4.0

Dorit S. Hochbaum and Erick Moreno-Centeno

(参考訳) 決定論における重要な問題は、個々のランク/レーティングを集団評価に集約することである。 2007 MSOMの学生論文コンペティションにおける新たな集約手法について述べる。この競争における集合問題は2つの課題をもたらす。第一に、各論文は裁判官のごくわずかな部分でのみレビューされ、その結果、総合評価は裁判官が選択した主観的な尺度に非常に敏感である。第二に、裁判官は審査した論文の基数評価と順序評価(格付けとランク付け)の両方を提供した。ここでの貢献は、順序と基数の評価を共同で総合評価に集約する新しい堅牢な方法論である。この方法論は、不完全な評価の場合、すなわち、個人がオブジェクトの厳密なサブセットのみを評価する場合に特に適しています。このアプローチは、大規模なプロジェクトや複数の優先順位を含む資本予算からプロジェクトを選択する委員会による管理的意思決定の問題において、潜在的に有用である。

An important problem in decision theory concerns the aggregation of individual rankings/ratings into a collective evaluation. We illustrate a new aggregation method in the context of the 2007 MSOM's student paper competition. The aggregation problem in this competition poses two challenges. Firstly, each paper was reviewed only by a very small fraction of the judges; thus the aggregate evaluation is highly sensitive to the subjective scales chosen by the judges. Secondly, the judges provided both cardinal and ordinal evaluations (ratings and rankings) of the papers they reviewed. The contribution here is a new robust methodology that jointly aggregates ordinal and cardinal evaluations into a collective evaluation. This methodology is particularly suitable in cases of incomplete evaluations -- i.e., when the individuals evaluate only a strict subset of the objects. This approach is potentially useful in managerial decision making problems by a committee selecting projects from a large set or capital budgeting involving multiple priorities.

翻訳日:2021-04-04 03:14:51 公開日:2021-01-12

# (参考訳) DuctTake:時空間ビデオ合成

DuctTake: Spatiotemporal Video Compositing ( http://arxiv.org/abs/2101.04772v1 )

ライセンス: CC BY 4.0

Jan Rueegg, Oliver Wang, Aljoscha Smolic, Markus Gross

(参考訳) DuctTakeは、シーンの複数のテイクを単一のビデオに実用的な合成を可能にするように設計されたシステムである。現在の業界ソリューションはオブジェクトセグメンテーション(オブジェクトセグメンテーション)に基づいており、手動入力とクリーンアップを必要とする難しい問題であり、フィルム製造プロセスの高価な部分を構成する。そこで本手法では,映像の体積を3次元グラフで補正し,最適な時空間シームを合成する。我々は,hd動画を合成するインタラクティブなツールとして,各セクションの実行時間と性能に特に注意を払いながら,必要なコンポーネント,決定,新しいテクニックを詳細に説明する。我々は,幅広い実例を提示し,現在最先端のツールを用いて,プロのアーティストが作成した複合作品と結果品質と作成時間を比較することにより,このアプローチを検証する。

DuctTake is a system designed to enable practical compositing of multiple takes of a scene into a single video. Current industry solutions are based around object segmentation, a hard problem that requires extensive manual input and cleanup, making compositing an expensive part of the film-making process. Our method instead composites shots together by finding optimal spatiotemporal seams using motion-compensated 3D graph cuts through the video volume. We describe in detail the required components, decisions, and new techniques that together make a usable, interactive tool for compositing HD video, paying special attention to running time and performance of each section. We validate our approach by presenting a wide variety of examples and by comparing result quality and creation time to composites made by professional artists using current state-of-the-art tools.

翻訳日:2021-04-04 02:58:59 公開日:2021-01-12

# (参考訳) 音声駆動サービスにおける実践的音声再使用防止

Practical Speech Re-use Prevention in Voice-driven Services ( http://arxiv.org/abs/2101.04773v1 )

ライセンス: CC BY 4.0

Yangyong Zhang, Maliheh Shirvanian, Sunpreet S. Arora, Jianwei Huang, and Guofei Gu

(参考訳) 音声駆動サービス(VDS)は、スマートホームコントロールからデジタルアシスタントを使った支払いまで、さまざまなアプリケーションで使用されている。このようなサービスへの入力は、オープンな音声チャンネル、例えばマイクを使って、教師なしの設定でキャプチャされることが多い。このような設定における運用上のセキュリティ要件の1つは、入力音声の鮮度である。本稿では,ユーザインタラクション時に動的音響ノイズを積極的に埋め込んだセキュリティオーバーレイであるAEOLUSについて述べる。音響ノイズは, (i) 確実に組込み, 取り出しが可能であり, (ii) 非破壊的 (かつ, 不可避) なvdsユーザであることを示す。実用的観点から、(i)および(ii)に対して最適なパラメータ(音響ナンスの動作周波数、振幅、ビットレート)を決定する。実験の結果,AEOLUSは背景雑音レベルが異なる3つの実環境において,音声の再使用防止のために0% FARで0.5%FRRを得ることがわかった。また,120名の被験者によるユーザ調査を行い,これらの環境では,94.16%の音声サンプルにおいて,全体のユーザエクスペリエンスが低下しないことを示した。そのため、AEOLUSは音声の再使用を防止し、音声入力の鮮度を確保するために実際に使用することができる。

Voice-driven services (VDS) are being used in a variety of applications ranging from smart home control to payments using digital assistants. The input to such services is often captured via an open voice channel, e.g., using a microphone, in an unsupervised setting. One of the key operational security requirements in such setting is the freshness of the input speech. We present AEOLUS, a security overlay that proactively embeds a dynamic acoustic nonce at the time of user interaction, and detects the presence of the embedded nonce in the recorded speech to ensure freshness. We demonstrate that acoustic nonce can (i) be reliably embedded and retrieved, and (ii) be non-disruptive (and even imperceptible) to a VDS user. Optimal parameters (acoustic nonce's operating frequency, amplitude, and bitrate) are determined for (i) and (ii) from a practical perspective. Experimental results show that AEOLUS yields 0.5% FRR at 0% FAR for speech re-use prevention upto a distance of 4 meters in three real-world environments with different background noise levels. We also conduct a user study with 120 participants, which shows that the acoustic nonce does not degrade overall user experience for 94.16% of speech samples, on average, in these environments. AEOLUS can therefore be used in practice to prevent speech re-use and ensure the freshness of speech input.

翻訳日:2021-04-04 02:42:29 公開日:2021-01-12

# (参考訳) マルチエージェントmdpのためのスケーラブルなanytime planning

Scalable Anytime Planning for Multi-Agent MDPs ( http://arxiv.org/abs/2101.04788v1 )

ライセンス: CC BY 4.0

Shushman Choudhury, Jayesh K. Gupta, Peter Morales, Mykel J. Kochenderfer

(参考訳) 動的協調を必要とする大規模マルチエージェントシーケンシャル決定問題に対して,スケーラブルな木探索計画アルゴリズムを提案する。エージェントのチームは多くのドメインで決定をコーディネートする必要があるが、単純なアプローチはエージェントの数と共同アクション空間が指数関数的に増加するために失敗する。私たちはこの複雑さを、近似品質と動的に協調する動作のために計算を交換できるanytimeアプローチを通じて回避します。提案アルゴリズムは,モンテカルロ木探索 (MCTS) を用いたオンライン計画,協調グラフを用いた局所エージェント相互作用の因子表現,および協調行動選択のための反復マックスプラス法からなる。我々は,静的コーディネーショングラフを用いたベンチマークSysAdminのアプローチを評価し,MCTSベースラインよりも計算コストがはるかに低い性能を実現する。また,動的,すなわち状態依存のコーディネーショングラフを持つマルチドローン配送ドメインを導入し,我々のアプローチが,他のmctsメソッドでは難解なこの領域の大きな問題にどのようにスケールするかを実証する。我々はこのアルゴリズムのオープンソース実装をhttps://github.com/JuliaPOMDP/FactoredValueMCTS.jlで公開しています。

We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.

翻訳日:2021-04-04 02:25:43 公開日:2021-01-12

# インスタント適応のための線形表現メタ強化学習

Linear Representation Meta-Reinforcement Learning for Instant Adaptation ( http://arxiv.org/abs/2101.04750v1 )

ライセンス: Link先を確認

Matt Peng, Banghua Zhu, Jiantao Jiao

(参考訳) 本稿では,Fast Linearized Adaptive Policy (FLAP)について紹介する。これは,学習中のデータ再利用を必要とせず,かつ,テスト中のサンプル数個だけでほぼ瞬時に適応できる,新しいメタ強化学習(meta-RL)手法である。 FLAPは方針の共有線形表現を学習するアイデアに基づいており、新しいタスクに適応すると、線形重みの集合を予測するのに十分である。適応中は、MAMLのような従来のメタRL法のように勾配勾配を更新する代わりに、アダプティブネットワークを用いてこれらの線形重み付けを予測することで、新しいポリシーを得られるように、個別のアダプタネットワークを同時に訓練する。異なるフィードフォワードネットワークの応用は、適応実行時間を著しく高速化するだけでなく、以前のMeta-RLメソッドでは一般化できなかった非常に異なるタスクに非常によく一般化する。標準の連続制御メタrlベンチマーク実験では、flapは平均リターンを最大2倍にし、以前の方法と比較して最大8倍高速に適応した実行時間速度を示す。

This paper introduces Fast Linearized Adaptive Policy (FLAP), a new meta-reinforcement learning (meta-RL) method that is able to extrapolate well to out-of-distribution tasks without the need to reuse data from training, and adapt almost instantaneously with the need of only a few samples during testing. FLAP builds upon the idea of learning a shared linear representation of the policy so that when adapting to a new task, it suffices to predict a set of linear weights. A separate adapter network is trained simultaneously with the policy such that during adaptation, we can directly use the adapter network to predict these linear weights instead of updating a meta-policy via gradient descent, such as in prior meta-RL methods like MAML, to obtain the new policy. The application of the separate feed-forward network not only speeds up the adaptation run-time significantly, but also generalizes extremely well to very different tasks that prior Meta-RL methods fail to generalize to. Experiments on standard continuous-control meta-RL benchmarks show FLAP presenting significantly stronger performance on out-of-distribution tasks with up to double the average return and up to 8X faster adaptation run-time speeds when compared to prior methods.

翻訳日:2021-04-04 01:55:07 公開日:2021-01-12

# 文脈問題:手話認識のための自己認識

Context Matters: Self-Attention for Sign Language Recognition ( http://arxiv.org/abs/2101.04632v1 )

ライセンス: Link先を確認

Fares Ben Slimane and Mohamed Bouguessa

(参考訳) 本稿では,連続手話認識のための注意ネットワークを提案する。提案手法は,手話のモダリティをモデル化するために,共依存データストリームを利用する。これらの異なる情報チャネルは、互いに複雑な時間構造を共有することができる。そのため、私たちは同期に注意を払い、異なる手話コンポーネント間の絡み合った依存関係を捉えるのに役立ちます。手話はマルチチャネルであるにもかかわらず、手形は手話解釈の中心的な実体を表す。正しい文脈で手形を見ることは、記号の意味を定義する。これを考慮し、注意機構を用いて、手の特徴を適切な時空間で効率的に集約し、より優れた手話認識を実現する。これによってモデルは、支配的な手と顔の領域を中心に回転する重要な手話コンポーネントを識別できることが分かりました。ベンチマークデータセットであるRWTH-PHOENIX-Weather 2014でテストを行い、競争結果を得た。

This paper proposes an attentional network for the task of Continuous Sign Language Recognition. The proposed approach exploits co-independent streams of data to model the sign language modalities. These different channels of information can share a complex temporal structure between each other. For that reason, we apply attention to synchronize and help capture entangled dependencies between the different sign language components. Even though Sign Language is multi-channel, handshapes represent the central entities in sign interpretation. Seeing handshapes in their correct context defines the meaning of a sign. Taking that into account, we utilize the attention mechanism to efficiently aggregate the hand features with their appropriate spatio-temporal context for better sign recognition. We found that by doing so the model is able to identify the essential Sign Language components that revolve around the dominant hand and the face areas. We test our model on the benchmark dataset RWTH-PHOENIX-Weather 2014, yielding competitive results.

翻訳日:2021-04-04 01:54:42 公開日:2021-01-12

# ビデオ感性分析のための量子認知型決定融合

Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis ( http://arxiv.org/abs/2101.04406v1 )

ライセンス: Link先を確認

Dimitris Gkoumas, Qiuchi Li, Shahram Dehdashti, Massimo Melucci, Yijun Yu, Dawei Song

(参考訳) 意思決定プロセスとしての映像感情分析は本質的に複雑であり、複数のモダリティからの意思決定の融合や、いわゆる認知バイアスが伴う。量子認知の最近の進歩に触発されて、あるモダリティからの感情判断が他のモダリティの判断と相容れないこと、すなわち秩序が問題であり、最終的な決定を下すために共同で測定できないことを示す。したがって、認知過程は古典的確率論では捉えられない「量子的」バイアスを示す。そこで本研究では,感情判断予測のための新しい量子認知的融合戦略を提案する。特に、正および負の感性判断の量子重ね合わせ状態として発話を定式化し、一様分類器を相互に相反する可観測量として、正の演算値測度を持つ複素数値ヒルベルト空間上で定式化する。 2つのベンチマークデータセットの実験は、我々のモデルが既存の決定レベルと最先端のコンテンツレベルの融合アプローチを大きく上回っていることを示している。また,不整合性の概念は,すべてのユニモーダル分類器によって誤って予測される極端な事例を含む,すべての組み合わせパターンを効果的に扱えることを示す。

Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matters and they cannot be jointly measured to produce a final decision. Thus the cognitive process exhibits "quantum-like" biases that cannot be captured by classical probability theories. Accordingly, we propose a fundamentally new, quantum cognitively motivated fusion strategy for predicting sentiment judgments. In particular, we formulate utterances as quantum superposition states of positive and negative sentiment judgments, and uni-modal classifiers as mutually incompatible observables, on a complex-valued Hilbert space with positive-operator valued measures. Experiments on two benchmarking datasets illustrate that our model significantly outperforms various existing decision level and a range of state-of-the-art content-level fusion approaches. The results also show that the concept of incompatibility allows effective handling of all combination patterns, including those extreme cases that are wrongly predicted by all uni-modal classifiers.

翻訳日:2021-04-04 01:54:30 公開日:2021-01-12

# マルチモーダルレシピにおける手続き的概念の潜在アライメント

Latent Alignment of Procedural Concepts in Multimodal Recipes ( http://arxiv.org/abs/2101.04727v1 )

ライセンス: Link先を確認

Hossein Rajaby Faghihi, Roshanak Mirzaee, Sudarshan Paliwal, and Parisa Kordjamshidi

(参考訳) 本稿では、新たにリリースされたマルチモーダルQAデータセットRecipeQAの手続き的推論を扱うための新しいアライメント機構を提案する。私たちのモデルは,画像と指示を含むレシピの読み解き理解であるテキストクローゼタスクを解決している。我々は,アテンションネットワーク,クロスモーダル表現,命令と候補回答間の潜在アライメント空間のパワーを活用し,この問題を解決した。本稿では,アライメント行列の最大プーリング操作を洗練し,モデルの出力間に不一致な制約を課す制約付きマックスプーリングを提案する。評価の結果,ベースラインに対して19-%改善が見られた。

We propose a novel alignment mechanism to deal with procedural reasoning on a newly released multimodal QA dataset, named RecipeQA. Our model is solving the textual cloze task which is a reading comprehension on a recipe containing images and instructions. We exploit the power of attention networks, cross-modal representations, and a latent alignment space between instructions and candidate answers to solve the problem. We introduce constrained max-pooling which refines the max-pooling operation on the alignment matrix to impose disjoint constraints among the outputs of the model. Our evaluation result indicates a 19\% improvement over the baselines.

翻訳日:2021-04-04 01:53:52 公開日:2021-01-12

# UFA-FUSE:多焦点画像融合のための新しい深層教師付きハイブリッドモデル

UFA-FUSE: A novel deep supervised and hybrid model for multi-focus image fusion ( http://arxiv.org/abs/2101.04506v1 )

ライセンス: Link先を確認

Yongsheng Zang, Dongming Zhou, Changcheng Wang, Rencan Nie, and Yanbu Guo

(参考訳) 従来の深層学習に基づく融合法は中間決定マップを生成し、一連の後処理手順を通じて融合画像を得る。しかし、これらの方法で生成された融合結果は、ソースイメージの詳細や成果物を失うことは容易である。ディープラーニングに基づく画像再構成技術に着想を得て,これらの課題をエンドツーエンドかつ教師付き学習方法で解決するために,ポストプロセッシングを伴わないマルチフォーカス画像融合ネットワークフレームワークを提案する。融合モデルを十分に訓練するために,地上融合画像を用いた大規模マルチフォーカス画像データセットを作成した。さらに,より情報的な融合画像を得るために,チャネルアテンションモジュールと空間アテンションモジュールから構成されるユニタリフュージョンアテンションに基づく新しい融合戦略を設計した。具体的には,提案手法は主に特徴抽出,特徴融合,画像再構成の3つの要素からなる。まず,7つの畳み込みブロックを用いて画像の特徴を抽出する。そして, 抽出した畳み込み特性を, 特徴融合層の融合戦略により融合させる。最後に、融合画像の特徴を4つの畳み込みブロックで再構成する。実験の結果, 提案手法は19の最先端融合法と比較して, 優れた核融合性能が得られることがわかった。

Traditional and deep learning-based fusion methods generated the intermediate decision map to obtain the fusion image through a series of post-processing procedures. However, the fusion results generated by these methods are easy to lose some source image details or results in artifacts. Inspired by the image reconstruction techniques based on deep learning, we propose a multi-focus image fusion network framework without any post-processing to solve these problems in the end-to-end and supervised learning way. To sufficiently train the fusion model, we have generated a large-scale multi-focus image dataset with ground-truth fusion images. What's more, to obtain a more informative fusion image, we further designed a novel fusion strategy based on unity fusion attention, which is composed of a channel attention module and a spatial attention module. Specifically, the proposed fusion approach mainly comprises three key components: feature extraction, feature fusion and image reconstruction. We firstly utilize seven convolutional blocks to extract the image features from source images. Then, the extracted convolutional features are fused by the proposed fusion strategy in the feature fusion layer. Finally, the fused image features are reconstructed by four convolutional blocks. Experimental results demonstrate that the proposed approach for multi-focus image fusion achieves remarkable fusion performance compared to 19 state-of-the-art fusion methods.

翻訳日:2021-04-04 01:53:40 公開日:2021-01-12

# 意味的特徴から物体間の相対的深さの予測

Predicting Relative Depth between Objects from Semantic Features ( http://arxiv.org/abs/2101.04626v1 )

ライセンス: Link先を確認

Stefan Cassar, Adrian Muscat, Dylan Seychell

(参考訳) 視覚関係検出や視覚的質問応答といった視覚および言語タスクは、言語を適切に接地できる意味的特徴から恩恵を受ける。 2次元画像で描かれた物体の3次元深度はそのような特徴である。しかし,シーン依存の適切な特徴を学習することなく正確な深度情報を得るのは難しい。この領域における技術の現状は、ステレオ画像データに基づいて訓練された複雑なニューラルネットワークモデルであり、ピクセルごとの深さを予測する。幸いなことに、いくつかのタスクでは、必要なオブジェクト間の相対的な深さのみである。本稿では,意味的特徴がコース相対深さを予測できる程度について検討する。この問題を分類として、オブジェクト境界ボックスに基づく幾何学的特徴として、オブジェクトラベルとシーン属性を計算し、パターン認識モデルの入力として使用して相対深さを予測する。後ろに、正面に、中立に。結果は,最先端技術を表すモノデプスニューラルネットワークモデルの出力を平均化した結果と比較する。モノディープスモデルから計算した相対深度に対する相対深度精度の14%の総合的な増加が達成された。

Vision and language tasks such as Visual Relation Detection and Visual Question Answering benefit from semantic features that afford proper grounding of language. The 3D depth of objects depicted in 2D images is one such feature. However it is very difficult to obtain accurate depth information without learning the appropriate features, which are scene dependent. The state of the art in this area are complex Neural Network models trained on stereo image data to predict depth per pixel. Fortunately, in some tasks, its only the relative depth between objects that is required. In this paper the extent to which semantic features can predict course relative depth is investigated. The problem is casted as a classification one and geometrical features based on object bounding boxes, object labels and scene attributes are computed and used as inputs to pattern recognition models to predict relative depth. i.e behind, in-front and neutral. The results are compared to those obtained from averaging the output of the monodepth neural network model, which represents the state-of-the art. An overall increase of 14% in relative depth accuracy over relative depth computed from the monodepth model derived results is achieved.

翻訳日:2021-04-04 01:53:22 公開日:2021-01-12

# 高精細画像合成のための高速安定化GAN訓練に向けて

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis ( http://arxiv.org/abs/2101.04775v1 )

ライセンス: Link先を確認

Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal

(参考訳) 高忠実度画像に対するGAN(Generative Adversarial Networks)のトレーニングは通常、大規模なGPUクラスタと大量のトレーニングイメージを必要とする。本稿では,最小計算コストでganの少数ショット画像合成タスクについて検討する。 1024*1024の解像度で優れた品質が得られる軽量gan構造を提案する。特に、モデルは1つのRTX-2080 GPUでわずか数時間のトレーニングでゼロから収束し、100以下のトレーニングサンプルでも一貫したパフォーマンスを持つ。機能エンコーダとして訓練されたスキップ層チャネル方向励振モジュールと自己教師付き判別器である。さまざまなイメージドメインをカバーする13のデータセット(データセットとコードはhttps://github.com/odegeasslbc/fastgan-pytorchで利用可能)では、データとコンピューティング予算が限られている場合、最先端のstylegan2よりも優れたパフォーマンスを示しています。

Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images. In this paper, we study the few-shot image synthesis task for GAN with minimum computing cost. We propose a light-weight GAN structure that gains superior quality on 1024*1024 resolution. Notably, the model converges from scratch with just a few hours of training on a single RTX-2080 GPU, and has a consistent performance, even with less than 100 training samples. Two technique designs constitute our work, a skip-layer channel-wise excitation module and a self-supervised discriminator trained as a feature-encoder. With thirteen datasets covering a wide variety of image domains (The datasets and code are available at: https://github.com/odegeasslbc/FastGAN-pytorch), we show our model's superior performance compared to the state-of-the-art StyleGAN2, when data and computing budget are limited.

翻訳日:2021-04-04 01:53:05 公開日:2021-01-12

# オンライン旅行目的地予測のための統一フレームワーク

A Unified Framework for Online Trip Destination Prediction ( http://arxiv.org/abs/2101.04520v1 )

ライセンス: Link先を確認

Victor Eberstein, Jonas Sj\"oblom, Nikolce Murgovski, Morteza Haghir Chehreghani

(参考訳) 旅行先予測は、旅行計画、自動運転、電気自動車など、多くのアプリケーションで重要性を増している分野である。この問題は、データがシーケンシャルな方法で到着するオンライン学習パラダイムで自然に解決することができるが、研究の大半はむしろオフライン設定だと考えている。本稿では,オンライントレーニングとオンライン予測の両方に適したオンライン環境での旅行先予測の統一フレームワークを提案する。この目的のために,2つのクラスタリングアルゴリズムを開発し,この問題に対する2つのオンライン予測モデルに統合する。実世界のデータセットにおけるクラスタリングアルゴリズムと予測モデルの異なる構成について検討する。従来のクラスタリングのメトリクスと精度を用いて、クラスタリングとフレームワーク全体がオフライン環境と比べて一貫した結果をもたらすことを実証する。最後に、オフラインのフレームワークと比較し、オンラインフレームワーク全体を評価するための新しい後悔の指標を提案する。このメトリックにより、誤った予測のソースをクラスタリングまたは予測モデルのいずれかに関連付けることができる。このメトリックを用いて,提案手法が真の分布に類似した確率分布に収束し,ベースラインのすべてよりも低い後悔を味わうことを示す。

Trip destination prediction is an area of increasing importance in many applications such as trip planning, autonomous driving and electric vehicles. Even though this problem could be naturally addressed in an online learning paradigm where data is arriving in a sequential fashion, the majority of research has rather considered the offline setting. In this paper, we present a unified framework for trip destination prediction in an online setting, which is suitable for both online training and online prediction. For this purpose, we develop two clustering algorithms and integrate them within two online prediction models for this problem. We investigate the different configurations of clustering algorithms and prediction models on a real-world dataset. By using traditional clustering metrics and accuracy, we demonstrate that both the clustering and the entire framework yield consistent results compared to the offline setting. Finally, we propose a novel regret metric for evaluating the entire online framework in comparison to its offline counterpart. This metric makes it possible to relate the source of erroneous predictions to either the clustering or the prediction model. Using this metric, we show that the proposed methods converge to a probability distribution resembling the true underlying distribution and enjoy a lower regret than all of the baselines.

翻訳日:2021-04-04 01:52:27 公開日:2021-01-12

# ベンチマークシミュレーションに基づく推論

Benchmarking Simulation-Based Inference ( http://arxiv.org/abs/2101.04653v1 )

ライセンス: Link先を確認

Jan-Matthis Lueckmann, Jan Boelts, David S. Greenberg, Pedro J. Gon\c{c}alves, Jakob H. Macke

(参考訳) 確率的モデリングの最近の進歩は、確率の数値的評価を必要としない多くのシミュレーションに基づく推論アルゴリズムを生み出した。しかし、このような'likelihood-free'アルゴリズムに適切なパフォーマンス指標を持つ公開ベンチマークは欠落している。これにより、アルゴリズムの比較と、その強みと弱みの特定が難しくなった。私たちは、推論タスクと適切なパフォーマンスメトリクスを備えたベンチマークを提供し、ニューラルネットワークと古典的な近似ベイズ計算手法を用いた最近のアプローチを含むアルゴリズムを初期選択します。性能指標の選択は重要であり、最先端のアルゴリズムでさえ改善の余地があり、逐次推定によりサンプリング効率が向上することがわかった。ニューラルネットワークベースのアプローチは一般的にパフォーマンスが向上するが、一様に最適なアルゴリズムはない。我々は,問題を診断し,アルゴリズムを改善するためのベンチマークの可能性を強調し,実践的なアドバイスを提供する。結果はコンパニオンwebサイトでインタラクティブに探すことができる。すべてのコードはオープンソースであり、さらなるベンチマークタスクと推論アルゴリズムに貢献することができる。

Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for such 'likelihood-free' algorithms has been lacking. This has made it difficult to compare algorithms and identify their strengths and weaknesses. We set out to fill this gap: We provide a benchmark with inference tasks and suitable performance metrics, with an initial selection of algorithms including recent approaches employing neural networks and classical Approximate Bayesian Computation methods. We found that the choice of performance metric is critical, that even state-of-the-art algorithms have substantial room for improvement, and that sequential estimation improves sample efficiency. Neural network-based approaches generally exhibit better performance, but there is no uniformly best algorithm. We provide practical advice and highlight the potential of the benchmark to diagnose problems and improve algorithms. The results can be explored interactively on a companion website. All code is open source, making it possible to contribute further benchmark tasks and inference algorithms.

翻訳日:2021-04-04 01:52:12 公開日:2021-01-12

# コミュニケーションのためのモデルベース機械学習

Model-Based Machine Learning for Communications ( http://arxiv.org/abs/2101.04726v1 )

ライセンス: Link先を確認

Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, and Andrea J. Goldsmith

(参考訳) 本稿では,コミュニケーションシステムのためのモデルベース機械学習について紹介する。まず、モデルベースアルゴリズムと機械学習を組み合わせる既存の戦略を高レベルの観点から見直し、エンドツーエンドでトレーニングされた確立されたディープニューラルネットワーク(DNN)アーキテクチャを利用した従来のディープラーニングアプローチと比較する。次に,通信受信機の基本的なタスクの一つであるシンボル検出に注目する。本稿では,従来のディープアーキテクチャ,ディープ展開,DNN支援ハイブリッドアルゴリズムの異なる戦略が,この問題にどのように適用できるかを示す。最後の2つのアプローチは、純粋にモデルベースとdnnベースのレシーバーの中間に位置する。この特定のタスクに注目することで,各戦略の利点と欠点を強調し,コミュニケーションのためのモデルベース深層学習システムの設計を容易にするためのガイドラインを提案する。

We present an introduction to model-based machine learning for communication systems. We begin by reviewing existing strategies for combining model-based algorithms and machine learning from a high level perspective, and compare them to the conventional deep learning approach which utilizes established deep neural network (DNN) architectures trained in an end-to-end manner. Then, we focus on symbol detection, which is one of the fundamental tasks of communication receivers. We show how the different strategies of conventional deep architectures, deep unfolding, and DNN-aided hybrid algorithms, can be applied to this problem. The last two approaches constitute a middle ground between purely model-based and solely DNN-based receivers. By focusing on this specific task, we highlight the advantages and drawbacks of each strategy, and present guidelines to facilitate the design of future model-based deep learning systems for communications.

翻訳日:2021-04-04 01:51:46 公開日:2021-01-12

# CleftNet:脳電子顕微鏡によるシナプス下肢検出のための深層学習

CleftNet: Augmented Deep Learning for Synaptic Cleft Detection from Brain Electron Microscopy ( http://arxiv.org/abs/2101.04266v1 )

ライセンス: Link先を確認

Yi Liu, Shuiwang Ji

(参考訳) シナプス裂の検出はシナプスの生物学的機能を調べる上で重要なステップである。体積電子顕微鏡(em)は、em像を高分解能で微細に撮影することでシナプス裂の同定を可能にする。 em画像からシナプス裂を自動的に予測するために、機械学習のアプローチが採用されている。そこで本研究では,脳EM画像からのシナプス・クリフ検出を改善するための,CleftNetと呼ばれる新しい深層学習モデルを提案する。まず,機能拡張器とラベル拡張器という2つの新しいネットワークコンポーネントを提案する。機能拡張子は、入力からグローバル情報を融合し、cleftで共通の形態的パターンを学習し、拡張されたcleft機能に繋がる。さらに、さまざまな次元の出力を生成して、任意のディープネットワークに柔軟に統合することができる。提案するラベル拡張器は,各ボクセルのラベルを値からベクトルに拡張し,セグメンテーションラベルと境界ラベルの両方を含む。これにより、ネットワークは重要な形状情報を学び、より情報的なクリフ表現を生成することができる。提案する機能拡張子とラベル拡張子に基づき、cleftnetをu-netライクなネットワークとして構築する。本手法の有効性は,オンラインタスクとオフラインタスクの両方で評価される。私たちのCleftNetは現在、CREMIオープンチャレンジのオンラインタスクで#1にランク付けしています。さらに,オフラインタスクにおける定量的および定性的な結果から,本手法がベースラインアプローチを大きく上回っていることが示された。

Detecting synaptic clefts is a crucial step to investigate the biological function of synapses. The volume electron microscopy (EM) allows the identification of synaptic clefts by photoing EM images with high resolution and fine details. Machine learning approaches have been employed to automatically predict synaptic clefts from EM images. In this work, we propose a novel and augmented deep learning model, known as CleftNet, for improving synaptic cleft detection from brain EM images. We first propose two novel network components, known as the feature augmentor and the label augmentor, for augmenting features and labels to improve cleft representations. The feature augmentor can fuse global information from inputs and learn common morphological patterns in clefts, leading to augmented cleft features. In addition, it can generate outputs with varying dimensions, making it flexible to be integrated in any deep network. The proposed label augmentor augments the label of each voxel from a value to a vector, which contains both the segmentation label and boundary label. This allows the network to learn important shape information and to produce more informative cleft representations. Based on the proposed feature augmentor and label augmentor, We build the CleftNet as a U-Net like network. The effectiveness of our methods is evaluated on both online and offline tasks. Our CleftNet currently ranks \#1 on the online task of the CREMI open challenge. In addition, both quantitative and qualitative results in the offline tasks show that our method outperforms the baseline approaches significantly.

翻訳日:2021-04-04 01:51:32 公開日:2021-01-12

# PvDeConv:3次元CAD構築のためのポイントボクセルデコンボリューション

PvDeConv: Point-Voxel Deconvolution for Autoencoding CAD Construction in 3D ( http://arxiv.org/abs/2101.04493v1 )

ライセンス: Link先を確認

Kseniya Cherenkova, Djamila Aouada, Gleb Gusev

(参考訳) 本稿では,3次元データオートエンコーダのためのPoint-Voxel DeConvolution (PVDeConv) モジュールを提案する。その効率を示すために、コンピュータ支援設計(cad)モデルの基盤となる幾何学を密に記述した10k点の高分解能点雲を合成することを学ぶ。プロトルージョン、欠落した部分、円滑な縁、穴などのスキャンはCADオブジェクトの実際の3Dスキャンに必然的に現れる。元のCADモデル構築を3Dスキャンから学習するには、対応するオブジェクトの3Dスキャンとともに、真理を理解する必要がある。このギャップを解決するために、50k以上のCADモデルとその対応する3Dメッシュを含む、新しい専用データセットCC3Dを導入する。このデータセットは、3Dスキャン(CADモデル)のペアからサンプリングされた点雲の畳み込みオートエンコーダを学ぶために使用される。この新しいデータセットの課題は、ShapeNetでトレーニングされた他の生成点クラウドサンプリングモデルと比較できる。 CC3Dオートエンコーダは、3Dデータ生成の最先端モデルと比較してメモリ消費とトレーニング時間に関して効率的である。

We propose a Point-Voxel DeConvolution (PVDeConv) module for 3D data autoencoder. To demonstrate its efficiency we learn to synthesize high-resolution point clouds of 10k points that densely describe the underlying geometry of Computer Aided Design (CAD) models. Scanning artifacts, such as protrusions, missing parts, smoothed edges and holes, inevitably appear in real 3D scans of fabricated CAD objects. Learning the original CAD model construction from a 3D scan requires a ground truth to be available together with the corresponding 3D scan of an object. To solve the gap, we introduce a new dedicated dataset, the CC3D, containing 50k+ pairs of CAD models and their corresponding 3D meshes. This dataset is used to learn a convolutional autoencoder for point clouds sampled from the pairs of 3D scans - CAD models. The challenges of this new dataset are demonstrated in comparison with other generative point cloud sampling models trained on ShapeNet. The CC3D autoencoder is efficient with respect to memory consumption and training time as compared to stateof-the-art models for 3D data generation.

翻訳日:2021-04-04 01:51:09 公開日:2021-01-12

# プログレッシブリトレーニングによる畳み込みニューラルネットワークの単純化

Convolutional Neural Network Simplification with Progressive Retraining ( http://arxiv.org/abs/2101.04699v1 )

ライセンス: Link先を確認

D. Osaku, J.F. Gomes, A.X. Falc\~ao

(参考訳) カーネルプルーニング法は、畳み込みニューラルネットワーク(CNN)モデルの説明を高速化、単純化、改善するために提案されている。しかし、単純化されたモデルの有効性は、しばしば元のモデルよりも低い。本稿では,カーネル除去の客観的および主観的妥当性基準に基づく新しい手法を提案する。プロセス中、cnnモデルは、次の層から最初の層まで重みを調整し、プロセスに関わらない後の層の重みを保存することによって、現在の層が完全に単純化された場合にのみ再訓練される。私たちはこの戦略を「emph{progressive retraining}」と呼び、各単純化アクションの後にモデル全体を再トレーニングするカーネルプルーニングメソッドとは異なる。我々の主観的関連性基準は、視覚パターン認識における人間の能力を活用し、デザイナーによる単純化プロセスの理解を改善する。適切な適合基準とプログレッシブ・リトレーニングの組み合わせは,モデルの単純化によって有効性を向上できることを示す。また,提案手法は,4つの課題の画像データセットを用いて,最先端技術による2つの手法よりも優れた結果が得られることを示す。

Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models. However, the effectiveness of a simplified model is often below the original one. In this letter, we present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion. During the process, a CNN model is retrained only when the current layer is entirely simplified, by adjusting the weights from the next layer to the first one and preserving weights of subsequent layers not involved in the process. We call this strategy \emph{progressive retraining}, differently from kernel pruning methods that usually retrain the entire model after each simplification action -- e.g., the elimination of one or a few kernels. Our subjective relevance criterion exploits the ability of humans in recognizing visual patterns and improves the designer's understanding of the simplification process. The combination of suitable relevance criteria and progressive retraining shows that our methods can increase effectiveness with considerable model simplification. We also demonstrate that our methods can provide better results than two popular ones and another one from the state-of-the-art using four challenging image datasets.

翻訳日:2021-04-04 01:50:38 公開日:2021-01-12

# 顔画像からの痛み推定のための個人化深層学習

Personalized Federated Deep Learning for Pain Estimation From Face Images ( http://arxiv.org/abs/2101.04800v1 )

ライセンス: Link先を確認

Ognjen Rudovic, Nicolas Tobis, Sebastian Kaltwang, Bj\"orn Schuller, Daniel Rueckert, Jeffrey F. Cohn and Rosalind W. Picard

(参考訳) 標準的な機械学習アプローチでは、ユーザのデータをひとつのコンピュータまたは共有データベースに集約する必要がある。したがって、特にデータ規制が厳格な医療環境では、中央アクセスを制限することが重要である。これに取り組む潜在的なアプローチは、生のトレーニングデータをローカルに保持しながら、ローカルにトレーニングされたモデルのパラメータを使用することで、複数の当事者が共有予測モデルを共同的に学習できるフェデレーション学習(fl)である。 AIによる鎮痛モニタリングの文脈では、長期の鎮痛監視のための機密性保存と非閉塞性鎮痛推定を可能とし、定期的なチェックアップを頻繁に行う看護スタッフの負担を軽減したい。この目的のために,顔画像から痛みを推定するためのPFDL(Personalized Federated Deep Learning)アプローチを提案する。 PFDLは、顔画像を共有することなく、異なるクライアント(主題など)にわたって、軽量CNNアーキテクチャを用いて実装されたディープモデルの協調トレーニングを実行する。標準FLのようにモデルのすべてのパラメータを共有する代わりに、PFDLは最後のレイヤをローカルに保持する(痛みの推定をパーソナライズするために使用される)。この(i)は、別のデータの機密性層を追加し、敵が対象者の痛みレベルを推測することを困難にし、(ii)局所的なパラメータチューニングによって各被験者の痛み推定をパーソナライズする。痛みの顔ビデオのデータセット(UNBC-McMaster Shoulder Pain Database)を用いて、PFDLは標準的な集中型およびFLアルゴリズムよりも可視的または優れた性能を示し、データのプライバシーをさらに強化する。これにより、より安全で計算効率が高く、多くの個人(家庭内の痛みモニタリングなど)にスケーラブルで、タイムリーで邪魔にならない痛み測定を提供することで、従来の痛みモニタリングを改善することができる。

Standard machine learning approaches require centralizing the users' data in one computer or a shared database, which raises data privacy and confidentiality concerns. Therefore, limiting central access is important, especially in healthcare settings, where data regulations are strict. A potential approach to tackling this is Federated Learning (FL), which enables multiple parties to collaboratively learn a shared prediction model by using parameters of locally trained models while keeping raw training data locally. In the context of AI-assisted pain-monitoring, we wish to enable confidentiality-preserving and unobtrusive pain estimation for long-term pain-monitoring and reduce the burden on the nursing staff who perform frequent routine check-ups. To this end, we propose a novel Personalized Federated Deep Learning (PFDL) approach for pain estimation from face images. PFDL performs collaborative training of a deep model, implemented using a lightweight CNN architecture, across different clients (i.e., subjects) without sharing their face images. Instead of sharing all parameters of the model, as in standard FL, PFDL retains the last layer locally (used to personalize the pain estimates). This (i) adds another layer of data confidentiality, making it difficult for an adversary to infer pain levels of the target subject, while (ii) personalizing the pain estimation to each subject through local parameter tuning. We show using a publicly available dataset of face videos of pain (UNBC-McMaster Shoulder Pain Database), that PFDL performs comparably or better than the standard centralized and FL algorithms, while further enhancing data privacy. This, has the potential to improve traditional pain monitoring by making it more secure, computationally efficient, and scalable to a large number of individuals (e.g., for in-home pain monitoring), providing timely and unobtrusive pain measurement.

翻訳日:2021-04-04 01:50:18 公開日:2021-01-12

# 深層学習による膝蓋骨遠位端関節症の自動検出:多施設変形性膝関節症研究(MOST)データ

Automated Detection of Patellofemoral Osteoarthritis from Knee Lateral View Radiographs Using Deep Learning: Data from the Multicenter Osteoarthritis Study (MOST) ( http://arxiv.org/abs/2101.04350v1 )

ライセンス: Link先を確認

Neslihan Bayramoglu, Miika T. Nieminen, Simo Saarakkala

(参考訳) 目的: 画像を用いた深層学習による膝蓋骨変形性膝関節症(PFOA)の予測能力を評価すること。デザイン:多中心型変形性関節症研究(MOST) (n=18,436膝) から膝側視像を抽出した。 Patellar region-of-interest(ROI)が最初に自動的に検出され、その後、終末から終末にかけての深部畳み込みニューラルネットワーク(CNN)が訓練され、パテロフェモラルOAの状態を検出した。深層学習に基づく物体検出法を用いてパテラーROIを検出した。 MOSTデータセットで提供される手動PFOAステータスアセスメントをCNNの分類結果として用いた。予測モデルの性能は, 受信機動作特性曲線 (ROC AUC) と, 層状5次元断面検証設定における精度再コール曲線 (PR) から得られた平均精度 (AP) に基づいて評価した。結果: 膝18,436例中3,425例(19%)がPFOAであった。 AUCとAPは、年齢、性別、体重指数(BMI)、西オンタリオ大学およびマクマスター大学関節炎指数(WOMAC)スコア、およびPFOAを予測するためのKelgren-Lawrence(KL)グレードが0.806と0.478であった。画像データのみを用いたCNNモデルはPFOA状態の予測を著しく改善した(ROC AUC=0.958, AP=0.862)。結論: 第1回機械学習に基づく自動pfoa検出法を提案する。さらに,膝側方x線写真から膝蓋骨領域を訓練した深層学習モデルでは,患者特性と臨床評価に基づくモデルよりもpfoaの予測が良好である。

Objective: To assess the ability of imaging-based deep learning to predict radiographic patellofemoral osteoarthritis (PFOA) from knee lateral view radiographs. Design: Knee lateral view radiographs were extracted from The Multicenter Osteoarthritis Study (MOST) (n = 18,436 knees). Patellar region-of-interest (ROI) was first automatically detected, and subsequently, end-to-end deep convolutional neural networks (CNNs) were trained and validated to detect the status of patellofemoral OA. Patellar ROI was detected using deep-learning-based object detection method. Manual PFOA status assessment provided in the MOST dataset was used as a classification outcome for the CNNs. Performance of prediction models was assessed using the area under the receiver operating characteristic curve (ROC AUC) and the average precision (AP) obtained from the precision-recall (PR) curve in the stratified 5-fold cross validation setting. Results: Of the 18,436 knees, 3,425 (19%) had PFOA. AUC and AP for the reference model including age, sex, body mass index (BMI), the total Western Ontario and McMaster Universities Arthritis Index (WOMAC) score, and tibiofemoral Kellgren-Lawrence (KL) grade to predict PFOA were 0.806 and 0.478, respectively. The CNN model that used only image data significantly improved the prediction of PFOA status (ROC AUC= 0.958, AP= 0.862). Conclusion: We present the first machine learning based automatic PFOA detection method. Furthermore, our deep learning based model trained on patella region from knee lateral view radiographs performs better at predicting PFOA than models based on patient characteristics and clinical assessments.

翻訳日:2021-04-04 01:49:44 公開日:2021-01-12

# 高精度ピック・アンド・プレイス作業のためのシミュレーションから実世界への移動経験

Transferring Experience from Simulation to the Real World for Precise Pick-And-Place Tasks in Highly Cluttered Scenes ( http://arxiv.org/abs/2101.04781v1 )

ライセンス: Link先を確認

Kilian Kleeberger and Markus V\"olk and Marius Moosmann and Erik Thiessenhusen and Florian Roth and Richard Bormann and Marco F. Huber

(参考訳) 本稿では,高度に散らばったシーンで既知の剛体物体を把握し,深度画像に基づいて正確に配置する,新しい学習手法を提案する。 pq-net (placement quality network) は、ニューラルネットワークの1回のフォワードパスにおいて、複数のオブジェクトに対して、自動的に生成された把持の各々のオブジェクトポーズと品質を92fpsで同時に推定する。全ての把握と配置の試行は物理シミュレーションで実行され、得られた経験はドメインランダム化を用いて実世界に移される。われわれの政策は実世界への移転に成功している。 PQ-Netは成功率の把握の観点から他のモデルフリーアプローチよりも優れており、人間の介入なしに任意の対称性を持つ新しいオブジェクトに自動的にスケールする。

In this paper, we introduce a novel learning-based approach for grasping known rigid objects in highly cluttered scenes and precisely placing them based on depth images. Our Placement Quality Network (PQ-Net) estimates the object pose and the quality for each automatically generated grasp pose for multiple objects simultaneously at 92 fps in a single forward pass of a neural network. All grasping and placement trials are executed in a physics simulation and the gained experience is transferred to the real world using domain randomization. We demonstrate that our policy successfully transfers to the real world. PQ-Net outperforms other model-free approaches in terms of grasping success rate and automatically scales to new objects of arbitrary symmetry without any human intervention.

翻訳日:2021-04-04 01:49:10 公開日:2021-01-12

# クラウドソーシングによる効果的なコンテンツ分析に向けて

Toward Effective Automated Content Analysis via Crowdsourcing ( http://arxiv.org/abs/2101.04615v1 )

ライセンス: Link先を確認

Jiele Wu, Chau-Wai Wong, Xinyan Zhao, Xianpeng Liu

(参考訳) 多くのコンピュータ科学者は、オンラインワーカーの集約された回答を使って真実を表現している。先行研究では、多数決のような集計手法が比較的客観的な特徴を測定するのに有効であることが示されている。意味的意味づけのような主観的な機能では、時間ごとの収益を最適化することで知られるオンラインワーカーは、より長く働くと応答の質が低下する傾向がある。本稿では,品質を意識したセマンティックデータアノテーションシステムを提案することで,この問題に対処しようとする。我々は、品質スコアによって定量化された労働者のパフォーマンスに対するタイムリーなフィードバックにより、オンライン労働者が長期にわたってラベル付けの品質を維持することができることを観察した。提案するアノテーションシステムの有効性を検証するために,i) エキスパートラベルデータセットに基づく性能評価,ii) 70%から80%の精度で一貫した学習行動をもたらす機械学習タスクの実証を行った。その結果,本システムでは主観的意味的特徴の質の高い回答を大規模に収集できることが示唆された。

Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective features such as semantic connotation, online workers, known for optimizing their hourly earnings, tend to deteriorate in the quality of their responses as they work longer. In this paper, we aim to address this issue by proposing a quality-aware semantic data annotation system. We observe that with timely feedback on workers' performance quantified by quality scores, better informed online workers can maintain the quality of their labeling throughout an extended period of time. We validate the effectiveness of the proposed annotation system through i) evaluating performance based on an expert-labeled dataset, and ii) demonstrating machine learning tasks that can lead to consistent learning behavior with 70%-80% accuracy. Our results suggest that with our system, researchers can collect high-quality answers of subjective semantic features at a large scale.

翻訳日:2021-04-04 01:48:57 公開日:2021-01-12

# SARS-CoV-2のAIおよびHPC対応リード生成:自然言語テキストに含まれる薬物様分子の抽出モデルとプロセス

AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text ( http://arxiv.org/abs/2101.04617v1 )

ライセンス: Link先を確認

Zhi Hong, J. Gregory Pauloski, Logan Ward, Kyle Chard, Ben Blaiszik, and Ian Foster

(参考訳) 世界中の研究者は、重症急性呼吸器症候群ウイルス(SARS-CoV-2)による病気に対抗するために、既存の薬物の再利用や新しい薬物の発見を目指している。このような研究の候補は、新型コロナウイルス研究の文脈で薬物のような分子であると科学文献で報告されている分子である。ここでは、人間と人工知能の両方を利用して、フリーテキストで薬物様分子の参照を検出するプロジェクトについて報告する。我々は、高度でない人間がラベル付きテキストのコーパスを作成し、このラベル付きコーパスを使用して名前付きエンティティ認識モデルを訓練し、訓練されたモデルを用いて198875紙のオープンリサーチデータセットチャレンジ(CORD-19)コーパスから10912の薬物様分子を抽出する。性能分析の結果, 自動抽出モデルは非熟練人間と同等の性能が得られることがわかった。

Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A promising source of candidates for such studies is molecules that have been reported in the scientific literature to be drug-like in the context of coronavirus research. We report here on a project that leverages both human and artificial intelligence to detect references to drug-like molecules in free text. We engage non-expert humans to create a corpus of labeled text, use this labeled corpus to train a named entity recognition model, and employ the trained model to extract 10912 drug-like molecules from the COVID-19 Open Research Dataset Challenge (CORD-19) corpus of 198875 papers. Performance analyses show that our automated extraction model can achieve performance on par with that of non-expert humans.

翻訳日:2021-04-04 01:48:40 公開日:2021-01-12

# Queue-Learning: サービス品質提供のための強化学習アプローチ

Queue-Learning: A Reinforcement Learning Approach for Providing Quality of Service ( http://arxiv.org/abs/2101.04627v1 )

ライセンス: Link先を確認

Majid Raeis, Ali Tizghadam, Alberto Leon-Garcia

(参考訳) エンドツーエンドの遅延は、クラウドコンピューティングやコンピュータネットワークなどのアプリケーションドメインにおけるQoS(Quality of Service)の重要な特性である。このメトリクスは、エンドツーエンドサービスがサービスチェーンを介して提供される、タンデムサービスシステムにおいて特に重要です。サービスレート制御は、サービスシステムにおいてqos保証を提供する共通のメカニズムである。本稿では、サービスリソースの過剰使用を防止しつつ、システムのエンドツーエンド遅延に対する確率的上限を提供する強化学習ベース(RLベース)サービスレートコントローラを提案する。一般的なフレームワークを得るために、私たちはキュー理論を使ってサービスシステムをモデル化します。しかし、待ち行列理論の制限を避けるためにrlベースのアプローチを採用する。特に、Deep Deterministic Policy Gradient(DDPG)を使用して、タンデムサービスシステムのキュー長(状態)の関数として、サービスレート(アクション)を学習します。システム全体の報酬によって性能を定量化する既存のrlベースの手法とは対照的に,提案するコントローラはシステムのエンド・ツー・エンドの遅延に対する明示的な確率的保証を提供する。 qosの制約を満たしたコントローラの能力を検証した,非指数的相互接続およびサービス時間を有するタンデム待ち行列システムについて評価を行った。

End-to-end delay is a critical attribute of quality of service (QoS) in application domains such as cloud computing and computer networks. This metric is particularly important in tandem service systems, where the end-to-end service is provided through a chain of services. Service-rate control is a common mechanism for providing QoS guarantees in service systems. In this paper, we introduce a reinforcement learning-based (RL-based) service-rate controller that provides probabilistic upper-bounds on the end-to-end delay of the system, while preventing the overuse of service resources. In order to have a general framework, we use queueing theory to model the service systems. However, we adopt an RL-based approach to avoid the limitations of queueing-theoretic methods. In particular, we use Deep Deterministic Policy Gradient (DDPG) to learn the service rates (action) as a function of the queue lengths (state) in tandem service systems. In contrast to existing RL-based methods that quantify their performance by the achieved overall reward, which could be hard to interpret or even misleading, our proposed controller provides explicit probabilistic guarantees on the end-to-end delay of the system. The evaluations are presented for a tandem queueing system with non-exponential inter-arrival and service times, the results of which validate our controller's capability in meeting QoS constraints.

翻訳日:2021-04-04 01:48:11 公開日:2021-01-12

# 計算物理学における自動モデル推薦のためのデータ拡張と特徴選択

Data augmentation and feature selection for automatic model recommendation in computational physics ( http://arxiv.org/abs/2101.04530v1 )

ライセンス: Link先を確認

Thomas Daniel, Fabien Casenave, Nissrine Akkari, David Ryckelynck

(参考訳) 分類アルゴリズムは、最近、計算物理学において、環境や物理システムの状態に適応した数値的手法やモデルの選択に応用されている。このような分類タスクでは、ラベル付きトレーニングデータは数値シミュレーションから得られ、一般にメッシュ上に離散化された物理フィールドに対応する。トレーニングデータの欠如、高次元化、物理データへの共通データ拡張技術の適用不可能という3つの難題が生まれている。この記事では、これらの問題に対処するために、2つのアルゴリズムを紹介します。1つは特徴選択による次元の削減、もう1つはデータ拡張です。これらのアルゴリズムは、評価のために様々な分類器と組み合わせられる。 6つの多層パーセプトロンからなる積層アンサンブルとリッジロジスティック回帰を組み合わせた場合、非線形構造力学の分類問題において90%の精度が得られる。

Classification algorithms have recently found applications in computational physics for the selection of numerical methods or models adapted to the environment and the state of the physical system. For such classification tasks, labeled training data come from numerical simulations and generally correspond to physical fields discretized on a mesh. Three challenging difficulties arise: the lack of training data, their high dimensionality, and the non-applicability of common data augmentation techniques to physics data. This article introduces two algorithms to address these issues, one for dimensionality reduction via feature selection, and one for data augmentation. These algorithms are combined with a wide variety of classifiers for their evaluation. When combined with a stacking ensemble made of six multilayer perceptrons and a ridge logistic regression, they enable reaching an accuracy of 90% on our classification problem for nonlinear structural mechanics.

翻訳日:2021-04-04 01:47:12 公開日:2021-01-12

# 空間情報を用いた時系列データの効率的解析のためのディープセルリカレントネットワーク

Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information ( http://arxiv.org/abs/2101.05608v1 )

ライセンス: Link先を確認

Lasitha Vidyaratne, Mahbubul Alam, Alexander Glandon, Anna Shabalina, Christopher Tennant, and Khan Iftekharuddin

(参考訳) 大規模時系列データの効率的な処理は、機械学習の複雑な問題である。手動で特徴抽出を行う従来のセンサ信号処理パイプラインは、高次元データによる膨大な計算コストを伴うことが多い。ディープリカレントニューラルネットワークは、時系列処理を改善するための自動機能学習に有望である。しかし、一般的なディープ・リカレントモデルでは、データの複雑さが増すにつれてスケールと深さが大きくなる。これは、時間的および空間的特性を持つ高次元データの存在において特に困難である。そこで本研究では,複雑な多次元時系列データを空間情報で効率的に処理する新しいディープセルリカレントニューラルネットワーク(dcrnn)アーキテクチャを提案する。提案モデルにおけるセルリカレントアーキテクチャにより,空間分布センサ信号源からの時系列データの位置認識同期処理が可能となる。提案アーキテクチャにおけるセルラ性による広範なトレーニング可能なパラメータ共有は,高次元入力を用いた再帰処理ユニットの使用効率を保証している。そこで本研究では,DCRNNモデルの多クラス時系列データの分類における汎用性についても検討した。その結果、DCRNNアーキテクチャは2つの時系列データセット、つまり、発作検出のためのマルチチャネルの頭皮EEGデータセットと、社内で得られたマシン故障検出データセットを用いて評価される。その結果,本論文の手法と比較した場合,学習可能なパラメータをかなり少なくしつつ,最先端の性能を実現できることが示唆された。

Efficient processing of large-scale time series data is an intricate problem in machine learning. Conventional sensor signal processing pipelines with hand engineered feature extraction often involve huge computational cost with high dimensional data. Deep recurrent neural networks have shown promise in automated feature learning for improved time-series processing. However, generic deep recurrent models grow in scale and depth with increased complexity of the data. This is particularly challenging in presence of high dimensional data with temporal and spatial characteristics. Consequently, this work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to efficiently process complex multi-dimensional time series data with spatial information. The cellular recurrent architecture in the proposed model allows for location-aware synchronous processing of time series data from spatially distributed sensor signal sources. Extensive trainable parameter sharing due to cellularity in the proposed architecture ensures efficiency in the use of recurrent processing units with high-dimensional inputs. This study also investigates the versatility of the proposed DCRNN model for classification of multi-class time series data from different application domains. Consequently, the proposed DCRNN architecture is evaluated using two time-series datasets: a multichannel scalp EEG dataset for seizure detection, and a machine fault detection dataset obtained in-house. The results suggest that the proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.

翻訳日:2021-04-04 01:46:57 公開日:2021-01-12

# デバイス上インテント分類の強化された文字表現

A character representation enhanced on-device Intent Classification ( http://arxiv.org/abs/2101.04456v1 )

ライセンス: Link先を確認

Sudeep Deepak Shivnikar, Himanshu Arora, Harichandana B S S

(参考訳) 意図分類は自然言語理解システムにおいて重要なタスクである。既存のアプローチは、ベンチマークデータセットで完璧なスコアを獲得しました。しかし、モバイルやタブレットなどの低リソースデバイスへのデプロイには適していない。モデルの大きさが大きすぎるためですそこで本稿では,デバイス上で効率的に動作可能な,意図分類のための新しい軽量アーキテクチャを提案する。我々は文字特徴を使って単語表現を豊かにする。実験により,提案モデルが既存手法より優れ,ベンチマークデータセットの最先端結果が得られた。また,本モデルではメモリフットプリントが5MB程度で,推定時間は2ミリ秒程度であり,資源制約環境下での効率を実証する。

Intent classification is an important task in natural language understanding systems. Existing approaches have achieved perfect scores on the benchmark datasets. However they are not suitable for deployment on low-resource devices like mobiles, tablets, etc. due to their massive model size. Therefore, in this paper, we present a novel light-weight architecture for intent classification that can run efficiently on a device. We use character features to enrich the word representation. Our experiments prove that our proposed model outperforms existing approaches and achieves state-of-the-art results on benchmark datasets. We also report that our model has tiny memory footprint of ~5 MB and low inference time of ~2 milliseconds, which proves its efficiency in a resource-constrained environment.

翻訳日:2021-04-04 01:46:40 公開日:2021-01-12

# 話題分布を持つxlnetモデルを用いた偽ニュース検出システム: constraint@aaai2021 shared task

Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task ( http://arxiv.org/abs/2101.11425v1 )

ライセンス: Link先を確認

Akansha Gautam, Venktesh V, Sarah Masud

(参考訳) 情報へのアクセスの容易さとインターネット上での急速な普及(速度とボリュームの両方)により、偽情報から真実情報をフィルタリングすることは困難になっている。研究コミュニティは現在、現実世界の政治的影響をもたらす偽ニュースの自動検出という課題に直面している。このような研究はConstraint@AAA12021 Shared Task on COVID19 Fake News Detection in Englishという形で行われた。本稿では,この共有タスクの一環として提案した新しい手法について光を当てる。我々のチームは、LDA(Latent Dirichlet Allocation)のトピック分布とXLNetの文脈表現を組み合わせたアプローチを導入しました。また,提案手法を既存のベースラインと比較し,XLNet + Topic DistributionsがF1スコア0.967を達成することにより,他の手法よりも優れていることを示す。

With the ease of access to information, and its rapid dissemination over the internet (both velocity and volume), it has become challenging to filter out truthful information from fake ones. The research community is now faced with the task of automatic detection of fake news, which carries real-world socio-political impact. One such research contribution came in the form of the Constraint@AAA12021 Shared Task on COVID19 Fake News Detection in English. In this paper, we shed light on a novel method we proposed as a part of this shared task. Our team introduced an approach to combine topical distributions from Latent Dirichlet Allocation (LDA) with contextualized representations from XLNet. We also compared our method with existing baselines to show that XLNet + Topic Distributions outperforms other approaches by attaining an F1-score of 0.967.

翻訳日:2021-04-04 01:46:31 公開日:2021-01-12

# クラウドカウントのための強化情報融合ネットワーク

Enhanced Information Fusion Network for Crowd Counting ( http://arxiv.org/abs/2101.04279v1 )

ライセンス: Link先を確認

Geng Chen and Peirong Guo

(参考訳) 近年,画像中の人物数を予測する手法である群集カウントは,コンピュータビジョンにおける課題となっている。本稿では,カラム内の情報冗長性問題を解決するために,クロスカラム特徴融合ネットワークを提案する。我々は,異なる列が他の列から重要な情報を得るのを助けるために,情報フローのチャネルを提供する情報融合モジュール(IFM)を紹介する。このチャネルを通じて、異なる列が情報を交換し、他の列から有用な特徴を抽出し、キー情報を強化する。したがって、イメージ内のすべての領域に注意を払うためにカラムは必要ない。各列は異なる領域に責任を持ち、各列の負担を軽減できる。実験では、モデルの一般化性はより堅牢で、異なるデータセット間で転送した結果は最先端のモデルと同等の結果が得られます。

In recent years, crowd counting, a technique for predicting the number of people in an image, becomes a challenging task in computer vision. In this paper, we propose a cross-column feature fusion network to solve the problem of information redundancy in columns. We introduce the Information Fusion Module (IFM) which provides a channel for information flow to help different columns to obtain significant information from another column. Through this channel, different columns exchange information with each other and extract useful features from the other column to enhance key information. Hence, there is no need for columns to pay attention to all areas in the image. Each column can be responsible for different regions, thereby reducing the burden of each column. In experiments, the generalizability of our model is more robust and the results of transferring between different datasets acheive the comparable results with the state-of-the-art models.

翻訳日:2021-04-04 01:46:16 公開日:2021-01-12

# マルチモーダル眼球運動データセットとマルチモーダル眼球運動セグメンテーション解析

A Multimodal Eye Movement Dataset and a Multimodal Eye Movement Segmentation Analysis ( http://arxiv.org/abs/2101.04318v1 )

ライセンス: Link先を確認

Wolfgang Fuhl and Enkelejda Kasneci

(参考訳) 注視眼球運動を伴う新しいデータセットを提案する。データセットは、現実世界やシミュレーターでの乗車中に記録された80万以上の視線ポイントで構成されている。合計19名の被験者の眼球運動を注記した。このデータセットには、眼球閉鎖、瞳孔中心、光学ベクトル、眼球角の中心から始まる瞳孔中心へのベクトルなど、いくつかのデータソースがある。これらの異なるデータソースを個別に分析・評価し、眼球運動分類に適合する良さと組み合わせて評価する。これらの結果は、リアルタイムシステムやアルゴリズムの開発者がアプリケーションに最適なデータソースを見つけるのに役立つだろう。また、このデータセット上で新しいアルゴリズムをトレーニングして評価することもできる。データとmatlabコードは、https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2fa%20multimodal%20eye%20movement%20dataset%20and%20...&mode=listでダウンロードできる。

We present a new dataset with annotated eye movements. The dataset consists of over 800,000 gaze points recorded during a car ride in the real world and in the simulator. In total, the eye movements of 19 subjects were annotated. In this dataset there are several data sources such as the eyelid closure, the pupil center, the optical vector, and a vector into the pupil center starting from the center of the eye corners. These different data sources are analyzed and evaluated individually as well as in combination with respect to their goodness of fit for eye movement classification. These results will help developers of real-time systems and algorithms to find the best data sources for their application. Also, new algorithms can be trained and evaluated on this data set. The data and the Matlab code can be downloaded here https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FA%20Multimodal%20Eye%20Movement%20Dataset%20and%20...&mode=list

翻訳日:2021-04-04 01:45:40 公開日:2021-01-12

# 逆攻撃に対する画像輝度のランダム変換

Random Transformation of Image Brightness for Adversarial Attack ( http://arxiv.org/abs/2101.04321v1 )

ライセンス: Link先を確認

Bo Yang, Kaiyong Xu, Hengjun Wang, Hengwei Zhang

(参考訳) ディープニューラルネットワークは、オリジナルの画像に小さな人間の知覚できない摂動を加えることで構築される敵の例に弱いが、モデル出力の不正確な予測を行う。ディープニューラルネットワークがデプロイされる前に、敵攻撃は安全クリティカルなアプリケーションにおいて堅牢なモデルを評価し選択するための重要な方法となる。しかし、難易度の高いブラックボックス設定では、攻撃成功率、すなわち敵の例の転送可能性を改善する必要がある。画像拡張法に基づき、画像輝度のランダム変換により、逆例生成における過剰フィットを解消し、その転送性を向上させることが判明した。そこで本研究では,FGSM(Fast Gradient Sign Method)関連手法と統合して,より堅牢な勾配に基づく攻撃を構築し,より優れた転送性を持つ逆例を生成する,この現象に基づく逆例生成手法を提案する。 ImageNetデータセットに関する大規模な実験は、この方法の有効性を実証している。本手法は,通常のネットワークであろうと逆であれ,データ拡張に基づく攻撃手法よりもブラックボックス攻撃の成功率が高い。この手法がモデルの堅牢性の評価と改善に役立つことを期待している。

Deep neural networks are vulnerable to adversarial examples, which are crafted by adding small, human-imperceptible perturbations to the original images, but make the model output inaccurate predictions. Before deep neural networks are deployed, adversarial attacks can thus be an important method to evaluate and select robust models in safety-critical applications. However, under the challenging black-box setting, the attack success rate, i.e., the transferability of adversarial examples, still needs to be improved. Based on image augmentation methods, we found that random transformation of image brightness can eliminate overfitting in the generation of adversarial examples and improve their transferability. To this end, we propose an adversarial example generation method based on this phenomenon, which can be integrated with Fast Gradient Sign Method (FGSM)-related methods to build a more robust gradient-based attack and generate adversarial examples with better transferability. Extensive experiments on the ImageNet dataset demonstrate the method's effectiveness. Whether on normally or adversarially trained networks, our method has a higher success rate for black-box attacks than other attack methods based on data augmentation. We hope that this method can help to evaluate and improve the robustness of models.

翻訳日:2021-04-04 01:45:20 公開日:2021-01-12

# 迷わずに混ざり合う

Mixup Without Hesitation ( http://arxiv.org/abs/2101.04342v1 )

ライセンス: Link先を確認

Hao Yu, Huanyu Wang, Jianxin Wu

(参考訳) ミックスアップはサンプルのペアを線形補間して新しいサンプルを作成するが、実装が容易であり、画像分類タスクに有効であることが示されている。しかし、ミックスアップには2つの欠点がある:1つは、十分に訓練されたモデルを得るために、より多くのトレーニングエポックが必要とされることである。本稿では,ミックスアップが常に表現空間を探索し,強化学習における探索・探索ジレンマにインスパイアされて,簡潔で効果的で使いやすいトレーニングアルゴリズムであるミックスアップ無湿(mWh)を提案する。我々は,mWhが基本データ拡張とミックスアップを徐々に置き換えることで,探索と搾取のバランスが良いことを示す。もともとのミキシングアップよりもトレーニング時間が短く、最適なハイパーパラメーターを探すことなく、すなわちmWhが混成アップとして振る舞うような強いベースラインを実現することができる。 mWhはCutMixに転送することもでき、オブジェクト検出などの他の機械学習やコンピュータビジョンタスクにも一貫した改善が加えられる。私たちのコードはオープンソースで、https://github.com/yuhao318/mwhで利用可能です。

Mixup linearly interpolates pairs of examples to form new samples, which is easy to implement and has been shown to be effective in image classification tasks. However, there are two drawbacks in mixup: one is that more training epochs are needed to obtain a well-trained model; the other is that mixup requires tuning a hyper-parameter to gain appropriate capacity but that is a difficult task. In this paper, we find that mixup constantly explores the representation space, and inspired by the exploration-exploitation dilemma in reinforcement learning, we propose mixup Without hesitation (mWh), a concise, effective, and easy-to-use training algorithm. We show that mWh strikes a good balance between exploration and exploitation by gradually replacing mixup with basic data augmentation. It can achieve a strong baseline with less training time than original mixup and without searching for optimal hyper-parameter, i.e., mWh acts as mixup without hesitation. mWh can also transfer to CutMix, and gain consistent improvement on other machine learning and computer vision tasks such as object detection. Our code is open-source and available at https://github.com/yuhao318/mwh

翻訳日:2021-04-04 01:44:48 公開日:2021-01-12

# インタラクティブな画像分割再考: 機能空間アノテーション

Rethinking Interactive Image Segmentation: Feature Space Annotation ( http://arxiv.org/abs/2101.04378v1 )

ライセンス: Link先を確認

Jord\~ao Bragantini (UNICAMP), Alexandre Falc\~ao (UNICAMP), Laurent Najman (ligm)

(参考訳) インタラクティブな画像分割手法の進歩にもかかわらず、高品質なピクセルレベルのアノテーションは依然として時間がかかり、手間がかかる。特徴空間投影によって導かれる複数の画像から対話的かつ同時的なセグメントアノテーションを提案し,ラベリングが進行するにつれてメトリック学習により最適化する。この戦略は、画像領域でアノテーションを実行する既存のインタラクティブセグメンテーション手法とは対照的である。提案手法は,iCoSeg,DAVIS,Rooftopといった前景セグメンテーションデータセットにおける最先端手法の精度を超えることができることを示す。さらに、既知のセマンティクスセグメンテーションデータセットであるcityscapesでは、元のアノテーション手順の74.75倍の精度で、91.5\%の精度を実現している。付録は追加の質的結果を示す。コードとビデオのデモは公開時に公開される。

Despite the progress of interactive image segmentation methods, high-quality pixel-level annotation is still time-consuming and laborious -- a bottleneck for several deep learning applications. We take a step back to propose interactive and simultaneous segment annotation from multiple images guided by feature space projection and optimized by metric learning as the labeling progresses. This strategy is in stark contrast to existing interactive segmentation methodologies, which perform annotation in the image domain. We show that our approach can surpass the accuracy of state-of-the-art methods in foreground segmentation datasets: iCoSeg, DAVIS, and Rooftop. Moreover, it achieves 91.5\% accuracy in a known semantic segmentation dataset, Cityscapes, being 74.75 times faster than the original annotation procedure. The appendix presents additional qualitative results. Code and video demonstration will be released upon publication.

翻訳日:2021-04-04 01:44:28 公開日:2021-01-12

# 二段階cnnに基づく木ログ認識

Two-stage CNN-based wood log recognition ( http://arxiv.org/abs/2101.04450v1 )

ライセンス: Link先を確認

Georg Wimmer and Rudolf Schraml and Heinz Hofbauer and Alexander Petutschnigg and Andreas Uhl

(参考訳) ログの起源の証明はますます重要になりつつある。 industry 4.0の文脈で、違法なロギングと戦うために、個々のログを追跡するモチベーションが高まっている。この分野でのこれまでの研究は、指紋や虹彩認識にインスパイアされた手法に基づくデジタルログエンド画像を用いたログ追跡に重点を置いていた。本研究は,CNNトレーニングのための三重項損失関数を用いて,ログ端のCNNに基づくセグメンテーションとセグメント化されたログ端の最終的な認識を組み合わせた畳み込みニューラルネットワーク(CNN)に基づくアプローチを提案する。その結果,提案手法は従来のアプローチよりも優れていることがわかった。

The proof of origin of logs is becoming increasingly important. In the context of Industry 4.0 and to combat illegal logging there is an increasing motivation to track each individual log. Our previous works in this field focused on log tracking using digital log end images based on methods inspired by fingerprint and iris-recognition. This work presents a convolutional neural network (CNN) based approach which comprises a CNN-based segmentation of the log end combined with a final CNN-based recognition of the segmented log end using the triplet loss function for CNN training. Results show that the proposed two-stage CNN-based approach outperforms traditional approaches.

翻訳日:2021-04-04 01:43:51 公開日:2021-01-12

# 画像合成におけるきめ細かいセマンティック制約

Fine-grained Semantic Constraint in Image Synthesis ( http://arxiv.org/abs/2101.04558v1 )

ライセンス: Link先を確認

Pengyang Li and Donghui Wang

(参考訳) 本稿では,精細な属性とマスクを入力として用いる多段高分解能画像合成モデルを提案する。提案モデルでは, 微粒化属性を用いて, 得られた画像の特徴を, 属性内の細粒化情報を通じて詳細に制約することができる。従来のマスクでは,生成した画像が視覚に適合するように制約され,生成する対向ネットワークから生成されたサンプルの予期せぬ多様性が低減される。また,画像の全体像とサブ領域を同時に識別することで,生成的敵ネットワークの識別能力を向上させる手法を提案する。さらに,データセットのラベル付き属性を最適化する手法を提案し,手動ラベリングノイズを低減する。その結果,画像合成モデルはよりリアルな画像を生成することがわかった。

In this paper, we propose a multi-stage and high-resolution model for image synthesis that uses fine-grained attributes and masks as input. With a fine-grained attribute, the proposed model can detailedly constrain the features of the generated image through rich and fine-grained semantic information in the attribute. With mask as prior, the model in this paper is constrained so that the generated images conform to visual senses, which will reduce the unexpected diversity of samples generated from the generative adversarial network. This paper also proposes a scheme to improve the discriminator of the generative adversarial network by simultaneously discriminating the total image and sub-regions of the image. In addition, we propose a method for optimizing the labeled attribute in datasets, which reduces the manual labeling noise. Extensive quantitative results show that our image synthesis model generates more realistic images.

翻訳日:2021-04-04 01:43:21 公開日:2021-01-12

# ファシカルランドマークの高速検出とその応用:調査

Fast Facial Landmark Detection and Applications: A Survey ( http://arxiv.org/abs/2101.10808v1 )

ライセンス: Link先を確認

Kostiantyn Khabarlak, Larysa Koriashkina

(参考訳) 本稿では,ニューラルネットワークに基づく顔のランドマーク検出アルゴリズムの探索と解析を行う。ここ数年で品質が大幅に向上したアプローチは、大きなポーズと感情の多様性、高いレベルの顔隠蔽を備えたデータセットに重点を置いています。本稿では,300-W,AFLW,WFLW,COFWという,難易度と最新度のデータセットの品質比較を行った。さらに、CPU、GPU、モバイルデバイスのアルゴリズム速度を比較します。完全性については、オープン実装で利用可能な確立されたメソッドについても簡単に触れます。さらに、ランドマーク検出アルゴリズムのアプリケーションと脆弱性についても取り上げる。それによって、将来さらなるアルゴリズム改善につながるであろう課題が生まれます。

In this paper we survey and analyze modern neural-network-based facial landmark detection algorithms. We focus on approaches that have led to a significant increase in quality over the past few years on datasets with large pose and emotion variability, high levels of face occlusions - all of which are typical in real-world scenarios. We summarize the improvements into categories, provide quality comparison on difficult and modern in-the-wild datasets: 300-W, AFLW, WFLW, COFW. Additionally, we compare algorithm speed on CPU, GPU and Mobile devices. For completeness, we also briefly touch on established methods with open implementations available. Besides, we cover applications and vulnerabilities of the landmark detection algorithms. Based on which, we raise problems that as we hope will lead to further algorithm improvements in future.

翻訳日:2021-04-04 01:42:30 公開日:2021-01-12

# HighAir:階層型グラフニューラルネットワークによる品質予測手法

HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method ( http://arxiv.org/abs/2101.04264v1 )

ライセンス: Link先を確認

Jiahui Xu, Ling Chen, Mingqi Lv, Chaoqun Zhan, Sanjian Chen, Jian Chang

(参考訳) 空気質を正確に予測することは、一般市民を肺や心臓病から守るのに不可欠である。これは、異なる汚染源と様々な影響要因の間の複雑な相互作用のため、難しい課題である。既存の大気汚染予測手法では,都市と監視局間の大気汚染物質の拡散過程を効果的にモデル化することはできない。本稿では,エンコーダ・デコーダアーキテクチャを採用し,気象や土地利用など,複雑な空気品質に影響する要因を考慮した階層型グラフニューラルネットワークによる空気品質予測手法を提案する。具体的には,都市レベルのグラフと駅レベルのグラフを階層的な視点から構築し,都市レベルのパターンと駅レベルのパターンをそれぞれ検討する。我々は,レベル間インタラクションを実装するために,上位配信と下位更新という2つの戦略を設計し,レベル内インタラクションを実装するためのメッセージパッシング機構を導入する。風向に基づくエッジウェイトを動的に調整し, 動的要因と空気質の関係をモデル化する。我々は,61,500km2以内の10大都市をカバーしているヤンツェ川デルタ市のデータセットについて,HighAirと最先端の空気質予測手法を比較した。実験の結果,HighAirは他の手法よりも優れていた。

Accurately forecasting air quality is critical to protecting general public from lung and heart diseases. This is a challenging task due to the complicated interactions among distinct pollution sources and various other influencing factors. Existing air quality forecasting methods cannot effectively model the diffusion processes of air pollutants between cities and monitoring stations, which may suddenly deteriorate the air quality of a region. In this paper, we propose HighAir, i.e., a hierarchical graph neural network-based air quality forecasting method, which adopts an encoder-decoder architecture and considers complex air quality influencing factors, e.g., weather and land usage. Specifically, we construct a city-level graph and station-level graphs from a hierarchical perspective, which can consider city-level and station-level patterns, respectively. We design two strategies, i.e., upper delivery and lower updating, to implement the inter-level interactions, and introduce message passing mechanism to implement the intra-level interactions. We dynamically adjust edge weights based on wind direction to model the correlations between dynamic factors and air quality. We compare HighAir with the state-of-the-art air quality forecasting methods on the dataset of Yangtze River Delta city group, which covers 10 major cities within 61,500 km2. The experimental results show that HighAir significantly outperforms other methods.

翻訳日:2021-04-04 01:42:22 公開日:2021-01-12

# トランザクション不正検出のための説明可能なディープビヘイビアシーケンスクラスタリング

Explainable Deep Behavioral Sequence Clustering for Transaction Fraud Detection ( http://arxiv.org/abs/2101.04285v1 )

ライセンス: Link先を確認

Wei Min, Weiming Liang, Hang Yin, Zhurong Wang, Mei Li, Alok Lal

(参考訳) eコマース業界では、ユーザー行動シーケンスデータは検索や商品販売といった多くのビジネスユニットで製品を改善するために広く使われている。しかし、その3v特性、すなわち金融サービスで使われることは稀である。体積、速度、バラエティ - しかし、その非構造的性質のためでもある。本稿では,金融サービスシナリオの深層学習に基づくクラスタ化行動データ表現手法(findeepbehaviorcluster)を提案する。動作シーケンスデータを利用するために,クリックストリームデータをイベントシーケンスとして扱い,時間アテンションに基づくBi-LSTMを用いて,教師なしの方法でシーケンス埋め込みを学習し,リスクエキスパートが生成した直感的な特徴と組み合わせてハイブリッドな特徴表現を形成する。また, FAISS プロジェクトに基づく HDBSCAN アルゴリズムのエンジニアリング最適化である GPU を用いた HDBSCAN (pHDBSCAN) アルゴリズムを提案する。アルゴリズムの計算効率は、元の実装に比べて500倍に向上し、フラッシュ詐欺パターン検出が実現された。実験の結果,提案するFinDeepBehaviorClusterフレームワークは,ビジネス価値の高い不正取引を捕捉できることがわかった。また、直感的な特徴を用いてリスククラスタからパターンを抽出するためにルール抽出法を適用し、事例調査のためにリスククラスタにナラティブ記述を付加し、未知のリスクパターンをリアルタイム詐欺検出のために掘り出すことができる。要約すると、FinDeepBehaviorClusterは、既存のリアルタイム不正検出エンジンを補完するリスク管理戦略であり、不正検出と積極的なリスク防御能力をさらに高めることができる。

In e-commerce industry, user behavior sequence data has been widely used in many business units such as search and merchandising to improve their products. However, it is rarely used in financial services not only due to its 3V characteristics - i.e. Volume, Velocity and Variety - but also due to its unstructured nature. In this paper, we propose a Financial Service scenario Deep learning based Behavior data representation method for Clustering (FinDeepBehaviorCluster) to detect fraudulent transactions. To utilize the behavior sequence data, we treat click stream data as event sequence, use time attention based Bi-LSTM to learn the sequence embedding in an unsupervised fashion, and combine them with intuitive features generated by risk experts to form a hybrid feature representation. We also propose a GPU powered HDBSCAN (pHDBSCAN) algorithm, which is an engineering optimization for the original HDBSCAN algorithm based on FAISS project, so that clustering can be carried out on hundreds of millions of transactions within a few minutes. The computation efficiency of the algorithm has increased 500 times compared with the original implementation, which makes flash fraud pattern detection feasible. Our experimental results show that the proposed FinDeepBehaviorCluster framework is able to catch missed fraudulent transactions with considerable business values. In addition, rule extraction method is applied to extract patterns from risky clusters using intuitive features, so that narrative descriptions can be attached to the risky clusters for case investigation, and unknown risk patterns can be mined for real-time fraud detection. In summary, FinDeepBehaviorCluster as a complementary risk management strategy to the existing real-time fraud detection engine, can further increase our fraud detection and proactive risk defense capabilities.

翻訳日:2021-04-04 01:42:00 公開日:2021-01-12

# マルチタスク学習によるシードストッキング

Seed Stocking Via Multi-Task Learning ( http://arxiv.org/abs/2101.04333v1 )

ライセンス: Link先を確認

Yunhe Feng and Wenjun Zhou

(参考訳) 作物種子の販売者は、少なくとも1年は在庫する種子の種類や量を計画する必要がある。 1つの作物には多数の種子品種があり、それぞれが異なる生育条件下で最高の性能を発揮できる。天候の予測不能を考えると、農家は高い収量と低いリスクのバランスをとる決定を下さなければならない。種子ベンダーは、農家のニーズを予想し、それらを準備する必要がある。本研究では,3つの主要なステップで種子需要を推定するための分析フレームワークを提案する。まず、各品種の収量とリスクを、あたかもそれぞれの場所に植えられたかのように見積もる。異なる種種を用いた過去の実験は品種間で非常に不均衡であり, 生育条件の組合せは少ないため, 類似品種の情報を借りるためにマルチタスク学習を採用している。第2に,収量とリスクのトレードオフを求めることにより,各地の種子のベストミックスを決定する。第3に,このようなミックスを集約して,成長する各場所の収量とリスクを再バランスさせるために,上位5品種を選択します。マルチタスク学習は収率予測に有効なソリューションであり、全体的な分析フレームワークは優れたパフォーマンスをもたらしています。

Sellers of crop seeds need to plan for the variety and quantity of seeds to stock at least a year in advance. There are a large number of seed varieties of one crop, and each can perform best under different growing conditions. Given the unpredictability of weather, farmers need to make decisions that balance high yield and low risk. A seed vendor needs to be able to anticipate the needs of farmers and have them ready. In this study, we propose an analytical framework for estimating seed demand with three major steps. First, we will estimate the yield and risk of each variety as if they were planted at each location. Since past experiments performed with different seed varieties are highly unbalanced across varieties, and the combination of growing conditions is sparse, we employ multi-task learning to borrow information from similar varieties. Second, we will determine the best mix of seeds for each location by seeking a tradeoff between yield and risk. Third, we will aggregate such mix and pick the top five varieties to re-balance the yield and risk for each growing location. We find that multi-task learning provides a viable solution for yield prediction, and our overall analytical framework has resulted in a good performance.

翻訳日:2021-04-04 01:41:27 公開日:2021-01-12

# エッジIoTソリューションのための信頼性の高いフリート分析

Reliable Fleet Analytics for Edge IoT Solutions ( http://arxiv.org/abs/2101.04414v1 )

ライセンス: Link先を確認

Emmanuel Raj, Magnus Westerlund, Leonardo Espinosa-Leal

(参考訳) 近年、iot(internet of things)デバイスのデプロイメントが急増し、ビッグデータと低レイテンシ通信の需要が高まりました。インフラストラクチャの需要の変化は、IoTアプリケーションに人工知能を使用することで、リアルタイムな意思決定を可能にする。 AIoT(Artificial Intelligence of Things)は、AI(Artificial Intelligence)テクノロジとIoTインフラストラクチャの組み合わせで、堅牢で効率的な操作と意思決定を提供する。 AIoTアプリケーションを実現するためにエッジコンピューティングが登場している。エッジコンピューティングは、データソースまたはその近くで洞察と意思決定を生成し、クラウドまたは中央リポジトリに送信されるデータ量を削減することができる。本稿では,エッジにおける機械学習モデル(Edge MLOps)の継続的デリバリ,デプロイメント,監視を可能にするために,AIoTアプリケーションのエッジでの機械学習を容易にするフレームワークを提案する。コントリビューションは、大規模にフリート分析を提供するためのサービス、ツール、メソッドを含むアーキテクチャである。本稿では,大学キャンパスの部屋でiotデバイスを用いた実験を行うことで,フレームワークの予備検証を行う。機械学習実験では,各エッジデバイスに配置したモデルを用いて,各室内の空気質を予測するための多変量時系列予測を行う。これらの実験により,提案するフリート分析フレームワークの効率性とロバスト性を検証する。

In recent years we have witnessed a boom in Internet of Things (IoT) device deployments, which has resulted in big data and demand for low-latency communication. This shift in the demand for infrastructure is also enabling real-time decision making using artificial intelligence for IoT applications. Artificial Intelligence of Things (AIoT) is the combination of Artificial Intelligence (AI) technologies and the IoT infrastructure to provide robust and efficient operations and decision making. Edge computing is emerging to enable AIoT applications. Edge computing enables generating insights and making decisions at or near the data source, reducing the amount of data sent to the cloud or a central repository. In this paper, we propose a framework for facilitating machine learning at the edge for AIoT applications, to enable continuous delivery, deployment, and monitoring of machine learning models at the edge (Edge MLOps). The contribution is an architecture that includes services, tools, and methods for delivering fleet analytics at scale. We present a preliminary validation of the framework by performing experiments with IoT devices on a university campus's rooms. For the machine learning experiments, we forecast multivariate time series for predicting air quality in the respective rooms by using the models deployed in respective edge devices. By these experiments, we validate the proposed fleet analytics framework for efficiency and robustness.

翻訳日:2021-04-04 01:41:09 公開日:2021-01-12

# 貯水池と貯水池の大陸規模流れのモデリング : 有効性の実証と課題の定式化

Continental-scale streamflow modeling of basins with reservoirs: a demonstration of effectiveness and a delineation of challenges ( http://arxiv.org/abs/2101.04423v1 )

ライセンス: Link先を確認

Wenyu Ouyang, Kathryn Lawson, Dapeng Feng, Lei Ye, Chi Zhang, Chaopeng Shen

(参考訳) 主要水路の大部分が流水に影響を与えるダムを有しており、大規模な水理モデルで考慮する必要がある。しかし,ダムを有する流域の毎日の流量予測は,様々なモデリング手法,特に大規模において困難である。そこで我々は,情報のみを用いて長期記憶(LSTM)深層学習モデルにより,どのタイプの流域を適切に表現できるかを分割・コンカレントで検討した。アメリカ合衆国における3557の盆地(83%が減衰)のデータを解析し,貯水池の用途,容量対流出比(dor),流れの流れのディバージョンが流れモデルに及ぼす影響を明らかにした。驚いたことに、LSTMモデルは広く使われている参照ベースベースデータセットでトレーニングされたが、データセット全体でトレーニングされたモデルは、Nash-Sutcliffe効率係数(NSE)の中央値を示し、ベンチマークレベルのパフォーマンスに達した。ゼロドール, 小型ドール, 大型ドール盆地は異なる挙動を示し, カテゴリー間での移動モデルにより破滅的な結果が得られた。しかし、異なるデータセットからプールされたデータを用いたトレーニングでは、これらのグループに対してそれぞれ0.73、0.78、0.71の最適中央値NSEが得られ、既存のモデルに対して顕著な優位性を示した。これらの結果は、降雨流出プロセスの一部として小さなダムをモデル化するコヒーレントな混合モデリング戦略を支持するが、ダム化された流域を基準として扱う必要はなく、訓練セットに含める必要がある。

A large fraction of major waterways have dams influencing streamflow, which must be accounted for in large-scale hydrologic modeling. However, daily streamflow prediction for basins with dams is challenging for various modeling approaches, especially at large scales. Here we took a divide-and-conquer approach to examine which types of basins could be well represented by a long short-term memory (LSTM) deep learning model using only readily-available information. We analyzed data from 3557 basins (83% dammed) over the contiguous United States and noted strong impacts of reservoir purposes, capacity-to-runoff ratio (dor), and diversion on streamflow on streamflow modeling. Surprisingly, while the LSTM model trained on a widely-used reference-basin dataset performed poorly for more non-reference basins, the model trained on the whole dataset presented a median test Nash-Sutcliffe efficiency coefficient (NSE) of 0.74, reaching benchmark-level performance. The zero-dor, small-dor, and large-dor basins were found to have distinct behaviors, so migrating models between categories yielded catastrophic results. However, training with pooled data from different sets yielded optimal median NSEs of 0.73, 0.78, and 0.71 for these groups, respectively, showing noticeable advantages over existing models. These results support a coherent, mixed modeling strategy where smaller dams are modeled as part of rainfall-runoff processes, but dammed basins must not be treated as reference ones and must be included in the training set; then, large-dor reservoirs can be represented explicitly and future work should examine modeling reservoirs for fire protection and irrigation, followed by those for hydroelectric power generation, and flood control, etc.

翻訳日:2021-04-04 01:40:49 公開日:2021-01-12

# 消費税の不正理解のための進化的ゲームモデル

An Evolutionary Game Model for Understanding Fraud in Consumption Taxes ( http://arxiv.org/abs/2101.04424v1 )

ライセンス: Link先を確認

M. Chica and J. Hernandez and C. Manrique-de-Lara-Pe\~nate and R. Chiong

(参考訳) 本稿では,消費税体系における不正行為のダイナミクスを研究・理解するための計算進化ゲームモデルを提案する。プレイヤーは、価値付加税(vat)を正しく宣言した場合は協力者であり、そうでない場合は離反者である。各プレイヤーの支払いは、回避された金額と税務当局によって検査される主観的確率に影響される。企業間の取引は買い手と売り手の両方が宣言しなければならないため、一方が採用する戦略は他方の支払いに影響を与える。我々は,このモデルについて,個体群と異なるスケールフリーネットワークを用いて検討する。スペイン・カナリア諸島に登録された企業によるVAT宣言の実際のデータを用いて,モデルパラメータを校正した。我々は,高低取引における監査確率のシナリオと人口の頻度,社会報酬や罰則を分析し,協力者の比率を高めるための最も効率的な政策を見出すことができた。 2つの大きな洞察が得られた。第一に、低取引に対する主観的な監査確率の増加は、高取引に対するこの確率の増加よりも効率的である。第二に、協力者に対する社会的報酬や、欠陥者に対する代替罰が効果的な政策であり得るが、その成功は、低取引と高取引の監査確率の分布に依存する。

This paper presents a computational evolutionary game model to study and understand fraud dynamics in the consumption tax system. Players are cooperators if they correctly declare their value added tax (VAT), and are defectors otherwise. Each player's payoff is influenced by the amount evaded and the subjective probability of being inspected by tax authorities. Since transactions between companies must be declared by both the buyer and seller, a strategy adopted by one influences the other's payoff. We study the model with a well-mixed population and different scale-free networks. Model parameters were calibrated using real-world data of VAT declarations by businesses registered in the Canary Islands region of Spain. We analyzed several scenarios of audit probabilities for high and low transactions and their prevalence in the population, as well as social rewards and penalties to find the most efficient policy to increase the proportion of cooperators. Two major insights were found. First, increasing the subjective audit probability for low transactions is more efficient than increasing this probability for high transactions. Second, favoring social rewards for cooperators or alternative penalties for defectors can be effective policies, but their success depends on the distribution of the audit probability for low and high transactions.

翻訳日:2021-04-04 01:39:48 公開日:2021-01-12

# 説明可能性の拡大:AIシステムにおける社会的透明性を目指して

Expanding Explainability: Towards Social Transparency in AI systems ( http://arxiv.org/abs/2101.04719v1 )

ライセンス: Link先を確認

Upol Ehsan, Q. Vera Liao, Michael Muller, Mark O. Riedl, Justin D. Weisz

(参考訳) AIを利用したシステムは、連続的な意思決定を仲介する傾向にあるため、エンドユーザーが情報と説明責任を負う行動を取ることが重要である。人間と人間の相互作用の説明は社会的に構成されている。 AIシステムはしばしば社会組織に組み込まれる。しかし、説明可能なAI(XAI)アプローチは主にアルゴリズム中心である。我々は、社会的な組織的文脈をAIによる意思決定の説明に取り入れた社会的透明性(Social Transparency, ST)を導入し、探求することで、社会的なXAIへの発展的な一歩を踏み出した。 stを概念的に探究するため,我々は投機的設計シナリオに基づく29人のaiユーザと実践者とのインタビューを行った。我々はSTの構成的設計要素を提案し、STの効果と含意を技術、意思決定、組織レベルで解き放つ概念的枠組みを開発した。このフレームワークは、STがAIに対する信頼を校正し、意思決定を改善し、組織的な集団行動を促進し、全体的説明責任を育む方法について説明している。本研究は, XAI の設計空間を拡大し,人間中心型 XAI の言説に寄与する。

As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algorithm-centered. We take a developmental step towards socially-situated XAI by introducing and exploring Social Transparency (ST), a sociotechnically informed perspective that incorporates the socio-organizational context into explaining AI-mediated decision-making. To explore ST conceptually, we conducted interviews with 29 AI users and practitioners grounded in a speculative design scenario. We suggested constitutive design elements of ST and developed a conceptual framework to unpack ST's effect and implications at the technical, decision-making, and organizational level. The framework showcases how ST can potentially calibrate trust in AI, improve decision-making, facilitate organizational collective actions, and cultivate holistic explainability. Our work contributes to the discourse of Human-Centered XAI by expanding the design space of XAI.

翻訳日:2021-04-04 01:39:27 公開日:2021-01-12

# 肺疾患におけるct画像の定量および自動解析のための患者別アプローチ--covid-19患者への応用

A patient-specific approach for quantitative and automatic analysis of computed tomography images in lung disease: application to COVID-19 patients ( http://arxiv.org/abs/2101.04430v1 )

ライセンス: Link先を確認

L. Berta, C. De Mattia, F. Rizzetto, S. Carrazza, P.E. Colombo, R. Fumagalli, T. Langer, D. Lizio, A. Vanzulli, A. Torresin

(参考訳) 肺CT画像の定量的な計測は広く用いられており、しばしば生理学との明確なつながりがない。本研究は,CT画像(WAVE)における肺の高度評価のための患者非依存モデルを提案する。肺の下部CTヒストグラムデータポイントに平均 (Mu.f) と幅 (Sigma.f) のガウスフィットを適用し, よく評価された肺体積 (WAVE.f) を推定した。肺CT画像と4DCT画像を用いて,CT再建パラメータと呼吸周期の独立性を解析した。第3のコホートで算出されたガウス測定値と第1の放射線学的特徴を健康な肺と比較した。各肺はさらに24領域に区分され, 局所密度変化を表すため, ガウスフィットパラメータmu.f由来の新しいバイオマーカーが提案されている。 WAVE.fは80%の症例で呼吸運動から独立していた。 1%, 2%, 最大14%の違いは, 適度な反復強度とFBPアルゴリズム, 1mm, 3mmのスライス厚, 異なる再構成カーネルを比較した。健康な被験者は、計算されたすべての指標について、COVID-19患者と大きく異なっていた。局所バイオマーカーのグラフィカル表現は、単一の2次元画像において空間的および定量的情報を提供する。固定ヒストグラム閾値に基づく他の指標とは異なり、このモデルは物体間および物体内変動性を考えることができる。さらに、観察者とは独立に、病気の重症度を定量化するための局所バイオマーカーを定義する。

Quantitative metrics in lung computed tomography (CT) images have been widely used, often without a clear connection with physiology. This work proposes a patient-independent model for the estimation of well-aerated volume of lungs in CT images (WAVE). A Gaussian fit, with mean (Mu.f) and width (Sigma.f) values, was applied to the lower CT histogram data points of the lung to provide the estimation of the well-aerated lung volume (WAVE.f). Independence from CT reconstruction parameters and respiratory cycle was analysed using healthy lung CT images and 4DCT acquisitions. The Gaussian metrics and first order radiomic features calculated for a third cohort of COVID-19 patients were compared with those relative to healthy lungs. Each lung was further segmented in 24 subregions and a new biomarker derived from Gaussian fit parameter Mu.f was proposed to represent the local density changes. WAVE.f resulted independent from the respiratory motion in 80% of the cases. Differences of 1%, 2% and up to 14% resulted comparing a moderate iterative strength and FBP algorithm, 1 and 3 mm of slice thickness and different reconstruction kernel. Healthy subjects were significantly different from COVID-19 patients for all the metrics calculated. Graphical representation of the local biomarker provides spatial and quantitative information in a single 2D picture. Unlike other metrics based on fixed histogram thresholds, this model is able to consider the inter-and intra-subject variability. In addition, it defines a local biomarker to quantify the severity of the disease, independently of the observer.

翻訳日:2021-04-04 01:39:08 公開日:2021-01-12

# KuzborskijとSzepesv\'ariの信頼境界について

A note on a confidence bound of Kuzborskij and Szepesv\'ari ( http://arxiv.org/abs/2101.04671v1 )

ライセンス: Link先を確認

Omar Rivasplata

(参考訳) 興味深い最近の研究で、Kuzborskij と Szepesv\'ari は独立確率変数の函数に対する信頼度を導出した。 Kuzborskij と Szepesv\'ari は PAC-Bayes-ification of their confidence bound も設立した。彼らの研究の2つの重要な側面は、確率変数が非有界な範囲であり、必ずしも同じ分布であるとは限らないことである。このノートの目的は、これらの興味深い結果を合理化して宣伝し、議論することである。この公開ノートは、例え「フィーチャー映画」を楽しみながらプレビューシーケンスをスキップする人のために書かれています。

In an interesting recent work, Kuzborskij and Szepesv\'ari derived a confidence bound for functions of independent random variables, which is based on an inequality that relates concentration to squared perturbations of the chosen function. Kuzborskij and Szepesv\'ari also established the PAC-Bayes-ification of their confidence bound. Two important aspects of their work are that the random variables could be of unbounded range, and not necessarily of an identical distribution. The purpose of this note is to advertise/discuss these interesting results, with streamlined proofs. This expository note is written for persons who, metaphorically speaking, enjoy the "featured movie" but prefer to skip the preview sequence.

翻訳日:2021-04-04 01:38:41 公開日:2021-01-12

# 自己教師あり表現学習による画像からの銀河距離の推定

Estimating Galactic Distances From Images Using Self-supervised Representation Learning ( http://arxiv.org/abs/2101.04293v1 )

ライセンス: Link先を確認

Md Abul Hayat, Peter Harrington, George Stein, Zarija Luki\'c, Mustafa Mustafa

(参考訳) 対照的な自己教師付き学習フレームワークを用いて、光度画像から銀河の距離を推定する。我々は、コンピュータビジョンからのデータ拡張と、銀河塵のアプリケーション固有の拡張を取り入れた。結果として得られる銀河画像の視覚的表現は意味的に有用であり、高速に類似性検索が可能であり、赤方偏移推定のタスクでうまく微調整できることがわかった。本研究では,(1)ラベルなしデータの大規模なコーパスを事前学習し,(2)ラベル付きデータに2-4倍の精度を必要とする完全教師付きモデルの精度を達成できること,(2)Sloan Digital Sky Survey (SDSS)のMain Galaxy Sampleにあるすべてのデータラベルを用いて自己教師付き表現を微調整することにより,最先端の教師付き学習手法よりも優れていることを示す。

We use a contrastive self-supervised learning framework to estimate distances to galaxies from their photometric images. We incorporate data augmentations from computer vision as well as an application-specific augmentation accounting for galactic dust. We find that the resulting visual representations of galaxy images are semantically useful and allow for fast similarity searches, and can be successfully fine-tuned for the task of redshift estimation. We show that (1) pretraining on a large corpus of unlabeled data followed by fine-tuning on some labels can attain the accuracy of a fully-supervised model which requires 2-4x more labeled data, and (2) that by fine-tuning our self-supervised representations using all available data labels in the Main Galaxy Sample of the Sloan Digital Sky Survey (SDSS), we outperform the state-of-the-art supervised learning method.

翻訳日:2021-04-04 01:38:28 公開日:2021-01-12

# CAnet:深層学習を用いたFDD大規模MIMOにおけるアップリンク支援ダウンリンクチャネル獲得

CAnet: Uplink-aided Downlink Channel Acquisition in FDD Massive MIMO using Deep Learning ( http://arxiv.org/abs/2101.04377v1 )

ライセンス: Link先を確認

Jiajia Guo, Chao-Kai Wen, Shi Jin

(参考訳) 周波数分割二重化システムでは、ダウンリンクチャネル状態情報(CSI)取得方式は高いトレーニングとフィードバックのオーバーヘッドをもたらす。本稿では,これらのオーバーヘッドを軽減するために,ディープラーニングを用いたアップリンク支援ダウンリンクチャネル獲得フレームワークを提案する。チャネル推定やフィードバックモジュールのみに焦点を当てた既存の作業とは異なり、私たちの知る限りでは、ダウンリンクパイロット設計、チャネル推定、フィードバックを含む、ダウンリンクCSI取得プロセス全体を考慮した最初の研究である。まず,角領域の双方向チャネル間の相関を利用して適応的なパイロット設計モジュールを提案し,チャネル推定を改善する。次に、フィードバックモジュール中のビット割り当て問題を回避するため、複雑なチャネルを結合し、基地局のチャネル再構成にアップリンクチャネルの大きさを埋め込む。最後に、上記の2つのモジュールを組み合わせて、2つの人気のあるダウンリンクチャネル獲得フレームワークを比較します。前者のフレームワークは、その後、ユーザ機器のチャネルを推定し、返送する。後者のユーザ装置は、受信したパイロット信号を基地局に直接送り返す。その結果、アップリンクの助けを借りて、パイロット信号を直接フィードバックすることで、約20%のフィードバックビットを節約できることがわかった。

In frequency-division duplexing systems, the downlink channel state information (CSI) acquisition scheme leads to high training and feedback overheads. In this paper, we propose an uplink-aided downlink channel acquisition framework using deep learning to reduce these overheads. Unlike most existing works that focus only on channel estimation or feedback modules, to the best of our knowledge, this is the first study that considers the entire downlink CSI acquisition process, including downlink pilot design, channel estimation, and feedback. First, we propose an adaptive pilot design module by exploiting the correlation in magnitude among bidirectional channels in the angular domain to improve channel estimation. Next, to avoid the bit allocation problem during the feedback module, we concatenate the complex channel and embed the uplink channel magnitude to the channel reconstruction at the base station. Lastly, we combine the above two modules and compare two popular downlink channel acquisition frameworks. The former framework estimates and feeds back the channel at the user equipment subsequently. The user equipment in the latter one directly feeds back the received pilot signals to the base station. Our results reveal that, with the help of uplink, directly feeding back the pilot signals can save approximately 20% of feedback bits, which provides a guideline for future research.

翻訳日:2021-04-04 01:38:08 公開日:2021-01-12

# 放射線特徴とコントラスト学習を用いた胸部X線上の肺炎検出

Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning ( http://arxiv.org/abs/2101.04269v1 )

ライセンス: Link先を確認

Yan Han, Chongyan Chen, Ahmed H Tewfik, Ying Ding, Yifan Peng

(参考訳) 胸部X線は非侵襲性から最も一般的な診断の1つである。胸部X線画像の数は急上昇したが、胸部X線を読むのは放射線技師が手動で行い、火傷や遅延が発生する。医学画像から多くの定量的特徴を抽出できる放射線学のサブフィールドとして伝統的にラジオミクスは、深層学習時代以前の医療画像診断を容易にする可能性を示している。深層学習の台頭に伴い、胸部X線診断における深部ニューラルネットワークの説明可能性はまだ不透明である。本研究では,胸部x線中の肺炎をx線学的特徴と対比学習を用いて検出する新しい枠組みを提案する。 rsna肺炎検出チャレンジデータセットを用いた実験により,いくつかの最先端モデル(f1-scoreでは10%以上)に対して優れた結果が得られ,モデルの解釈性が向上した。

Chest X-ray becomes one of the most common medical diagnoses due to its noninvasiveness. The number of chest X-ray images has skyrocketed, but reading chest X-rays still have been manually performed by radiologists, which creates huge burnouts and delays. Traditionally, radiomics, as a subfield of radiology that can extract a large number of quantitative features from medical images, demonstrates its potential to facilitate medical imaging diagnosis before the deep learning era. With the rise of deep learning, the explainability of deep neural networks on chest X-ray diagnosis remains opaque. In this study, we proposed a novel framework that leverages radiomics features and contrastive learning to detect pneumonia in chest X-ray. Experiments on the RSNA Pneumonia Detection Challenge dataset show that our model achieves superior results to several state-of-the-art models (> 10% in F1-score) and increases the model's interpretability.

翻訳日:2021-04-04 01:37:49 公開日:2021-01-12

# LiDARおよびカメラセンサセットアップの自動外部校正法

Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor Setups ( http://arxiv.org/abs/2101.04431v1 )

ライセンス: Link先を確認

Jorge Beltr\'an, Carlos Guindel, Fernando Garc\'ia

(参考訳) ほとんどのセンサーはlidarと視覚システムで構成されており、ロバストなシーン理解を得るために必要な異なるアルゴリズムの信頼性を向上させる補完的情報を提供する。しかし、異なるソースからの情報の効果的な使用には、関連するセンサー間の正確なキャリブレーションが必要である。そこで本研究では,LiDAR,モノクラーカメラ,ステレオカメラを含むセンサ対の外部パラメータを同一あるいは異なるモードで校正する手法を提案する。第1に、カスタム校正対象に属する基準点を校正するセンサによって提供されるデータから抽出し、第2に、両点セットの登録により最適な剛性変換を求める。提案手法は、通常車両のセットアップで見られるように、非常に異なる解像度とポーズのデバイスを扱うことができる。提案手法の性能を評価するため,一般的なシミュレーションフレームワーク上に構築された新しい評価スイートを紹介した。合成環境における実験により, キャリブレーションアルゴリズムは既存の手法よりも有意に優れており, 実データテストは評価スイートで得られた結果と相関することがわかった。オープンソースコードはhttps://github.com/beltransen/velo2cam_calibrationで入手できる。

Most sensor setups for onboard autonomous perception are composed of LiDARs and vision systems, as they provide complementary information that improves the reliability of the different algorithms necessary to obtain a robust scene understanding. However, the effective use of information from different sources requires an accurate calibration between the sensors involved, which usually implies a tedious and burdensome process. We present a method to calibrate the extrinsic parameters of any pair of sensors involving LiDARs, monocular or stereo cameras, of the same or different modalities. The procedure is composed of two stages: first, reference points belonging to a custom calibration target are extracted from the data provided by the sensors to be calibrated, and second, the optimal rigid transformation is found through the registration of both point sets. The proposed approach can handle devices with very different resolutions and poses, as usually found in vehicle setups. In order to assess the performance of the proposed method, a novel evaluation suite built on top of a popular simulation framework is introduced. Experiments on the synthetic environment show that our calibration algorithm significantly outperforms existing methods, whereas real data tests corroborate the results obtained in the evaluation suite. Open-source code is available at https://github.com/beltransen/velo2cam_calibration

翻訳日:2021-04-04 01:37:33 公開日:2021-01-12

# 野生における共同脱塩・脱鼻--地底不確かさ下での訓練を事例として

Joint Demosaicking and Denoising in the Wild: The Case of Training Under Ground Truth Uncertainty ( http://arxiv.org/abs/2101.04442v1 )

ライセンス: Link先を確認

Jierun Chen, Song Wen, S.-H. Gary Chan

(参考訳) デジタルカメラパイプラインにおける2つの基本的なステップは、ノイズの多い輝度からクリーンなカラーイメージを再構築することである。本稿では,野生における共同解体・復調のための新しい学習フレームワークであるWild-JDDを提案し,研究する。トレーニングデータの基底的真理が現実の完全な反映であると一般的に仮定する先行研究とは対照的に、ここでは野生における基底的真理の不確かさのより一般的な不完全なケースを考察する。まず, ジッパー効果, カラーモアレ, 残留雑音など, 様々な種類の人工物として現れることを示す。次に,2段階データ分解過程を定式化し,基底分布に共役事前分布を課すような基礎的真理不確かさを捉える。その後、劣化した入力に基づいて条件付けられた共役事前分布のパラメータを近似するニューラルネットワークを訓練するために、下限値(elbo)損失の証拠を導出する。最後に, 分散型入力の性能をさらに高めるために, 入力を弱い情報量優先にすることで, 単純かつ効果的な微調整戦略を考案する。基礎的な真実の不確実性を考慮すると、Wild-JDDは最適化の間、よく解釈可能である。広範な実験によって、合成データセットとリアルデータセットの両方で、共同デモサイクリングとデノイジングタスクで最先端のスキームを上回ることが検証された。

Image demosaicking and denoising are the two key fundamental steps in digital camera pipelines, aiming to reconstruct clean color images from noisy luminance readings. In this paper, we propose and study Wild-JDD, a novel learning framework for joint demosaicking and denoising in the wild. In contrast to previous works which generally assume the ground truth of training data is a perfect reflection of the reality, we consider here the more common imperfect case of ground truth uncertainty in the wild. We first illustrate its manifestation as various kinds of artifacts including zipper effect, color moire and residual noise. Then we formulate a two-stage data degradation process to capture such ground truth uncertainty, where a conjugate prior distribution is imposed upon a base distribution. After that, we derive an evidence lower bound (ELBO) loss to train a neural network that approximates the parameters of the conjugate prior distribution conditioned on the degraded input. Finally, to further enhance the performance for out-of-distribution input, we design a simple but effective fine-tuning strategy by taking the input as a weakly informative prior. Taking into account ground truth uncertainty, Wild-JDD enjoys good interpretability during optimization. Extensive experiments validate that it outperforms state-of-the-art schemes on joint demosaicking and denoising tasks on both synthetic and realistic raw datasets.

翻訳日:2021-04-04 01:37:13 公開日:2021-01-12

# Binary TTC: 自律ナビゲーションのための時間ジオフェンス

Binary TTC: A Temporal Geofence for Autonomous Navigation ( http://arxiv.org/abs/2101.04777v1 )

ライセンス: Link先を確認

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

(参考訳) タイム・トゥ・コンタクト(TTC、Time-to-Contact)は、物体が観測者の飛行機と衝突する時であり、経路計画のための強力なツールである。 TTCには、単眼カメラのみを必要とするなど、いくつかの利点がある。しかし、各画素に対するTTCの回帰は簡単ではなく、既存のほとんどの手法はシーンに関する仮定を単純化する。この課題に対処するために、TTCを単純なバイナリ分類によって推定する。我々は、観測者が一定の時間内に障害物と衝突するかどうかを低レイテンシで予測する。このようなシナリオでは、従来の方法よりも25倍以上高速で6.4ミリ秒の時間的測地を提供する。提案手法は,計算予算が許す場合,任意に微細な量子化(連続値を含む)で画素当たりのTTCを推定できる。我々の知識を最大限に活用するために,本手法は初めて,十分高いフレームレートでTCC情報(バイナリまたは粗い量子化)を提供する。

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene -- even for humans. TTC presents several advantages, including requiring only a monocular, uncalibrated camera. However, regressing TTC for each pixel is not straightforward, and most existing methods make over-simplifying assumptions about the scene. We address this challenge by estimating TTC via a series of simpler, binary classifications. We predict with low latency whether the observer will collide with an obstacle within a certain time, which is often more critical than knowing exact, per-pixel TTC. For such scenarios, our method offers a temporal geofence in 6.4 ms -- over 25x faster than existing methods. Our approach can also estimate per-pixel TTC with arbitrarily fine quantization (including continuous values), when the computational budget allows for it. To the best of our knowledge, our method is the first to offer TTC information (binary or coarsely quantized) at sufficiently high frame-rates for practical use.

翻訳日:2021-04-04 01:36:32 公開日:2021-01-12

# ドメインフリーな医用画像拡張のための生成逆U-Net

Generative Adversarial U-Net for Domain-free Medical Image Augmentation ( http://arxiv.org/abs/2101.04793v1 )

ライセンス: Link先を確認

Xiaocong Chen and Yun Li and Lina Yao and Ehsan Adeli and Yu Zhang

(参考訳) 注釈付き医用画像の不足は、医用画像コンピューティングの分野における最大の課題の1つだ。十分な数のトレーニングサンプルがなければ、ディープラーニングベースのモデルは過剰フィッティングの問題に苦しむ可能性が高い。一般的な解決策は、画像回転、トリミング、リサイズなどの画像操作である。これらの方法は、より多くのトレーニングサンプルが導入されるにつれて、過度に適合する問題を緩和するのに役立ちます。しかし、追加情報を持つ新しい画像を導入することはなく、テストセットがトレーニングセットに現れる類似のサンプルを含む可能性があるため、データ漏洩につながる可能性がある。この課題に対処するために,生成型逆ネットワークを用いた多様な画像を生成することを提案する。本稿では, 生成逆ネットワークとU-Netの両方を利用する, 生成逆ネットワークと呼ばれる新しい生成手法を開発する。既存のアプローチとは異なり、新しく設計されたモデルはドメインフリーで、様々な医療画像に一般化できる。コンピュータ断層撮影(CT)スキャン,病理学,X線など,8つの多様なデータセットに対して大規模な実験を行った。可視化と定量化により,提案手法の有効性を実証し,高画質な医用画像の生成に有効であることを示す。

The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing. Without a sufficient number of training samples, deep learning based models are very likely to suffer from over-fitting problem. The common solution is image manipulation such as image rotation, cropping, or resizing. Those methods can help relieve the over-fitting problem as more training samples are introduced. However, they do not really introduce new images with additional information and may lead to data leakage as the test set may contain similar samples which appear in the training set. To address this challenge, we propose to generate diverse images with generative adversarial network. In this paper, we develop a novel generative method named generative adversarial U-Net , which utilizes both generative adversarial network and U-Net. Different from existing approaches, our newly designed model is domain-free and generalizable to various medical images. Extensive experiments are conducted over eight diverse datasets including computed tomography (CT) scan, pathology, X-ray, etc. The visualization and quantitative results demonstrate the efficacy and good generalization of the proposed method on generating a wide array of high-quality medical images.

翻訳日:2021-04-04 01:36:13 公開日:2021-01-12

# トレース比最適化と多視点学習への応用

Trace Ratio Optimization with an Application to Multi-view Learning ( http://arxiv.org/abs/2101.04292v1 )

ライセンス: Link先を確認

Li Wang and Lei-Hong Zhang and Ren-Cang Li

(参考訳) スティーフェル多様体上のトレース比最適化問題について,理論と数値計算の両方の観点から検討した。この問題は,フィッシャー線形判別分析,正準相関解析,非平衡散逸問題から,少なくとも3つの特別な事例が生じた。固有ベクトル依存性を持つ非線形固有値問題の形で必要条件が確立され、自己整合体(SCF)反復に基づく数値法が設計され、常に収束することが証明された。多視点サブスペース学習の応用として,実世界データセット上で新しいフレームワークとそのインスタンス化された具体モデルを提案する。数値実験の結果,提案手法の有効性と新しい多視点部分空間学習モデルの有効性が示された。

A trace ratio optimization problem over the Stiefel manifold is investigated from the perspectives of both theory and numerical computations. At least three special cases of the problem have arisen from Fisher linear discriminant analysis, canonical correlation analysis, and unbalanced Procrustes problem, respectively. Necessary conditions in the form of nonlinear eigenvalue problem with eigenvector dependency are established and a numerical method based on the self-consistent field (SCF) iteration is designed and proved to be always convergent. As an application to multi-view subspace learning, a new framework and its instantiated concrete models are proposed and demonstrated on real world data sets. Numerical results show that the efficiency of the proposed numerical methods and effectiveness of the new multi-view subspace learning models.

翻訳日:2021-04-04 01:35:57 公開日:2021-01-12

# NeurIPS 2020 Workshop on Machine Learning for the Development World: Improving Resilience」の開催報告

Proceedings of the NeurIPS 2020 Workshop on Machine Learning for the Developing World: Improving Resilience ( http://arxiv.org/abs/2101.04347v1 )

ライセンス: Link先を確認

Tejumade Afonja, Konstantin Klemmer, Aya Salama, Paula Rodriguez Diaz, Niveditha Kalavakonda, Oluwafemi Azeez

(参考訳) 以下は、2020年12月12日土曜日に開催された第43回NeurIPS Conference on Neural Information Processing Systems (NeurIPS)の一部として開催されるML4D(Machine Learning for the developing World)の第4回ワークショップの手順である。

These are the proceedings of the 4th workshop on Machine Learning for the Developing World (ML4D), held as part of the Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) on Saturday, December 12th 2020.

翻訳日:2021-04-04 01:35:45 公開日:2021-01-12

# 活性化密度に基づくエネルギー効率の良いニューラルネットワークの混合精度量子化

Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks ( http://arxiv.org/abs/2101.04354v1 )

ライセンス: Link先を確認

Karina Vasquez, Yeshwanth Venkatesha, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

(参考訳) ニューラルネットワークが組み込みデバイスで広く普及するにつれて、リソース制約のある環境への展開を容易にするためのモデル圧縮技術が必要である。量子化は最先端のモデル圧縮をもたらすゴートメソッドの1つである。ほとんどのアプローチは、完全に訓練されたモデルを採用し、異なるヒューリスティックを適用して、ネットワークの異なる層に対して最適なビット精度を決定する。活性化密度 (AD) に基づいて, 層内の非ゼロ活性化の比率を推定し, イントレーニング量子化法を提案する。本手法は,混合精度モデルによる学習中の各層に対するビット幅を計算する。トレーニング中に精度の低いモデルをトレーニングするため、このアプローチはトレーニング複雑性の低い最終量子化モデルをもたらし、再トレーニングの必要性も排除します。我々は、VGG19/ResNet18アーキテクチャ上で、CIFAR-10、CIFAR-100、TinyImagenetなどのベンチマークデータセットで実験を行い、その精度とエネルギー推定を報告する。推定乗算累積 (MAC) の削減と, トレーニングの複雑さを50%減らすことで, 4.5倍の利点が得られる。提案手法の省エネルギー効果を更に評価するため,pim(mixed-precision scalable process in memory)ハードウェアアクセラレーションプラットフォームを開発した。ハードウェアプラットフォームには、マルチビット精密ニューラルネットワークモデルを扱うためのシフト付加機能が含まれている。提案手法を用いて得られた量子化モデルをPIMプラットフォーム上で評価すると,16ビットモデルと比較して約5倍のエネルギー削減が得られる。さらに,広告ベースの量子化と広告ベースのプルーニング(どちらもトレーニング中)を統合すると,vgg19とresnet18アーキテクチャの最大198倍,44倍のエネルギー削減がpcmプラットフォーム上で実現されることが分かった。

As neural networks gain widespread adoption in embedded devices, there is a need for model compression techniques to facilitate deployment in resource-constrained environments. Quantization is one of the go-to methods yielding state-of-the-art model compression. Most approaches take a fully trained model, apply different heuristics to determine the optimal bit-precision for different layers of the network, and retrain the network to regain any drop in accuracy. Based on Activation Density (AD)-the proportion of non-zero activations in a layer-we propose an in-training quantization method. Our method calculates bit-width for each layer during training yielding a mixed precision model with competitive accuracy. Since we train lower precision models during training, our approach yields the final quantized model at lower training complexity and also eliminates the need for re-training. We run experiments on benchmark datasets like CIFAR-10, CIFAR-100, TinyImagenet on VGG19/ResNet18 architectures and report the accuracy and energy estimates for the same. We achieve ~4.5x benefit in terms of estimated multiply-and-accumulate (MAC) reduction while reducing the training complexity by 50% in our experiments. To further evaluate the energy benefits of our proposed method, we develop a mixed-precision scalable Process In Memory (PIM) hardware accelerator platform. The hardware platform incorporates shift-add functionality for handling multi-bit precision neural network models. Evaluating the quantized models obtained with our proposed method on the PIM platform yields ~5x energy reduction compared to 16-bit models. Additionally, we find that integrating AD based quantization with AD based pruning (both conducted during training) yields up to ~198x and ~44x energy reductions for VGG19 and ResNet18 architectures respectively on PIM platform compared to baseline 16-bit precision, unpruned models.

翻訳日:2021-04-04 01:35:38 公開日:2021-01-12

# 機械学習による新しい半導体の解釈可能な発見

Interpretable discovery of new semiconductors with machine learning ( http://arxiv.org/abs/2101.04383v1 )

ライセンス: Link先を確認

Hitarth Choubisa (1), Petar Todorovi\'c (1), Joao M. Pina (1), Darshan H. Parmar (1), Ziliang Li (1), Oleksandr Voznyy (4), Isaac Tamblyn (2,3), Edward Sargent (1) ((1) Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada, (2) National Research Council of Canada, Ottawa, ON, Canada, (3) Vector Institute for Artificial Intelligence, Toronto, ON, Canada, (4) Department of Physical and Environmental Sciences, University of Toronto, Scarborough, ON, Canada)

(参考訳) ディープラーニングモデルは、密度汎関数理論(DFT)で計算された結果を、DFT$^{6}$のコストの10万分の1で再現する。実験材料合成におけるガイダンスを提供するには, 正確かつ効果的な探索アルゴリズムと, 実験観測と整合したトレーニングデータを組み合わせる必要がある。本稿では,Deep Adaptive Regressive Weighted Intelligent Network (DARWIN) を用いて,高スループットハイブリッドDFTデータに基づいて学習したマシン学習サロゲートモデルを用いた進化的アルゴリズムを報告する。この戦略は、対象特性を持つ候補に対して、10$^8$三元および10$^{11}$四元数$^{7}$の材料空間の効率的な探索を可能にする。ハロゲン化物とBサイトカチオンの電気陰性度の違いが3次構造安定性の強い予測因子であることの発見など、解釈可能な設計規則を提供する。例えば、紫外線放射を求めるとき、DARWINはその電子陰性率差に基づいて、K$_2$CuX$_3$ (X = Cl, Br) を有望な物質族として予測する。我々はこれらの物質を、安定で直接バンドギャップUVエミッタとして合成し、発見した。このアプローチは、人間が使用する知識蒸留も可能にする。

Machine learning models of materials$^{1-5}$ accelerate discovery compared to ab initio methods: deep learning models now reproduce density functional theory (DFT)-calculated results at one hundred thousandths of the cost of DFT$^{6}$. To provide guidance in experimental materials synthesis, these need to be coupled with an accurate yet effective search algorithm and training data consistent with experimental observations. Here we report an evolutionary algorithm powered search which uses machine-learned surrogate models trained on high-throughput hybrid functional DFT data benchmarked against experimental bandgaps: Deep Adaptive Regressive Weighted Intelligent Network (DARWIN). The strategy enables efficient search over the materials space of ~10$^8$ ternaries and 10$^{11}$ quaternaries$^{7}$ for candidates with target properties. It provides interpretable design rules, such as our finding that the difference in the electronegativity between the halide and B-site cation being a strong predictor of ternary structural stability. As an example, when we seek UV emission, DARWIN predicts K$_2$CuX$_3$ (X = Cl, Br) as a promising materials family, based on its electronegativity difference. We synthesized and found these materials to be stable, direct bandgap UV emitters. The approach also allows knowledge distillation for use by humans.

翻訳日:2021-04-04 01:35:04 公開日:2021-01-12

# 二成分ニューラルネットワークによる高出力IoTデバイス上の音事象検出

Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices ( http://arxiv.org/abs/2101.04446v1 )

ライセンス: Link先を確認

Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Michele Magno, Elisabetta Farella, Luca Benini

(参考訳) サウンドイベント検出(SED)は、消費者およびスマートシティアプリケーションにおいてホットなトピックである。ディープニューラルネットワークに基づく既存のアプローチは非常に効果的だが、超低消費電力の常時オンデバイスをターゲットにする場合、メモリ、電力、スループットの面で非常に要求される。レイテンシ、可用性、コスト、プライバシ要件は、最新のIoTシステムに対して、センサに近いノード上でデータを処理し、非常に限られたエネルギー供給と、最先端のDNNを実行する前にメモリサイズと処理能力に厳しい制約を課している。本稿では,高エネルギー効率なRISC-V(8+1)コアGAP8マイクロコントローラと,極端量子化と小フットプリント型バイナリニューラルネットワーク(BNN)の組み合わせについて検討する。既存のSED用CNNのフットプリント(815kB)が、当社プラットフォームで利用可能なメモリ512kBを超えていることから、バイナリフィルタとアクティベーションを使用してネットワークを再トレーニングし、これらのメモリ制約を満たす。完全な)バイナリニューラルネットワークは、同等の完全精度のベースラインに比べて、難しいImageNetオブジェクト認識チャレンジにおいて、12-18%の精度が自然に低下する。このBNNは77.9%の精度に達し、全精度版よりわずか7%低く、重量は58kB(7.2倍)、メモリは262kB(2.4倍)である。 BNNの実装では,全ネットワーク上での最大スループットは4.6 GMAC/sと1.5 GMAC/sで,それぞれ67.1 GMAC/s/W,31.3 GMAC/s/Wの効率に対応するMel binsによる前処理を含む。 ARM Cortex-M4の実装と比較して、我々のシステムは実行時間が10.3倍速く、エネルギー効率が51.1倍高い。

Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency.

翻訳日:2021-04-04 01:34:44 公開日:2021-01-12

# 深層ニューラルネットワークを用いた呼吸イベントの自動検出

Automated Respiratory Event Detection Using Deep Neural Networks ( http://arxiv.org/abs/2101.04635v1 )

ライセンス: Link先を確認

Thijs E Nassi, Wolfgang Ganglberger, Haoqi Sun, Abigail A Bucklin, Siddharth Biswal, Michel J A M van Putten, Robert J Thomas, M Brandon Westover

(参考訳) 睡眠中の呼吸を評価するゴールドスタンダードはポリソムノグラフィ(polysomnography)であり、重荷が高く(分析時間と測定コストの両方において)、繰り返すのが困難である。呼吸分析の自動化は、テスト効率を改善し、世界中で利用可能な実装機会を可能にする。マサチューセッツ総合病院(MGH)の9,656個のポリソムノグラフィー記録を用いて, 閉塞性無呼吸, 中枢性無呼吸, 低呼吸, 呼吸自覚関連覚醒を検出するため, 単一呼吸帯に基づくニューラルネットワーク(WaveNet)を訓練した。パフォーマンス評価には、apnea-hypopnea index分析を用いたイベントベースおよび記録ベースのメトリクスが含まれる。このモデルは8,455枚のポリソノグラフィー記録を含む公開データセットであるSleep-Heart-Health-Study-1でさらに評価された。 MGHデータセットの2次無呼吸事象検出には、95%の精度、0.89のアパネ-ハイパネ指数$r^2$、レシーバ動作特性曲線の曲線下領域、0.93と0.74の精度-リコール曲線が得られた。マルチクラスタスクでは,全ラベル付き中枢性無呼吸の81%が正しく分類され,この指標は閉塞性無呼吸の46%,呼吸時無呼吸の29%,低呼吸の16%であった。誤った予測の大部分は、別の種類の呼吸イベントとして誤分類であった。呼吸イベントを完全自動検出し, 臨床応用に十分な精度で無呼吸ハイポネア指数を評価できる。イベントタイプの分化はより困難であり、人間の呼吸アウトプットの複雑さと、手動アノテーションで使用される臨床閾値と基準のある程度の任意性を反映している可能性がある。

The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we trained a neural network (WaveNet) based on a single respiratory effort belt to detect obstructive apnea, central apnea, hypopnea and respiratory-effort related arousals. Performance evaluation included event-based and recording-based metrics - using an apnea-hypopnea index analysis. The model was further evaluated on a public dataset, the Sleep-Heart-Health-Study-1, containing 8,455 polysomnographic recordings. For binary apnea event detection in the MGH dataset, the neural network obtained an accuracy of 95%, an apnea-hypopnea index $r^2$ of 0.89 and area under the curve for the receiver operating characteristics curve and precision-recall curve of 0.93 and 0.74, respectively. For the multiclass task, we obtained varying performances: 81% of all labeled central apneas were correctly classified, whereas this metric was 46% for obstructive apneas, 29% for respiratory effort related arousals and 16% for hypopneas. The majority of false predictions were misclassifications as another type of respiratory event. Our fully automated method can detect respiratory events and assess the apnea-hypopnea index with sufficient accuracy for clinical utilization. Differentiation of event types is more difficult and may reflect in part the complexity of human respiratory output and some degree of arbitrariness in the clinical thresholds and criteria used during manual annotation.

翻訳日:2021-04-04 01:34:10 公開日:2021-01-12

# double-adversarial activation anomaly detection: adversarial autoencoder are anomaly generators

Double-Adversarial Activation Anomaly Detection: Adversarial Autoencoders are Anomaly Generators ( http://arxiv.org/abs/2101.04645v1 )

ライセンス: Link先を確認

J.-P. Schulze, P. Sperl, K. B\"ottinger

(参考訳) 異常検出は、固有のクラス不均衡のため、機械学習アルゴリズムにとって難しいタスクである。観測されたデータを手動で分析するのはコストが高く、時間を要するため、通常、使用可能な場合の既知の異常はごくわずかである。生成モデルとニューラルネットワークの隠れ活性化の解析に着想を得て,DA3Dと呼ばれる新しい教師なし異常検出手法を導入する。ここでは,通常のデータのみに基づく異常な反例を生成するために,対向オートエンコーダを用いる。これらの人工的な異常は、実際の、しかし目に見えない異常を検出することができる。新たな生成手法により,異常検出の教師なしタスクを教師付きタスクに変換する。 DA3Dは、ドメイン知識を必要としない純粋にデータ駆動の方法で最先端の異常検出手法の性能を上回る。

Anomaly detection is a challenging task for machine learning algorithms due to the inherent class imbalance. It is costly and time-demanding to manually analyse the observed data, thus usually only few known anomalies if any are available. Inspired by generative models and the analysis of the hidden activations of neural networks, we introduce a novel unsupervised anomaly detection method called DA3D. Here, we use adversarial autoencoders to generate anomalous counterexamples based on the normal data only. These artificial anomalies used during training allow the detection of real, yet unseen anomalies. With our novel generative approach, we transform the unsupervised task of anomaly detection to a supervised one, which is more tractable by machine learning and especially deep learning methods. DA3D surpasses the performance of state-of-the-art anomaly detection methods in a purely data-driven way, where no domain knowledge is required.

翻訳日:2021-04-04 01:33:37 公開日:2021-01-12

# 人・場所・つながり--社会的場所の景観と社会ネットワーク構造

People, Places, and Ties: Landscape of social places and their social network structures ( http://arxiv.org/abs/2101.04737v1 )

ライセンス: Link先を確認

Jaehyuk Park, Bogdan State, Monica Bhole, Michael C. Bailey, and Yong-Yeol Ahn

(参考訳) 社会化の場として本質的な役割から、ネットワーク科学、社会学、地理学、都市計画、地域研究など幅広い分野から「第三の場所」が研究されている。しかし、第3位に大規模な国勢調査がないため、研究者は体系的な調査を控えた。ここでは,facebookページを用いて,第三者とそのソーシャルネットワークを組織的に調査する。解析の結果,第3地点の分布は地理的に多様であり,その分布は人口動態や郡特性と高い相関関係にあることが明らかとなった。礼拝の場所」のような特定の種類のページは、コミュニティの好みや集中に対する潜在的な相補性を示唆する大量のクラスタリングを示している。また, 異なるタイプの社会的場所のソーシャルネットワークは, 既成友情の密着したコミュニティである可能性が高いのに対して, 既成友情のPlaces of Worship と「コミュニティ・アメニティ」のページカテゴリーは, 新たな友情の結びつきを橋渡しする傾向にある。本研究は,社会空間と社会関係の体系的比較研究において,今後の研究のマイルストーンとなるものと考えられる。

Due to their essential role as places for socialization, "third places" - social places where people casually visit and communicate with friends and neighbors - have been studied by a wide range of fields including network science, sociology, geography, urban planning, and regional studies. However, the lack of a large-scale census on third places kept researchers from systematic investigations. Here we provide a systematic nationwide investigation of third places and their social networks, by using Facebook pages. Our analysis reveals a large degree of geographic heterogeneity in the distribution of the types of third places, which is highly correlated with baseline demographics and county characteristics. Certain types of pages like "Places of Worship" demonstrate a large degree of clustering suggesting community preference or potential complementarities to concentration. We also found that the social networks of different types of social place differ in important ways: The social networks of 'Restaurants' and 'Indoor Recreation' pages are more likely to be tight-knit communities of pre-existing friendships whereas 'Places of Worship' and 'Community Amenities' page categories are more likely to bridge new friendship ties. We believe that this study can serve as an important milestone for future studies on the systematic comparative study of social spaces and their social relationships.

翻訳日:2021-04-04 01:33:21 公開日:2021-01-12

# エアフォイル gan: エアフォイルのエンコーディングと合成 foraerodynamic-aware shape optimization

Airfoil GAN: Encoding and Synthesizing Airfoils forAerodynamic-aware Shape Optimization ( http://arxiv.org/abs/2101.04757v1 )

ライセンス: Link先を確認

Yuyang Wang, Kenji Shimada, Amir Barati Farimani

(参考訳) エアフォイルのような空力形状の現在の設計は、可能な設計空間を探索するための計算集約的なシミュレーションを伴う。通常、このような設計は設計パラメータの事前定義に依存し、新しい形状の合成に制限を課す。本研究では,既存の翼から表現を自動的に学習し,学習した表現を用いて新しい翼を生成するデータ駆動型形状符号化・生成法を提案する。これらの表現は、空気力学的性能に基づいて合成翼形状の最適化に使用される。我々のモデルは、変分オートエンコーダとジェネレーティブ・アドバーサリアル・ネットワークを組み合わせたニューラルネットワークであるVAEGANに基づいて構築されており、勾配に基づく手法で訓練されている。本モデルでは,(1)既存のエアフォイルを潜在ベクターにエンコードし,それからエアフォイルを再構築し,(2)潜在ベクターをランダムにサンプリングしてエアフォイル座標領域にマッピングし,(3)学習した特徴を遺伝的アルゴリズムにより最適化し,所望の空力特性を有するエアフォイルを合成する。実験の結果,事前定義された設計パラメータを使わずに,形状情報を網羅的かつ包括的に符号化できることがわかった。特徴ベクトルの補間/補間またはガウス雑音からのサンプリングにより、モデルは新しい翼形状を自動的に合成することができる。遺伝的アルゴリズムによって学習された特徴の形状を最適化することで、合成された翼は特定の空力特性を持つように進化し、空力製品の設計を効果的かつ効率的に導くことができる。

The current design of aerodynamic shapes, like airfoils, involves computationally intensive simulations to explore the possible design space. Usually, such design relies on the prior definition of design parameters and places restrictions on synthesizing novel shapes. In this work, we propose a data-driven shape encoding and generating method, which automatically learns representations from existing airfoils and uses the learned representations to generate new airfoils. The representations are then used in the optimization of synthesized airfoil shapes based on their aerodynamic performance. Our model is built upon VAEGAN, a neural network that combines Variational Autoencoder with Generative Adversarial Network and is trained by the gradient-based technique. Our model can (1) encode the existing airfoil into a latent vector and reconstruct the airfoil from that, (2) generate novel airfoils by randomly sampling the latent vectors and mapping the vectors to the airfoil coordinate domain, and (3) synthesize airfoils with desired aerodynamic properties by optimizing learned features via a genetic algorithm. Our experiments show that the learned features encode shape information thoroughly and comprehensively without predefined design parameters. By interpolating/extrapolating feature vectors or sampling from Gaussian noises, the model can automatically synthesize novel airfoil shapes, some of which possess competitive or even better aerodynamic properties comparing with training airfoils. By optimizing shape on learned features via a genetic algorithm, synthesized airfoils can evolve to have specific aerodynamic properties, which can guide designing aerodynamic products effectively and efficiently.

翻訳日:2021-04-04 01:32:57 公開日:2021-01-12

# SARA(Self-Adaptive Reconfigurable Arrays):スケーリングGEMM高速化を支援するML

Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling GEMM Acceleration ( http://arxiv.org/abs/2101.04799v1 )

ライセンス: Link先を確認

Ananda Samajdar, Michael Pellauer, Tushar Krishna

(参考訳) 層形状とサイズの観点からのディープニューラルネットワーク(DNN)モデルの多様性の向上に伴い、研究コミュニティはフレキシブル/再構成可能な加速器基板を調査してきた。この研究は2つの課題を提起した。ひとつは、パフォーマンス上のメリットと再構成可能性のオーバーヘッドをトレードオフできるアクセラレータアレイ内の適切な柔軟性を決定することです。 2つ目は、現在のDNNモデルと/またはレイヤの配列の適切な設定を決定し、実行時にアクセラレータを再設定できることです。本稿では、self adaptive reconfigurable array(sara)と呼ばれる新しいタイプのアクセラレータを紹介します。 SARAアーキテクチャは、再構成可能な配列と、実行時に配列の最適化された構成を決定するハードウェアユニットの両方で構成されている。我々は、SARAのインスタンスをSAGARと呼ぶアクセラレータでデモし、様々なサイズの小さな配列の分散コレクションや柔軟なアスペクト比を持つ単一配列として機能するように構成できる、新しい再構成可能なシストリックアレイを導入しました。我々はまた、現在の層パラメータに対する配列設定とデータフローを推奨するADAPTNETと呼ばれる新しいレコメンデーションニューラルネットワークを開発した。 ADAPTNETは、ADAPTNETを実行時に実行し、配列を再設定する統合されたカスタムハードウェアADAPTNETXで動作し、アクセル全体を自己充足する。 SAGARは、分散システムとして動作する10244x4配列の集合と同じマッピング柔軟性を提供し、3.5倍の電力効率と3.2倍の計算密度を実現している。

With increasing diversity in Deep Neural Network(DNN) models in terms of layer shapes and sizes, the research community has been investigating flexible/reconfigurable accelerator substrates. This line of research has opened up two challenges. The first is to determine the appropriate amount of flexibility within an accelerator array that that can trade-off the performance benefits versus the area overheads of the reconfigurability. The second is being able to determine the right configuration of the array for the current DNN model and/or layer and reconfigure the accelerator at runtime. This work introduces a new class of accelerators that we call Self Adaptive Reconfigurable Array (SARA). SARA architectures comprise of both a reconfigurable array and a hardware unit capable of determining an optimized configuration for the array at runtime. We demonstrate an instance of SARA with an accelerator we call SAGAR, which introduces a novel reconfigurable systolic array that can be configured to work as a distributed collection of smaller arrays of various sizes or as a single array with flexible aspect ratios. We also develop a novel recommendation neural network called ADAPTNET which recommends an array configuration and dataflow for the current layer parameters. ADAPTNET runs on an integrated custom hardware ADAPTNETX that runs ADAPTNET at runtime and reconfigures the array, making the entire accelerator self-sufficient. SAGAR is capable of providing the same mapping flexibility as a collection of 10244x4 arrays working as a distributed system while achieving 3.5x more power efficiency and 3.2x higher compute density Furthermore, the runtime achieved on the recommended parameters from ADAPTNET is 99.93% of the best achievable runtime.

翻訳日:2021-04-04 01:32:29 公開日:2021-01-12

# 4脚ラインフォロワロボットへの組込み型コンピュータビジョンシステムの適用

Embedded Computer Vision System Applied to a Four-Legged Line Follower Robot ( http://arxiv.org/abs/2101.04804v1 )

ライセンス: Link先を確認

Beatriz Arruda Asfora

(参考訳) ロボットは知覚と行動の結びつきとして定義することができる。このプロジェクトは、ロボットの視覚と動作をつなぐ自動コンピュータビジョン組み込みシステムを使用して、ロボットを駆動することを目的としている。ロボットに色認識システムを実装するために、処理言語、androidシステム、arduinoプラットフォーム、pixyカメラなどのオープンソースツールが選択される。制約は明確です – 単純さ,複製性,財務性です。ロボット工学、コンピュータビジョン、画像処理を統合するために、このロボットは典型的な移動ロボットの課題であるラインフォローに適用される。パスと背景を区別する問題は、一般的な大津法、実験による色の組み合わせに基づくしきい値、彩度と彩度による色追跡など、様々なアプローチで分析される。次に移動する場所の決定は、経路の線の中心に基づいており、完全に自動化されている。 4本足のロボットをプラットフォームとして、カメラを唯一のセンサーとして使用することで、ロボットはラインを追跡することに成功した。イメージのキャプチャからロボットの移動まで、統合ロボティクスがいかに実現可能かは明らかです。本論文の課題は機械工学、エレクトロニクス、制御システム、プログラミングに関する知識のみである。この作業に関するすべてがドキュメント化され、オープンソースのオンラインページで利用可能になったため、ロボット工学の学習と実験に役立てることができる。

Robotics can be defined as the connection of perception to action. Taking this further, this project aims to drive a robot using an automated computer vision embedded system, connecting the robot's vision to its behavior. In order to implement a color recognition system on the robot, open source tools are chosen, such as Processing language, Android system, Arduino platform and Pixy camera. The constraints are clear: simplicity, replicability and financial viability. In order to integrate Robotics, Computer Vision and Image Processing, the robot is applied on a typical mobile robot's issue: line following. The problem of distinguishing the path from the background is analyzed through different approaches: the popular Otsu's Method, thresholding based on color combinations through experimentation and color tracking via hue and saturation. Decision making of where to move next is based on the line center of the path and is fully automated. Using a four-legged robot as platform and a camera as its only sensor, the robot is capable of successfully follow a line. From capturing the image to moving the robot, it's evident how integrative Robotics can be. The issue of this paper alone involves knowledge of Mechanical Engineering, Electronics, Control Systems and Programming. Everything related to this work was documented and made available on an open source online page, so it can be useful in learning and experimenting with robotics.

翻訳日:2021-04-04 01:31:44 公開日:2021-01-12

# ニューラルネットワークを用いた仮想マイクロホン推定器

Neural Network-based Virtual Microphone Estimator ( http://arxiv.org/abs/2101.04315v1 )

ライセンス: Link先を確認

Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki

(参考訳) 少数のマイクロホンのためのマイクロホンアレイ技術の開発は、多くのデバイスに制約があるため重要である。この状況に対処する一つの方向は、例えばいくつかの物理モデル仮定に基づいて、マイク信号の数を事実上増やすことである。しかし、そのような仮定は必ずしも現実的な条件で満たされない。本稿では,ニューラルネットワークを用いた仮想マイクロホン推定器(NN-VME)を提案する。 NN-VMEは、最近の時間領域ニューラルネットワークの正確な推定能力を利用して、仮想マイクロホン信号を時間領域内で直接推定する。訓練時の仮想マイクの位置での実際の観察を利用した教師あり学習フレームワークを採用する。したがって、nn-vmeはマルチチャンネルの観測のみを使用して訓練することができ、実記録を直接行うことができ、非現実的な物理モデルに基づく仮定の必要性を回避できる。提案するnn-vmeは実記録においても高い仮想マイクロホン推定性能を達成し,nn-vmeを付加したビームフォーマによって音声強調と認識性能の両方が向上することを示す。

Developing microphone array technologies for a small number of microphones is important due to the constraints of many devices. One direction to address this situation consists of virtually augmenting the number of microphone signals, e.g., based on several physical model assumptions. However, such assumptions are not necessarily met in realistic conditions. In this paper, as an alternative approach, we propose a neural network-based virtual microphone estimator (NN-VME). The NN-VME estimates virtual microphone signals directly in the time domain, by utilizing the precise estimation capability of the recent time-domain neural networks. We adopt a fully supervised learning framework that uses actual observations at the locations of the virtual microphones at training time. Consequently, the NN-VME can be trained using only multi-channel observations and thus directly on real recordings, avoiding the need for unrealistic physical model-based assumptions. Experiments on the CHiME-4 corpus show that the proposed NN-VME achieves high virtual microphone estimation performance even for real recordings and that a beamformer augmented with the NN-VME improves both the speech enhancement and recognition performance.

翻訳日:2021-04-04 01:31:24 公開日:2021-01-12

# LSTMネットワークを用いた機械型通信におけるイベント駆動ソーストラヒック予測

Event-Driven Source Traffic Prediction in Machine-Type Communications Using LSTM Networks ( http://arxiv.org/abs/2101.04365v1 )

ライセンス: Link先を確認

Thulitha Senevirathna, Bathiya Thennakoon, Tharindu Sankalpa, Chatura Seneviratne, Samad Ali and Nandana Rajatheva

(参考訳) ソーストラフィック予測は、機械型通信(MTC)における予測リソース割り当てを可能にする主な課題の1つである。本稿では,イベント駆動ソーストラフィック予測のための長期短期記憶(lstm)ベースのディープラーニング手法を提案する。ソーストラフィック予測問題は、過去の送信データに基づいて、機械型装置(MTD)の送信状態を主焦点とするシーケンス生成タスクとして定式化することができる。これは、LSTMネットワークがデバイス間の因果関係を識別できるように、送信データを再構成することで実現される。このような因果関係の知識は、イベント駆動のトラフィック予測を可能にする。提案手法の性能は、異なるエントロピー範囲のmddによる事象に関するデータを用いて検討した。我々のモデルは、既存のベースラインソリューションよりも、リソースの節約と精度を約9%で上回ります。また,我々のモデルによるランダムアクセス (RA) 要求の低減について解析し,LSTMに基づくソーストラフィック予測手法の結果として必要な信号量が少ないことを示す。

Source traffic prediction is one of the main challenges of enabling predictive resource allocation in machine type communications (MTC). In this paper, a Long Short-Term Memory (LSTM) based deep learning approach is proposed for event-driven source traffic prediction. The source traffic prediction problem can be formulated as a sequence generation task where the main focus is predicting the transmission states of machine-type devices (MTDs) based on their past transmission data. This is done by restructuring the transmission data in a way that the LSTM network can identify the causal relationship between the devices. Knowledge of such a causal relationship can enable event-driven traffic prediction. The performance of the proposed approach is studied using data regarding events from MTDs with different ranges of entropy. Our model outperforms existing baseline solutions in saving resources and accuracy with a margin of around 9%. Reduction in Random Access (RA) requests by our model is also analyzed to demonstrate the low amount of signaling required as a result of our proposed LSTM based source traffic prediction approach.

翻訳日:2021-04-04 01:31:06 公開日:2021-01-12

# Type4Py: Pythonの深い類似性学習に基づく型推論

Type4Py: Deep Similarity Learning-Based Type Inference for Python ( http://arxiv.org/abs/2101.04470v1 )

ライセンス: Link先を確認

Amir M. Mir, Evaldas Latoskinas, Sebastian Proksch, Georgios Gousios

(参考訳) PythonやJavascriptのような動的言語は、開発者の柔軟性のために静的型付けを交換する。これは生産性が向上すると言われているが、静的型付けの欠如はランタイム例外、型不整合を引き起こし、IDEサポートの弱さの大きな要因である。これらの問題を緩和するため、PEP 484はPythonのオプション型アノテーションを導入した。既存のコードベースへの型の再適合はエラーを起こしやすいため、既存の部分的に注釈付けされたコードベースに基づいた自動型アノテーションを実現するための学習ベースのアプローチが提案されている。しかし、レア型とユーザ定義型の予測は依然として困難である。本稿では,pythonの類似度学習に基づく型推論モデルtype4pyを提案する。我々は、高次元空間における同種の型と異種の型を区別することを学ぶ階層型ニューラルネットワークモデルを設計し、その結果、型をクラスタ化する。最寄りの検索では、python関数の型シグネチャが考えられる。分析されたモジュールで見える型は、軽量な依存性分析を使って表面化されます。定量的および定性的な評価の結果,Type4Pyはタイプ予測タスクにおける最先端アプローチよりも有意に優れていた。トップ1の予測を考えると、Type4PyはTypilusやTypeWriterよりも19.33%、13.49%高い精度を得られる。

Dynamic languages, such as Python and Javascript, trade static typing for developer flexibility. While this allegedly enables greater productivity, lack of static typing can cause runtime exceptions, type inconsistencies, and is a major factor for weak IDE support. To alleviate these issues, PEP 484 introduced optional type annotations for Python. As retrofitting types to existing codebases is error-prone and laborious, learning-based approaches have been proposed to enable automatic type annotations based on existing, partially annotated codebases. However, the prediction of rare and user-defined types is still challenging. In this paper, we present Type4Py, a deep similarity learning-based type inference model for Python. We design a hierarchical neural network model that learns to discriminate between types of the same kind and dissimilar types in a high-dimensional space, which results in clusters of types. Nearest neighbor search suggests likely type signatures of given Python functions. The types visible to analyzed modules are surfaced using lightweight dependency analysis. The results of quantitative and qualitative evaluation indicate that Type4Py significantly outperforms state-of-the-art approaches at the type prediction task. Considering the Top-1 prediction, Type4Py obtains 19.33% and 13.49% higher precision than Typilus and TypeWriter, respectively, while utilizing a much bigger vocabulary.

翻訳日:2021-04-04 01:30:35 公開日:2021-01-12

# パラメータ依存力学系の初期値問題に対する機械学習

Machine Learning for Initial Value Problems of Parameter-Dependent Dynamical Systems ( http://arxiv.org/abs/2101.04595v1 )

ライセンス: Link先を確認

Roland Pulch and Maha Youssef

(参考訳) 物理パラメータを含む非線形力学系の初期値問題を考察する。溶液による利息の量が観測される。離散化は、多くの時間点における興味の量の軌跡をもたらす。パラメータの集合から軌道の離散値へのマッピングについて検討する。このマッピングの評価は初期値の問題を解決する必要がある。あるいは、機械学習の概念を用いて、評価が低い計算作業を必要とする近似を決定する。我々は、軌道のサンプルデータに適合するフィードフォワードニューラルネットワークを採用している。電気回路をモデル化する実験例に対して数値計算の結果を示す。

We consider initial value problems of nonlinear dynamical systems, which include physical parameters. A quantity of interest depending on the solution is observed. A discretisation yields the trajectories of the quantity of interest in many time points. We examine the mapping from the set of parameters to the discrete values of the trajectories. An evaluation of this mapping requires to solve an initial value problem. Alternatively, we determine an approximation, where the evaluation requires low computation work, using a concept of machine learning. We employ feedforward neural networks, which are fitted to data from samples of the trajectories. Results of numerical computations are presented for a test example modelling an electric circuit.

翻訳日:2021-04-04 01:30:14 公開日:2021-01-12

# MP3net: 単純な畳み込みGANによる生オーディオからのコヒーレントで微小な音楽生成

MP3net: coherent, minute-long music generation from raw audio with a simple convolutional GAN ( http://arxiv.org/abs/2101.04785v1 )

ライセンス: Link先を確認

Korneel van den Broek

(参考訳) 本稿では,MP3/Vorbis音声圧縮技術を利用して,長距離コヒーレンスを有する長大な高品質オーディオサンプルを生成する深層畳み込みGANを提案する。このモデルは、すべての位相情報を含むMDCT(Modified Discrete Cosine Transform)データ表現を使用する。したがって、位相生成はモデルに不可欠な部分である。人間の耳の聴覚マスキングと心理音響知覚限界を利用して、真の分布を広げ、トレーニングプロセスを安定化させる。モデルアーキテクチャは深部2次元畳み込みネットワークであり、各ジェネレータモデルブロックは時間軸に沿って分解能を高め、周波数軸に沿って高いオクターブを追加する。より深いレイヤは出力のすべての部分に接続され、完全なトラックのコンテキストを持つ。これにより、長距離コヒーレンスを示すサンプルを生成することができる。我々はMP3netを使って、1つのクラウドTPUv2で250時間トレーニングした後、サンプルレート22kHzの95sステレオトラックを作成します。 CNNベースのモデルアーキテクチャのさらなる利点は、新しい曲の生成がほぼ瞬時に行われることである。

We present a deep convolutional GAN which leverages techniques from MP3/Vorbis audio compression to produce long, high-quality audio samples with long-range coherence. The model uses a Modified Discrete Cosine Transform (MDCT) data representation, which includes all phase information. Phase generation is hence integral part of the model. We leverage the auditory masking and psychoacoustic perception limit of the human ear to widen the true distribution and stabilize the training process. The model architecture is a deep 2D convolutional network, where each subsequent generator model block increases the resolution along the time axis and adds a higher octave along the frequency axis. The deeper layers are connected with all parts of the output and have the context of the full track. This enables generation of samples which exhibit long-range coherence. We use MP3net to create 95s stereo tracks with a 22kHz sample rate after training for 250h on a single Cloud TPUv2. An additional benefit of the CNN-based model architecture is that generation of new songs is almost instantaneous.

翻訳日:2021-04-04 01:29:46 公開日:2021-01-12

# UCNN:非構造化メッシュの畳み込み戦略

UCNN: A Convolutional Strategy on Unstructured Mesh ( http://arxiv.org/abs/2101.05207v1 )

ライセンス: Link先を確認

Mengfei Xu, Shufang Song, Xuxiang Sun, Weiwei Zhang

(参考訳) 流体力学の機械学習では、フルコネクテッドニューラルネットワーク(FNN)はモデリングにのみローカル機能を使用するが、畳み込みニューラルネットワーク(CNN)は構造化/非構造化メッシュのデータには適用できない。 FNNとCNNの限界を克服するため、非構造畳み込みニューラルネットワーク(UCNN)が提案され、重み関数を通じて近隣ノードの特徴を集約し、効果的に活用する。随伴ベクトルモデリングは、ucnnの性能を研究するタスクとして取られる。フローフィールド特徴から随伴ベクトルへのマッピング関数は、GPU上の効率的な並列実装によって構成される。 UCNNのモデリング能力は,テストケースにおける検証セットや空力形状の最適化においてFNNと比較される。さらに,メッシュ変化がUCNNのモデリング能力に及ぼす影響について検討した。その結果,UCNNはモデリング過程においてより正確であることが示唆された。

In machine learning for fluid mechanics, fully-connected neural network (FNN) only uses the local features for modelling, while the convolutional neural network (CNN) cannot be applied to data on structured/unstructured mesh. In order to overcome the limitations of FNN and CNN, the unstructured convolutional neural network (UCNN) is proposed, which aggregates and effectively exploits the features of neighbour nodes through the weight function. Adjoint vector modelling is taken as the task to study the performance of UCNN. The mapping function from flow-field features to adjoint vector is constructed through efficient parallel implementation on GPU. The modelling capability of UCNN is compared with that of FNN on validation set and in aerodynamic shape optimization at test case. The influence of mesh changing on the modelling capability of UCNN is further studied. The results indicate that UCNN is more accurate in modelling process.

翻訳日:2021-04-04 01:29:28 公開日:2021-01-12

# 深層学習によるボアホール比抵抗測定システムの設計

Design of borehole resistivity measurement acquisition systems using deep learning ( http://arxiv.org/abs/2101.05623v1 )

ライセンス: Link先を確認

M. Shahriari, A. Hazra, D. Pardo

(参考訳) lwd(loging-while-drilling)装置で記録されたボアホール比抵抗測定は、地球の地下特性を特徴付けるために広く用いられている。石油やガスなどの天然資源の抽出を促進する。 lwd装置は、井戸付近の地表面の電気的特性を推定し、おそらく井戸軌道を補正するために、電磁的測定のリアルタイムな反転を必要とする。深層ニューラルネットワーク(dnn)ベースの手法は、トレーニングフェーズ中にオフラインで前方および逆問題を近似するので、ボアホール比抵抗測定の迅速な反転に適しており、評価にほんの1秒(すなわち予測)しか必要としない。しかし、逆問題は通常複数の解を許容する。データミスフィットに基づく従来の損失関数を持つDNNは、逆問題の解決には不適当である。これは、エンコーダ-デコーダアーキテクチャ用に特別に設計された損失関数に正規化項を追加することで部分的に克服できる。しかし、正則化を加えることで、優先すべき物理解の集合に対する可能な解の数を大幅に制限する。これを回避するために,正規化を伴わない2段階損失関数を用いる。さらに, 逆解を保証するためには, 十分な数の計測値を持つ注意深く選択した計測取得システムが必要である。そこで本研究では,DNNに基づく計測取得システムの設計のための反復アルゴリズムを提案する。いくつかの合成例を通してDNNに基づく反復アルゴリズムについて述べる。以上の結果から, 測定装置上および下方における抵抗層と導電層の両方を同定し, 特徴付けるのに十分であることがわかった。数値的な結果は有望であるが, 産業目的のためにはさらなる改良が必要である。

Borehole resistivity measurements recorded with logging-while-drilling (LWD) instruments are widely used for characterizing the earth's subsurface properties. They facilitate the extraction of natural resources such as oil and gas. LWD instruments require real-time inversions of electromagnetic measurements to estimate the electrical properties of the earth's subsurface near the well and possibly correct the well trajectory. Deep Neural Network (DNN)-based methods are suitable for the rapid inversion of borehole resistivity measurements as they approximate the forward and inverse problem offline during the training phase and they only require a fraction of a second for the evaluation (aka prediction). However, the inverse problem generally admits multiple solutions. DNNs with traditional loss functions based on data misfit are ill-equipped for solving an inverse problem. This can be partially overcome by adding regularization terms to a loss function specifically designed for encoder-decoder architectures. But adding regularization seriously limits the number of possible solutions to a set of a priori desirable physical solutions. To avoid this, we use a two-step loss function without any regularization. In addition, to guarantee an inverse solution, we need a carefully selected measurement acquisition system with a sufficient number of measurements. In this work, we propose a DNN-based iterative algorithm for designing such a measurement acquisition system. We illustrate our DNN-based iterative algorithm via several synthetic examples. Numerical results show that the obtained measurement acquisition system is sufficient to identify and characterize both resistive and conductive layers above and below the logging instrument. Numerical results are promising, although further improvements are required to make our method amenable for industrial purposes.

翻訳日:2021-04-04 01:29:15 公開日:2021-01-12

PDF登録状況（公開日: 20210112）