Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20220128となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 小さな葉/花弁を持つ大きな星/バラの余剰次元 Large Star/Rose Extra Dimension with Small Leaves/Petals ( http://arxiv.org/abs/2001.07102v5 ) ライセンス: Link先を確認	Florian Nortier	(参考訳) 本稿では,多数の同一葉/ペダルを持つ星座グラフ上に1つの大きな余剰次元(LED)をコンパクト化する手法を提案する。 5次元プランクスケールを$\Lambda_P^{(5)} \sim \mathcal{O}(1)$ TeVとすることでゲージ階層問題を解くことができる。リーフ/ペナル長スケールは$\mathcal{O}(1/\Lambda_{EW})$であり、$\Lambda_{EW} \sim 100$ GeVは従来のLEDモデルの大きな幾何学的階層が安定化しない弱いスケールである。 smの4次元場は、スター/ローズグラフの中心頂点にある3面上に局在する。我々は、TeVスケール上の強結合重力現象の体制の下にある、正に結合した弱いスケールのカルザ・クライン(KK)重力子の塔を予測する。さらに, 右利きニュートリノは, バルク内を伝播するゲージ一重項フェルミオンのKKモードである光ディラックニュートリノを生成するためのLED機構を再構成した。多くのKK-グラビトンとKK-ニュートリノは重力のみに相互作用し、隠れたセクターを構成する。 In this paper, we propose to compactify a single large extra dimension (LED) on a star/rose graph with a large number of identical leaves/petals. The 5D Planck scale can be chosen to be $\Lambda_P^{(5)} \sim \mathcal{O}(1)$ TeV which can provide a path to solve the gauge hierarchy problem. The leaf/petal length scale is of $\mathcal{O}(1/\Lambda_{EW})$, where $\Lambda_{EW} \sim 100$ GeV is the weak scale, without the large geometrical hierarchy of the traditional LED models to stabilize. The 4D fields of the SM are localized on a 3-brane at the central vertex of the star/rose graph. We predict a tower of feebly coupled weak scale Kaluza-Klein (KK) gravitons below a regime of strongly coupled gravitational phenomena above the TeV scale. Moreover, we reformulate in our setup the LED mechanism to generate light Dirac neutrinos, where the right-handed neutrinos are KK-modes of gauge singlet fermions propagating in the bulk. A large number of KK-gravitons and KK-neutrinos interact only gravitationally, and thus constitute a hidden sector.	翻訳日:2023-06-07 06:19:38 公開日:2022-01-28
# ビスト確率作用素と量子ランダム変数 Bistochastic operators and quantum random variables ( http://arxiv.org/abs/2005.00005v2 ) ライセンス: Link先を確認	Sarah Plosker and Christopher Ramsey	(参考訳) 局所コンパクトハウスドルフ空間 $X$ のボレル集合に作用する正の作用素値 $\nu$ が代数 $\mathcal B(\mathcal H)$ 上のすべての有界作用素 (おそらく無限次元) ヒルベルト空間 $\mathcal H$ が与えられたとき、$\nu$-可積分函数 $X\rightarrow \mathcal B(\mathcal H)$ は正の量子確率変数である。そのような函数のスパン上の半ノルムを定義し、商においてバナッハ空間に導く。この空間に作用するビスト確率作用素と量子乱変数の大量化は、これらの作用素に対して定義される。古典的偏化理論と同様に、この文脈における偏化は、ある型のすべての可能な凸函数を含む不等式と関係する。古典的設定とは異なり、連続性と収束問題は作品全体に生じる。 Given a positive operator-valued measure $\nu$ acting on the Borel sets of a locally compact Hausdorff space $X$, with outcomes in the algebra $\mathcal B(\mathcal H)$ of all bounded operators on a (possibly infinite-dimensional) Hilbert space $\mathcal H$, one can consider $\nu$-integrable functions $X\rightarrow \mathcal B(\mathcal H)$ that are positive quantum random variables. We define a seminorm on the span of such functions which in the quotient leads to a Banach space. We consider bistochastic operators acting on this space and majorization of quantum random variables is then defined with respect to these operators. As in classical majorization theory, we relate majorization in this context to an inequality involving all possible convex functions of a certain type. Unlike the classical setting, continuity and convergence issues arise throughout the work.	翻訳日:2023-05-21 17:04:51 公開日:2022-01-28
# 量子アンサンブルのギースワーク Guesswork of a quantum ensemble ( http://arxiv.org/abs/2012.09350v2 ) ライセンス: Link先を確認	Michele Dall'Arno, Francesco Buscemi, Takeshi Koshiba	(参考訳) 量子アンサンブルの推測作業は、1回に1つの状態しかクエリできない場合、アンサンブルの状態を正確に推測するために平均で必要となる最小の推測数を定量化する。ここでは,一様確率分布を持つ任意のキュービットアンサンブルに対する解析解を含む,有限条件下における推測作業問題の解析解を求める。明示的な例として、任意の量子ビット正則多角形および多面体アンサンブルの推測ワークを計算する。 The guesswork of a quantum ensemble quantifies the minimum number of guesses needed in average to correctly guess the state of the ensemble, when only one state can be queried at a time. Here, we derive analytical solutions of the guesswork problem subject to a finite set of conditions, including the analytical solution for any qubit ensemble with uniform probability distribution. As explicit examples, we compute the guesswork for any qubit regular polygonal and polyhedral ensemble.	翻訳日:2023-04-20 08:43:25 公開日:2022-01-28
# 空間自己相関の存在下でのバイアスの検出 Detecting Bias in the Presence of Spatial Autocorrelation ( http://arxiv.org/abs/2101.01703v3 ) ライセンス: Link先を確認	Subhabrata Majumdar, Cheryl Flynn, Ritwik Mitra	(参考訳) かなりの実用的重要性にもかかわらず、現在のアルゴリズムの公平性文学は、空間データのバイアス問題を評価または緩和しながら、地理的依存性を考慮に入れる技術的手法を欠いている。本稿では,空間的応用におけるバイアスの研究を開始し,定量的手法の体系化に向けて第一歩を踏み出す。空間的データ応用におけるバイアスは、しばしば基礎となる空間的自己相関によって共起される。我々は,この効果の存在と強さを検出するための仮説テスト手法を提案し,空間フィルタリングに基づく手法を用いてそれを説明することにより,既存のバイアス検出メトリクスの適用を可能にする。提案手法を実データおよび合成データセットの数値実験により評価し,空間構造に起因した複数種類の共起効果が存在する場合,テスト手法は低型II誤りと名目I誤りの維持に有効であることを示した。 In spite of considerable practical importance, current algorithmic fairness literature lacks technical methods to account for underlying geographic dependency while evaluating or mitigating bias issues for spatial data. We initiate the study of bias in spatial applications in this paper, taking the first step towards formalizing this line of quantitative methods. Bias in spatial data applications often gets confounded by underlying spatial autocorrelation. We propose hypothesis testing methodology to detect the presence and strength of this effect, then account for it by using a spatial filtering-based approach -- in order to enable application of existing bias detection metrics. We evaluate our proposed methodology through numerical experiments on real and synthetic datasets, demonstrating that in the presence of several types of confounding effects due to the underlying spatial structure our testing methods perform well in maintaining low type-II errors and nominal type-I errors.	翻訳日:2023-04-17 19:48:34 公開日:2022-01-28
# 位相空間における操作理論:高調波発振器のトイモデル Operational Theories in Phase Space: Toy Model for the Harmonic Oscillator ( http://arxiv.org/abs/2101.08323v2 ) ライセンス: Link先を確認	Martin Pl\'avala, Matthias Kleinmann	(参考訳) 位置と運動量に依存するエネルギー観測可能な理論を含む一般確率論を構築する方法を示す。この構成は古典的および量子的な理論に従い、位置、運動量、エネルギーの確率分布のような物理的予測を可能にする。古典的でも量子的でもない高調波発振器の玩具モデルを定式化して構成を実証する。このモデルでは、離散エネルギースペクトル、鋭い位置と運動量を持つ基底状態、非正のウィグナー関数を持つ固有状態、およびトンネル特性を持つ状態が特徴である。玩具モデルは、操作理論が物理理論を定式化するための実行可能な代替手法であることを示した。 We show how to construct general probabilistic theories that contain an energy observable dependent on position and momentum. The construction is in accordance with classical and quantum theory and allows for physical predictions, such as the probability distribution for position, momentum and energy. We demonstrate the construction by formulating a toy model for the harmonic oscillator that is neither classical nor quantum. The model features a discrete energy spectrum, a ground state with sharp position and momentum, an eigenstate with non-positive Wigner function as well as a state that has tunneling properties. The toy model demonstrates that operational theories can be a viable alternative approach for formulating physical theories.	翻訳日:2023-04-14 11:07:57 公開日:2022-01-28
# 量子リピータセグメントの絡み合い接続のためのメモリ拡張スケーリングの実験的検討 Experimental demonstration of memory-enhanced scaling for entanglement connection of quantum repeater segments ( http://arxiv.org/abs/2101.08541v3 ) ライセンス: Link先を確認	Yunfei Pu, Sheng Zhang, Yukai Wu, Nan Jiang, Wei Chang, Chang Li and Luming Duan	(参考訳) 量子リピータプロトコルは、長距離量子通信と大規模量子ネットワークを実装するための有望なアプローチである。量子リピータプロトコルの重要な考え方は、長い寿命の量子メモリを使用して、多項式スケーリングで異なるリピータセグメント間の効率的な絡み合い接続を実現することである。本稿では,数ミリ秒の記憶時間を持つ2つの原子量子メモリを用いて,オンデマンドのエンタングルメントスワップによる2つの量子リピータセグメントの効率的な接続を実現する実験を報告する。メモリエンハンスメントにより,エンタングルメント接続の成功率において,スケーリング・チェンジ・アクセラレーションが実証される。効率的なメモリ拡張スケーリングによる2つの量子リピータセグメントの絡み合い接続の実験的実現は、将来の大規模量子ネットワークの基盤となる量子リピータプロトコルの重要な利点を示す。 The quantum repeater protocol is a promising approach to implement long-distance quantum communication and large-scale quantum networks. A key idea of the quantum repeater protocol is to use long-lived quantum memories to achieve efficient entanglement connection between different repeater segments with a polynomial scaling. Here we report an experiment which realizes efficient connection of two quantum repeater segments via on-demand entanglement swapping by the use of two atomic quantum memories with storage time of tens of milliseconds. With the memory enhancement, scaling-changing acceleration is demonstrated in the rate for a successful entanglement connection. The experimental realization of entanglement connection of two quantum repeater segments with an efficient memory-enhanced scaling demonstrates a key advantage of the quantum repeater protocol, which makes a cornerstone towards future large-scale quantum networks.	翻訳日:2023-04-14 08:37:03 公開日:2022-01-28
# 高次元ガウスボゾンサンプリングによる量子計算の優位性 Quantum Computational Advantage via High-Dimensional Gaussian Boson Sampling ( http://arxiv.org/abs/2102.12474v3 ) ライセンス: Link先を確認	Abhinav Deshpande, Arthur Mehta, Trevor Vincent, Nicolas Quesada, Marcel Hinsche, Marios Ioannou, Lars Madsen, Jonathan Lavoie, Haoyu Qi, Jens Eisert, Dominik Hangleiter, Bill Fefferman, Ish Dhand	(参考訳) フォトニクス(英: Photonics)は、量子計算上の優位性(QCA)を示すための有望なプラットフォームである。この約束にもかかわらず、既存の提案とデモは課題に直面している。実験的に、ガウスボソンサンプリング(GBS)の現在の実装はプログラマビリティを欠いているか、あるいは不正な損失率を持っている。理論的には、gbsの古典的硬さに対する厳密な証拠が比較的欠如している。本研究は,理論的な証拠と実験的な展望の両方を改善するための進歩である。 GBS の硬さの証拠は,QCA の最も強い理論的提案に匹敵するものである。また,高次元GBSと呼ぶ新しいQCAアーキテクチャを提案する。 GBSをシミュレーションする特定のアルゴリズムは、モデストシステムサイズでの高次元GBS実験により性能が向上していることを示す。この研究は、プログラマブルフォトニックプロセッサによるQCAの実証への道を開く。 Photonics is a promising platform for demonstrating a quantum computational advantage (QCA) by outperforming the most powerful classical supercomputers on a well-defined computational task. Despite this promise, existing proposals and demonstrations face challenges. Experimentally, current implementations of Gaussian boson sampling (GBS) lack programmability or have prohibitive loss rates. Theoretically, there is a comparative lack of rigorous evidence for the classical hardness of GBS. In this work, we make progress in improving both the theoretical evidence and experimental prospects. We provide evidence for the hardness of GBS, comparable to the strongest theoretical proposals for QCA. We also propose a new QCA architecture we call high-dimensional GBS, which is programmable and can be implemented with low loss using few optical components. We show that particular algorithms for simulating GBS are outperformed by high-dimensional GBS experiments at modest system sizes. This work thus opens the path to demonstrating QCA with programmable photonic processors.	翻訳日:2023-04-10 00:57:51 公開日:2022-01-28
# コヒーレントイジングマシンによる地中・低エネルギーイジングスピン配置の効率的なサンプリング Efficient sampling of ground and low-energy Ising spin configurations with a coherent Ising machine ( http://arxiv.org/abs/2103.05629v2 ) ライセンス: Link先を確認	Edwin Ng, Tatsuhiro Onodera, Satoshi Kako, Peter L. McMahon, Hideo Mabuchi, Yoshihisa Yamamoto	(参考訳) 量子ノイズの存在下での計測フィードバック型コヒーレントイジングマシン(mfb-cim)の非線形確率ダイナミクスを,イジングモデルの縮退グラウンドおよび低エネルギースピン配置のサンプルに活用できることを示す。我々は,MFB-CIMの一般離散時間ガウス状態モデルを定式化し,システム閾値以上の非線形ダイナミクスを忠実に捉える。このモデルは、量子ノイズを無視する平均場モデルと、長い光子寿命を仮定する連続時間モデルの両方の限界を克服する。このモデルの数値シミュレーションにより、mfb-cimが短い光子寿命(すなわち低空洞細さ)を持つ量子ノイズ支配状態で動作している場合、ホモダインモニタリングは低エネルギーイジングスピン配置のサンプルを効率的に生成することができ、確立された高精細連続時間モデルによって提案されるよりも多くのラウンドトリップを必要とすることが示された。サンプリング性能は、パラメトリックドライブの符号をオフにしたり、完全に反転させたりすることで堅牢である、あるいは改善されているが、光学非線形性がない場合には性能が著しく低下する。二項符号のエッジ重みを持つMAX-CUT問題のクラスでは、すべての縮退を含む最初の励起イジングエネルギーまでの全スピン配置を十分にサンプリングするのに十分なラウンドトリップの数は1.08^N$までスケールする。問題サイズのn = 100$ 数十 (中20) のインスタンス毎の所望の構成では、平均的な十分なサンプリング時間は6\times10^6$ ラウンドトリップであり、10ghzの繰り返しレートを持つmfb-cimの実験的な実装では、ウォールクロックのサンプリング時間は60msである。 We show that the nonlinear stochastic dynamics of a measurement-feedback-based coherent Ising machine (MFB-CIM) in the presence of quantum noise can be exploited to sample degenerate ground and low-energy spin configurations of the Ising model. We formulate a general discrete-time Gaussian-state model of the MFB-CIM which faithfully captures the nonlinear dynamics present at and above system threshold. This model overcomes the limitations of both mean-field models, which neglect quantum noise, and continuous-time models, which assume long photon lifetimes. Numerical simulations of our model show that when the MFB-CIM is operated in a quantum-noise-dominated regime with short photon lifetimes (i.e., low cavity finesse), homodyne monitoring of the system can efficiently produce samples of low-energy Ising spin configurations, requiring many fewer roundtrips to sample than suggested by established high-finesse, continuous-time models. We find that sampling performance is robust to, or even improved by, turning off or altogether reversing the sign of the parametric drive, but performance is critically reduced in the absence of optical nonlinearity. For the class of MAX-CUT problems with binary-signed edge weights, the number of roundtrips sufficient to fully sample all spin configurations up to the first-excited Ising energy, including all degeneracies, scales as $1.08^N$. At a problem size of $N = 100$ with a few dozen (median of 20) such desired configurations per instance, we have found median sufficient sampling times of $6\times10^6$ roundtrips; in an experimental implementation of an MFB-CIM with a 10 GHz repetition rate, this corresponds to a wall-clock sampling time of 60 ms.	翻訳日:2023-04-08 15:40:15 公開日:2022-01-28
# 超低メカニカル散逸による階層的引張構造 Hierarchical tensile structures with ultralow mechanical dissipation ( http://arxiv.org/abs/2103.09785v3 ) ライセンス: Link先を確認	Mohammad J. Bereyhi, Alberto Beccari, Robin Groth, Sergey A. Fedorov, Amirali Arabmoheghi, Tobias J. Kippenberg, Nils J. Engelsen	(参考訳) 構造構造は無数の生物系で見られ、エッフェル塔から光学キャビティまで、人工的な構造を改善した。階層的メタマテリアルは、多サイズスケールの構造を利用して、構成材料と著しく異なる新しい、非常に望ましい特性を実現する。静的張力によって剛性が得られる機械的共振器では、構造的階層は、不定形なソフトクランプによる基本モードの超低レベルへの散逸を低減できる。本稿では, 窒化ケイ素ナノメカニカル共振器に階層設計を適用し, 107kHzの周波数で10^9$の高品質な2成分木形共振器を実現し, 浮遊粒子のパラメータ状態に到達した。共振器の熱雑音に制限された力の感度は、室温で740\ \mathrm{zN/\sqrt{Hz}}$と6Kで$\mathrm{90\ zN/\sqrt{Hz}}$に達する。また、二分木共振器の自己相似構造はフラクタルジオメトリの特徴である分数スペクトル次元をもたらすことがわかった。さらに, 階層的設計原理を2次元トランポリン膜にまで拡張できることを示すとともに, ファブリ・ピエロキャビティの干渉位置測定に適した超低散逸膜を作製した。階層型ナノメカニカル共振器は、低散逸が最重要であり、基本モードでの操作がしばしば有利である、力センシング、信号伝達、量子光学の新たな道を開く。 Structural hierarchy is found in myriad biological systems and has improved man-made structures ranging from the Eiffel tower to optical cavities. Hierarchical metamaterials utilize structure at multiple size scales to realize new and highly desirable properties which can be strikingly different from those of the constituent materials. In mechanical resonators whose rigidity is provided by static tension, structural hierarchy can reduce the dissipation of the fundamental mode to ultralow levels due to an unconventional form of soft clamping. Here, we apply hierarchical design to silicon nitride nanomechanical resonators and realize binary tree-shaped resonators with quality factors as high as $10^9$ at 107 kHz frequency, reaching the parameter regime of levitated particles. The resonators' thermal-noise-limited force sensitivities reach $740\ \mathrm{zN/\sqrt{Hz}}$ at room temperature and $\mathrm{90\ zN/\sqrt{Hz}}$ at 6 K, surpassing state-of-the-art cantilevers currently used for force microscopy. We also find that the self-similar structure of binary tree resonators results in fractional spectral dimensions, which is characteristic of fractal geometries. Moreover, we show that the hierarchical design principles can be extended to 2D trampoline membranes, and we fabricate ultralow dissipation membranes suitable for interferometric position measurements in Fabry-P\'erot cavities. Hierarchical nanomechanical resonators open new avenues in force sensing, signal transduction and quantum optomechanics, where low dissipation is paramount and operation with the fundamental mode is often advantageous.	翻訳日:2023-04-07 21:11:33 公開日:2022-01-28
# シリコン中の3ビットドナー量子プロセッサの精密トモグラフィー Precision tomography of a three-qubit donor quantum processor in silicon ( http://arxiv.org/abs/2106.03082v3 ) ライセンス: Link先を確認	Mateusz T. M\k{a}dzik, Serwan Asaad, Akram Youssry, Benjamin Joecker, Kenneth M. Rudinger, Erik Nielsen, Kevin C. Young, Timothy J. Proctor, Andrew D. Baczewski, Arne Laucht, Vivien Schmitt, Fay E. Hudson, Kohei M. Itoh, Alexander M. Jakob, Brett C. Johnson, David N. Jamieson, Andrew S. Dzurak, Christopher Ferrie, Robin Blume-Kohout and Andrea Morello	(参考訳) 核スピンは、量子コヒーレンスと原子スケールの足跡のため、量子情報処理において最初に検討された物理プラットフォームの一つであった。しかし、スケーラブルなデバイス内の核量子ビットを、フォールトトレラントな量子計算を維持するのに十分な忠実さを持つマルチキュービット演算と組み合わせる方法が欠如しているため、量子コンピューティングの完全なポテンシャルはまだ実現されていない。ここでは、シリコンナノエレクトロニクスデバイスにおいて、イオンを注入した31Pドナー原子核を用いた普遍量子論理演算を示す。共有電子スピンに幾何位相を付与し、94.2(2.7)%までのフィディティを有する絡み合ったベル状態を作成するための核二量子制御zゲートを得る。量子演算は、ゲートセットトモグラフィー(GST)を用いて正確に特徴づけられ、最大99.95(2)%、2キュービット平均ゲート忠実度99.37(11)%、2キュービット準備/測定忠実度98.95(4)%となる。これら3つの指標は、シリコンの核スピンがフォールトトレラント量子プロセッサで要求される性能に近づいていることを示している。次に,グリーンベルガー・ホルン・ザイリンガー三量子状態と92.5(1.0)%の忠実度を発生させることにより,2つの核と共有電子の絡み合いを示す。半導体中の電子スピン量子ビットは、他の電子とさらに結合したり、異なる場所に物理的に伝播することができるため、これらの結果はドナー核スピンと電子スピンを用いたスケーラブルな量子情報処理に有効な経路を確立できる。 Nuclear spins were among the first physical platforms to be considered for quantum information processing, because of their exceptional quantum coherence and atomic-scale footprint. However, their full potential for quantum computing has not yet been realized, due to the lack of methods to link nuclear qubits within a scalable device combined with multi-qubit operations with sufficient fidelity to sustain fault-tolerant quantum computation. Here we demonstrate universal quantum logic operations using a pair of ion-implanted 31P donor nuclei in a silicon nanoelectronic device. A nuclear two-qubit controlled-Z gate is obtained by imparting a geometric phase to a shared electron spin, and used to prepare entangled Bell states with fidelities up to 94.2(2.7)%. The quantum operations are precisely characterised using gate set tomography (GST), yielding one-qubit average gate fidelities up to 99.95(2)%, two-qubit average gate fidelity of 99.37(11)% and two-qubit preparation/measurement fidelities of 98.95(4)%. These three metrics indicate that nuclear spins in silicon are approaching the performance demanded in fault-tolerant quantum processors. We then demonstrate entanglement between the two nuclei and the shared electron by producing a Greenberger-Horne-Zeilinger three-qubit state with 92.5(1.0)% fidelity. Since electron spin qubits in semiconductors can be further coupled to other electrons or physically shuttled across different locations, these results establish a viable route for scalable quantum information processing using donor nuclear and electron spins.	翻訳日:2023-03-27 11:40:07 公開日:2022-01-28
# 確率的ニューラルネットワークの自己組織的臨界性のウィッテン型位相場理論 Witten-type topological field theory of self-organized criticality for stochastic neural networks ( http://arxiv.org/abs/2106.10851v2 ) ライセンス: Link先を確認	Jian Zhai, Chaojun Yu, You Zhai	(参考訳) 確率的ニューラルネットワークに対する自己組織臨界性(SOC)のウィッテン型トポロジカル場理論(W-TFT)について検討する。ニューラルネットワークに対する一般確率微分方程式(SDE)のパリ・ソルラス・ウー量子化、拡散系のベッチ・ルーエ・ストーラ・チューティン対称性、およびSDEの定常状態を接続する自然破断とインスタントンの関係、および擬超対称性確率的ニューラルネットワークにおける十分かつ必要な条件を得る。神経細胞の雪崩は皮質情報処理と記憶のメカニズムである cite{Beggs}\cite{Plenz1}\cite{Plenz2} を仮定し、確率的ニューラルネットワークのモデルが正しいと仮定し、SOCシステムは自発的にBRST対称性が破れたW-TFTと見なすことができる。次に、確率的ニューラルネットワークのモデルから、神経雪崩と自発的に壊れたBRST対称性を回復する。ドリフト係数の発散が小さく非定数であれば、確率的ニューラルネットワークのモデルはbrst対称であることが分かる。すなわち、脳神経回路のSOCを、自発的にBRST対称性が破れたW-TFTとみなすことができれば、神経科学において広く用いられる確率的ニューラルネットワークの一般的なモデルは、SOCを記述するのに十分ではない。一方,fokker-planck方程式を用いて拡散の十分条件を示し,確率的ニューラルネットワークの定常確率分布が存在することを示す。神経ネットワークの発火速度のリズムはプロセスから生じるが、いくつかの生物学的法則は保存されている。 We study the Witten-type topological field theory(W-TFT) of self-organized criticality(SOC) for stochastic neural networks. The Parisi-Sourlas-Wu quantization of general stochastic differential equations (SDEs) for neural networks, the Becchi-Rouet-Stora-Tyutin(BRST)-symmetry of the diffusion system and the relation between spontaneous breaking and instantons connecting steady states of the SDEs, as well as the sufficient and necessary condition on pseudo-supersymmetric stochastic neural networks are obtained. Suppose neuronal avalanche is a mechanism of cortical information processing and storage \cite{Beggs}\cite{Plenz1}\cite{Plenz2} and the model of stochastic neural networks\cite{Dayan} is correct, as well as the SOC system can be looked upon as a W-TFT with spontaneously broken BRST symmetry. Then we should recover the neuronal avalanches and spontaneously broken BRST symmetry from the model of stochastic neural networks. We find that, provided the divergence of drift coefficients is small and non-constant, the model of stochastic neural networks is BRST symmetric. That is, if the SOC of brain neural networks system can be looked upon as a W-TFT with spontaneously broken BRST symmetry, then the general model of stochastic neural networks which be extensively used in neuroscience \cite{Dayan} is not enough to describe the SOC. On the other hand, using the Fokker-Planck equation, we show the sufficient condition on diffusion so that there exists a steady state probability distribution for the stochastic neural networks. Rhythms of the firing rates of the neuronal networks arise from the process, meanwhile some biological laws are conserved.	翻訳日:2023-03-25 23:28:25 公開日:2022-01-28
# リニア応答を超える高速周期運転時の加熱速度 Heating Rates under Fast Periodic Driving beyond Linear Response ( http://arxiv.org/abs/2107.12587v2 ) ライセンス: Link先を確認	Takashi Mori	(参考訳) 周期駆動下での加熱は一般的な非平衡現象であり、非平衡統計物理学では定量的に正確な加熱速度を導出することは難しい問題である。本研究では,古典多体系および量子多体系において,高速かつ強い周期駆動下での加熱速度の簡単な公式を提供する。この公式の背景にある重要な考え方は、マイクロモーション作用素の高周波膨張の切り離しによって見いだされる回転フレームに移動し、線形応答理論を適用することで、時間依存型ハミルトニアンを構成することである。特定の古典モデルや量子モデルでは、高周波膨張の2次切断は線形応答系を超えて定量的に正確な加熱速度をもたらすことが確認されている。その結果, 加熱ダイナミクスに関する情報は, 高周波膨張の最初の数個の項でエンコードされるが, 加熱はしばしば高周波膨張の漸近的発散挙動と関連していることがわかった。 Heating under periodic driving is a generic nonequilibrium phenomenon, and it is a challenging problem in nonequilibrium statistical physics to derive a quantitatively accurate heating rate. In this work, we provide a simple formula on the heating rate under fast and strong periodic driving in classical and quantum many-body systems. The key idea behind the formula is constructing a time-dependent dressed Hamiltonian by moving to a rotating frame, which is found by a truncation of the high-frequency expansion of the micromotion operator, and applying the linear-response theory. It is confirmed for specific classical and quantum models that the second-order truncation of the high-frequency expansion yields quantitatively accurate heating rates beyond the linear-response regime. Our result implies that the information on heating dynamics is encoded in the first few terms of the high-frequency expansion, although heating is often associated with an asymptotically divergent behavior of the high-frequency expansion.	翻訳日:2023-03-20 19:28:28 公開日:2022-01-28
# 導波路結合型ナノキャビティにおける強キラル光マター相互作用 Engineering strong chiral light-matter interactions in a waveguide-coupled nanocavity ( http://arxiv.org/abs/2108.01462v3 ) ライセンス: Link先を確認	D. Hallett, A. P. Foster, D. M. Whittaker, M. S. Skolnick, L. R. Wilson	(参考訳) スピン依存、指向性光-物質相互作用はキラル量子ネットワークの基礎を形成する。固体状態では、量子エミッタは一般にスピン依存のハンドネスを持つ円偏光遷移を持つ。スピン依存キラルカップリングは導波路結合型ナノキャビティにそのようなエミッタを埋め込むことにより実現可能であることを数値的に示す。キラルな挙動は、2つの単一モード出力導波路に結合する際のキャビティモード間の方向依存性による干渉によって生じる。特に、実験的な現実的な空洞設計は、ほぼ均一なキラルコントラスト、効率的な(\beta > 0.95$)導波路結合、光-物質相互作用強度(Purcell factor $F_P > 70$)を同時にサポートする。これらのパラメータを組み合わせることで、ナノフォトニック回路に統合可能な高コヒーレントなスピン光子インタフェースの開発が可能になる。 Spin-dependent, directional light-matter interactions form the basis of chiral quantum networks. In the solid state, quantum emitters commonly possess circularly polarised optical transitions with spin-dependent handedness. We demonstrate numerically that spin-dependent chiral coupling can be realised by embedding such an emitter in a waveguide-coupled nanocavity, which supports two near-degenerate, orthogonally-polarised cavity modes. The chiral behaviour arises due to direction-dependent interference between the cavity modes upon coupling to two single-mode output waveguides. Notably, an experimentally realistic cavity design simultaneously supports near-unity chiral contrast, efficient ($\beta > 0.95$) waveguide coupling and enhanced light-matter interaction strength (Purcell factor $F_P > 70$). In combination, these parameters could enable the development of highly coherent spin-photon interfaces, ready for integration into nanophotonic circuits.	翻訳日:2023-03-20 00:55:59 公開日:2022-01-28
# 単一領域ボース縮合磁力計は帯域当たりのエネルギー分解能を$\hbar$以下で達成する Single-domain Bose condensate magnetometer achieves energy resolution per bandwidth below $\hbar$ ( http://arxiv.org/abs/2108.11716v2 ) ライセンス: Link先を確認	Silvana Palacios Alvarez, Pau Gomez, Simon Coop, Roberto Zamora-Zamora, Chiara Mazzinghi and Morgan W. Mitchell	(参考訳) 本稿では,帯域当たりのエネルギー分解能を持つ磁気センサについて述べる。非破壊的なファラデー回転法により検出された単ドメインスピノル Bose-Einstein condensate が,ボリューム $V=1091(30)~\mu\mathrm{m}^3$ for $3.5~\mathrm{s}$, $E_R = 0.075(16)~\hbar$ の単ショットdc磁気感度を実現することを示す。本研究では, 凝縮体積, スピンコヒーレンス時間, 読み出しノイズを実験的に測定し, 3+1次元平均場シミュレーションによる位相空間法を用いてスピンノイズを計算する。スピンノイズへの寄与は、強磁性接触相互作用と二次ゼーマンシフトの競合による一体と三体損失と投射雑音分布のせん断を含む。それでも、単一ドメインの超低温2体相互作用の完全なコヒーレンス性により、システムはコヒーレンスvsを回避できる。これは従来のスピンプロセシングセンサーにエネルギー分解能の限界を課す。他のボース凝縮アルカリ、特に反強磁性の$^{23}\mathrm{na}$は、この方法のエネルギー分解能をさらに向上できると予測する。 We present a magnetic sensor with energy resolution per bandwidth $E_R < \hbar$. We show how a $^{87}\mathrm{Rb}$ single domain spinor Bose-Einstein condensate, detected by non-destructive Faraday-rotation probing, achieves single shot dc magnetic sensitivity of $72(8)~\mathrm{fT}$ measuring a volume $V= 1091(30)~\mu\mathrm{m}^3$ for $3.5~\mathrm{s}$, and thus $E_R = 0.075(16)~\hbar$. We measure experimentally the condensate volume, spin coherence time, and readout noise, and use phase-space methods, backed by 3+1D mean-field simulations, to compute the spin noise. Contributions to the spin noise include one-body and three-body losses and shearing of the projection noise distribution, due to competition of ferromagnetic contact interactions and quadratic Zeeman shifts. Nonetheless, the fully-coherent nature of the single-domain, ultracold two-body interactions allows the system to escape the coherence vs.~density trade-off that imposes an energy resolution limit on traditional spin-precession sensors. We predict that other Bose-condensed alkalis, especially the antiferromagnetic $^{23}\mathrm{Na}$, can further improve the energy resolution of this method.	翻訳日:2023-03-17 03:19:00 公開日:2022-01-28
# オンサイト相互作用を持つ周期駆動格子モデルにおけるトポロジカル2粒子ダイナミクス Topological two-particle dynamics in a periodically driven lattice model with on-site interactions ( http://arxiv.org/abs/2109.05220v2 ) ライセンス: Link先を確認	Anna Berti and Iacopo Carusotto	(参考訳) 本研究では,2粒子境界状態のロバストなトポロジカルなトポロジカルなダイナミクスをオンサイト相互作用と適切な時間依存ホッピングを持つ格子モデルで観測する。このFloquetスキームは、既存のデジタル量子コンピュータプラットフォーム上で現実的に実装することができる。 2つの独立粒子の位相的単粒子ダイナミクスと2つの構成粒子間の絡み合いの明瞭なシグネチャとの顕著な違いを強調する。 We develop a realistic protocol to observe a robust topological dynamics of two-particle bound states in a lattice model with on-site interactions and suitably designed time-dependent hoppings. This Floquet scheme can be realistically implemented on existing digital quantum computer platforms. Marked differences from the topological single-particle dynamics of two independent particles and clear signatures of the entanglement between the two constituent particles are highlighted.	翻訳日:2023-03-15 11:28:12 公開日:2022-01-28
# 測定誘起相転移における波動関数の多重性を超えた普遍的挙動 Universal behavior beyond multifractality of wave-functions at measurement--induced phase transitions ( http://arxiv.org/abs/2109.06882v3 ) ライセンス: Link先を確認	Piotr Sierant, Xhek Turkeshi	(参考訳) 本研究では,1次元量子回路の多体波動関数の構造を局所的測定により検討する。参加エントロピーのシステムサイズ依存性の先行項は、非ゼロの測定速度における波動関数のモデル依存マルチフラクタルスケーリングを示す。サブリード項は、測定誘起相転移に関する普遍的な情報を含み、次数パラメータの役割を担い、誤差補正位相では定数非ゼロであり、量子ゼノ相では消失する。本研究では,様々な量子多体系のロバストな数値的証明を提供し,この振る舞いを2次元における古典的統計モデルの分割関数の観点でエントロピーを表わす解析的解釈を提供する。 We investigate the structure of many-body wave functions of 1D quantum circuits with local measurements employing the participation entropies. The leading term in system size dependence of participation entropy indicates a model dependent multifractal scaling of the wave-functions at any non-zero measurement rate. The sub-leading term contains universal information about measurement-induced phase transitions and plays the role of an order parameter, being constant non-zero in the error correcting phase and vanishing in the quantum Zeno phase. We provide robust numerical evidence investigating a variety of quantum many-body systems, and provide an analytical interpretation of this behavior expressing the participation entropy in terms of partition functions of classical statistical models in 2D.	翻訳日:2023-03-15 02:53:51 公開日:2022-01-28
# 炭素中性データセンターの体系的調査に向けて Towards a Systematic Survey for Carbon Neutral Data Centers ( http://arxiv.org/abs/2110.09284v3 ) ライセンス: Link先を確認	Zhiwei Cao, Xin Zhou, Han Hu, Zhi Wang, Yonggang Wen	(参考訳) データセンターは大量のエネルギー消費のために炭素集約型企業であり、データセンター産業は2030年までに世界の二酸化炭素排出量の82%を占めると推定されている。しかし、データセンターの二酸化炭素排出量を削減または中和するための技術と政策の手段は、いずれも完全には調査されていない。このギャップを埋めるため,本稿では,政策機器と技術方法論の両方を考慮したカーボンニュートラルデータセンターのロードマップを提案する。まず、データセンターのカーボンフットプリントを提示することから始め、そして炭素排出の主な源に関するいくつかの洞察から始めます。その後、主要なグローバルクラウドプロバイダに対するカーボン中立性計画が議論され、この方向の現在の産業活動が要約される。以下では、コスト効率の高い方法でデータセンターの炭素排出量を相殺する方法を説明するための政策指標として、炭素市場を紹介する。技術面では、再生可能エネルギーの普及、エネルギー効率の向上、同時にエネルギー循環の向上により、炭素中性データセンターの実現を提案する。これら3つのトピックに関する既存技術の総合的なレビューは後述する。これに基づいて、カーボン中立性に対する多角的アプローチが想定され、このソリューションを実現するために、デジタルツインパワー産業人工知能(AI)フレームワークが提案されている。さらに,このような枠組みを立案する上での3つの重要な科学的課題について論じる。最後に、このフレームワークのいくつかのアプリケーションは、その巨大な可能性を示すために提示されます。 Data centers are carbon-intensive enterprises due to their massive energy consumption, and it is estimated that data center industry will account for 8\% of global carbon emissions by 2030. However, both technological and policy instruments for reducing or even neutralizing data center carbon emissions have not been thoroughly investigated. To bridge this gap, this survey paper proposes a roadmap towards carbon-neutral data centers that takes into account both policy instruments and technological methodologies. We begin by presenting the carbon footprint of data centers, as well as some insights into the major sources of carbon emissions. Following that, carbon neutrality plans for major global cloud providers are discussed to summarize current industrial efforts in this direction. In what follows, we introduce the carbon market as a policy instrument to explain how to offset data center carbon emissions in a cost-efficient manner. On the technological front, we propose achieving carbon-neutral data centers by increasing renewable energy penetration, improving energy efficiency, and boosting energy circulation simultaneously. A comprehensive review of existing technologies on these three topics is elaborated subsequently. Based on this, a multi-pronged approach towards carbon neutrality is envisioned and a digital twin-powered industrial artificial intelligence (AI) framework is proposed to make this solution a reality. Furthermore, three key scientific challenges for putting such a framework in place are discussed. Finally, several applications for this framework are presented to demonstrate its enormous potential.	翻訳日:2023-03-11 09:52:59 公開日:2022-01-28
# wikipediaにおける科学の地図 A Map of Science in Wikipedia ( http://arxiv.org/abs/2110.13790v2 ) ライセンス: Link先を確認	Puyu Yang and Giovanni Colavizza	(参考訳) 近年、インターネットの普及が急速に進み、科学情報への便利で安価なアクセスが可能になっている。世界最大の百科事典の1つであるウィキペディアは、この点において参考となり、学者から広く注目を集めている。しかし、ウィキペディアの内容を支える科学資料の明確な理解は、いまだ解明されていない。本研究では,ウィキペディアの記事と科学雑誌記事の関係を地図化するために,ウィキペディアからの引用のオープンデータセットを利用する。ウィキペディアから引用されたほとんどの雑誌記事はSTEM分野、特に生物学と医学(引用の47.6$\%、引用記事の46.1$\%)に属する。さらに、ウィキペディアの伝記はSTEM分野と人文科学、特に歴史を結びつける上で重要な役割を果たしている。これらの結果は、ウィキペディアの科学的情報源への依存と、知識ブローカーとしての一般への役割の理解に寄与する。 In recent decades, the rapid growth of Internet adoption is offering opportunities for convenient and inexpensive access to scientific information. Wikipedia, one of the largest encyclopedias worldwide, has become a reference in this respect, and has attracted widespread attention from scholars. However, a clear understanding of the scientific sources underpinning Wikipedia's contents remains elusive. In this work, we rely on an open dataset of citations from Wikipedia to map the relationship between Wikipedia articles and scientific journal articles. We find that most journal articles cited from Wikipedia belong to STEM fields, in particular biology and medicine ($47.6$\% of citations; $46.1$\% of cited articles). Furthermore, Wikipedia's biographies play an important role in connecting STEM fields with the humanities, especially history. These results contribute to our understanding of Wikipedia's reliance on scientific sources, and its role as knowledge broker to the public.	翻訳日:2023-03-10 05:32:38 公開日:2022-01-28
# カルノー限界を超える有限時間量子計測冷却 Finite-time quantum measurement cooling beyond the Carnot limit ( http://arxiv.org/abs/2111.12467v2 ) ライセンス: Link先を確認	Tong Fu, Jianying Du, Jingyi Chen, Jincan Chen, Chikako Uchiyama, Shanhe Su	(参考訳) そこで我々は, 侵襲的計測が冷却サイクルを駆動する力を与える計測系量子クーラーの有限時間サイクルモデルを提案した。そのようなクーラーは、マウェルの悪魔の代替思考実験と見なすことができる。測定フィードバック情報は、ワーク入力なしで冷湯から熱湯へ熱を移動させ、カルノー限界よりも性能の最大係数を大きくすることができる。この一見パラドックスな結果が熱力学の法則に違反しない原因は、相互情報を含む一般化されたクラウシウスの不等式を導出することで明確に説明できる。 We proposed the finite-time cycle model of a measurement-based quantum cooler, where invasive measurement provides the power to drive the cooling cycle. Such a cooler may be regarded as an alternative thought experiment of Mawell's demon. The measurement-feedback information is capable of moving heat from the cold to hot bath without any work input and even making the maximum coefficient of performance larger than the Carnot limit. The causes that this seemingly paradoxical result does not violate the laws of thermodynamics can be clearly explained through the derivation of a generalized Clausius inequality including the mutual information.	翻訳日:2023-03-07 00:06:47 公開日:2022-01-28
# 一般化回転対称性で保護される非エルミート$C_{NH} = 2$チャーン絶縁体 Non-Hermitian $C_{NH} = 2$ Chern insulator protected by generalized rotational symmetry ( http://arxiv.org/abs/2111.12573v2 ) ライセンス: Link先を確認	Kai Chen and Alexander B. Khanikaev	(参考訳) 空間の回転とエルミート共役を誘導する一般化回転対称性によって保護される非エルミート位相系を提案する。この系は、強結合モデルと非相互ホッピングにより記述され、ギャップ付き位相位相において2対のギャップ内エッジモードをホストし、非エルミート(NH)チャーン数$C_{NH}=2$で特徴づけられる。非エルミートチャーン数の量子化は、系の一般化された回転対称性 $\^H^{+}=\^U\^H\^U^{+}$ によって保護される。我々の発見は、トポロジカル不変量の大きい値と、トポロジカルにレジリエントな多重化に使用できる複数のエッジ状態のホストを特徴とする、新しい非エルミート的トポロジカルシステムへの道を開くものである。 We propose a non-Hermitian topological system protected by the generalized rotational symmetry which invokes rotation in space and Hermitian conjugation. The system, described by the tight-binding model with nonreciprocal hopping, is found to host two pairs of in-gap edge modes in the gapped topological phase and is characterized by the non-Hermitian (NH) Chern number $C_{NH}=2$. The quantization of the non-Hermitian Chern number is shown to be protected by the generalized rotational symmetry $\^H^{+}=\^U\^H\^U^{+}$ of the system. Our finding paves the way towards novel non-Hermitian topological systems characterized by large values of topological invariants and hosting multiple in-gap edge states, which can be used for topologically resilient multiplexing.	翻訳日:2023-03-06 23:57:31 公開日:2022-01-28
# qubitノイズデコンボリューション Qubit noise deconvolution ( http://arxiv.org/abs/2112.03043v2 ) ライセンス: Link先を確認	Stefano Mangini, Lorenzo Maccone, Chiara Macchiavello	(参考訳) 量子ビットシステム上で任意の測定を行う際に,広帯域ノイズを除去するノイズデコンボリューション手法を提案する。特に、最も一般的な単一キュービットノイズチャネルの逆写像を導出し、データ処理ステップで利用して、既知の雑音を受けるキュービットシステムで評価された可観測物のノイズフリー推定値を得る。本稿では,総称パウリチャネルのデコンボリューションに対するシミュレーション結果と,リゲッティ量子ハードウェア上で発生するデコヒーレンスノイズのデコンボリューションの実験的証拠を提供するために,ノイズ特性が正確であることを保証するための自己矛盾チェックを示す。 We present a noise deconvolution technique to remove a wide class of noises when performing arbitrary measurements on qubit systems. In particular, we derive the inverse map of the most common single qubit noisy channels and exploit it at the data processing step to obtain noise-free estimates of observables evaluated on a qubit system subject to known noise. We illustrate a self-consistency check to ensure that the noise characterization is accurate providing simulation results for the deconvolution of a generic Pauli channel, as well as experimental evidence of the deconvolution of decoherence noise occurring on Rigetti quantum hardware.	翻訳日:2023-03-05 10:05:39 公開日:2022-01-28
# 3量子ドットスピン量子ビットにおける高速で高忠実な状態形成と測定 Fast and high-fidelity state preparation and measurement in triple-quantum-dot spin qubits ( http://arxiv.org/abs/2112.09801v2 ) ライセンス: Link先を確認	Jacob Z. Blumoff, Andrew S. Pan, Tyler E. Keating, Reed W. Andrews, David W. Barnes, Teresa L. Brecht, Edward T. Croke, Larken E. Euliss, Jacob A. Fast, Clayton A. C. Jackson, Aaron M. Jones, Joseph Kerckhoff, Robert K. Lanza, Kate Raach, Bryan J. Thomas, Roland Velunta, Aaron J. Weinstein, Thaddeus D. Ladd, Kevin Eng, Matthew G. Borselli, Andrew T. Hunter, and Matthew T. Rakher	(参考訳) 交換専用si/sigeトリプル量子ドット量子ビットにおける高速で高忠実な状態形成と測定を示す。高速測定統合(980$ ns)と初期化(300$ ns)の操作は、全電気的、ベースバンド制御で実行される。我々は,交換専用量子ビットの文脈で開発された漏洩感度結合初期化および測定基準を強調し,2.5 pm0.5\times 10^{-3}$の不忠実さを報告する。この結果は、2-to-3電子電荷境界における初期化とスピン・ツー・チャージ変換におけるT_1$の慎重な評価と緩和によって実現される。最終的な忠実度は,多くの重要な要因によって制限され,さらに改善された忠実度と速度への明確な道筋が特定される。観測されたシングルキュービットランダム化ベンチマークエラーレートが1.7\times 10^{-3}$と並んで、スケーラブルな量子情報処理を約束する忠実度と持続時間におけるSi/SiGe三重ドット量子ビットの初期化、制御、測定を示す。 We demonstrate rapid, high-fidelity state preparation and measurement in exchange-only Si/SiGe triple-quantum-dot qubits. Fast measurement integration ($980$ ns) and initialization ($\approx 300$ ns) operations are performed with all-electrical, baseband control. We emphasize a leakage-sensitive joint initialization and measurement metric, developed in the context of exchange-only qubits but applicable more broadly, and report an infidelity of $2.5\pm0.5\times 10^{-3}$. This result is enabled by a high-valley-splitting heterostructure, initialization at the 2-to-3 electron charge boundary, and careful assessment and mitigation of $T_1$ during spin-to-charge conversion. The ultimate fidelity is limited by a number of comparably-important factors, and we identify clear paths towards further improved fidelity and speed. Along with an observed single-qubit randomized benchmarking error rate of $1.7\times 10^{-3}$, this work demonstrates initialization, control, and measurement of Si/SiGe triple-dot qubits at fidelities and durations which are promising for scalable quantum information processing.	翻訳日:2023-03-04 06:51:59 公開日:2022-01-28
# 超短パルスおよび高強度レーザーパルスによる原子イオン化におけるトンネルの役割 The role of tunneling in the ionization of atoms by ultrashort and intense laser pulses ( http://arxiv.org/abs/2112.14336v2 ) ライセンス: Link先を確認	Gabriel M. Lando	(参考訳) 古典的に許容される輸送は、ケルディシュパラメータがユニティよりも小さいにもかかわらず、超短パルスおよび強レーザーパルスによる原子のイオン化中に量子トンネルと競合する。これは、Trncated Wigner Approximation を用いて、純粋に古典的な伝播から得られるものと正確な確率密度を比較することによって行われる。古典輸送は、軌道を核から遠ざけることができるだけでなく、実験で現在使われている強度に対して量子輸送と同じ位のイオン化確率を提供することもできる。本研究の結果は,強磁場物理における概念的補正から半古典的なステップモデルから,時空実験におけるトンネル時間測定に関する議論まで多岐にわたる。 Classically allowed transport is shown to compete with quantum tunneling during the ionization of atoms by ultrashort and intense laser pulses, despite Keldysh parameters smaller than unity. This is done by comparing exact probability densities with the ones obtained from purely classical propagation using the Truncated Wigner Approximation. Not only is classical transport capable of moving trajectories away from the core, but it can also furnish ionization probabilities of the same order as the quantum ones for intensities currently employed in experiments. Our results have implications ranging from a conceptual correction to semiclassical step models in strong-field physics to the ongoing debate about tunneling time measurements in attoclock experiments.	翻訳日:2023-03-02 23:42:30 公開日:2022-01-28
# 量子多重アクセスチャネル上のプライベート古典的通信 Private Classical Communication over Quantum Multiple-Access Channels ( http://arxiv.org/abs/2201.11899v1 ) ライセンス: Link先を確認	Remi A. Chou	(参考訳) 量子多重アクセスチャネル上でのプライベート古典通信について検討する。任意の数の送信機に対して、容量領域の正規化表現を導出する。分解可能なチャネルの場合、最善の達成可能な和率に対する単一レター式を確立し、この量もまた分解可能な量子多重アクセスチャネル上の量子通信における最良の達成可能な和率に対応することを証明します。達成可能性の結果として、信頼性とプライバシーの制約を分離し、それぞれ、量子側情報と普遍ハッシュによるソースコーディングによって処理する。したがって、検討中のマルチユーザコーディング問題は、ポイント・ツー・ポイントのコーディング技術でのみ扱うことができる。独立利害の副産物として、我々は、達成可能な結果におけるプライバシを保証する量子側情報に対する分散剰余ハッシュ補題を導出する。 We study private classical communication over quantum multiple-access channels. For an arbitrary number of transmitters, we derive a regularized expression of the capacity region. In the case of degradable channels, we establish a single-letter expression for the best achievable sum-rate and prove that this quantity also corresponds to the best achievable sum-rate for quantum communication over degradable quantum multiple-access channels. In our achievability result, we decouple the reliability and privacy constraints, which are handled via source coding with quantum side information and universal hashing, respectively. Hence, we also establish that the multi-user coding problem under consideration can be handled solely via point-to-point coding techniques. As a by-product of independent interest, we derive a distributed leftover hash lemma against quantum side information that ensures privacy in our achievability result.	翻訳日:2023-02-27 16:19:13 公開日:2022-01-28
# Pseudo-Hermiticityにより保護されたユニタリ散乱 Unitary Scattering Protected by Pseudo-Hermiticity ( http://arxiv.org/abs/2201.11894v1 ) ライセンス: Link先を確認	L. Jin	(参考訳) エルミート系はユニタリ散乱を持つが、ハーミート性はユニタリ散乱には不要であるが、非ハーミート性の影響下での散乱は主に非ユニタリ散乱である。ここでは、ユニタリ散乱がある種の擬ハーミティティーによって保護され、非ハーミティティーの次数の影響を受けないことを証明する。エネルギー保存は散乱過程において破れ、散乱後に回復する。接続点のみを含む擬エルミート散乱中心のサブシステムはエルミートである。これらの発見はユニタリ散乱、擬エルミティシティ、エネルギー保存に関する基本的な知見を提供し、非エルミティアン系における光伝播、メソスコピック電子輸送、量子干渉に有望である。 The Hermitian systems possess unitary scattering; however, the Hermiticity is unnecessary for a unitary scattering although the scattering under the influence of non-Hermiticity is mostly non-unitary. Here we prove that the unitary scattering is protected by certain type of pseudo-Hermiticity and unaffected by the degree of non-Hermiticity. The energy conservation is violated in the scattering process and recovers after scattering. The subsystem of the pseudo-Hermitian scattering center including only the connection sites is Hermitian. These findings provide fundamental insights on the unitary scattering, pseudo-Hermiticity, and energy conservation; and are promising for the light propagation, mesoscopic electron transport, and quantum interference in the non-Hermitian systems.	翻訳日:2023-02-27 16:18:59 公開日:2022-01-28
# 量子鍵分布のための統合室温単一光子源 Integrated Room Temperature Single Photon Source for Quantum Key Distribution ( http://arxiv.org/abs/2201.11882v1 ) ライセンス: Link先を確認	Helen Zhi Jie Zeng, Minh Anh Phan Ngyuen, Xiaoyu Ai, Adam Bennet, Alexander Solnstev, Arne Laucht, Ali Al-Juboori, Milos Toth, Rich Mildren, Robert Malaney, and Igor Aharonovich	(参考訳) 室温で動作可能な高純度単一光子源(SPS)は、量子フォトニクスや量子鍵分布を含む無数のアプリケーションにとって非常に望ましい。本研究では、六方晶窒化ホウ素(hBN)の原子欠陥と固体浸漬レンズ(SIL)を融合した超高輝度固体SPSを実現する。 SILはソース効率を6倍に向上させ、統合システムは室温で毎秒1000万個の光子を生成することができる。この結果は、量子通信プロトコルにおけるspsの実用化に有望である。 High-purity single photon sources (SPS) that can operate at room temperature are highly desirable for a myriad of applications, including quantum photonics and quantum key distribution. In this work, we realise an ultra-bright solid-state SPS based on an atomic defect in hexagonal boron nitride (hBN) integrated with a solid immersion lens (SIL). The SIL increases the source efficiency by a factor of six, and the integrated system is capable of producing over ten million single photons per second at room temperature. Our results are promising for practical applications of SPS in quantum communication protocols.	翻訳日:2023-02-27 16:18:30 公開日:2022-01-28
# 従来型結合クラスタとユニタリ結合クラスタとの演算子関係 Operator relationship between conventional coupled cluster and unitary coupled cluster ( http://arxiv.org/abs/2201.11881v1 ) ライセンス: Link先を確認	James K. Freericks	(参考訳) 化学コミュニティは、特に量子コンピュータ上で量子化学を実行することに関心があるため、単一参照システムにおいて、従来と一元結合クラスタアンサッツの正確な関係を求めてきた。本研究では、指数的不等式とアダマール補題によって与えられた演算子操作を、ユニタリ結合クラスター近似の因子化形式と従来の結合クラスター近似の因子化形式とを関連付ける方法を示す(一部の振幅は演算子値であり、他の項に可換ではないため、因子化形式が必要である)。トロッター積公式を用いることで、分解された形式をユニタリ結合クラスター ansatz の標準形式に関連付けることができる。結合クラスタ近似の分解形式の演算子依存は、さらに高階演算子を必要とするために除去され、最終的に従来の結合クラスタが生成される。このアプローチの代数的操作は手作業で行うのが難しいが、十分に小さなシステムのためにコンピュータ上で自動化することができる。 The chemistry community has long sought the exact relationship between the conventional and the unitary coupled cluster ansatz for a single-reference system, especially given the interest in performing quantum chemistry on quantum computers. In this work, we show how one can use the operator manipulations given by the exponential disentangling identity and the Hadamard lemma to relate the factorized form of the unitary coupled-cluster approximation to a factorized form of the conventional coupled cluster approximation (the factorized form is required, because some amplitudes are operator-valued and do not commute with other terms). By employing the Trotter product formula, one can then relate the factorized form to the standard form of the unitary coupled cluster ansatz. The operator dependence of the factorized form of the coupled cluster approximation can also be removed at the expense of requiring even more higher-rank operators, finally yielding the conventional coupled cluster. The algebraic manipulations of this approach are daunting to carry out by hand, but can be automated on a computer for small enough systems.	翻訳日:2023-02-27 16:18:21 公開日:2022-01-28
# 古典情報伝達の熱力学的基準 Thermodynamic Criterion of Transmitting Classical Information ( http://arxiv.org/abs/2201.12110v1 ) ライセンス: Link先を確認	Chung-Yun Hsieh	(参考訳) 古典情報伝達の熱力学的基準とは何か? 任意に与えられた超チャネルのクラスによって補助される1ショット古典キャパシティ上の熱力学上および下界を証明した。これらの境界は、送信チャネルによって維持される古典的相関から抽出可能な研究によって与えられ、選択されたスーパーチャネルのクラスに依存する追加の熱力学的制約を受ける。これは、ワンショット方式で、古典情報の$n$ビットをチャネルを介して送信する物理メッセージは、保守された古典的相関から抽出可能な$n\times k_BT\ln2$と等価であり、その結果、古典的情報の伝達に必要な熱力学的基準を明らかにする。この結果は漸近理論にまで拡張でき、ホールボ=シュマハ=ウェストモアランドの定理に熱力学的意味を与えることができる。最後に,作業抽出はチャネルの資源理論と密接に関連していることを示す。この課題を定量的に解くために, 作業抽出タスクはダイナミクスの一般的な資源を目の当たりにすることができることを示し, 広い範囲のチャネル資源を初めて熱力学的に解釈する。以上の知見は,コミュニケーションと熱力学の間に明らかなつながりをもたらし,その相互作用から新たな物理メッセージを発見する可能性を示す。 What is the thermodynamic criterion of transmitting classical information? We prove thermodynamic upper and lower bounds on the one-shot classical capacity assisted by an arbitrarily given class of superchannels. These bounds are given by the extractable work from classical correlation maintained by the transmission channel, subject to additional thermodynamic constraints depending on the chosen class of superchannels. It provides the physical message that, in the one-shot regime, transmitting $n$ bits of classical information through a channel is equivalent to $n\times k_BT\ln2$ extractable work from the maintained classical correlation, consequently revealing the thermodynamic criterion that is necessary to transmit classical information. This result can be further extended to the asymptotic regime, providing thermodynamic meanings for Holevo-Schumacher-Westmoreland Theorem. Finally, our study suggests that work extraction is closely related to resource theories of channels. To quantitatively address this question, we show that work extraction tasks can witness general resources of dynamics, providing the first thermodynamic interpretation of a broad class of channel resources. Our findings provide explicit connections between communication and thermodynamics, demonstrating the possibility of discovering new physical messages from their interplay.	翻訳日:2023-02-27 16:13:54 公開日:2022-01-28
# 量子マイクロ波フォトニクスの実証-原理実証 A proof-of-principle demonstration of quantum microwave photonics ( http://arxiv.org/abs/2201.12106v1 ) ライセンス: Link先を確認	Yaqing Jin, Ye Yang, Huibo Hong, Xiao Xiang, Runai Quan, Tao Liu, Shougang Zhang, Ninghua Zhu, Ming Li, and Ruifang Dong	(参考訳) マイクロ波フォトニクスの急速な発展により、商業的重要性の多くの応用へと発展し、出現するボトルネックを取り除くことが重要となる。例えば、マイクロ波フォトニクスのメインブランチとして、無線オーバーファイバー技術は高帯域幅、低損失、長距離伝搬能力を提供し、通信から無線ネットワークまで幅広い応用を促進する。光キャリアとして超短パルスを用いると、さらに大きな容量が与えられる。しかし、超短パルスの広い帯域幅は、高周波RF信号のファイバ分散に対する深刻な脆弱性をもたらす。光キャリアとして時間エネルギーの絡み合った二光子源と単一光子検出技術を組み合わせた量子マイクロ波フォトニクス法の提案と実証を行った。その結果,超短パルスキャリアによる分散に強い耐性を持つ非局所RF信号変調を実現するだけでなく,分散からRF信号を効果的に抽出する機構を提供することがわかった。さらに,非局所変調RF信号と蒸留RF信号の両方のスプリアスフリーダイナミックレンジが大幅に改善された。超弱検出と低タイミング単一光子検出による高速処理の利点により、量子マイクロ波フォトニクス法は現代の通信やネットワークにおいて新たな可能性を開く。 With the rapid development of microwave photonics, which has expanded to numerous applications of commercial importance, eliminating the emerging bottlenecks becomes of vital importance. For example, as the main branch of microwave photonics, radio-over-fiber technology provides high bandwidth, low-loss, and long-distance propagation capability, facilitating wide applications ranging from telecommunication to wireless networks. With ultrashort pulses as the optical carrier, huge capacity is further endowed. However, the wide bandwidth of ultrashort pulses results in the severe vulnerability of high-frequency RF signals to fiber dispersion. With a time-energy entangled biphoton source as the optical carrier and combined with the single-photon detection technique, a quantum microwave photonics method is proposed and demonstrated experimentally. The results show that it not only realizes unprecedented nonlocal RF signal modulation with strong resistance to the dispersion associated with ultrashort pulse carriers but provides an alternative mechanism to effectively distill the RF signal out from the dispersion. Furthermore, the spurious-free dynamic range of both the nonlocally modulated and distilled RF signals has been significantly improved. With the ultra-weak detection and high-speed processing advantages endowed by the low-timing-jitter single-photon detection, the quantum microwave photonics method opens up new possibilities in modern communication and networks.	翻訳日:2023-02-27 16:13:32 公開日:2022-01-28
# 1-2-3 量子ソフトウェア実験の再現性 1-2-3 Reproducibility for Quantum Software Experiments ( http://arxiv.org/abs/2201.12031v1 ) ライセンス: Link先を確認	Wolfgang Mauerer and Stefanie Scherzinger	(参考訳) 様々な科学分野が再現性危機に直面している。量子ソフトウェア工学が新興分野であるためには、最初から適切な再現性工学に重点を置くことが不可欠である。しかし、複製パッケージの提供はほとんど普遍的に欠落している。このようなパッケージの作り方に関する実践的なアドバイスは、コンピュータサイエンス以外のバックグラウンドを持つ研究者から多くの貢献を受けている分野において、特に不幸である。本稿では,量子ソフトウェア実験における再現性工学への1-2-3～アプローチを提案することで,この不足を是正する方法について議論する。これらは、プロジェクト固有の研究成果物(ソースコード、測定データ、構成データ)のみに基づいて、専門的および学習的な社会の要求を満たすように設計されており、研究者による時間的投資をほとんど必要としない。我々の方式は、量子プロセッサ自体がもはやアクセスできない場合でも、長期的トレーサビリティを確認する。技術的バーを劇的に下げることで、量子ソフトウェア実験における複製パッケージの増殖を促進し、非CS研究者の分野への参加を容易にする。 Various fields of science face a reproducibility crisis. For quantum software engineering as an emerging field, it is therefore imminent to focus on proper reproducibility engineering from the start. Yet the provision of reproduction packages is almost universally lacking. Actionable advice on how to build such packages is rare, particularly unfortunate in a field with many contributions from researchers with backgrounds outside computer science. In this article, we argue how to rectify this deficiency by proposing a 1-2-3~approach to reproducibility engineering for quantum software experiments: Using a meta-generation mechanism, we generate DOI-safe, long-term functioning and dependency-free reproduction packages. They are designed to satisfy the requirements of professional and learned societies solely on the basis of project-specific research artefacts (source code, measurement and configuration data), and require little temporal investment by researchers. Our scheme ascertains long-term traceability even when the quantum processor itself is no longer accessible. By drastically lowering the technical bar, we foster the proliferation of reproduction packages in quantum software experiments and ease the inclusion of non-CS researchers entering the field.	翻訳日:2023-02-27 16:12:18 公開日:2022-01-28
# 標準量子アニールはデコヒーレンスで断熱逆アニールより優れる Standard quantum annealing outperforms adiabatic reverse annealing with decoherence ( http://arxiv.org/abs/2201.11997v1 ) ライセンス: Link先を確認	Gianluca Passarelli, Ka-Wa Yip, Daniel A. Lidar, Procolo Lucignano	(参考訳) オープンシステムにおけるAdiabatic reverse annealing(ARA)について検討した。閉系(単位)設定では、このアニーリングプロトコルは選択されたモデルの1次量子相転移を回避し、アルゴリズムの初期状態がターゲットモデルとハミング距離に近いことを条件として、標準量子アニーリングと比較して指数的なスピードアップをもたらす。ここで、デコヒーレンスは、この結論を著しく修正できることを示す: 断熱マスター方程式のアプローチを用いて、独立かつ集合的デファスメントの下で$p=3$の強磁性(p$-spin)モデルのダイナミクスをシミュレートする。いずれのデコヒーレンスモデルにおいても、オープンシステムaraの性能は、ユニタリシステムよりも初期状態の選択に対する感受性が低く、最も顕著なのは、オープンシステムaraが標準量子アニーリングに比べてソリューションアドバンテージの時間を失うことである。これらの結果は、ARAが単独の戦略として、標準の「前方」量子アニールを実験的に上回ることは不可能であり、現実的でノイズの多い環境でのARAの利点を実現するためにはエラー軽減戦略が必要であることを示唆している。 We study adiabatic reverse annealing (ARA) in an open system. In the closed system (unitary) setting, this annealing protocol allows avoidance of first-order quantum phase transitions of selected models, resulting in an exponential speedup compared with standard quantum annealing, provided that the initial state of the algorithm is close in Hamming distance to the target one. Here, we show that decoherence can significantly modify this conclusion: by resorting to the adiabatic master equation approach, we simulate the dynamics of the ferromagnetic $p$-spin model with $p=3$ under independent and collective dephasing. For both models of decoherence, we show that the performance of open system ARA is far less sensitive to the choice of the initial state than its unitary counterpart, and, most significantly, that open system ARA by and large loses its time to solution advantage compared to standard quantum annealing. These results suggest that as a stand-alone strategy, ARA is unlikely to experimentally outperform standard "forward" quantum annealing, and that error mitigation strategies will likely be required in order to realize the benefits of ARA in realistic, noisy settings.	翻訳日:2023-02-27 16:12:00 公開日:2022-01-28
# 2次解析によるBennett-Brassard 1984プロトコルにおける2塩基間の最適比 Optimum ratio between two bases in Bennett-Brassard 1984 protocol with second order analysis ( http://arxiv.org/abs/2201.11960v1 ) ライセンス: Link先を確認	Masahito Hayashi	(参考訳) ベネット・ブラッサード 1984 (bb84) プロトコルでは,コヒーレント攻撃時の生成鍵の長さに対する2次拡張を用いて,2つのベース,ビットベース,位相ベースの選択比率を最適化する。この最適化は、ベースの不一致による送信ビットの損失と、位相ベースにおける誤差率の推定誤差とのトレードオフに対処する。次に、第2次漸近性を有する生成鍵の最適比と最適長さを求める。驚くべきことに、2次の順序は$n^{3/4}$であり、これは従来の設定では$n$が量子通信の数であるとき、$n^{1/2}$よりもはるかに大きい。この事実は、我々の設定が従来の問題よりも2階解析においてはるかに重要であることを示している。この重要性を説明するために,第2次補正の効果を数値的にプロットする。 Bennet-Brassard 1984 (BB84) protocol, we optimize the ratio of the choice of two bases, the bit basis and the phase basis by using the second order expansion for the length of the generation keys under the coherent attack. This optimization addresses the trade-off between the loss of transmitted bits due to the disagreement of their bases and the estimation error of the error rate in the phase basis. Then, we derive the optimum ratio and the optimum length of the generation keys with the second order asymptotics. Surprisingly, the second order has the order $n^{3/4}$, which is much larger than the second order $n^{1/2}$ in the conventional setting when $n$ is the number of quantum communication. This fact shows that our setting has much larger importance for the second order analysis than the conventional problem. To illustrate this importance, we numerically plot the effect of the second order correction.	翻訳日:2023-02-27 16:11:37 公開日:2022-01-28
# 遠距離量子メモリの絡み合い Entangling metropolitan-distance separated quantum memories ( http://arxiv.org/abs/2201.11953v1 ) ライセンス: Link先を確認	Xi-Yu Luo, Yong Yu, Jian-Long Liu, Ming-Yang Zheng, Chao-Yang Wang, Bin Wang, Jun Li, Xiao Jiang, Xiu-Ping Xie, Qiang Zhang, Xiao-Hui Bao, Jian-Wei Pan	(参考訳) 量子インターネットは、すべての量子リソースを接続するという約束を与え、ローカライズされたシナリオをはるかに超えるアプリケーションを可能にする。プロトタイプは、絡み合って分離された量子記憶のネットワークである。従来は距離が限られていた。本稿では,2つの原子量子メモリ間の遠隔絡み合いを,大都市圏で直接12.5km間隔で物理的に分離した。原子-光子結合を1つのノードに生成し、光子を第2ノードに送信して記憶する。 20.5kmのフィールド展開ファイバによる低損失伝送を周波数ダウンコンバージョンとアップコンバージョンを用いて活用する。最終的なメモリ・メモリの絡み合いは、光子を回収することで90%の忠実さが証明される。我々の実験は、実用的なシナリオで量子ネットワークアプリケーションを研究する方法である。 Quantum internet gives the promise of getting all quantum resources connected, and it will enable applications far beyond a localized scenario. A prototype is a network of quantum memories that are entangled and well separated. Previous realizations are limited in the distance. In this paper, we report the establishment of remote entanglement between two atomic quantum memories physically separated by 12.5 km directly in a metropolitan area. We create atom-photon entanglement in one node and send the photon to a second node for storage. We harness low-loss transmission through a field-deployed fiber of 20.5 km by making use of frequency down-conversion and up-conversion. The final memory-memory entanglement is verified to have a fidelity of 90% via retrieving to photons. Our experiment paves the way to study quantum network applications in a practical scenario.	翻訳日:2023-02-27 16:11:22 公開日:2022-01-28
# トラップイオン量子コンピュータ上のマルチラウンドqaoaおよびadvancedミキサー Multi-round QAOA and advanced mixers on a trapped-ion quantum computer ( http://arxiv.org/abs/2201.12335v1 ) ライセンス: Link先を確認	Yingyue Zhu, Zewen Zhang, Bhuvanesh Sundar, Alaina M. Green, C. Huerta Alderete, Nhung H. Nguyen, Kaden R. A. Hazzard, Norbert M. Linke	(参考訳) グラフ上の組合せ最適化問題は、科学と工学に幅広い応用がある。量子近似最適化アルゴリズム(Quantum Approximate Optimization Algorithm, QAOA)は、変分回路の複数ラウンドを適用して量子コンピュータ上でこれらの問題を解く方法である。しかし、QAOAの実際の応用を制限するいくつかの課題が存在する。本稿では、複数の任意のグラフ上の複数の問題に対するラウンド数によってqaoa結果が改善するトラップイオン量子コンピュータについて述べる。また,任意の重みを持つ最適解をサンプリングできる高度な混合ハミルトニアンを示す。結果は,実世界の問題に量子アルゴリズムを適用するための一歩である。 Combinatorial optimization problems on graphs have broad applications in science and engineering. The Quantum Approximate Optimization Algorithm (QAOA) is a method to solve these problems on a quantum computer by applying multiple rounds of variational circuits. However, there exist several challenges limiting the real-world applications of QAOA. In this paper, we demonstrate on a trapped-ion quantum computer that QAOA results improve with the number of rounds for multiple problems on several arbitrary graphs. We also demonstrate an advanced mixing Hamiltonian that allows sampling of all optimal solutions with predetermined weights. Our results are a step towards applying quantum algorithms to real-world problems.	翻訳日:2023-02-27 16:04:08 公開日:2022-01-28
# 散逸工学による非ユニタリゲート操作 Nonunitary Gate Operations by Dissipation Engineering ( http://arxiv.org/abs/2201.12330v1 ) ライセンス: Link先を確認	E. Zapusek, A. Javadi, F. Reiter	(参考訳) 無可逆論理はユニタリ量子進化と相反する。そのような操作を古典的な測定でエミュレートすることは、外乱と高いリソース要求をもたらす可能性がある。これらの制限を克服するために, 不可逆ゲート操作に必要な非単位進化を実現するために, 散逸を利用するプロトコルを提案する。崩壊する新たな励起状態を用いて、最小の安定ヒルベルト空間上で所望のゲート演算を行う効果的な崩壊過程を設計する。これらは、測定を必要とせず、決定論的かつ自律的に動作する。我々は、OR、NOR、XORゲートなどの古典論理演算を考慮に入れたアプローチを例証する。実験的な実現に向けて、量子ドットの実装の可能性について議論する。本研究では,非可逆論理演算を現実的な量子システム上で効率的に行うことができ,非単体進化を得るためには散逸工学が不可欠であることを示す。提案したオペレーションは、量子エンジニアのツールボックスを拡張し、NISQアルゴリズムと量子機械学習に有望な応用をもたらす。 Irreversible logic is at odds with unitary quantum evolution. Emulating such operations by classical measurements can result in disturbances and high resource demands. To overcome these limitations, we propose protocols that harness dissipation to realize the nonunitary evolution required for irreversible gate operations. Using additional excited states subject to decay, we engineer effective decay processes that perform the desired gate operations on the smallest stable Hilbert space. These operate deterministically and in an autonomous fashion, without the need for measurements. We exemplify our approach considering several classical logic operations, such as the OR, NOR, and XOR gates. Towards experimental realization, we discuss a possible implementation in quantum dots. Our study shows that irreversible logic operations can be efficiently performed on realistic quantum systems and that dissipation engineering is an essential tool for obtaining nonunitary evolutions. The proposed operations expand the quantum engineers' toolbox and have promising applications in NISQ algorithms and quantum machine learning.	翻訳日:2023-02-27 16:03:57 公開日:2022-01-28
# 量子後連想記憶 A Post-Quantum Associative Memory ( http://arxiv.org/abs/2201.12305v1 ) ライセンス: Link先を確認	Ludovico Lami, Daniel Goldwater, Gerardo Adesso	(参考訳) 連想記憶(Associative memory)は、その部分的開示によって完全に検索できる情報を記憶する装置である。我々は,いくつかの基本的な操作公理を満足する物理理論の最も一般的なクラスを表現する一般確率論(gpts)の枠組みの中で,連想記憶のおもちゃモデルとそれを行う究極の限界について検討する。私たちは、gptの次元がどれくらい大きいか自問自答し、n$が完全に区別可能な特性で2^m$の状態に対応できるようにします。 danzer と gr\"unbaum による古い結果を呼び出すと、この質問に対する最適な答えが m+1$ であるとき、理論が古典的あるいは量子的である必要があるとき、$o(2^m)$ と比較できることが証明される。これは、GPTが古典理論と量子理論の両方を指数関数的に上回るタスクの例をもたらす。 N\geq 3$の同じ問題は未解決のままである。 Associative memories are devices storing information that can be fully retrieved given partial disclosure of it. We examine a toy model of associative memory and the ultimate limitations it is subjected to within the framework of general probabilistic theories (GPTs), which represent the most general class of physical theories satisfying some basic operational axioms. We ask ourselves how large the dimension of a GPT should be so that it can accommodate $2^m$ states with the property that any $N$ of them are perfectly distinguishable. Invoking an old result by Danzer and Gr\"unbaum, we prove that when $N=2$ the optimal answer to this question is $m+1$, to be compared with $O(2^m)$ when the theory is required to be either classical or quantum. This yields an example of a task where GPTs outperform both classical and quantum theory exponentially. The same problem for $N\geq 3$ is left open.	翻訳日:2023-02-27 16:03:31 公開日:2022-01-28
# 宇宙の絡み合いに対する幾何学的補正 Geometric corrections to cosmological entanglement ( http://arxiv.org/abs/2201.12299v1 ) ライセンス: Link先を確認	Alessio Belfiglio, Orlando Luongo, Stefano Mancini	(参考訳) 均質および等方的宇宙背景上の不均質摂動による絡み合い生成について検討し、量子効果と幾何効果の相互作用が、均質なシナリオに関して絡み合いエントロピーに関連があることを示した。そのため、共形結合したスカラー場に注目し、スカラー粒子の幾何的生成が絡み合うかについて議論する。摂動的に、第一階ではエントロピー補正の振動を見出すが、第二階では下層幾何が絡み合い生成のモード混合を誘導する。したがって,幾何学的貢献のみによる絡み合いを定量化し,これまでの結果と比較した。ダークマター候補として解釈された幾何学的(準)粒子による幾何学的寄与を特徴付ける。 We investigate entanglement production by inhomogeneous perturbations over a homogeneous and isotropic cosmic background, demonstrating that the interplay between quantum and geometric effects can have relevant consequences on entanglement entropy, with respect to homogeneous scenarios. To do so, we focus on a conformally coupled scalar field and discuss how geometric production of scalar particles leads to entanglement. Perturbatively, at first order we find oscillations in entropy correction, whereas at second order the underlying geometry induces mode-mixing on entanglement production. We thus quantify entanglement solely due to geometrical contribution and compare our outcomes with previous findings. We characterize the geometric contribution through geometric (quasi)-particles, interpreted as dark matter candidates.	翻訳日:2023-02-27 16:03:12 公開日:2022-01-28
# 監視量子回路における絡み合いダイナミクスの3次元解法 Three-fold way of entanglement dynamics in monitored quantum circuits ( http://arxiv.org/abs/2201.12259v1 ) ライセンス: Link先を確認	Tara Kalsi, Alessandro Romito, Henning Schomerus	(参考訳) ダイソンの3つの円形アンサンブル(円形ユニタリ,直交,シンプレクティックアンサンブル,CUE,COE,CSE)上に構築された量子回路における測定誘起エンタングルメント遷移について検討する。局所ランダムユニタリゲートの交互に発展する一次元回路の確立したモデルと、測定速度が増加するにつれて広範囲から集中的な絡み合いスケーリングへの遷移を示すことで、キューから引き出すゲートに対して可変速度で行う射影計測を活用した。このケースをCOEとCSEと対比することにより、ゲートによる局所的な絡み合い発生と測定による絡み合い低減との相互作用の洞察を得る。このために,各ゲートが異なるアンサンブルで生成する絡み合いに対する解析的ランダム行列結果と,完全量子回路に対する数値結果を組み合わせた。これらの考察は、カルタンのKAK分解の本質を捉えた特性エンタングルメント行列の観点で統計エンタングリングパワーの効率的な言い換え、CSEに関連する反対称行列の固有値統計に対する一般的な結果を含む。 We investigate the measurement-induced entanglement transition in quantum circuits built upon Dyson's three circular ensembles (circular unitary, orthogonal, and symplectic ensembles; CUE, COE and CSE). We utilise the established model of a one-dimensional circuit evolving under alternating local random unitary gates and projective measurements performed with tunable rate, which for gates drawn from the CUE is known to display a transition from extensive to intensive entanglement scaling as the measurement rate is increased. By contrasting this case to the COE and CSE, we obtain insights into the interplay between the local entanglement generation by the gates and the entanglement reduction by the measurements. For this, we combine exact analytical random-matrix results for the entanglement generated by the individual gates in the different ensembles, and numerical results for the complete quantum circuit. These considerations include an efficient rephrasing of the statistical entangling power in terms of a characteristic entanglement matrix capturing the essence of Cartan's KAK decomposition, and a general result for the eigenvalue statistics of antisymmetric matrices associated with the CSE.	翻訳日:2023-02-27 16:02:58 公開日:2022-01-28
# 非線形蹴りマッハ・ツェンダー干渉計を用いた量子気象 Quantum metrology with a non-linear kicked Mach-Zehnder interferometer ( http://arxiv.org/abs/2201.12255v1 ) ライセンス: Link先を確認	Sabrina M\"uller and Daniel Braun	(参考訳) 位相シフト器に加えて非線形素子を含むマッハ・ツェンダー干渉計の感度について検討した。キャビティまたは光が何度も横切るループに両方の要素を含めることで、干渉計の非線形キックバージョンが生まれる。本研究では, 位相シフト, キック強度, 最大平均光子数, および初期コヒーレント状態における光子損失による減衰の関数としての感度について検討した。減衰したハイゼンベルクに制限された感度のスケーリングを消すためには、スクイーズが全光子数を支配している場合に生じる。最小から中程度の減衰率では、非線形キックは単位時間当たりの量子フィッシャー情報によって測定される感度をかなり高めることができる。 We study the sensitivity of a Mach-Zehnder interferometer that contains in addition to the phase shifter a non-linear element. By including both elements in a cavity or a loop that the light transverses many times, a non-linear kicked version of the interferometer arises. We study its sensitivity as function of the phase shift, the kicking strength, the maximally reached average number of photons, and damping due to photon loss for an initial coherent state. We find that for vanishing damping Heisenberg-limited scaling of the sensitivity arises if squeezing dominates the total photon number. For small to moderate damping rates the non-linear kicks can considerably increase the sensitivity as measured by the quantum Fisher information per unit time.	翻訳日:2023-02-27 16:02:36 公開日:2022-01-28
# 量子イマジナリー時間進化によるMaxCutの解法 Solving MaxCut with Quantum Imaginary Time Evolution ( http://arxiv.org/abs/2201.12221v1 ) ライセンス: Link先を確認	Rizwanul Alam, George Siopsis, Rebekah Herrman, James Ostrowski, Phillip Lotshaw, Travis Humble	(参考訳) 量子イマジナリー時間発展(qite)に基づくマックスカット問題を効率的に解く手法を提案する。ユニタリ更新には線形Ansatzを使用し、絡み合いを伴わない初期状態とする。この手法を頂点数 \|V\| = 4,6,8,10 のグラフに適用し、平均解が最大MaxCut 解の 100%, 99%, 98%, 97% となることを示す。与えられたグラフと2つのエッジを持つグラフを補間する仮想時間依存ハミルトニアン補間を用いることで、修正アルゴリズムは最大8頂点のグラフと約100個の10頂点グラフのランダムサンプルに対して最大解に100%の精度で収束することを示した。この改良された手法は頂点数の多項式であるオーバーヘッドを持つ。 We introduce a method to solve the MaxCut problem efficiently based on quantum imaginary time evolution (QITE). We employ a linear Ansatz for unitary updates and an initial state that involve no entanglement. We apply the method to graphs with number of vertices \|V\| = 4,6,8,10, and show that after ten QITE steps, the average solution is 100%, 99%, 98%, 97%, respectively, of the maximum MaxCut solution. By employing an imaginary-time-dependent Hamiltonian interpolating between a given graph and a subgraph with two edges excised, we show that the modified algorithm has a 100% performance converging to the maximum solution of the MaxCut problem for all graphs up to eight vertices as well as about 100 random samples of ten-vertex graphs. This improved method has an overhead which is polynomial in the number of vertices.	翻訳日:2023-02-27 16:02:22 公開日:2022-01-28
# 熱力学過程における量子コヒーレンスの役割 The roles of quantum coherence in thermodynamic processes ( http://arxiv.org/abs/2201.12202v1 ) ライセンス: Link先を確認	Jingyi Chen, Guozhen Su, Jincan Chen, and Shanhe Su	(参考訳) 2つの異なる固有基底ベクトルの重ね合わせに付随する量子コヒーレンスは熱力学において必須であると考えられている。系密度作用素とハミルトニアンの基底ベクトルの拡張として観測因子を記述することにより、コヒーレント因子を決定することができる。スピンの偏差や光子の自発的放出といった有限時間熱力学過程におけるコヒーレンスの役割を明らかにする。その結果,スピン沈降と自然放出過程における熱は,主にコヒーレンスによって生成されることがわかった。 Quantum coherence associated with the superpositions of two different sets of eigenbasis vectors has been regarded as essential in thermodynamics. It is found that coherent factors can be determined by writing observables as an expansion in the basis vectors of the systemic density operator and Hamiltonian. We reveal the roles of coherence in finite-time thermodynamic processes, such as the spin precession and the spontaneous emission of a photon. Results show that the work in the spin precession and the heat in the spontaneous emission process are mainly generated by coherence.	翻訳日:2023-02-27 16:02:05 公開日:2022-01-28
# uofa-truth at factify 2022 : トランスフォーマーとトランスファー学習に基づくマルチモーダルファクトチェック UofA-Truth at Factify 2022 : Transformer And Transfer Learning Based Multi-Modal Fact-Checking ( http://arxiv.org/abs/2203.07990v1 ) ライセンス: Link先を確認	Abhishek Dhankar, Osmar R. Za\"iane and Francois Bolduc	(参考訳) 特にテキスト、画像、ビデオ、音声を通じて情報を伝達する複数のモードを考える場合、偽ニュースを特定することは非常に難しい作業である。我々は,De-Factify@AAAI2022におけるFACTIFY共有タスクにおいて,複数のモーダルニュースソース(テキストや画像を含む)の自動誤報・誤報検出の問題に,単純かつ効果的に対処する試みを行った。私たちのモデルはF1重み付けスコア74.807%を生成しました。本稿では,共有タスクを行うためのアプローチについて説明する。 Identifying fake news is a very difficult task, especially when considering the multiple modes of conveying information through text, image, video and/or audio. We attempted to tackle the problem of automated misinformation/disinformation detection in multi-modal news sources (including text and images) through our simple, yet effective, approach in the FACTIFY shared task at De-Factify@AAAI2022. Our model produced an F1-weighted score of 74.807%, which was the fourth best out of all the submissions. In this paper we will explain our approach to undertake the shared task.	翻訳日:2023-02-27 15:56:01 公開日:2022-01-28
# 2次元Tiny外窓を用いた3次元量子力学の二重量子化 A Double Quantization for 3d Quantum Mechanics with 2d Tiny Extra Window ( http://arxiv.org/abs/2202.00539v1 ) ライセンス: Link先を確認	Zahra Ghahreman, Mehdi Dehghani, Majid Monemzadeh	(参考訳) 我々は、検出しようとする粒子の既存のコンパクト余剰次元の仮説に基づいて量子力学を構築する。確率関数を導入することにより、粒子の外部2dウィンドウへの遷移を表現する。この関数の一般的な性質について検討し、余分な窓への粒子発生のための長さスケールが与えられる。多様な視点から考えると、新しい長さスケールはプランク定数のほかに別の量子化のための別の量子基準となる。第二級制約系の正準量子化は、所望の量子力学を構築するための方法であり、その中に確率関数が第二級制約の構造に入る。これは、余剰次元の現象を効果的に3次元量子力学にインポートする。この効果的な二重量子論のいくつかの側面が述べられており、場の理論的な視界とは対照的に、余剰次元の量子を機械的に経験することに焦点を当てている。特に、線形微分方程式を解くためのフロベニウス法則を用いて、自由粒子の波動関数とスペクトルの解を作ろうとする。この文脈では、余剰次元の長さスケールは3次元空間に接続された小さな余剰ウィンドウの境界における波動方程式の特異性を特徴付ける。 We construct a quantum mechanics based on the hypothesis of existing compact extra dimensions for a particle that wants to detect it. By introducing a probability function, we express the transition of particle to the extra 2d window. The general properties of this function has been examined and a length scale for occurrence of particle to extra window is given. By a diverse view point we consider that, the new length scale plays another quantum criteria for another quantization, beside the Planck constant. Canonical quantization of second class constrained systems, is our method for constructing the desired quantum mechanics, in which in it the probability function enters in the structure of second class constraints. This import the phenomena of extra dimension to the 3d quantum mechanics, effectively. Some aspects of this effective double quantum theory are mentioned, which one may investigate them more focused to experience extra dimension quantum mechanically in contrast to field theoretic sights. Specially, we try to make solutions for wave function and spectrum of the free particle, by Frobenius prescription for solving linear differential equations. In this context, the length scale of extra dimension characterizes the singularity of the wave equation at the boundary which tiny extra window connected to 3d space.	翻訳日:2023-02-27 15:55:33 公開日:2022-01-28
# 結合線形ポテンシャル上の準安定状態の非断熱崩壊 Nonadiabatic decay of metastable states on coupled linear potentials ( http://arxiv.org/abs/2201.12388v1 ) ライセンス: Link先を確認	Alisher Duspayev, Ansh Shah, Georg Raithel	(参考訳) 反対の傾斜を持つレベル対の交差は、量子粒子の外部自由度に対するポテンシャルエネルギー曲線を形成することができる。本研究では, メタスタブル状態の非断熱的崩壊について, ダイアバティックおよび断熱的表現を用いて検討した。このシステムは単一のスケールされた断熱パラメータ $v$ によって記述される。時間非依存の2成分Schr\"odinger方程式は両表現で解かれ、MSACの非断続寿命は波動関数のフラックス計算とブライト・ウィグナー式から決定され、各MSACの寿命は4つになる。また,両画像における時間依存schr\"odinger方程式を解き,波動関数崩壊によるmsc寿命を導出する。 msac寿命の6つの非摂動的値の集合はよく一致し、アプローチを検証する。断熱パラメータの$V$が約10倍に増加するにつれて、MSAC文字はわずかに変化し、寿命はおよそ10桁になる。いくつかの体制における寿命の$\nu$-dependenceについて論じる。時間依存摂動理論は、非摂動結果から$\lesssim 30\%$を逸脱する近似寿命を得るのに対し、半古典的なランダウ・ツェナートンネル方程式に基づく予測は、研究された$V$と$\nu$の範囲で最大20オフの係数を持つ。結果は、交差、結合ポテンシャルエネルギー曲線に関する量子状態を持つ多くの原子系と分子系に関係している。 Avoided crossings of level pairs with opposite slopes can form potential energy curves for the external degree of freedom of quantum particles. We investigate nonadiabatic decay of metastable states on such avoided crossings (MSACs) using diabatic and adiabatic representations. The system is described by a single scaled adiabaticity parameter, $V$. The time-independent two-component Schr\"odinger equation is solved in both representations, and the nonadiabatic lifetimes of MSACs are determined from a wave-function flux calculation and from the Breit-Wigner formula, leading to four lifetime values for each MSAC. We also solve the time-dependent Schr\"odinger equation in both pictures and derive the MSAC lifetimes from wave-function decay. The sets of six non-perturbative values for the MSAC lifetimes agree well, validating the approaches. As the adiabaticity parameter $V$ is increased by about a factor of ten, the MSAC character transitions from marginally to highly stable, with the lifetimes increasing by about ten orders of magnitude. The $\nu$-dependence of the lifetimes in several regimes is discussed. Time-dependent perturbation theory is found to yield approximate lifetimes that deviate by $\lesssim 30\%$ from the non-perturbative results, while predictions based on the semi-classical Landau-Zener tunneling equation are found to be up to a factor of twenty off, over the ranges of $V$ and $\nu$ studied. The results are relevant to numerous atomic and molecular systems with quantum states on intersecting, coupled potential energy curves.	翻訳日:2023-02-27 15:54:59 公開日:2022-01-28
# 開群系におけるデコヒーレンスフリー部分空間の断熱制御 Adiabatic Control of Decoherence-Free-Subspaces in an Open Collective System ( http://arxiv.org/abs/2201.12379v1 ) ライセンス: Link先を確認	Jarrod T. Reilly, Simon B. J\"ager, John Cooper, Murray J. Holland	(参考訳) 本稿では,脱コヒーレンスフリー部分空間 (DFS) を用いた原子アンサンブルの消散キャビティ内での制御手法を提案する。我々は,アンサンブルの放射振幅に分解的に干渉するキャビティにフィールドを注入することにより,システムのリンドブラッドジャンプ演算子の固有状態を設計できる。従来の断熱的 DFS 提案とは対照的に,提案手法は集団的デコヒーレンスの存在下で DFS を作成する。したがって、量子情報科学やメトロロジーに利用される多粒子の絡み合いを持つ状態を設計することができる。さらに,いわゆるアディアバティック基準から得られるダイアバティック進化の知識を利用する,より最適化された運転方式を示す。これにより、アンサンブル内の原子数に依存しない時間スケールで、非常に高い忠実度を持つ所望の状態へと進化することができる。 DFS固有状態を理論的に設計することにより、この手法は、散逸のみを用いて所望の状態に減衰する従来のスキームよりも高速な状態準備を可能にする。 We propose a method to adiabatically control an atomic ensemble using a decoherence-free subspace (DFS) within a dissipative cavity. We can engineer a specific eigenstate of the system's Lindblad jump operators by injecting a field into the cavity which deconstructively interferes with the emission amplitude of the ensemble. In contrast to previous adiabatic DFS proposals, our scheme creates a DFS in the presence of collective decoherence. We therefore have the ability to engineer states that have high multi-particle entanglements which may be exploited for quantum information science or metrology. We further demonstrate a more optimized driving scheme that utilizes the knowledge of possible diabatic evolution gained from the so-called adiabatic criteria. This allows us to evolve to a desired state with exceptionally high fidelity on a time scale that does not depend on the number of atoms in the ensemble. By engineering the DFS eigenstate adiabatically, our method allows for faster state preparation than previous schemes that rely on damping into a desired state solely using dissipation.	翻訳日:2023-02-27 15:54:30 公開日:2022-01-28
# $\mathbb{Z}_2$対称性を持つ巡回アーベル格子ゲージ理論の電磁双対性 Electric-magnetic duality of $\mathbb{Z}_2$ symmetry enriched cyclic Abelian lattice gauge theory ( http://arxiv.org/abs/2201.12361v1 ) ライセンス: Link先を確認	Zhian Jia, Dagomir Kaszlikowski	(参考訳) キタエフの量子二重モデルはディクグラフ-ウィッテン位相量子場理論(tqft)の格子ゲージ理論による実現であり、その位相的に保護された基底状態空間は位相量子計算と位相量子記憶に広く応用されている。我々は、圏的枠組みにおける巡回アーベル群のモデルの一般化である $\mathbb{z}_2$ 対称性を調べ、明示的なハミルトニアン構成を示す。このモデルは、$\mathbb{Z}_2$対称性リッチトポロジカル位相(SET)の格子実現を提供する。我々は、電磁(EM)双対性対称性が特別な場合である位相のカテゴリー対称性について詳細に論じる。対称性欠陥の側面を, UBFC ($G$-crossed Unitary Braided fusion category) を用いて検討した。また, 対応するいずれの凝縮も決定し, ギャップ付き境界と境界バルク双対性についても検討した。そして、これらのSET相に対するEM双対性の明示的な格子実現を慎重に構築する。最後に、トポロジカル量子計算とトポロジカルメモリ理論におけるそれらの可能性について論じる。 Kitaev's quantum double model is a lattice gauge theoretic realization of Dijkgraaf-Witten topological quantum field theory (TQFT), its topologically protected ground state space has broad applications for topological quantum computation and topological quantum memory. We investigate the $\mathbb{Z}_2$ symmetry enriched generalization of the model for the cyclic Abelian group in a categorical framework and present an explicit Hamiltonian construction. This model provides a lattice realization of the $\mathbb{Z}_2$ symmetry enriched topological (SET) phase. We discuss in detail the categorical symmetry of the phase, for which the electric-magnetic (EM) duality symmetry is a special case. The aspects of symmetry defects are investigated using the $G$-crossed unitary braided fusion category (UBFC). By determining the corresponding anyon condensation, the gapped boundaries and boundary-bulk duality are also investigated. Then we carefully construct the explicit lattice realization of EM duality for these SET phases. Finally, their potential applications in topological quantum computation and topological memory theories are discussed.	翻訳日:2023-02-27 15:53:52 公開日:2022-01-28
# 最近の配車会社合併後のミュンヘン市における交通モードの嗜好性の検討 Exploring Preferences for Transportation Modes in the City of Munich after the Recent Incorporation of Ride-Hailing Companies ( http://arxiv.org/abs/2201.13284v1 ) ライセンス: Link先を確認	Maged Shoman and Ana Tsui Moreno	(参考訳) 近年のライドシェアリング(RH)企業の成長は、多くの点で都市移動に影響を与えている。このようなサービスのメリットに関する広範な主張にもかかわらず、この話題に関する限定的な研究が進められている。本稿では、ミュンヘン交通利用者のrhサービスに対する支払い意欲を評価する。 RH企業から直接データを取得することの難しさに気付き、前述した選好調査が設計された。データセットには500人の通勤者からの回答が含まれている。 RHサービスとそれに似たモード(オートとトランジット)を用いた8kmの旅行シナリオにおけるソシオドモグラフィー特性,現在の旅行行動および交通モードの嗜好を収集した。所得グループ間でrhサービスを使用するための時間とコスト係数を推定するために多項ロジットモデルが用いられ、rhの時間価値(vot)を推定するために使用された。モデルの結果、rhサービスの人気は18歳から39歳までで、自動車の数が少ない大きな世帯や世帯が多かった。高い収入グループは、RHサービスの使用に対してより多くのお金を払っている。ミュンヘン市におけるRHサービスのモーダルスプリットへの影響を検討するため、インクリメンタルロジットを用いた既存のネストロジットモード選択モデルにRHを新しいモードとして組み込んだ。旅行時間、旅行費、VOTは、通勤者がRHと最も近いモードであるメトロを選択する際の選択肢として用いられた。 20のシナリオを4つの異なる混雑レベルと4つの価格レベルで評価し、許容されるコストと時間的トレードオフに対応するために需要を反映した。 The growth of ridehailing (RH) companies over the past few years has affected urban mobility in numerous ways. Despite widespread claims about the benefits of such services, limited research has been conducted on the topic. This paper assesses the willingness of Munich transportation users to pay for RH services. Realizing the difficulty of obtaining data directly from RH companies, a stated preference survey was designed. The dataset includes responses from 500 commuters. Sociodemographic attributes, current travel behavior and transportation mode preference in an 8 km trip scenario using RH service and its similar modes (auto and transit), were collected. A multinomial logit model was used to estimate the time and cost coefficients for using RH services across income groups, which was then used to estimate the value of time (VOT) for RH. The model results indicate RH services popularity among those aged 18 to 39, larger households and households with fewer autos. Higher income groups are also willing to pay more for using RH services. To examine the impact of RH services on modal split in the city of Munich, we incorporated RH as a new mode into an existing nested logit mode choice model using an incremental logit. Travel time, travel cost and VOT were used as measures for the choice commuters make when choosing between RH and its closest mode, metro. A total of 20 scenarios were evaluated at four different congestion levels and four price levels to reflect the demand in response to acceptable costs and time tradeoffs.	翻訳日:2023-02-19 14:34:56 公開日:2022-01-28
# 統計的匿名性:ユーザーを再識別せずに再識別リスクを定量化する Statistical anonymity: Quantifying reidentification risks without reidentifying users ( http://arxiv.org/abs/2201.12306v1 ) ライセンス: Link先を確認	Gecia Bravo-Hermsdorff, Robert Busa-Fekete, Lee M. Gunderson, Andr\'es Mun\~oz Medina, Umar Syed	(参考訳) データ匿名化は、参加者の再識別を防ぐためのプライバシ保護データリリースに対するアプローチであり、ノイズの多いデータを許容できないアプリケーションにおいて、差分プライバシーに対する重要な代替手段である。リリースデータに$k$-匿名化を強制する既存のアルゴリズムは、匿名化を実行するキュレーターが元のデータに完全にアクセスしたと仮定している。このアクセスを制限する理由は、望ましくないものから実現不可能なものまで様々である。本稿は,k$-匿名性の統計的概念を維持しつつ,キュレーターに置かれる信頼を減らすための,目的,メトリクス,プロトコル,拡張のアイデアを探求する。このようなフレームワークの主な目的として,信頼(キュレーターに提供する情報量)とプライバシ(参加者の匿名性)を提案する。我々は、これらの目標を達成することを目的としたプロトコルのクラスを説明し、プロセスで新たなプライバシー指標を提案し、関連する境界を証明する。最後に、中央キュレーターの必要性を完全に排除するこの作業の自然な拡張について論じる。 Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification, and it is an important alternative to differential privacy in applications that cannot tolerate noisy data. Existing algorithms for enforcing $k$-anonymity in the released data assume that the curator performing the anonymization has complete access to the original data. Reasons for limiting this access range from undesirability to complete infeasibility. This paper explores ideas -- objectives, metrics, protocols, and extensions -- for reducing the trust that must be placed in the curator, while still maintaining a statistical notion of $k$-anonymity. We suggest trust (amount of information provided to the curator) and privacy (anonymity of the participants) as the primary objectives of such a framework. We describe a class of protocols aimed at achieving these goals, proposing new metrics of privacy in the process, and proving related bounds. We conclude by discussing a natural extension of this work that completely removes the need for a central curator.	翻訳日:2023-02-19 14:33:27 公開日:2022-01-28
# TikTokのパーソナライズ要因に関する実証的研究 An Empirical Investigation of Personalization Factors on TikTok ( http://arxiv.org/abs/2201.12271v1 ) ライセンス: Link先を確認	Maximilian Boeker, Aleksandra Urman	(参考訳) tiktokは現在、急速に成長しているソーシャルメディアプラットフォームであり、月間アクティブユーザー数は10億人を超えている。 TikTokのアルゴリズムがプラットフォームの成功とコンテンツの配布に重要であるにもかかわらず、アルゴリズムの実証的な分析はほとんど行われていない。私たちの研究は、この研究ギャップを埋めるための基礎を築いた。当社が開発したカスタムアルゴリズムを用いたsock-puppet監査手法を用いて,tiktok,フォロー機能,いいね!機能へのアクセスに使用される言語とロケーションの効果と,ユーザが特定の投稿を長く見ることによって推奨されるコンテンツがどう変化するかをテストおよび分析した。テストされたすべての要素がTikTokユーザに推奨されるコンテンツに影響を与える証拠を提供する。さらに,フォロー機能の影響が最も強く,追従機能やビデオ視聴率が高いことがわかった。また,tiktokにおけるフィルタバブルの形成と問題コンテンツの拡散の文脈において,本研究の意義について考察する。 TikTok currently is the fastest growing social media platform with over 1 billion active monthly users of which the majority is from generation Z. Arguably, its most important success driver is its recommendation system. Despite the importance of TikTok's algorithm to the platform's success and content distribution, little work has been done on the empirical analysis of the algorithm. Our work lays the foundation to fill this research gap. Using a sock-puppet audit methodology with a custom algorithm developed by us, we tested and analysed the effect of the language and location used to access TikTok, follow- and like-feature, as well as how the recommended content changes as a user watches certain posts longer than others. We provide evidence that all the tested factors influence the content recommended to TikTok users. Further, we identified that the follow-feature has the strongest influence, followed by the like-feature and video view rate. We also discuss the implications of our findings in the context of the formation of filter bubbles on TikTok and the proliferation of problematic content.	翻訳日:2023-02-19 14:33:10 公開日:2022-01-28
# 正当な決定支援とは何か? What is Legitimate Decision Support? ( http://arxiv.org/abs/2201.12071v1 ) ライセンス: Link先を確認	Yves Meinard, Alexis Tsouki\`as	(参考訳) 意思決定支援(英: decision support)とは、利用可能な理論知識と経験的データに基づいて、問題に直面した意思決定者への勧告を提供する科学と関連する実践である。この活動は数学的な問題の解決とアルゴリズムの認識に関係していると見なされることが多いが、本質的には経験的かつ社会的に枠づけられた活動であり、クライアントとアナリスト、そしてそれらと関係する第三者が重要な役割を果たす。 80年代以降、この意思決定支援の側面である妥当性と正当性を分析するための文献を2つの概念で構成してきた。妥当性はクライアントとアナリストの相互作用に焦点が当てられているが、正当性は、組織的状況、全体的な問題状況、環境、文化、歴史といった、より広い視点を指す。その重要性にも拘わらず、この概念は決定支援の文献にふさわしい関心を受けていない。本論文は,このギャップを埋めることを目的とする。そこで我々は,意思決定支援の文脈において有効な正当性の概念について,他の分野の文献を精査する。本稿では,本論文で見いだされた関連貢献を包含して,意思決定支援の文脈に適応した正当性に関する一般的な理論を提案する。この一般的な理論によれば、正当な意思決定支援介入とは、決定支援提供者が2つの条件を満たす正当化を行うものである。 (i)決定支援提供者の仲介者(有効性条件)を効果的に説得し、 (ii)できるだけ多様で多様な反論(真面目な条件)の活発な解明を中心に組織されている。その概念的単純さにもかかわらず、この意味で理解されている正当性は非常に厳密な要件であり、我々が主張する野心的な研究の道を開く。 Decision support is the science and associated practice that consist in providing recommendations to decision makers facing problems, based on available theoretical knowledge and empirical data. Although this activity is often seen as being concerned with solving mathematical problems and conceiving algorithms, it is essentially an empirical and socially framed activity, where interactions between clients and analysts, and between them and concerned third parties, play a crucial role. Since the 80s, two concepts have structured the literature devoted to analysing this aspect of decision support: validity and legitimacy. Whereas validity is focused on the interactions between the client and the analyst, legitimacy refers to the broader picture: the organisational context, the overall problem situation, the environment, culture, history. Despite its importance, this concept has not received the attention it deserves in the literature in decision support. The present paper aims at filling this gap. For that purpose, we review the literature in other disciplines relevant to elaborate a concept of legitimacy useful in decision support contexts. Based on this review, we propose a general theory of legitimacy, adapted to decision support contexts, encompassing the relevant contributions we found in the literature. According to this general theory, a legitimate decision support intervention is one for which the decision support provider produces a justification that satisfies two conditions: (i) it effectively convinces the decision support provider's interlocutors (effectiveness condition) and (ii) it is organised around the active elicitation of as many and as diverse counterarguments as possible (truthfulness condition). Despite its conceptual simplicity, legitimacy, understood in this sense, is a very exacting requirement, opening ambitious research avenues that we delineate.	翻訳日:2023-02-19 14:32:51 公開日:2022-01-28
# マルチエージェント強化学習における異種エージェントのパラメータ共有 Parameter Sharing For Heterogeneous Agents in Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2005.13625v7 ) ライセンス: Link先を確認	J. K. Terry, Nathaniel Grammel, Sanghyun Son, Benjamin Black	(参考訳) パラメータ共有は、各エージェントが独立して、すべてのポリシー間で完全に共有されたパラメータを持つポリシーを学習するものである。残念ながら、すべてのエージェントが同じポリシーネットワークを共有しているので、異なるポリシーやタスクを学べません。この問題は、観察にエージェント特異的な指標信号を加えることで実験的に回避され、「エージェント表示」と呼ばれる。エージェント表示は制限されているが、修正なしでは、アクション空間や観測空間が不均一な環境にパラメータ共有を適用することはできない。この研究はエージェント指示の概念を形式化し、それが最適ポリシーへの収束を可能にすることを初めて証明する。次に,不均一な観測と行動空間における学習へのパラメータ共有の拡張手法を正式に導入し,これらの手法が最適ポリシーへの収束を可能にすることを示す。最後に,関数を経験的に導入する方法を実験的に検証し,多数の異なるエージェント表示方式のグラフィカルな観測空間に対する経験的有効性について検討した。 Parameter sharing, where each agent independently learns a policy with fully shared parameters between all policies, is a popular baseline method for multi-agent deep reinforcement learning. Unfortunately, since all agents share the same policy network, they cannot learn different policies or tasks. This issue has been circumvented experimentally by adding an agent-specific indicator signal to observations, which we term "agent indication." Agent indication is limited, however, in that without modification it does not allow parameter sharing to be applied to environments where the action spaces and/or observation spaces are heterogeneous. This work formalizes the notion of agent indication and proves that it enables convergence to optimal policies for the first time. Next, we formally introduce methods to extend parameter sharing to learning in heterogeneous observation and action spaces, and prove that these methods allow for convergence to optimal policies. Finally, we experimentally confirm that the methods we introduce function empirically, and conduct a wide array of experiments studying the empirical efficacy of many different agent indication schemes for graphical observation spaces.	翻訳日:2022-11-28 07:53:25 公開日:2022-01-28
# 分散トレーニングにおける最適複雑性 Optimal Complexity in Decentralized Training ( http://arxiv.org/abs/2006.08085v4 ) ライセンス: Link先を確認	Yucheng Lu, Christopher De Sa	(参考訳) 分散化は、並列機械学習システムをスケールアップする有望な方法である。本稿では、確率的非凸設定において、そのような手法の反復複雑性の厳密な下限を提供する。我々の下限は、D-PSGDのような多くの既存の分散トレーニングアルゴリズムの既知の収束率の理論的ギャップを明らかにしている。我々は、この下限がきつく達成可能であることを構築によって証明する。この知見に動機づけられて,我々はさらに,対数ギャップだけで下限を達成する,実用的なゴシップ型分散アルゴリズムであるdetagを提案する。経験的に,画像分類タスクにおけるdetagと他の分散アルゴリズムを比較し,detagがベースライン,特に非シャッフルデータやスパースネットワークよりも高速に収束することを示す。 Decentralization is a promising method of scaling up parallel machine learning systems. In this paper, we provide a tight lower bound on the iteration complexity for such methods in a stochastic non-convex setting. Our lower bound reveals a theoretical gap in known convergence rates of many existing decentralized training algorithms, such as D-PSGD. We prove by construction this lower bound is tight and achievable. Motivated by our insights, we further propose DeTAG, a practical gossip-style decentralized algorithm that achieves the lower bound with only a logarithm gap. Empirically, we compare DeTAG with other decentralized algorithms on image classification tasks, and we show DeTAG enjoys faster convergence compared to baselines, especially on unshuffled data and in sparse networks.	翻訳日:2022-11-21 02:49:07 公開日:2022-01-28
# バイナリ分類のための量子判別器 Quantum Discriminator for Binary Classification ( http://arxiv.org/abs/2009.01235v3 ) ライセンス: Link先を確認	Prasanna Date and Wyatt Smith	(参考訳) 量子コンピュータは、高次元空間において比較的迅速に動作できるユニークな能力を持っている。本研究では,量子コンピュータが高次元空間で動作する能力を活用する量子識別器(Quantum Discriminator)と呼ばれる新しい量子機械学習モデルを提案する。量子判別器は、O(N logN)時間で量子古典ハイブリッドアルゴリズムを用いて訓練され、線形時間で普遍量子コンピュータ上で推論を行う。量子判別器は、ゼロ状態に初期化された予測キュービットと共に所定のデータムから抽出されたバイナリ特徴を入力とし、予測ラベルを出力する。我々は、irisデータセット上での性能を分析し、量子判別器が99%の精度が得られることを示す。 Quantum computers have the unique ability to operate relatively quickly in high-dimensional spaces -- this is sought to give them a competitive advantage over classical computers. In this work, we propose a novel quantum machine learning model called the Quantum Discriminator, which leverages the ability of quantum computers to operate in the high-dimensional spaces. The quantum discriminator is trained using a quantum-classical hybrid algorithm in O(N logN) time, and inferencing is performed on a universal quantum computer in linear time. The quantum discriminator takes as input the binary features extracted from a given datum along with a prediction qubit initialized to the zero state and outputs the predicted label. We analyze its performance on the Iris data set and show that the quantum discriminator can attain 99% accuracy in simulation.	翻訳日:2022-10-22 18:52:13 公開日:2022-01-28
# 交互K平均によるビクラスタリング Biclustering with Alternating K-Means ( http://arxiv.org/abs/2009.04550v3 ) ライセンス: Link先を確認	Nicolas Fraiman, Zichao Li	(参考訳) ビクラスタリングは、データマトリックスの行と列を、サブグループ内の行と列が同様のパターンを示すように、異なるサブグループに同時にクラスタ化するタスクである。本稿では,ブロック対角二クラスターの生成事例について考察する。我々は,経験的クラスタリングリスクを最小限に抑えるというアイデアに基づいて,ビクラスタリング問題の新たな定式化を行う。経験的クラスタリングリスクに関して一貫性のある結果を開発し,証明する。最適化問題は本質的に組合せ的であるため、大域的な最小値の探索は計算的に難解である。そこで本研究では,カラムと行間のk-meansクラスタリングアルゴリズムの適応バージョンを交互に使用することにより,局所最小値を求める,シンプルで斬新なアルゴリズムを提案する。我々は,シミュレーションデータと実世界の遺伝子発現データセットを用いて,アルゴリズムの性能を他のビクラスタリング手法と比較した。その結果,本アルゴリズムは,データ中の有意義な構造を検知し,様々な設定や状況において競合する2クラスタリング手法より優れていることを示す。 Biclustering is the task of simultaneously clustering the rows and columns of the data matrix into different subgroups such that the rows and columns within a subgroup exhibit similar patterns. In this paper, we consider the case of producing block-diagonal biclusters. We provide a new formulation of the biclustering problem based on the idea of minimizing the empirical clustering risk. We develop and prove a consistency result with respect to the empirical clustering risk. Since the optimization problem is combinatorial in nature, finding the global minimum is computationally intractable. In light of this fact, we propose a simple and novel algorithm that finds a local minimum by alternating the use of an adapted version of the k-means clustering algorithm between columns and rows. We evaluate and compare the performance of our algorithm to other related biclustering methods on both simulated data and real-world gene expression data sets. The results demonstrate that our algorithm is able to detect meaningful structures in the data and outperform other competing biclustering methods in various settings and situations.	翻訳日:2022-10-20 08:56:48 公開日:2022-01-28
# 衝突バイアスによる分類器フェアネスの評価 Assessing Classifier Fairness with Collider Bias ( http://arxiv.org/abs/2010.03933v2 ) ライセンス: Link先を確認	Zhenlong Xu (1), Ziqi Xu (1), Jixue Liu (1), Debo Cheng (1), Jiuyong Li (1), Lin Liu (1), Ke Wang (2) ((1) STEM, Univsersity of South Austrlia, Adelaide, Australia, (2) Simon Frasier University, Burnaby, Canada) Ziqi Xu and Zhenlong Xu contributed equally to this paper	(参考訳) 日々の意思決定プロセスにおける機械学習技術の適用が増えているため、アルゴリズムによる意思決定の公平性が懸念されている。本稿では, 公平性評価に拍車をかける衝突型バイアスの問題と, 衝突型バイアスを回避する公平性評価を導くための定理を考案する。監査機関が訓練した分類器を監査する実世界の応用について検討する。本研究では, 開発した定理を用いて非バイアス評価アルゴリズムを提案する。実験およびシミュレーションにより, 提案手法は, 評価において有意な衝突バイアスを低減し, 訓練された分類器の監査に有望であることが示された。 The increasing application of machine learning techniques in everyday decision-making processes has brought concerns about the fairness of algorithmic decision-making. This paper concerns the problem of collider bias which produces spurious associations in fairness assessment and develops theorems to guide fairness assessment avoiding the collider bias. We consider a real-world application of auditing a trained classifier by an audit agency. We propose an unbiased assessment algorithm by utilising the developed theorems to reduce collider biases in the assessment. Experiments and simulations show the proposed algorithm reduces collider biases significantly in the assessment and is promising in auditing trained classifiers.	翻訳日:2022-10-09 11:23:08 公開日:2022-01-28
# 検索:バイリンガル辞書がニューラルマシン翻訳を改善する Look It Up: Bilingual Dictionaries Improve Neural Machine Translation ( http://arxiv.org/abs/2010.05997v2 ) ライセンス: Link先を確認	Xing Jie Zhong, and David Chiang	(参考訳) ニューラルマシン翻訳(nmt)の品質は向上しているが、希少な単語が問題となっている。人間にとって、レアワード問題の解法は、長い間辞書であったが、辞書は直接NMTに組み込むことはできない。本稿では,辞書の定義をレアな単語に"アタッチ"する新しい手法について述べる。二言語辞書を用いて最大1.8 bleuの改善を示す。 Despite advances in neural machine translation (NMT) quality, rare words continue to be problematic. For humans, the solution to the rare-word problem has long been dictionaries, but dictionaries cannot be straightforwardly incorporated into NMT. In this paper, we describe a new method for "attaching" dictionary definitions to rare words so that the network can learn the best way to use them. We demonstrate improvements of up to 1.8 BLEU using bilingual dictionaries.	翻訳日:2022-10-08 05:21:16 公開日:2022-01-28
# コードとテキストによる計算可能科学モデルの自動作成と人力支援 Automated Creation and Human-assisted Curation of Computable Scientific Models from Code and Text ( http://arxiv.org/abs/2202.13739v1 ) ライセンス: Link先を確認	Varish Mulwad, Andrew Crapo, Vijay S. Kumar, James Jobin, Alfredo Gabaldon, Nurali Virani, Sharad Dixit, Narendra Joshi	(参考訳) 科学モデルは複雑なシステムの振る舞いをよりよく理解し予測するための鍵を握る。科学的モデルの最も包括的な表現は、そのユーザビリティを支える重要な仮定やパラメータを含むが、通常は関連するソースコードやドキュメントに埋め込まれ、様々な(潜在的に時代遅れの)プログラミングプラクティスや言語が用いられる。ドメインの専門家は、コードに精通していない場合、科学的モデルの実装を完全に理解することができません。さらに、急速な研究と開発イテレーションは、絶え間なく進化する科学モデルコードベースに追いつくのを難しくします。これらの課題に対処するため、我々は、関連するインラインコメントや外部文書の文脈でモデルのコードを解析する計算可能な科学モデルの知識グラフの自動作成と人力によるキュレーションシステムを開発する。本システムでは,知識駆動型およびデータ駆動型アプローチを用いて,コードや方程式からテキスト文書から,ドメイン用語を用いた意味論的注釈モデルまで,関連する概念を識別・抽出する。これらのモデルは実行可能なpython関数に変換され、さらに複雑なワークフローに構成され、異なる形式のドメイン駆動質問に答えることができる。我々は、nasaのhypersonic aerospaces webサイトから派生したコードと関連するテキストのデータセットを用いて実験結果を示す。 Scientific models hold the key to better understanding and predicting the behavior of complex systems. The most comprehensive manifestation of a scientific model, including crucial assumptions and parameters that underpin its usability, is usually embedded in associated source code and documentation, which may employ a variety of (potentially outdated) programming practices and languages. Domain experts cannot gain a complete understanding of the implementation of a scientific model if they are not familiar with the code. Furthermore, rapid research and development iterations make it challenging to keep up with constantly evolving scientific model codebases. To address these challenges, we develop a system for the automated creation and human-assisted curation of a knowledge graph of computable scientific models that analyzes a model's code in the context of any associated inline comments and external documentation. Our system uses knowledge-driven as well as data-driven approaches to identify and extract relevant concepts from code and equations from textual documents to semantically annotate models using domain terminology. These models are converted into executable Python functions and then can further be composed into complex workflows to answer different forms of domain-driven questions. We present experimental results obtained using a dataset of code and associated text derived from NASA's Hypersonic Aerodynamics website.	翻訳日:2022-03-06 13:08:17 公開日:2022-01-28
# ディープラーニングを活用した自動能力評価 Towards automated Capability Assessment leveraging Deep Learning ( http://arxiv.org/abs/2202.04051v1 ) ライセンス: Link先を確認	Raoul Sch\"onhof and Manuel Fechter	(参考訳) 製造業における経済効率の向上を目指して、自動化の度合いの向上が鍵となる。しかしながら、専用プロセスのための自動組立ソリューションの技術的実現可能性を評価することは困難であり、与えられた製品部品の形状によってしばしば決定される。自動化の実現可能性に関する決定的な基準は、単一部分の分離と分離、最終位置でのコンポーネントの自己調整の能力である。この実現可能性を評価するために,Fraunhofer 研究者によるアンケートに基づく評価手法を開発した。しかし、結果は、単一のエンジニアが評価を行うという暗黙の知識と経験に強く依存する。本稿では,voxelizationを用いた評価を自動化するソフトウェアツールneurocadを提案する。この手法によりCADファイルに基づくディープラーニングにより,抽象的かつ生産的なジオメトリ機能の評価が可能となる。 Aiming for a higher economic efficiency in manufacturing, an increased degree of automation is a key enabler. However, assessing the technical feasibility of an automated assembly solution for a dedicated process is difficult and often determined by the geometry of the given product parts. Among others, decisive criterions of the automation feasibility are the ability to separate and isolate single parts or the capability of component self-alignment in final position. To assess the feasibility, a questionnaire based evaluation scheme has been developed and applied by Fraunhofer researchers. However, the results strongly depend on the implicit knowledge and experience of the single engineer performing the assessment. This paper presents NeuroCAD, a software tool that automates the assessment using voxelization techniques. The approach enables the assessment of abstract and production relevant geometries features through deep-learning based on CAD files.	翻訳日:2022-02-13 14:54:28 公開日:2022-01-28
# (参考訳) 画像分類のための低ランク特徴に基づく二重変換行列学習 Low-rank features based double transformation matrices learning for image classification ( http://arxiv.org/abs/2201.12351v1 ) ライセンス: CC BY 4.0	Yu-Hong Cai, Xiao-Jun Wu, Zhe Chen	(参考訳) 線形回帰は分類タスクで広く使われている教師付き手法である。分類タスクに線形回帰を適用するために,回帰目標を緩和する手法を提案した。しかし、この手法に基づく手法は、データに含まれる複雑な情報によって単一の変換行列の圧力を無視する。この場合、単一の変換行列はフレキシブルな射影を提供するには厳密すぎるため、変換行列に緩和を導入する必要がある。本稿では,潜在低ランク特徴抽出に基づく二重変換行列学習手法を提案する。中心となる考え方は、緩和のために二重変換行列を使い、学習した主特徴と正則な特徴を2方向からラベル空間に共同投影し、単一の変換行列の圧力を共有することである。まず、低ランク特徴を潜在低ランク表現(latlrr)法により学習し、2方向から元のデータを処理する。このプロセスでは、スパースノイズも分離され、射影学習に対する干渉がある程度軽減される。そして、2つの変換行列を導入して2つの特徴を別々に処理し、分類に有用な情報を抽出する。最後に、2つの変換行列は代替最適化法により容易に得ることができる。このような処理により,サンプル中に大量の冗長情報が含まれている場合でも,分類が容易な投影結果を得ることができる。複数のデータセットに対する実験は、特に複雑なシナリオにおいて、分類のためのアプローチの有効性を示す。 Linear regression is a supervised method that has been widely used in classification tasks. In order to apply linear regression to classification tasks, a technique for relaxing regression targets was proposed. However, methods based on this technique ignore the pressure on a single transformation matrix due to the complex information contained in the data. A single transformation matrix in this case is too strict to provide a flexible projection, thus it is necessary to adopt relaxation on transformation matrix. This paper proposes a double transformation matrices learning method based on latent low-rank feature extraction. The core idea is to use double transformation matrices for relaxation, and jointly projecting the learned principal and salient features from two directions into the label space, which can share the pressure of a single transformation matrix. Firstly, the low-rank features are learned by the latent low rank representation (LatLRR) method which processes the original data from two directions. In this process, sparse noise is also separated, which alleviates its interference on projection learning to some extent. Then, two transformation matrices are introduced to process the two features separately, and the information useful for the classification is extracted. Finally, the two transformation matrices can be easily obtained by alternate optimization methods. Through such processing, even when a large amount of redundant information is contained in samples, our method can also obtain projection results that are easy to classify. Experiments on multiple data sets demonstrate the effectiveness of our approach for classification, especially for complex scenarios.	翻訳日:2022-02-05 09:03:18 公開日:2022-01-28
# (参考訳) LULC画像解析のための3次元可視化と空間データマイニング 3D Visualization and Spatial Data Mining for Analysis of LULC Images ( http://arxiv.org/abs/2202.00123v1 ) ライセンス: CC BY 4.0	B. G. Kodge	(参考訳) 本研究では,3次元可視化における土地利用土地被覆(LUCL)画像解析のための新しいツールの開発を試みた。本研究は主に高分解能lc衛星画像の空間データマイニング技術を用いて行う。特徴空間の可視化は、画像データのパターンの探索と分類過程と関連する不確実性に関する洞察を可能にする。視覚的データマイニングは、ユーザーが分類プロセスに関与し、結果に対する自信を高め、理解することができるため、画像分類に付加価値を提供する。本研究では,lucl衛星画像の視覚データマイニング(vdm)のための画像分割,k-meansクラスタリング,および3次元可視化ツールの試作を行った。この体積に基づく表現は、特徴空間を球面またはボクセルに分割する。可視化ツールは,インド・マハラシュトラ州ラトゥル地区の高解像度LULC画像の分類研究において,サンプルデータとして用いられている。 The present study is an attempt made to create a new tool for the analysis of Land Use Land Cover (LUCL) images in 3D visualization. This study mainly uses spatial data mining techniques on high resolution LULC satellite imagery. Visualization of feature space allows exploration of patterns in the image data and insight into the classification process and related uncertainty. Visual Data Mining provides added value to image classifications as the user can be involved in the classification process providing increased confidence in and understanding of the results. In this study, we present a prototype of image segmentation, K-Means clustering and 3D visualization tool for visual data mining (VDM) of LUCL satellite imagery into volume visualization. This volume based representation divides feature space into spheres or voxels. The visualization tool is showcased in a classification study of high-resolution LULC imagery of Latur district (Maharashtra state, India) is used as sample data.	翻訳日:2022-02-05 09:01:17 公開日:2022-01-28
# (参考訳) テキスト分類のための集合型アクティブラーニングとそのオンラインソーシャルメディアへの応用 Dominant Set-based Active Learning for Text Classification and its Application to Online Social Media ( http://arxiv.org/abs/2202.00540v1 ) ライセンス: CC BY 4.0	Toktam A. Oghaz, Ivan Garibay	(参考訳) オンラインソーシャルメディアにおける自然言語処理(NLP)の最近の進歩は、明らかに大規模なデータセットに負っている。しかし、大量のテキストデータポイント(例えばツイート)のラベル付け、保存、処理は依然として困難である。それに加えて、ヘイトスピーチ検出などのアプリケーションでは、攻撃的コンテンツを含む十分に大きなデータセットをラベル付けすることは、人間のアノテータに対して精神的および感情的に課税することができる。したがって、ラベル付きデータポイントを著しく少ないものにできるNLP手法は非常に興味深い。本稿では,最小のアノテーションコストで大規模未ラベルコーパスのトレーニングに使用できる,プールベースのアクティブラーニング手法を提案する。そこで我々は,局所クラスタ群を特徴空間に配置する手法を提案する。これらの集合はデータの最大結合構造を表す。すると、支配的な集合のどれにも属さないサンプルは、局所クラスタの境界を表すため、モデルのトレーニングに使用されるように選択され、分類することがより困難になる。提案手法は,データセットに依存しないパラメータを持たず,完全なトレーニングデータと同等の分類精度をほぼ達成でき,データポイントも大幅に少ない。さらに,本手法は,最先端のアクティブ学習戦略と比較して高い性能を実現する。さらに,提案アルゴリズムは,不確実性に基づくスコアなどの従来のアクティブな学習スコアを選択基準に組み込むことができる。異なるデータセットと異なるニューラルネットワークアーキテクチャを用いて,本手法の有効性を示す。 Recent advances in natural language processing (NLP) in online social media are evidently owed to large-scale datasets. However, labeling, storing, and processing a large number of textual data points, e.g., tweets, has remained challenging. On top of that, in applications such as hate speech detection, labeling a sufficiently large dataset containing offensive content can be mentally and emotionally taxing for human annotators. Thus, NLP methods that can make the best use of significantly less labeled data points are of great interest. In this paper, we present a novel pool-based active learning method that can be used for the training of large unlabeled corpus with minimum annotation cost. For that, we propose to find the dominant sets of local clusters in the feature space. These sets represent maximally cohesive structures in the data. Then, the samples that do not belong to any of the dominant sets are selected to be used to train the model, as they represent the boundaries of the local clusters and are more challenging to classify. Our proposed method does not have any parameters to be tuned, making it dataset-independent, and it can approximately achieve the same classification accuracy as full training data, with significantly fewer data points. Additionally, our method achieves a higher performance in comparison to the state-of-the-art active learning strategies. Furthermore, our proposed algorithm is able to incorporate conventional active learning scores, such as uncertainty-based scores, into its selection criteria. We show the effectiveness of our method on different datasets and using different neural network architectures.	翻訳日:2022-02-05 08:55:14 公開日:2022-01-28
# (参考訳) プライベート(ディープ)学習におけるバウンディングトレーニングデータ再構成 Bounding Training Data Reconstruction in Private (Deep) Learning ( http://arxiv.org/abs/2201.12383v1 ) ライセンス: CC BY 4.0	Chuan Guo, Brian Karrer, Kamalika Chaudhuri, Laurens van der Maaten	(参考訳) 差分プライバシーは、MLにおけるデータ漏洩を防ぐデファクト方法として広く受け入れられており、従来の知恵は、プライバシ攻撃に対する強力な保護を提供することを示している。しかし、既存のDPのセマンティックな保証は、相手の能力を過大評価する可能性があり、メンバーシップステータス自体が非感受性である場合には適用できないメンバーシップ推論に焦点を当てている。本稿では,形式的脅威モデルの下でのトレーニングデータ再構成攻撃に対するDP機構の最初のセマンティック保証を導出する。我々は,renyi differential privacyとfisher information leakという2つの異なるプライバシー会計手法が,データ復元攻撃に対して強い意味的保護を提供することを示した。 Differential privacy is widely accepted as the de facto method for preventing data leakage in ML, and conventional wisdom suggests that it offers strong protection against privacy attacks. However, existing semantic guarantees for DP focus on membership inference, which may overestimate the adversary's capabilities and is not applicable when membership status itself is non-sensitive. In this paper, we derive the first semantic guarantees for DP mechanisms against training data reconstruction attacks under a formal threat model. We show that two distinct privacy accounting methods -- Renyi differential privacy and Fisher information leakage -- both offer strong semantic protection against data reconstruction attacks.	翻訳日:2022-02-05 08:41:42 公開日:2022-01-28
# (参考訳) 加齢黄斑変性診断のための機械学習アルゴリズムの開発 Developing a Machine-Learning Algorithm to Diagnose Age-Related Macular Degeneration ( http://arxiv.org/abs/2201.12384v1 ) ライセンス: CC BY 4.0	Ananya Dua, Pham Hung Minh, Sajid Fahmid, Shikhar Gupta, Sophia Zheng, Vanessa Moyo, Yanran Elisa Xue	(参考訳) 現在、40歳以上の1200万人以上が眼疾患を患っている。最も一般的には、高齢の患者は加齢に伴う黄斑変性(網膜の劣化による中心視のぼやけを引き起こす眼疾患)の影響を受けやすい。前者は、複雑で高価な画像ソフトウェアでしか検出できず、目視検査が行われ、未治療の眼疾患を持つかなりの集団を残し、完全な視力喪失のリスクを負っている。眼疾患に対する機械学習アルゴリズムの使用が提案されている。しかしながら、これらのモデルの開発は、モデル性能を最大化するための適切なモデルとトレーニングパラメータに関する理解の欠如によって制限される。本研究では,n が 0, -1, -2, ... -6 である場合の学習速度 1 * 10^n の6つのモデルを生成し,各モデルに対する f1 スコアを算出した。分析の結果、サンプルの不均衡は機械学習モデルのトレーニングにおいて重要な課題であり、モデル予測性能の真の改善とはならない、トレーニングコストの騙し込みの改善をもたらす可能性があることが示された。この病気の幅広い影響と悪影響を考慮すると、我々は同じことを処理するための機械学習アルゴリズムを開発した。 5000人以上の患者による眼疾患データセットと、その感染した目の画像に基づいて、我々のモデルを訓練した。将来的には、このモデルが特に未資源の地域で広く使われ、眼疾患の診断や人間性の改善に活用されることを願っています。 Today, more than 12 million people over the age of 40 suffer from ocular diseases. Most commonly, older patients are susceptible to age related macular degeneration, an eye disease that causes blurring of the central vision due to the deterioration of the retina. The former can only be detected through complex and expensive imaging software, markedly a visual field test; this leaves a significant population with untreated eye disease and holds them at risk for complete vision loss. The use of machine learning algorithms has been proposed for treating eye disease. However, the development of these models is limited by a lack of understanding regarding appropriate model and training parameters to maximize model performance. In our study, we address these points by generating 6 models, each with a learning rate of 1 * 10^n where n is 0, -1, -2, ... -6, and calculated a f1 score for each of the models. Our analysis shows that sample imbalance is a key challenge in training of machine learning models and can result in deceptive improvements in training cost which does not translate to true improvements in model predictive performance. Considering the wide ranging impact of the disease and its adverse effects, we developed a machine learning algorithm to treat the same. We trained our model on varying eye disease datasets consisting of over 5000 patients, and the pictures of their infected eyes. In the future, we hope this model is used extensively, especially in areas that are under-resourced, to better diagnose eye disease and improve well being for humanity.	翻訳日:2022-02-05 07:53:49 公開日:2022-01-28
# (参考訳) マルチモーダル心内画像分割のための非教師なし領域適応 Few-shot Unsupervised Domain Adaptation for Multi-modal Cardiac Image Segmentation ( http://arxiv.org/abs/2201.12386v1 ) ライセンス: CC BY 4.0	Mingxuan Gu, Sulaiman Vesal, Ronak Kosti, Andreas Maier	(参考訳) 非教師なしドメイン適応(UDA)手法は、ラベル付けされていないターゲットドメインとラベル付けされたソースドメインデータを使用することで、ソースとターゲットドメイン間のギャップを減らすことを目的としている。これにより、新しいドメインに対するUDAメソッドの開発が制限される。本稿では,1つの未ラベル患者サンプルのみを利用できる現実的なシナリオにおいて,UDAの可能性を探る。これをマイショット非教師なしドメイン適応(fuda)と呼ぶ。まず、ソース画像からターゲットスタイルの画像を生成し、ランダム適応インスタンス正規化(rain)のある単一のターゲット患者から多様なターゲットスタイルを探索する。そして、生成された対象画像に教師付きでセグメント化ネットワークを訓練する。実験の結果,FUDAはベースラインに比べて目標領域でのDiceスコアの0.33向上し,より厳密なワンショット設定でDiceスコアの0.28向上を達成できた。私たちのコードは \url{https://github.com/MingxuanGu/Few-shot-UDA} で利用可能です。 Unsupervised domain adaptation (UDA) methods intend to reduce the gap between source and target domains by using unlabeled target domain and labeled source domain data, however, in the medical domain, target domain data may not always be easily available, and acquiring new samples is generally time-consuming. This restricts the development of UDA methods for new domains. In this paper, we explore the potential of UDA in a more challenging while realistic scenario where only one unlabeled target patient sample is available. We call it Few-shot Unsupervised Domain adaptation (FUDA). We first generate target-style images from source images and explore diverse target styles from a single target patient with Random Adaptive Instance Normalization (RAIN). Then, a segmentation network is trained in a supervised manner with the generated target images. Our experiments demonstrate that FUDA improves the segmentation performance by 0.33 of Dice score on the target domain compared with the baseline, and it also gives 0.28 of Dice score improvement in a more rigorous one-shot setting. Our code is available at \url{https://github.com/MingxuanGu/Few-shot-UDA}.	翻訳日:2022-02-05 07:49:36 公開日:2022-01-28
# (参考訳) DoubleU-Net++:Vertebraeセグメンテーションのための爆発的マルチスケール機能を備えたアーキテクチャ DoubleU-Net++: Architecture with Exploit Multiscale Features for Vertebrae Segmentation ( http://arxiv.org/abs/2201.12389v1 ) ライセンス: CC BY 4.0	Simindokht Jahangard, Mahdi Bonyani, Abbas Khosravi	(参考訳) 脊椎の正確な分節は、外科医を支援する様々な医学的応用(例えば遠隔手術)において重要な前提条件である。ディープニューラルネットワークの開発が成功した後、最近の研究は脊椎分節の本質的な規則に焦点を当てている。以前の作業には多数のパラメータが含まれており、セグメンテーションは1つのビューに制限されている。 DoubleU-Netに触発されてDoubleU-Net++と呼ばれる新しいモデルを提案し、DensNetを特徴抽出モジュールとして、CBAM(Convolutional Block Attention on Module)から特別注意モジュールを、抽出機能を改善するためにPraamid Squeeze Attention(PSA)モジュールを採用する。我々はverse2020とxvertsegデータセットの3つの異なるビュー(sagittal, coronal, axial)で提案モデルを評価する。最新の研究と比較すると,我々のアーキテクチャはより高速に訓練され,評価として高い精度,リコール,およびf1-score(4～6%)を達成し,またverse2020データセットでは矢状図が94%以上,コロナビューが94%以上,軸線ビューが93%以上となった。また,xvertsegデータセットでは,矢状視では97%,コロナ視では93%,軸視では96%以上の精度,リコール,f1-scoreを達成している。 Accurate segmentation of the vertebra is an important prerequisite in various medical applications (E.g. tele surgery) to assist surgeons. Following the successful development of deep neural networks, recent studies have focused on the essential rule of vertebral segmentation. Prior works contain a large number of parameters, and their segmentation is restricted to only one view. Inspired by DoubleU-Net, we propose a novel model named DoubleU-Net++ in which DensNet as feature extractor, special attention module from Convolutional Block Attention on Module (CBAM) and, Pyramid Squeeze Attention (PSA) module are employed to improve extracted features. We evaluate our proposed model on three different views (sagittal, coronal, and axial) of VerSe2020 and xVertSeg datasets. Compared with state-of-the-art studies, our architecture is trained faster and achieves higher precision, recall, and F1-score as evaluation (imporoved by 4-6%) and the result of above 94% for sagittal view and above 94% for both coronal view and above 93% axial view were gained for VerSe2020 dataset, respectively. Also, for xVertSeg dataset, we achieved precision, recall,and F1-score of above 97% for sagittal view, above 93% for coronal view ,and above 96% for axial view.	翻訳日:2022-02-05 07:44:29 公開日:2022-01-28
# (参考訳) com-pound-protein相互作用予測法の性能評価に関する知見 Insights into performance evaluation of com-pound-protein interaction prediction methods ( http://arxiv.org/abs/2202.00001v1 ) ライセンス: CC BY-SA 4.0	Adiba Yaseen (1), Imran Amin (2), Naeem Akhter (1), Asa Ben-Hur (3) and Fayyaz Minhas (4) ((1) Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, Pakistan,(2) National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan,(3) Department of Computer Science, Colorado State University, Fort Collins, USA (4) Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coven-try, UK)	(参考訳) モチベーション: 複合タンパク質相互作用(CPI)の機械学習による予測は, 薬物設計, スクリーニング, 再資源化研究において重要であり, 湿式ラボアッセイの効率性と費用対効果を向上させることができる。近年,cpi予測因子を報告する多くの研究論文が公表されているが,モデル性能の楽観的評価に繋がる実験設計の問題点が数多く報告されている。結果:本論文では,既存の研究で見落としているCPI予測器の一般化に影響を及ぼすいくつかの重要な要因について分析する。クロスバリデーションにおけるトレーニングとテスト例の類似性 2. 実験的に検証された否定例がない場合に、否定例を生成するための戦略。 3. 評価プロトコルと性能指標の選択と大規模複合ライブラリのスクリーニングにおけるCPI予測器の現実利用との整合性既存の最先端手法(CPI-NN)とカーネルベースのアプローチの両方を用いて、CPI予測器の予測性能の評価には、トレーニングとテスト例の類似性について慎重に検討する必要があることが分かった。また、訓練や性能評価のための合成陰性例生成のためのランダムペアリングは、既存の研究で使われているより洗練された戦略と比較して、より一般化された性能を持つモデルに結果をもたらすことを示した。さらに、カーネルベースのアプローチは、そのシンプルな設計にもかかわらず、CPI-NNの予測性能を上回ることが判明した。提案したモデルを用いてSARS-CoV-2 SpikeやHuman ACE2などのタンパク質の複合スクリーニングを行い,そのトップヒットを裏付ける強い証拠を見出した。可用性: https://github.com/adibayaseen/HKRCPI Contact: Fayyaz.minhas@warwick.ac.uk Motivation: Machine learning based prediction of compound-protein interactions (CPIs) is important for drug design, screening and repurposing studies and can improve the efficiency and cost-effectiveness of wet lab assays. Despite the publication of many research papers reporting CPI predictors in the recent years, we have observed a number of fundamental issues in experiment design that lead to over optimistic estimates of model performance. Results: In this paper, we analyze the impact of several important factors affecting generalization perfor-mance of CPI predictors that are overlooked in existing work: 1. Similarity between training and test examples in cross-validation 2. The strategy for generating negative examples, in the absence of experimentally verified negative examples. 3. Choice of evaluation protocols and performance metrics and their alignment with real-world use of CPI predictors in screening large compound libraries. Using both an existing state-of-the-art method (CPI-NN) and a proposed kernel based approach, we have found that assessment of predictive performance of CPI predictors requires careful con-trol over similarity between training and test examples. We also show that random pairing for gen-erating synthetic negative examples for training and performance evaluation results in models with better generalization performance in comparison to more sophisticated strategies used in existing studies. Furthermore, we have found that our kernel based approach, despite its simple design, exceeds the prediction performance of CPI-NN. We have used the proposed model for compound screening of several proteins including SARS-CoV-2 Spike and Human ACE2 proteins and found strong evidence in support of its top hits. Availability: Code and raw experimental results available at https://github.com/adibayaseen/HKRCPI Contact: Fayyaz.minhas@warwick.ac.uk	翻訳日:2022-02-05 07:35:36 公開日:2022-01-28
# (参考訳) Syfer: プライベートデータリリースのための神経障害 Syfer: Neural Obfuscation for Private Data Release ( http://arxiv.org/abs/2201.12406v1 ) ライセンス: CC BY 4.0	Adam Yala, Victor Quach, Homa Esfahanizadeh, Rafael G. L. D'Oliveira, Ken R. Duffy, Muriel M\'edard, Tommi S. Jaakkola, Regina Barzilay	(参考訳) プライバシと予測ユーティリティのバランスは、医療におけるマシンラーニングの中心的な課題である。本稿では,再同定攻撃から保護する神経難読化法syferを開発した。 syferはトレーニングされた層をランダムなニューラルネットワークで構成し、元のデータ(例えばx線)をエンコードすると同時に、エンコードされたデータから診断を予測する能力を維持する。エンコーダのランダム性は、データ所有者のプライベートキーとして振る舞う。 1つの画像(ゲスワーク)を再特定するのに必要な攻撃者の数として、プライバシーを定量化する。推測作業を推定するためのコントラスト学習アルゴリズムを提案する。 DP画像などの差分的プライベートな手法が,実用性を著しく損なうことなく,プライバシを獲得できることを実証的に示す。対照的に、Syferはユーティリティを保ちながら強力なプライバシーを実現している。例えば、DP-image、Syfer、およびオリジナルのデータで構築されたX線分類器は平均AUCを0.53、0.78、0.86とする。 Balancing privacy and predictive utility remains a central challenge for machine learning in healthcare. In this paper, we develop Syfer, a neural obfuscation method to protect against re-identification attacks. Syfer composes trained layers with random neural networks to encode the original data (e.g. X-rays) while maintaining the ability to predict diagnoses from the encoded data. The randomness in the encoder acts as the private key for the data owner. We quantify privacy as the number of attacker guesses required to re-identify a single image (guesswork). We propose a contrastive learning algorithm to estimate guesswork. We show empirically that differentially private methods, such as DP-Image, obtain privacy at a significant loss of utility. In contrast, Syfer achieves strong privacy while preserving utility. For example, X-ray classifiers built with DP-image, Syfer, and original data achieve average AUCs of 0.53, 0.78, and 0.86, respectively.	翻訳日:2022-02-05 07:19:26 公開日:2022-01-28
# (参考訳) モバイル介入のためのネットワークレストレストレストマルチアームバンディット Networked Restless Multi-Armed Bandits for Mobile Interventions ( http://arxiv.org/abs/2201.12408v1 ) ライセンス: CC BY 4.0	Han-Ching Ou, Christoph Siebenbrunner, Jackson Killian, Meredith B Brooks, David Kempe, Yevgeniy Vorobeychik, Milind Tambe	(参考訳) 幅広い種類のモバイル介入問題に動機づけられ,ネットワーク効果を持つレストレス・マルチアーム・バンディット(rmabs)を提案し,検討した。我々のモデルでは、アームは部分的にリチャージされ、グラフを介して接続されているため、一方のアームを引くことで隣接するアームの状態も改善され、ネットワーク効果のない完全リチャージバンディットの設定が大幅に拡張される。モバイル介入では、ネットワーク効果は通常の人口移動(家と仕事の通勤など)によって生じることがある。 RMABのネットワーク効果は,既存の解法では考慮されていない強い報酬結合をもたらすことを示す。本稿では,ネットワーク化RMABに対する新しい解法を提案し,介入効果の構造に対する自然な仮定の下で生じる凹凸特性を利用する。理想化された環境でのアプローチの最適性に十分な条件を提供し,実世界グラフを用いた3つのモバイル介入領域における最先端のベースラインを経験的に上回っていることを示す。 Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.	翻訳日:2022-02-05 06:53:05 公開日:2022-01-28
# (参考訳) 社会会話におけるエンティティ中心コンテキスト追跡への統一的アプローチ A Unified Approach to Entity-Centric Context Tracking in Social Conversations ( http://arxiv.org/abs/2201.12409v1 ) ライセンス: CC BY 4.0	Ulrich R\"uckert, Srinivas Sunkara, Abhinav Rastogi, Sushant Prakash, Pranav Khaitan	(参考訳) 人間と人間の会話では、コンテキストトラッキングは重要なエンティティを識別し、その特性と関係を追跡する。これはスロットタグ、コア参照解決、複数の参照の解決、エンティティリンクなど、いくつかのサブタスクを含む難しい問題である。本稿では,これまで述べたエンティティ参照,それらの特性,それらの関係を含むエンティティリポジトリによって,会話コンテキストを表現したエンドツーエンドモデリングタスクとして,この問題にアプローチする。リポジトリはターンバイターンで更新されるため、長い会話であっても、トレーニングと推論が計算的に効率的になる。本稿は,この枠組みを2つの方法で検討するための基礎研究を行う。まず、人間と位置アノテーションによるコンテキスト追跡のための、大規模な人間と人間の会話コーパスであるcontrackをリリースする。平均11.8ターン、5.8エンティティ、15.2参照を持つ7000以上の会話を含んでいる。次に、コンテキストトラッキングのためのニューラルネットワークアーキテクチャをオープンソース化します。最後に、このネットワークをサブタスクの最先端のアプローチと比較し、関連するトレードオフに関する結果を報告します。 In human-human conversations, Context Tracking deals with identifying important entities and keeping track of their properties and relationships. This is a challenging problem that encompasses several subtasks such as slot tagging, coreference resolution, resolving plural mentions and entity linking. We approach this problem as an end-to-end modeling task where the conversational context is represented by an entity repository containing the entity references mentioned so far, their properties and the relationships between them. The repository is updated turn-by-turn, thus making training and inference computationally efficient even for long conversations. This paper lays the groundwork for an investigation of this framework in two ways. First, we release Contrack, a large scale human-human conversation corpus for context tracking with people and location annotations. It contains over 7000 conversations with an average of 11.8 turns, 5.8 entities and 15.2 references per conversation. Second, we open-source a neural network architecture for context tracking. Finally we compare this network to state-of-the-art approaches for the subtasks it subsumes and report results on the involved tradeoffs.	翻訳日:2022-02-05 06:36:19 公開日:2022-01-28
# (参考訳) どんな変分オートエンコーダでも任意の条件付けができる Any Variational Autoencoder Can Do Arbitrary Conditioning ( http://arxiv.org/abs/2201.12414v1 ) ライセンス: CC BY 4.0	Ryan R. Strauss, Junier B. Oliva	(参考訳) 任意条件付けは教師なし学習において重要な問題であり、ここでは条件密度$p(\mathbf{x}_u \mid \mathbf{x}_o)$をモデル化する。しかし、密度推定の大多数は、特徴間の重要な条件依存が不透明な共同分布 $p(\mathbf{x})$ をモデル化することのみに焦点を当てている。本稿では,任意の変分オートエンコーダ(VAE)をVAE自体を変更することなく任意の条件付けを行うことのできる,シンプルで汎用的なフレームワークであるPosterior Matchingを提案する。後方マッチングは、既存のvaeに基づくジョイント密度推定法に応用され、任意の条件付けに対する以前のアプローチが要求する特殊モデルを回避している。 Posterior Matchingは、様々なタスクに対する現在の最先端メソッドに匹敵する、あるいは優れているパフォーマンスを実現する。 Arbitrary conditioning is an important problem in unsupervised learning, where we seek to model the conditional densities $p(\mathbf{x}_u \mid \mathbf{x}_o)$ that underly some data, for all possible non-intersecting subsets $o, u \subset \{1, \dots , d\}$. However, the vast majority of density estimation only focuses on modeling the joint distribution $p(\mathbf{x})$, in which important conditional dependencies between features are opaque. We propose a simple and general framework, coined Posterior Matching, that enables any Variational Autoencoder (VAE) to perform arbitrary conditioning, without modification to the VAE itself. Posterior Matching applies to the numerous existing VAE-based approaches to joint density estimation, thereby circumventing the specialized models required by previous approaches to arbitrary conditioning. We find that Posterior Matching achieves performance that is comparable or superior to current state-of-the-art methods for a variety of tasks.	翻訳日:2022-02-04 13:25:17 公開日:2022-01-28
# (参考訳) なぜ君を信頼すべきなのか、ベルマン? Bellman Errorは価値エラーの少ない代替品 Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error ( http://arxiv.org/abs/2201.12417v1 ) ライセンス: CC BY 4.0	Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu	(参考訳) 本研究では,ベルマン方程式を数値予測精度の代用目的として利用することを検討した。ベルマン方程式はすべての状態-作用対上の真の値関数によって一意に解かれるが、ベルマン誤差(方程式の両側の違い)は値関数の精度の指標として不十分である。特に, 1) ベルマン方程式の両辺のキャンセルにより, ベルマン誤差の大きさは, すべての状態-作用対を考慮に入れた場合でも, 真の値関数との距離と弱い関係しかなく, 2) 有限データ状態においては, ベルマン方程式は無限に多くの準最適解によって正確に満たされることを示す。これは、値関数の精度を向上することなくベルマン誤差を最小化できることを意味する。これらの現象を、一連の命題、例示的なトイ例、標準ベンチマークドメインにおける経験的分析を通じて実証する。 In this work, we study the use of the Bellman equation as a surrogate objective for value prediction accuracy. While the Bellman equation is uniquely solved by the true value function over all state-action pairs, we find that the Bellman error (the difference between both sides of the equation) is a poor proxy for the accuracy of the value function. In particular, we show that (1) due to cancellations from both sides of the Bellman equation, the magnitude of the Bellman error is only weakly related to the distance to the true value function, even when considering all state-action pairs, and (2) in the finite data regime, the Bellman equation can be satisfied exactly by infinitely many suboptimal solutions. This means that the Bellman error can be minimized without improving the accuracy of the value function. We demonstrate these phenomena through a series of propositions, illustrative toy examples, and empirical analysis in standard benchmark domains.	翻訳日:2022-02-04 13:02:47 公開日:2022-01-28
# (参考訳) FastFlows: 分子グラフ生成のためのフローベースモデル FastFlows: Flow-Based Models for Molecular Graph Generation ( http://arxiv.org/abs/2201.12419v1 ) ライセンス: CC BY 4.0	Nathan C. Frey, Vijay Gadepally, Bharath Ramsundar	(参考訳) 本稿では, 正規化フローモデル, SELF参照組込み文字列, 小分子を効率的に生成する多目的最適化を用いたフレームワークを提案する。最初のトレーニングセットは100個の小さな分子で、FastFlowsは数秒で何千もの化学的に有効な分子を生成する。効率的なサンプリングのため、サブ構造フィルターは不合理なモーティーを持つ化合物を除去するために必要に応じて適用することができる。薬物類似性, 合成アクセシビリティ, 合成複雑性の計算が容易で学習可能なメトリクスを用いて, マルチオブジェクト最適化を行い, 高速な仮想スクリーニング環境でのFastFlowsの動作を実証する。我々のモデルは自己回帰型分子生成モデルよりもはるかにシンプルで訓練が容易であり、薬物様合成可能な分子の高速な生成と同定を可能にする。 We propose a framework using normalizing-flow based models, SELF-Referencing Embedded Strings, and multi-objective optimization that efficiently generates small molecules. With an initial training set of only 100 small molecules, FastFlows generates thousands of chemically valid molecules in seconds. Because of the efficient sampling, substructure filters can be applied as desired to eliminate compounds with unreasonable moieties. Using easily computable and learned metrics for druglikeness, synthetic accessibility, and synthetic complexity, we perform a multi-objective optimization to demonstrate how FastFlows functions in a high-throughput virtual screening context. Our model is significantly simpler and easier to train than autoregressive molecular generative models, and enables fast generation and identification of druglike, synthesizable molecules.	翻訳日:2022-02-04 12:24:56 公開日:2022-01-28
# (参考訳) 効率的な分散ディープラーニングのためのリソース利用ベンチマーク Benchmarking Resource Usage for Efficient Distributed Deep Learning ( http://arxiv.org/abs/2201.12423v1 ) ライセンス: CC BY 4.0	Nathan C. Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi	(参考訳) ディープラーニング(DL)ワークフローは、はるかに大きな利益を達成するために、計算とエネルギーの予算を継続的に増やすことを要求する。ニューラルネットワークの検索、ハイパーパラメータスイープ、ラピッドプロトタイピングは大量のリソースを消費し、リソース制約のある研究者が大規模なモデルの実験を行なわず、環境への影響も大きい。そのため、ディープニューラルネットワーク(DNN)とトレーニングの違いが、計算資源とエネルギー資源の増大をどのように活用するかを理解することが不可欠である。本稿では,最大424のグラフィックス処理ユニット(GPU)上で,さまざまなドメイン/タスク(自然言語処理,コンピュータビジョン,化学)を表すディープネットワークの配列を3,400以上の実験を行った。実験では,計算資源特性と電力利用やgpuクロックレート制限などの省エネ機構を系統的に変化させ,各代表モデルが様々な資源・エネルギー制約条件下で提示するトレードオフやスケーリング行動の把握と説明を行う。トレーニング時間が利用可能な計算資源とエネルギー制約によってどのようにスケールするかを記述する、パワーローモデルに適合します。これらの知見は,各種ディープラーニングタスク/ワークフローのエネルギー消費を,トレーニングへの影響を最小限に抑えて選択的に削減し,資源利用の最適化において,高性能コンピューティングプロバイダに情報提供と指導を支援することを期待する。 Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototyping consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources -- especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks -- natural language processing, computer vision, and chemistry -- on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy-saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.	翻訳日:2022-02-04 11:36:10 公開日:2022-01-28
# (参考訳) CoordX: 分割型MLPアーキテクチャによる暗黙のニューラル表現の高速化 CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture ( http://arxiv.org/abs/2201.12425v1 ) ライセンス: CC BY 4.0	Ruofan Liang, Hongyi Sun, Nandita Vijaykumar	(参考訳) 多層パーセプトロン(MLP)を用いた暗黙的神経表現は、近年、新しいビュー合成や3Dオブジェクト表現やレンダリングなど、様々なタスクで注目されている。しかし、これらの表現の重大な課題は、画像、ビデオ、または3Dオブジェクトを学習し、表現するために、多数の入力座標に対するMLPによるトレーニングと推論の両方が大量の計算と長い処理時間を必要とすることである。本研究では,新たな分割型MLPアーキテクチャであるCoordXを提案することにより,暗黙的ニューラル表現のための座標ベースMLPの推論と訓練を高速化することを目的とする。 CoordXでは、初期層を分割して入力座標の各次元を個別に学習する。中間の特徴は最後の層によって融合され、対応する座標点で学習信号を生成する。これにより、必要な計算量が大幅に削減され、トレーニングや推論のスピードアップが大きくなり、ベースラインMLPと同じような精度が達成される。このアプローチは、元の信号の分解である最初の学習機能を目標とし、学習した信号を生成するためにそれらを融合させる。提案アーキテクチャは,多くの暗黙的ニューラル表現タスクにおいて,メモリオーバーヘッドを伴わずに利用できる。画像,映像,3次元形状表現およびレンダリングタスクのベースラインモデルと比較して,最大2.92倍の高速化を示す。 Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks such as novel view synthesis and 3D object representation and rendering. However, a significant challenge with these representations is that both training and inference with an MLP over a large number of input coordinates to learn and represent an image, video, or 3D object, require large amounts of computation and incur long processing times. In this work, we aim to accelerate inference and training of coordinate-based MLPs for implicit neural representations by proposing a new split MLP architecture, CoordX. With CoordX, the initial layers are split to learn each dimension of the input coordinates separately. The intermediate features are then fused by the last layers to generate the learned signal at the corresponding coordinate point. This significantly reduces the amount of computation required and leads to large speedups in training and inference, while achieving similar accuracy as the baseline MLP. This approach thus aims at first learning functions that are a decomposition of the original signal and then fusing them to generate the learned signal. Our proposed architecture can be generally used for many implicit neural representation tasks with no additional memory overheads. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.	翻訳日:2022-02-04 11:15:28 公開日:2022-01-28
# (参考訳) FedGCN:グラフ畳み込みネットワークのフェデレーショントレーニングにおける収束とコミュニケーションのトレードオフ FedGCN: Convergence and Communication Tradeoffs in Federated Training of Graph Convolutional Networks ( http://arxiv.org/abs/2201.12433v1 ) ライセンス: CC BY 4.0	Yuhang Yao, Carlee Joe-Wong	(参考訳) グラフデータセットのトレーニングモデルのための分散メソッドは、グラフデータセットのサイズと、ソーシャルネットワークのようなグラフィカルデータのプライベートな性質により、最近人気が高まっている。しかし、このデータのグラフィカルな構造は、異なる学習クライアント間で疎結合に分割できないことを意味しており、クライアント間の重要な通信オーバーヘッドまたはトレーニング方法で利用可能な情報の損失につながる。フェデレーショングラフ畳み込みネットワーク(Federated Graph Convolutional Network, FedGCN)を導入し, フェデレーション学習を用いてGCNモデルを最適収束率と通信コストで訓練する。各イテレーションでクライアント間の通信を必要とする以前の方法と比較して、federcnはクライアントデータのプライバシを保持し、最初のステップで通信のみを必要とするため、通信コストを大幅に削減し、コンバージェンスレートを高速化する。我々は、FedGCNの収束率と異なるデータ分布下での通信コストのトレードオフを理論的に分析し、一般的なフレームワークを導入して、すべてのエッジ補完に基づくGCNトレーニングアルゴリズムを解析することができる。実験により,本アルゴリズムの有効性を実証し,理論解析を検証した。 Distributed methods for training models on graph datasets have recently grown in popularity, due to the size of graph datasets as well as the private nature of graphical data like social networks. However, the graphical structure of this data means that it cannot be disjointly partitioned between different learning clients, leading to either significant communication overhead between clients or a loss of information available to the training method. We introduce Federated Graph Convolutional Network (FedGCN), which uses federated learning to train GCN models with optimized convergence rate and communication cost. Compared to prior methods that require communication among clients at each iteration, FedGCN preserves the privacy of client data and only needs communication at the initial step, which greatly reduces communication cost and speeds up the convergence rate. We theoretically analyze the tradeoff between FedGCN's convergence rate and communication cost under different data distributions, introducing a general framework can be generally used for the analysis of all edge-completion-based GCN training algorithms. Experimental results demonstrate the effectiveness of our algorithm and validate our theoretical analysis.	翻訳日:2022-02-04 10:49:59 公開日:2022-01-28
# (参考訳) 事前学習型言語モデルを用いた常識知識推論と生成:調査 Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey ( http://arxiv.org/abs/2201.12438v1 ) ライセンス: CC BY 4.0	Prajjwal Bhargava, Vincent Ng	(参考訳) 常識知識の獲得と推論は伝統的に知識表現と推論コミュニティの中核的な研究テーマであったが、近年は自然言語処理コミュニティにおいて、事前訓練されたモデルを開発し、新しく設計された様々な常識知識の推論と生成タスクに対処する能力をテストすることへの関心が高まっている。本稿では,これらの課題に関する調査を行い,これらの課題により明らかになった常識推論と生成のための最先端の事前学習モデルの強みと弱みについて考察し,今後の研究の方向性を考察する。 While commonsense knowledge acquisition and reasoning has traditionally been a core research topic in the knowledge representation and reasoning community, recent years have seen a surge of interest in the natural language processing community in developing pre-trained models and testing their ability to address a variety of newly designed commonsense knowledge reasoning and generation tasks. This paper presents a survey of these tasks, discusses the strengths and weaknesses of state-of-the-art pre-trained models for commonsense reasoning and generation as revealed by these tasks, and reflects on future research directions.	翻訳日:2022-02-04 10:30:29 公開日:2022-01-28
# (参考訳) 分布シフトによるモデル精度の検証 Certifying Model Accuracy under Distribution Shifts ( http://arxiv.org/abs/2201.12440v1 ) ライセンス: CC BY 4.0	Aounon Kumar, Alexander Levine, Tom Goldstein and Soheil Feizi	(参考訳) 機械学習における認証された堅牢性は主に、データ分散の各点に対する固定攻撃予算による入力の逆摂動に焦点を当てている。本研究では,データ分布の有界wassersteinシフト下でのモデルの精度について,証明可能なロバスト性を保証する。変換空間内のモデルの入力をランダム化する単純な手続きは、変換の下での分布シフトに対して確実に頑健であることを示す。提案手法により, datum 特有の摂動径は入力分布の異なる点にまたがって変化し, 固定サイズの摂動も含むことができる。我々の証明は、ワッサーシュタイン球内における入力分布の(自然あるいは逆)シフトに対するモデルの性能に関する保証された低い境界を生成する。この技術を応用します一色シフト、色シフト、明るさ及び彩度の変化等の画像の自然(非逆変換)に対する堅牢性を証明すること。 (ii)入力分布の逆流に対するロバスト性を証明すること、及び (3) モデルトレーニングに干渉する有害ないわゆる「未学習」データセットで訓練されたモデルの性能について、証明可能な下限(硬度結果)を示す。 Certified robustness in machine learning has primarily focused on adversarial perturbations of the input with a fixed attack budget for each point in the data distribution. In this work, we present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation. Our framework allows the datum-specific perturbation size to vary across different points in the input distribution and is general enough to include fixed-sized perturbations as well. Our certificates produce guaranteed lower bounds on the performance of the model for any (natural or adversarial) shift of the input distribution within a Wasserstein ball around the original distribution. We apply our technique to: (i) certify robustness against natural (non-adversarial) transformations of images such as color shifts, hue shifts and changes in brightness and saturation, (ii) certify robustness against adversarial shifts of the input distribution, and (iii) show provable lower bounds (hardness results) on the performance of models trained on so-called "unlearnable" datasets that have been poisoned to interfere with model training.	翻訳日:2022-02-04 10:12:54 公開日:2022-01-28
# 効率的なポリシー空間対応 oracle Efficient Policy Space Response Oracles ( http://arxiv.org/abs/2202.00633v1 ) ライセンス: Link先を確認	Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu	(参考訳) ポリシー空間応答 oracle method (psro)は、2人プレイのゼロサムゲームにおけるnash均衡に対する一般的な解決策を提供するが、(1)シミュレーションによって現在の人口を一貫して評価することによる計算効率の非効率、(2)各イテレーションにおける固定されたメタストラテジーに対する最善の反応を学ぶことによる探索効率の非効率の2つの問題に苦しむ。本稿では,上記の2つのステップの効率を大幅に向上させるEPSRO(Efficient PSRO)を提案する。我々の開発の中心は、制限なし(URR)ゲームにおけるミニマックス最適化の導入されたサブルーチンである。各ステップでURRを解くことで、現在のゲームを評価し、ゲームシミュレーションを必要とせずに、1回のフォワードパスでベストレスポンスを計算することができる。理論的には、ESPROの解法が、攻撃性に対する単調な改善をもたらすことを証明している。さらに、ESPROの望ましい性質は、並列化可能であり、行動多様性を誘導する政策空間の効率的な探索を可能にすることである。我々は,EPSROを3種類のゲームでテストし,壁面時間における50倍の高速化,10倍のデータ効率,および既存のKuhnおよびLeduc PokerゲームにおけるPSRO手法と同様のエクスプロイザビリティを報告した。 Policy Space Response Oracle method (PSRO) provides a general solution to Nash equilibrium in two-player zero-sum games but suffers from two problems: (1) the computation inefficiency due to consistently evaluating current populations by simulations; and (2) the exploration inefficiency due to learning best responses against a fixed meta-strategy at each iteration. In this work, we propose Efficient PSRO (EPSRO) that largely improves the efficiency of the above two steps. Central to our development is the newly-introduced subroutine of minimax optimization on unrestricted-restricted (URR) games. By solving URR at each step, one can evaluate the current game and compute the best response in one forward pass with no need for game simulations. Theoretically, we prove that the solution procedures of EPSRO offer a monotonic improvement on exploitability. Moreover, a desirable property of EPSRO is that it is parallelizable, this allows for efficient exploration in the policy space that induces behavioral diversity. We test EPSRO on three classes of games and report a 50x speedup in wall-time, 10x data efficiency, and similar exploitability as existing PSRO methods on Kuhn and Leduc Poker games.	翻訳日:2022-02-02 15:38:17 公開日:2022-01-28
# 3次元CADモデルから標準と特徴を認識するニューラルネットワークの開発 Development of a neural network to recognize standards and features from 3D CAD models ( http://arxiv.org/abs/2202.00573v1 ) ライセンス: Link先を確認	Alexander Neb and Iyed Briki and Raoul Schoenhof	(参考訳) この研究の焦点は、3dcadモデルから直接標準や機能を認識することである。このため、ニューラルネットワークは9種類の機械要素を認識するように訓練された。 DIN EN ISO 8676以降の六角形ネジのように、ある部分を標準として特定した後、アプリケーションプログラミングインタフェース(API)を介してCADシステムの幾何学的情報にアクセスする。 APIでは,その部分を適切に記述するために必要な情報を検索する。この情報に基づく標準化部品を詳細に認識し、さらに情報を補うことができる。 Focus of this work is to recognize standards and further features directly from 3D CAD models. For this reason, a neural network was trained to recognize nine classes of machine elements. After the system identified a part as a standard, like a hexagon head screw after the DIN EN ISO 8676, it accesses the geometrical information of the CAD system via the Application Programming Interface (API). In the API, the system searches for necessary information to describe the part appropriately. Based on this information standardized parts can be recognized in detail and supplemented with further information.	翻訳日:2022-02-02 13:34:21 公開日:2022-01-28
# Dynamic-VAEによる電気自動車バッテリーの故障検出 Detecting Electric Vehicle Battery Failure via Dynamic-VAE ( http://arxiv.org/abs/2201.12358v1 ) ライセンス: Link先を確認	Haowei He, Jingzhao Zhang, Yanan Wang, Shaobo Huang, Chen Wang, Yang Zhang, Dongxu Guo, Guannan He, Minggao Ouyang	(参考訳) 本稿では,ディープラーニングモデルによってバックアップされたバッテリ故障検出パイプラインについて述べる。まず、数百台の車両のバッテリー充電データを含む、大規模な電気自動車(EV)バッテリーデータセットを紹介します。次に,バッテリ故障検出を異常検出問題として定式化し,動的システムと変分オートエンコーダに基づく動的VAEという新しいアルゴリズムを提案する。提案アルゴリズムの性能を,提案したデータセットのベースラインに対して検証し,動的VAEの有効性を実証した。 In this note, we describe a battery failure detection pipeline backed up by deep learning models. We first introduce a large-scale Electric vehicle (EV) battery dataset including cleaned battery-charging data from hundreds of vehicles. We then formulate battery failure detection as an outlier detection problem, and propose a new algorithm named Dynamic-VAE based on dynamic system and variational autoencoders. We validate the performance of our proposed algorithm against several baselines on our released dataset and demonstrated the effectiveness of Dynamic-VAE.	翻訳日:2022-02-01 20:02:37 公開日:2022-01-28
# 攻撃グラフを用いた強化学習による濾過経路の検出 Discovering Exfiltration Paths Using Reinforcement Learning with Attack Graphs ( http://arxiv.org/abs/2201.12416v1 ) ライセンス: Link先を確認	Tyler Cody, Abdul Rahman, Christopher Redino, Lanxiao Huang, Ryan Clark, Akshay Kakkar, Deepak Kushwaha, Paul Park, Peter Beling, Edward Bowen	(参考訳) 強化学習 (Reinforcement Learning, RL) は, 攻撃グラフやサイバー地形とともに, 企業ネットワークにおけるデータ流出の最適な経路を決定するための報酬と状態を開発するために用いられる。この研究は以前のクラウンジュエリー(CJ)識別に基づいており、敵が近くのCJやホストを妥協する最適な経路を計算することの目標に焦点を当てている。この作業は、データが盗まれ、ネットワークから静かに流出しなければならないという仮定に基づいて、以前のCJアプローチを逆転させる。 RLは、敵が検出を減らしたいと願う経路の識別に基づいて報酬関数の開発を支援するために利用される。その結果,大規模ネットワーク環境における有望な性能が示された。 Reinforcement learning (RL), in conjunction with attack graphs and cyber terrain, are used to develop reward and state associated with determination of optimal paths for exfiltration of data in enterprise networks. This work builds on previous crown jewels (CJ) identification that focused on the target goal of computing optimal paths that adversaries may traverse toward compromising CJs or hosts within their proximity. This work inverts the previous CJ approach based on the assumption that data has been stolen and now must be quietly exfiltrated from the network. RL is utilized to support the development of a reward function based on the identification of those paths where adversaries desire reduced detection. Results demonstrate promising performance for a sizable network environment.	翻訳日:2022-02-01 20:02:27 公開日:2022-01-28
# 学習最適化のための簡易ガード A Simple Guard for Learned Optimizers ( http://arxiv.org/abs/2201.12426v1 ) ライセンス: Link先を確認	Isabeau Pr\'emont-Schwarz, Jaroslav V\'itk\r{u}, Jan Feyereisl	(参考訳) 学習したコンポーネントの傾向が最終的に手作りバージョンを上回り続けるなら、学習したオプティマイザは最終的にはSGDやAdamのような手作りのオプティマイザを上回る。しかし、たとえ学習したオプティマイザ(L2Os)が最終的に手作りのものよりも優れているとしても、それらは証明可能な収束性はなく、分布に失敗する可能性がある。これらの質問はここで取り上げている。現在、学習オプティマイザは、学習の開始時に、一般的な手作りのオプティマイザ(勾配降下など)をしばしば上回っていますが、一般的には、ジェネリックアルゴリズムが進歩を続けながら、学習したアルゴリズムをaesopのtortoiseとして上回っており、hareを上回っており、そうでない。 L2Osはまた、分布から一般化するのが難しい。 (heaton et al., 2020)は、学習したオプティマイザをジェネリックな学習アルゴリズムで保護し、2つのアルゴリズムを条件付きで切り替えることで、結果として得られるアルゴリズムが確実に収束するように保護されたl2o(gl2o)を提案した。 L2O(Los-Guarded L2O)と呼ばれる新しいセーフガード型L2O(Los-Guarded L2O)を提案する。ガード機構は、両オプティマイザの期待損失値のみに基づいて決定する。さらに,lgl2o の収束保証の理論的証明と gl2o や他のベースラインと比較し,l2o と sgd を最もよく組み合わせ,実際 gl2o よりも収束することを示す実験結果を示す。 If the trend of learned components eventually outperforming their hand-crafted version continues, learned optimizers will eventually outperform hand-crafted optimizers like SGD or Adam. Even if learned optimizers (L2Os) eventually outpace hand-crafted ones in practice however, they are still not provably convergent and might fail out of distribution. These are the questions addressed here. Currently, learned optimizers frequently outperform generic hand-crafted optimizers (such as gradient descent) at the beginning of learning but they generally plateau after some time while the generic algorithms continue to make progress and often overtake the learned algorithm as Aesop's tortoise which overtakes the hare and are not. L2Os also still have a difficult time generalizing out of distribution. (Heaton et al., 2020) proposed Safeguarded L2O (GL2O) which can take a learned optimizer and safeguard it with a generic learning algorithm so that by conditionally switching between the two, the resulting algorithm is provably convergent. We propose a new class of Safeguarded L2O, called Loss-Guarded L2O (LGL2O), which is both conceptually simpler and computationally less expensive. The guarding mechanism decides solely based on the expected future loss value of both optimizers. Furthermore, we show theoretical proof of LGL2O's convergence guarantee and empirical results comparing to GL2O and other baselines showing that it combines the best of both L2O and SGD and and in practice converges much better than GL2O.	翻訳日:2022-02-01 20:02:15 公開日:2022-01-28
# 情報選択システムのためのトップKランキング深層帯域 Top-K Ranking Deep Contextual Bandits for Information Selection Systems ( http://arxiv.org/abs/2201.13287v1 ) ライセンス: Link先を確認	Jade Freeman and Michael Rawson	(参考訳) 今日の技術環境では、情報は豊富で、動的で、自然に異質である。情報の自動フィルタリングと優先順位付けは、その情報が目標に向かって実質的な価値を付加するかどうかの区別に基づいている。コンテキスト型マルチアームバンディットは、ユーザの関心や関連性に応じてコンテンツをフィルタリングし優先順位付けするために広く使用されている。 Learn-to-Rankテクニックはアイテムの関連ランキングを最適化し、コンテンツの選択を可能にする。本稿では,文脈的マルチアームバンディットフレームワークに基づくトップKランキングに対する新しいアプローチを提案する。確率的報酬関数をニューラルネットワークでモデル化し,非線形近似により報酬と文脈の関係を学習する。本手法を実証し,シミュレーションシナリオにおける実世界のデータセットを用いて実験結果から学習性能を評価する。実験の結果、この手法は報酬構造と高次元の文脈特徴の複雑さの下でうまく機能することが示された。 In today's technology environment, information is abundant, dynamic, and heterogeneous in nature. Automated filtering and prioritization of information is based on the distinction between whether the information adds substantial value toward one's goal or not. Contextual multi-armed bandit has been widely used for learning to filter contents and prioritize according to user interest or relevance. Learn-to-Rank technique optimizes the relevance ranking on items, allowing the contents to be selected accordingly. We propose a novel approach to top-K rankings under the contextual multi-armed bandit framework. We model the stochastic reward function with a neural network to allow non-linear approximation to learn the relationship between rewards and contexts. We demonstrate the approach and evaluate the the performance of learning from the experiments using real world data sets in simulated scenarios. Empirical results show that this approach performs well under the complexity of a reward structure and high dimensional contextual features.	翻訳日:2022-02-01 18:21:10 公開日:2022-01-28
# 注意重み付きイベントベース埋め込みを用いた自動音声キャプション Automatic Audio Captioning using Attention weighted Event based Embeddings ( http://arxiv.org/abs/2201.12352v1 ) ライセンス: Link先を確認	Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu	(参考訳) 自動音声キャプション(automatic audio captioning, aac)は、音声を自然言語に翻訳し、音声イベント、イベントのソース、それらの関係を記述するタスクである。現在、aacデータセットの限られたサンプルは、転送学習とオーディオイベント検出(aed)を親タスクとして組み込むトレンドを設定している。本稿では,aacのための軽量(学習パラメータが小さい)bi-lstmリカレント層を有するエンコーダ・デコーダアーキテクチャを提案する。その結果,時間的注意と拡張技術を組み合わせた効率的なaed埋め込み抽出器は,計算集約的なアーキテクチャで既存の文献を上回ることができることがわかった。さらに,このモデルの一部として生成した不均一な注意重み付き符号化が,各トークンの生成中にオーディオの特定の部分をデコーダが見渡すことができることを示す。 Automatic Audio Captioning (AAC) refers to the task of translating audio into a natural language that describes the audio events, source of the events and their relationships. The limited samples in AAC datasets at present, has set up a trend to incorporate transfer learning with Audio Event Detection (AED) as a parent task. Towards this direction, in this paper, we propose an encoder-decoder architecture with light-weight (i.e. with lesser learnable parameters) Bi-LSTM recurrent layers for AAC and compare the performance of two state-of-the-art pre-trained AED models as embedding extractors. Our results show that an efficient AED based embedding extractor combined with temporal attention and augmentation techniques is able to surpass existing literature with computationally intensive architectures. Further, we provide evidence of the ability of the non-uniform attention weighted encoding generated as a part of our model to facilitate the decoder glance over specific sections of the audio while generating each token.	翻訳日:2022-02-01 18:19:40 公開日:2022-01-28
# ロボット操作のためのタスク焦点Few-Shotオブジェクト検出 Task-Focused Few-Shot Object Detection for Robot Manipulation ( http://arxiv.org/abs/2201.12437v1 ) ライセンス: Link先を確認	Brent Griffin	(参考訳) 本稿では,新しい物体の移動ロボット操作における検出による問題に対処する。我々のアプローチでは、現実のタスクから学習する補完関数として視覚と制御を用いる。検出のみに基づく操作手法を開発し,タスク中心の少数ショット検出を導入し,新しいオブジェクトや設定を学習する。少数ショットオブジェクト検出の現在のパラダイムは、既存のアノテーション付き例を使用する。対照的に、このパラダイムは、特定の下流タスク(例えば、深さ推定と把握)の性能を向上させるアクティブデータ収集とアノテーション選択を用いて拡張する。数ショット学習へのインタラクティブなアプローチの実験では、ロボットを訓練して、検出からオブジェクトを直接操作する(ClickBot)。 clickbotはアノテーションの1クリックでビジュアルサーボ制御を学び、クラッターや他の設定で新しいオブジェクトを把握し、既存のビジュアルサーボ制御と深さ推定ベンチマークで最先端の結果を得る。最後に、将来の研究をサポートするタスク中心の少数ショットオブジェクト検出ベンチマークを確立します。 This paper addresses the problem of mobile robot manipulation of novel objects via detection. Our approach uses vision and control as complementary functions that learn from real-world tasks. We develop a manipulation method based solely on detection then introduce task-focused few-shot object detection to learn new objects and settings. The current paradigm for few-shot object detection uses existing annotated examples. In contrast, we extend this paradigm by using active data collection and annotation selection that improves performance for specific downstream tasks (e.g., depth estimation and grasping). In experiments for our interactive approach to few-shot learning, we train a robot to manipulate objects directly from detection (ClickBot). ClickBot learns visual servo control from a single click of annotation, grasps novel objects in clutter and other settings, and achieves state-of-the-art results on an existing visual servo control and depth estimation benchmark. Finally, we establish a task-focused few-shot object detection benchmark to support future research: https://github.com/griffbr/TFOD.	翻訳日:2022-02-01 18:03:44 公開日:2022-01-28
# 物理インフォームドニューラルネットワークによる複数の電気解剖学的マップからの心臓線維配向の学習 Physics-informed neural networks to learn cardiac fiber orientation from multiple electroanatomical maps ( http://arxiv.org/abs/2201.12362v1 ) ライセンス: Link先を確認	Carlos Ruiz Herrera, Thomas Grandits, Gernot Plank, Paris Perdikaris, Francisco Sahli Costabal and Simone Pezzuto	(参考訳) 本研究では, 複数のカテーテル記録からヒト心房の心線維構造をin-vivoで推定するfibernetを提案する。心臓線維は心臓の電気機械機能において中心的な役割を担っているが、生体内決定が困難であり、それゆえ、既存の心臓モデルにおいて真に患者特異的であることは稀である。逆問題は、スパース活性化マップの集合から心臓伝播モデルの伝導速度テンソルを特定することである。局所繊維角を含む伝導速度テンソルの全ての成分を同時に同定し, 合成2次元および3次元例, 拡散テンソル繊維, 患者特有の場合についてfibernetを広範囲にテストした。 3つの地図は繊維を正確に捉えるのに十分であり、ノイズの予測にも十分であることを示す。地図が少なければ、正規化の役割は顕著になる。さらに, 適応モデルにより, 目に見えないアクティベーションマップを頑健に再現できることを示す。 FiberNetはパーソナライズされた医療のための患者固有のモデルを作成するのに役立つことを期待しています。 We propose FiberNet, a method to estimate in-vivo the cardiac fiber architecture of the human atria from multiple catheter recordings of the electrical activation. Cardiac fibers play a central rolein the electro-mechanical function of the heart, yet they aredifficult to determine in-vivo, and hence rarely truly patient-specificin existing cardiac models.FiberNet learns the fibers arrangement by solvingan inverse problem with physics-informed neural networks. The inverse problem amounts to identifyingthe conduction velocity tensor of a cardiac propagation modelfrom a set of sparse activation maps. The use of multiple mapsenables the simultaneous identification of all the componentsof the conduction velocity tensor, including the local fiber angle.We extensively test FiberNet on synthetic 2-D and 3-D examples, diffusion tensor fibers, and a patient-specific case. We show that 3 maps are sufficient to accurately capture the fibers, also in thepresence of noise. With fewer maps, the role of regularization becomesprominent. Moreover, we show that the fitted model can robustlyreproduce unseen activation maps. We envision that FiberNet will help the creation of patient-specific models for personalized medicine.The full code is available at http://github.com/fsahli/FiberNet.	翻訳日:2022-02-01 17:59:06 公開日:2022-01-28
# Electra: 条件付き生成モデルに基づく述語対応クエリ近似 Electra: Conditional Generative Model based Predicate-Aware Query Approximation ( http://arxiv.org/abs/2201.12420v1 ) ライセンス: Link先を確認	Nikhil Sheoran, Subrata Mitra, Vibhor Porwal, Siddharth Ghetia, Jatin Varshney, Tung Mai, Anup Rao, Vikas Maddukuri	(参考訳) Approximate Query Processing(AQP)の目標は、クエリをコスト的に集約する上で、非常に高速だが“十分正確な”結果を提供することで、大規模なデータセットのインタラクティブな探索におけるユーザエクスペリエンスを向上させることだ。最近提案された機械学習ベースのaqp技術は、クエリの実行が従来のデータベースクラスタでのクエリ処理と比較してモデル推論のみを伴うため、非常に低いレイテンシを提供することができる。しかし、フィルタ述語(WHERE節)の数が増加すると、近似誤差はこれらの手法で著しく増加する。アナリストは洞察の発見に多くの述語を使ったクエリを使うことが多い。したがって、アナリストが誤った結論を出すのを防ぐためには、低い近似誤差を維持することが重要である。本稿では,より少ない近似誤差で多数の述語を用いた分析式クエリに応答できる述語認識型AQPシステムであるELECTRAを提案する。 electraは条件付き生成モデルを使用して、データの条件付き分布を学習し、実行時に小さな(約1000行)だが代表的なサンプルを生成し、クエリを実行して近似結果を計算する。実世界の3つのデータセットに対する4つの異なるベースラインによる評価の結果,ELECTRAはベースラインと比較して多数の述語に対して低いAQP誤差を提供することがわかった。 The goal of Approximate Query Processing (AQP) is to provide very fast but "accurate enough" results for costly aggregate queries thereby improving user experience in interactive exploration of large datasets. Recently proposed Machine-Learning based AQP techniques can provide very low latency as query execution only involves model inference as compared to traditional query processing on database clusters. However, with increase in the number of filtering predicates(WHERE clauses), the approximation error significantly increases for these methods. Analysts often use queries with a large number of predicates for insights discovery. Thus, maintaining low approximation error is important to prevent analysts from drawing misleading conclusions. In this paper, we propose ELECTRA, a predicate-aware AQP system that can answer analytics-style queries with a large number of predicates with much smaller approximation errors. ELECTRA uses a conditional generative model that learns the conditional distribution of the data and at runtime generates a small (~1000 rows) but representative sample, on which the query is executed to compute the approximate result. Our evaluations with four different baselines on three real-world datasets show that ELECTRA provides lower AQP error for large number of predicates compared to baselines.	翻訳日:2022-02-01 17:56:46 公開日:2022-01-28
# コミュニケーション構造を考慮した協調ゲームによるグラフレベルの予測 Explaining Graph-level Predictions with Communication Structure-Aware Cooperative Games ( http://arxiv.org/abs/2201.12380v1 ) ライセンス: Link先を確認	Shichang Zhang, Neil Shah, Yozen Liu, Yizhou Sun	(参考訳) 機械学習モデルによる予測を説明することが重要であり、興味を惹きつけている。協調ゲーム理論のShapley値は、特に画像、テキスト、表データ、および最近のグラフ上のグラフニューラルネットワーク(GNN)において、予測に対する特徴的重要性を計算するための主要なアプローチとして提案されている。本研究では,グラフ記述におけるShapley値の妥当性を再検討し,グラフレベルの予測において最も重要な部分グラフと構成ノードを特定する。我々は、Shapley値がグラフデータに対する非理想的な選択であると仮定する。本稿では、重要なグラフ構造情報を利用して説明を改善するグラフ構造対応eXplanation(GStarX)法を提案する。具体的には,HN値と呼ばれる協調ゲーム理論から,新たな構造認識値に基づくスコアリング関数を提案する。ノードの重要性を評価する場合、HN値はグラフ構造を利用して、GNNのメッセージパッシングに似た近隣ノード間の協調余剰を属性とし、ノードの重要度スコアはノードの特徴の重要性だけでなく構造的役割も反映する。我々はGstarXが定性的に直感的に説明し、化学グラフ特性予測とテキストグラフの感情分類に基づく強いベースラインを定量的に改善することを示した。 Explaining predictions made by machine learning models is important and have attracted an increased interest. The Shapley value from cooperative game theory has been proposed as a prime approach to compute feature importances towards predictions, especially for images, text, tabular data, and recently graph neural networks (GNNs) on graphs. In this work, we revisit the appropriateness of the Shapley value for graph explanation, where the task is to identify the most important subgraph and constituent nodes for graph-level predictions. We purport that the Shapley value is a no-ideal choice for graph data because it is by definition not structure-aware. We propose a Graph Structure-aware eXplanation (GStarX) method to leverage the critical graph structure information to improve the explanation. Specifically, we propose a scoring function based on a new structure-aware value from the cooperative game theory called the HN value. When used to score node importance, the HN value utilizes graph structures to attribute cooperation surplus between neighbor nodes, resembling message passing in GNNs, so that node importance scores reflect not only the node feature importance, but also the structural roles. We demonstrate that GstarX produces qualitatively more intuitive explanations, and quantitatively improves over strong baselines on chemical graph property prediction and text graph sentiment classification.	翻訳日:2022-02-01 16:56:41 公開日:2022-01-28
# 状態マージによるRNNからの有限オートマタ抽出 Extracting Finite Automata from RNNs Using State Merging ( http://arxiv.org/abs/2201.12451v1 ) ライセンス: Link先を確認	William Merrill and Nikolaos Tsilivis	(参考訳) blackbox recurrent neural network(rnn)の振る舞いを解釈する一つの方法は、より解釈可能な離散計算モデルからその振る舞いをキャプチャする有限状態機械のように抽出することである。本研究では,rnnから有限オートマトンを抽出する新しい手法を提案する。提案手法の有効性をTomita言語ベンチマークで検証したところ,ベンチマーク中のすべての言語でトレーニングされたRNNから忠実なオートマトンを抽出できることがわかった。抽出性能は抽出プロセス中に提供されるデータ数と、ターゲット言語を完全に学習した後、RNNモデルが追加のエポックのために訓練されているかどうかによって支援される。我々はこの現象を解析するためにこの手法を用い,RNNの内部状態空間の圧縮につながるため,収束を超えたトレーニングが有用であることを確認した。そこで本研究では,RNNモデルの解釈可能性と解析に本手法をどのように利用できるかを示す。 One way to interpret the behavior of a blackbox recurrent neural network (RNN) is to extract from it a more interpretable discrete computational model, like a finite state machine, that captures its behavior. In this work, we propose a new method for extracting finite automata from RNNs inspired by the state merging paradigm from grammatical inference. We demonstrate the effectiveness of our method on the Tomita languages benchmark, where we find that it is able to extract faithful automata from RNNs trained on all languages in the benchmark. We find that extraction performance is aided by the number of data provided during the extraction process, as well as, curiously, whether the RNN model is trained for additional epochs after perfectly learning its target language. We use our method to analyze this phenomenon, finding that training beyond convergence is useful because it leads to compression of the internal state space of the RNN. This finding demonstrates how our method can be used for interpretability and analysis of trained RNN models.	翻訳日:2022-02-01 16:56:21 公開日:2022-01-28
# タンパク質構造表現学習のための重み付きニューラルネットワーク Directed Weight Neural Networks for Protein Structure Representation Learning ( http://arxiv.org/abs/2201.13299v1 ) ライセンス: Link先を確認	Jiahan Li, Shitong Luo, Congyue Deng, Chaoran Cheng, Jiaqi Guan, Leonidas Guibas, Jian Peng, Jianzhu Ma	(参考訳) タンパク質は、特定の3D構造に折り畳んで生物学的機能を発揮する。タンパク質構造を正確にモデル化するには、全体的な幾何学的トポロジーとアミノ酸間の局所的細粒度関係(例えば、側鎖ねじれ角とアミノ酸間配向)の両方を慎重に考慮すべきである。本研究では,異なるアミノ酸間の幾何学的関係をよりよく捉えるために,有向重みニューラルネットワークを提案する。スカラーから3次元指向ベクトルへの単一重み拡張により,新しいフレームワークは,古典的および so(3)-表現的特徴の両方に対する幾何学的操作の豊富な集合をサポートし,その上にアミノ酸情報を処理するためのパーセプトロンユニットを構築する。さらに,有向重みを既存のグラフニューラルネットワークに接続するためのタンパク質に同変メッセージパッシングパラダイムを導入し,グローバルスケールでのSO(3)-等価性の維持に優れた長所を示す。実験により,従来のニューラルネットワークや(グローバルに)等価ネットワークと比較して,幾何学的関係を表現する上で,ネットワークの表現性が著しく向上することが示された。また、タンパク質3D構造に関連する様々な計算生物学応用の最先端性能も達成している。 A protein performs biological functions by folding to a particular 3D structure. To accurately model the protein structures, both the overall geometric topology and local fine-grained relations between amino acids (e.g. side-chain torsion angles and inter-amino-acid orientations) should be carefully considered. In this work, we propose the Directed Weight Neural Network for better capturing geometric relations among different amino acids. Extending a single weight from a scalar to a 3D directed vector, our new framework supports a rich set of geometric operations on both classical and SO(3)--representation features, on top of which we construct a perceptron unit for processing amino-acid information. In addition, we introduce an equivariant message passing paradigm on proteins for plugging the directed weight perceptrons into existing Graph Neural Networks, showing superior versatility in maintaining SO(3)-equivariance at the global scale. Experiments show that our network has remarkably better expressiveness in representing geometric relations in comparison to classical neural networks and the (globally) equivariant networks. It also achieves state-of-the-art performance on various computational biology applications related to protein 3D structures.	翻訳日:2022-02-01 16:18:28 公開日:2022-01-28
# 組合せインタラクションテストを用いた機械学習の系統的トレーニングとテスト Systematic Training and Testing for Machine Learning Using Combinatorial Interaction Testing ( http://arxiv.org/abs/2201.12428v1 ) ライセンス: Link先を確認	Tyler Cody, Erin Lanus, Daniel D. Doyle, Laura Freeman	(参考訳) 本稿では,機械学習モデルにおけるテストおよびトレーニングセットの選択と特徴付けに,組合せカバレッジの体系的利用を示す。提案した研究は、機械学習で使用されるデータを特徴付けるために、ソフトウェアテストの欠陥を特定するためにうまく活用されている組合せ相互作用テストに適応する。 mnist手書き桁データを使用して、コンビネートカバレッジが、マシンラーニングモデルのパフォーマンスを強調するテストセットの選択、堅牢なモデルパフォーマンスにつながるトレーニングセットの選択、新しいドメインへの微調整モデルのためのデータの選択に使用できることを実証する。したがって、結果は機械学習のトレーニングとテストのための総合的なアプローチとして組み合わせカバレッジを実証する。本稿では、ニューラルネットワークの内部におけるカバレッジの利用に注目した先行研究とは対照的に、入力と出力から得られる単純な特徴のカバレッジについて考察する。そこで本稿では,機械学習モデルに対するテストおよびトレーニングセットのサプライヤーが,モデル自体に対する知的財産権を持っていない場合について論じる。最後に、組み合わせカバレッジに対する事前の批判に対処し、機械学習アプリケーションにおけるカバレッジメトリクスの使用を推奨する反論を提供する。 This paper demonstrates the systematic use of combinatorial coverage for selecting and characterizing test and training sets for machine learning models. The presented work adapts combinatorial interaction testing, which has been successfully leveraged in identifying faults in software testing, to characterize data used in machine learning. The MNIST hand-written digits data is used to demonstrate that combinatorial coverage can be used to select test sets that stress machine learning model performance, to select training sets that lead to robust model performance, and to select data for fine-tuning models to new domains. Thus, the results posit combinatorial coverage as a holistic approach to training and testing for machine learning. In contrast to prior work which has focused on the use of coverage in regard to the internal of neural networks, this paper considers coverage over simple features derived from inputs and outputs. Thus, this paper addresses the case where the supplier of test and training sets for machine learning models does not have intellectual property rights to the models themselves. Finally, the paper addresses prior criticism of combinatorial coverage and provides a rebuttal which advocates the use of coverage metrics in machine learning applications.	翻訳日:2022-02-01 16:17:54 公開日:2022-01-28
# マルチエージェント・コントロにおけるレギュレット最小化手法 A Regret Minimization Approach to Multi-Agent Contro ( http://arxiv.org/abs/2201.13288v1 ) ライセンス: Link先を確認	Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan	(参考訳) 本研究では,動的システムのマルチエージェント制御の問題点について考察する。本研究は,中央集権的な事前計算を行なわない最適制御に焦点をあて,安定化制御のみを備えた異なるエージェントに対する適応制御ポリシーを提案する。我々は、任意の(標準的な)後悔の少ない制御方法を分散アルゴリズムに還元する。この削減により、得られた分散アルゴリズムは、最適な事前計算された共同ポリシに対して、後悔の少ないことが保証される。提案手法は,オンライン凸最適化をマルチエージェント設定に一般化し,非定型制御からの最近のツールを適用することを含む。本手法は過度に作動する航空機のモデルを用いて実験的に評価する。分散手法は, 障害に対して頑健であり, ダイナミックスにおける逆摂動に対して頑健であることを示す。 We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances. Our study focuses on optimal control without centralized precomputed policies, but rather with adaptive control policies for the different agents that are only equipped with a stabilizing controller. We give a reduction from any (standard) regret minimizing control method to a distributed algorithm. The reduction guarantees that the resulting distributed algorithm has low regret relative to the optimal precomputed joint policy. Our methodology involves generalizing online convex optimization to a multi-agent setting and applying recent tools from nonstochastic control derived for a single agent. We empirically evaluate our method on a model of an overactuated aircraft. We show that the distributed method is robust to failure and to adversarial perturbations in the dynamics.	翻訳日:2022-02-01 16:15:22 公開日:2022-01-28
# シーケンス生成によるスキーマフリー依存パーシング Schema-Free Dependency Parsing via Sequence Generation ( http://arxiv.org/abs/2201.12407v1 ) ライセンス: Link先を確認	Boda Lin, Zijun Yao, Jiaxin Shi, Shulin Cao, Binghao Tang, Si Li, Yong Luo, Juanzi Li, Lei Hou	(参考訳) 依存関係解析は、文の構文依存構造や意味依存構造を抽出することを目的としている。既存の方法は、普遍性を欠いたり、補助デコーダに強く依存する欠点を負う。これらの欠点を解消するために、補助構造や解析アルゴリズムを使わずに事前学習された言語モデル(PLM)のみを利用することで、シーケンス生成(SG)DPSGを介して、普遍的でスキーマなしの依存性解析(DP)を実現することを提案する。まず、解析構造をシーケンスに変換するための異なるシリアライズ設計戦略を検討する。次に、依存ユニットを設計し、これらのユニットをDPSGのシーケンスにまとめる。シーケンス生成の柔軟性が高いため、DPSGは単一のモデルを用いて構文DPと意味DPの両方を実現できる。特定のスキーマを示すプレフィックスをシーケンスと結合することで、DPSGはマルチスキーマ解析を達成できます。 dpsgの有効性は,ptb,codt,sdp15,semeval16など,広く使用されているdpベンチマークを用いて実証した。 DPSGは、CODTとSemEval16におけるすべてのベンチマークのファーストレベルメソッドと、最先端(SOTA)のパフォーマンスで同等の結果を得る。本稿ではDPSGが新たな解析パラダイムとなる可能性を実証する。私たちは受け入れ次第コードを公開します。 Dependency parsing aims to extract syntactic dependency structure or semantic dependency structure for sentences. Existing methods suffer the drawbacks of lacking universality or highly relying on the auxiliary decoder. To remedy these drawbacks, we propose to achieve universal and schema-free Dependency Parsing (DP) via Sequence Generation (SG) DPSG by utilizing only the pre-trained language model (PLM) without any auxiliary structures or parsing algorithms. We first explore different serialization designing strategies for converting parsing structures into sequences. Then we design dependency units and concatenate these units into the sequence for DPSG. Thanks to the high flexibility of the sequence generation, our DPSG can achieve both syntactic DP and semantic DP using a single model. By concatenating the prefix to indicate the specific schema with the sequence, our DPSG can even accomplish multi-schemata parsing. The effectiveness of our DPSG is demonstrated by the experiments on widely used DP benchmarks, i.e., PTB, CODT, SDP15, and SemEval16. DPSG achieves comparable results with the first-tier methods on all the benchmarks and even the state-of-the-art (SOTA) performance in CODT and SemEval16. This paper demonstrates our DPSG has the potential to be a new parsing paradigm. We will release our codes upon acceptance.	翻訳日:2022-02-01 16:10:49 公開日:2022-01-28
# 物理エンコード学習による希少データからの非線形pdesの検出 Discovering Nonlinear PDEs from Scarce Data with Physics-encoded Learning ( http://arxiv.org/abs/2201.12354v1 ) ライセンス: Link先を確認	Chengping Rao, Pu Ren, Yang Liu, Hao Sun	(参考訳) 複雑な物理現象を支配する偏微分方程式(pdes)を発見するために、実験的な測定値を活用することへの関心が高まっている。過去の研究はデータ駆動型PDE発見において大きな成功を収めてきたが、低品質の測定データを扱う場合、既存の手法の堅牢性は保証できない。この課題を克服するために,不足・雑音データから時空間PDEを発見するための物理符号化離散学習フレームワークを提案する。まず、1)表現能力に柔軟でありながら事前の物理知識(PDE用語、仮定されたPDE構造、初期/境界条件など)を符号化し、高忠実度データを正確に再構成し、(2)再構成されたデータでスパースレグレッションを行い、PDEの明示的な形式を特定する、新しいディープ・畳み込み・リカレント・ネットワークを導入する。本手法を非線形PDEシステムで検証する。ベースラインモデルに対する提案手法の有効性と優位性を示す。 There have been growing interests in leveraging experimental measurements to discover the underlying partial differential equations (PDEs) that govern complex physical phenomena. Although past research attempts have achieved great success in data-driven PDE discovery, the robustness of the existing methods cannot be guaranteed when dealing with low-quality measurement data. To overcome this challenge, we propose a novel physics-encoded discrete learning framework for discovering spatiotemporal PDEs from scarce and noisy data. The general idea is to (1) firstly introduce a novel deep convolutional-recurrent network, which can encode prior physics knowledge (e.g., known PDE terms, assumed PDE structure, initial/boundary conditions, etc.) while remaining flexible on representation capability, to accurately reconstruct high-fidelity data, and (2) perform sparse regression with the reconstructed data to identify the explicit form of the governing PDEs. We validate our method on three nonlinear PDE systems. The effectiveness and superiority of the proposed method over baseline models are demonstrated.	翻訳日:2022-02-01 15:40:02 公開日:2022-01-28
# 安全編集者政策による安全強化学習に向けて Towards Safe Reinforcement Learning with a Safety Editor Policy ( http://arxiv.org/abs/2201.12427v1 ) ライセンス: Link先を確認	Haonan Yu, Wei Xu, Haichao Zhang	(参考訳) 制約を満たすとともに実用性を最大化する安全強化学習(RL)問題を考察する。我々は、安全概念の事前知識や事前訓練を前提としないので、漸近的制約満足度に興味を持っている。この研究で一般的なアプローチは、ラグランジアン法とモデルなしRLアルゴリズムを組み合わせることで、制約報酬の重み付けを動的に調整することである。効用と制約報酬の衝突に対処するための単一のポリシーに依存しており、しばしば困難である。安全層設計(dalal et al., 2018)に着想を得た我々は、ユーティリティ最大化ポリシーによって出力される潜在的安全でないアクションを安全なものに変換する安全エディタポリシーを別々に学ぶことを提案する。安全編集者は、編集前後のアクションの実用Q値のヒンジ損失を最小限に抑えつつ、制約報酬を最大化するように訓練される。厳格な制約しきい値を持つ12のカスタムセーフティジム(ray et al., 2019)と2つのセーフレーシングタスクにおいて,本手法は制約に準拠しながら優れた実用性能を示す。アブレーション研究は、我々の2つの政治デザインが重要であることを示している。典型的な単一政治アプローチのモデル容量を2倍にするだけでは、同等の結果にはならない。特定の状況ではQヒンジ損失も重要であり、通常のL2距離に置き換えるには失敗する可能性がある。 We consider the safe reinforcement learning (RL) problem of maximizing utility while satisfying provided constraints. Since we do not assume any prior knowledge or pre-training of the safety concept, we are interested in asymptotic constraint satisfaction. A popular approach in this line of research is to combine the Lagrangian method with a model-free RL algorithm to adjust the weight of the constraint reward dynamically. It relies on a single policy to handle the conflict between utility and constraint rewards, which is often challenging. Inspired by the safety layer design (Dalal et al., 2018), we propose to separately learn a safety editor policy that transforms potentially unsafe actions output by a utility maximizer policy into safe ones. The safety editor is trained to maximize the constraint reward while minimizing a hinge loss of the utility Q values of actions before and after the edit. On 12 custom Safety Gym (Ray et al., 2019) tasks and 2 safe racing tasks with very harsh constraint thresholds, our approach demonstrates outstanding utility performance while complying with the constraints. Ablation studies reveal that our two-policy design is critical. Simply doubling the model capacity of typical single-policy approaches will not lead to comparable results. The Q hinge loss is also important in certain circumstances, and replacing it with the usual L2 distance could fail badly.	翻訳日:2022-02-01 15:39:42 公開日:2022-01-28
# エントロピー・リワード(実践)は必要か? Do You Need the Entropy Reward (in Practice)? ( http://arxiv.org/abs/2201.12434v1 ) ライセンス: Link先を確認	Haonan Yu, Haichao Zhang, Wei Xu	(参考訳) 最大エントロピー(MaxEnt) RLは、元のタスク報酬とエントロピー報酬の組み合わせを最大化する。エントロピーによって課される規則化は、政策改善と政策評価の両方において、共に良好な探索、訓練の収束、学習した政策の堅牢性に寄与していると考えられている。本稿では,MaxEnt RLの代表者であるソフトアクター・クリティック(SAC)に対する様々なアブレーション研究を行い,エントロピーを本質的な報酬としてより深く考察する。以上の結果から,エントロピー報酬は政策評価に留意して適用すべきである。一方、エントロピー報酬は他の固有の報酬と同様に、適切に管理されていない場合、メインタスク報酬を曖昧にすることができる。特にエピソード的マルコフ決定過程(MDP)におけるエントロピー報酬(entropy reward)の失敗事例を同定し,政策が過度に楽観的あるいは悲観的になる可能性を示唆した。一方,本研究は,エントロピー正規化を政策改善にのみ用いることは,政策改善と政策評価の両方で使用するよりも,同等あるいはそれ以上のパフォーマンスと堅牢性をもたらすことを示した。これらの観測に基づいて、エントロピー報酬をゼロ平均(SACZero)に正規化するか、あるいはより実用的な結果を得るために政策評価(SACLite)から単に取り除くことを推奨する。 Maximum entropy (MaxEnt) RL maximizes a combination of the original task reward and an entropy reward. It is believed that the regularization imposed by entropy, on both policy improvement and policy evaluation, together contributes to good exploration, training convergence, and robustness of learned policies. This paper takes a closer look at entropy as an intrinsic reward, by conducting various ablation studies on soft actor-critic (SAC), a popular representative of MaxEnt RL. Our findings reveal that in general, entropy rewards should be applied with caution to policy evaluation. On one hand, the entropy reward, like any other intrinsic reward, could obscure the main task reward if it is not properly managed. We identify some failure cases of the entropy reward especially in episodic Markov decision processes (MDPs), where it could cause the policy to be overly optimistic or pessimistic. On the other hand, our large-scale empirical study shows that using entropy regularization alone in policy improvement, leads to comparable or even better performance and robustness than using it in both policy improvement and policy evaluation. Based on these observations, we recommend either normalizing the entropy reward to a zero mean (SACZero), or simply removing it from policy evaluation (SACLite) for better practical results.	翻訳日:2022-02-01 15:38:01 公開日:2022-01-28
# Any-Play: ゼロショットコーディネーションに固有の拡張 Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination ( http://arxiv.org/abs/2201.12436v1 ) ライセンス: Link先を確認	Keane Lucas and Ross E. Allen	(参考訳) 協調作業における人間または超人的能力を持つ協調人工知能は、機械学習研究のフロンティアに立っている。先行研究は、セルフプレイ(一緒に訓練されたエージェントで構成されるチーム)とクロスプレイ(同じアルゴリズムを使用して独立して訓練されたエージェントのチーム)の制限パラダイムの下で、協調aiのパフォーマンスを評価する傾向があった。最近の研究によると、これらの狭い設定に最適化されたaiは、現実世界で望ましくない協力者になる可能性がある。我々は、エージェント間のアルゴリズム的類似性を仮定することなく、実験プール内の他のすべてのエージェントとの協調性能の評価を行う、アルゴリズム間クロスプレイと呼ばれる協調AIを評価するための代替基準を定式化する。このパラダイムでは、Other-Play や Off-Belief Learning といった既存の最先端の協調型AIアルゴリズムが低パフォーマンスであることを示す。本稿では,ゼロショットコーディネーション(ZSC)のための多様性に基づく固有報酬のマルチエージェント拡張であるAny-Play学習拡張を提案する。本研究では,Any-Play学習をSAD(Simplified Action Decoder)に適用し,コラボレーションカードゲーム「はなび」の最先端性能を示す。 Cooperative artificial intelligence with human or superhuman proficiency in collaborative tasks stands at the frontier of machine learning research. Prior work has tended to evaluate cooperative AI performance under the restrictive paradigms of self-play (teams composed of agents trained together) and cross-play (teams of agents trained independently but using the same algorithm). Recent work has indicated that AI optimized for these narrow settings may make for undesirable collaborators in the real-world. We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play, where agents are evaluated on teaming performance with all other agents within an experiment pool with no assumption of algorithmic similarities between agents. We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm. We propose the Any-Play learning augmentation -- a multi-agent extension of diversity-based intrinsic rewards for zero-shot coordination (ZSC) -- for generalizing self-play-based algorithms to the inter-algorithm cross-play setting. We apply the Any-Play learning augmentation to the Simplified Action Decoder (SAD) and demonstrate state-of-the-art performance in the collaborative card game Hanabi.	翻訳日:2022-02-01 15:37:34 公開日:2022-01-28
# Automaton-augmented Retrievalを用いたニューロシンボリック言語モデリング Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval ( http://arxiv.org/abs/2201.12431v1 ) ライセンス: Link先を確認	Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig	(参考訳) 検索型言語モデル(R-LM)は、標準言語モデル(LM)とテスト時に外部データストアから取得した例を組み合わせることで、自然言語テキストの確率をモデル化する。効果的ではあるが、実際にこれらのモデルを使用する際の大きなボトルネックは計算コストのかかるデータストア検索である。本稿では,(1)エントリの"状態"へのクラスタリングと(2)前のエントリからの状態遷移に基づいて,データストア検索を近似したRetoMaton --検索オートマトンを提案する。これにより、データストアをフラットリストとして表現するのではなく、データストア上に構築された重み付き有限オートマトンが実現される。オートマトンの作成は監視されず、RetoMatonはオリジナルのトレーニングコーパスまたは他のドメインから、任意のテキストコレクションから構築することができる。このオートマトンをLM推論と並行して推論時にトラバースすることは、その難易度を減少させるか、またはkNN-LM(Khandelwal et al., 2020)上で最も近い隣人探索の83%を、難易度を損なうことなく節約する。 Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton -- retrieval automaton -- which approximates the datastore search, based on (1) clustering of entries into "states", and (2) state transitions from previous entries. This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity, or alternatively saves up to 83% of the nearest neighbor searches over kNN-LM (Khandelwal et al., 2020), without hurting perplexity.	翻訳日:2022-02-01 15:20:15 公開日:2022-01-28
# 善の逆例:不均衡学習を指導する逆例 Adversarial Examples for Good: Adversarial Examples Guided Imbalanced Learning ( http://arxiv.org/abs/2201.12356v1 ) ライセンス: Link先を確認	Jie Zhang, Lei Zhang, Gang Li, Chao Wu	(参考訳) 逆の例は、攻撃者がモデルをミスさせるよう設計した機械学習モデルの入力である。本稿では,不均衡学習の性能向上のために,逆例も有効に活用できることを実証する。我々は,不均衡なデータの扱い方について,GAE(Guiding Adversarial Examples)によるトレーニングによってバイアス付き決定境界を調整するという,新たな視点を提供する。本手法は,少数クラスの精度を効果的に向上し,多数派の精度を損なうことができる。いくつかのベンチマークデータセットにおいて,提案手法は最先端手法に匹敵することを示す。最善の知識として、我々は、逆の例で不均衡な学習を扱う最初の人です。 Adversarial examples are inputs for machine learning models that have been designed by attackers to cause the model to make mistakes. In this paper, we demonstrate that adversarial examples can also be utilized for good to improve the performance of imbalanced learning. We provide a new perspective on how to deal with imbalanced data: adjust the biased decision boundary by training with Guiding Adversarial Examples (GAEs). Our method can effectively increase the accuracy of minority classes while sacrificing little accuracy on majority classes. We empirically show, on several benchmark datasets, our proposed method is comparable to the state-of-the-art method. To our best knowledge, we are the first to deal with imbalanced learning with adversarial examples.	翻訳日:2022-02-01 15:19:04 公開日:2022-01-28
# 適応型ルックアヘッドによる計画と学習 Planning and Learning with Adaptive Lookahead ( http://arxiv.org/abs/2201.12403v1 ) ライセンス: Link先を確認	Aviv Rosenberg and Assaf Hallak and Shie Mannor and Gal Chechik and Gal Dalal	(参考訳) 古典的ポリシーイテレーション(PI)アルゴリズムは、強欲な一段階の政策改善と政策評価を交互に行う。近年の文献では、複数ステップのルックアヘッド政策の改善は、イテレーション毎の複雑さの増加を犠牲にして、収束率の向上につながることが示されている。しかし、アルゴリズムを実行する前に、何が最良の固定された視線地平線であるかを知ることはできない。さらに、与えられた走行ごとに、地平線より大きい視線を使うのは、しばしば無駄である。本研究では,多段階の地平線を状態と推定値の関数として動的に適応する手法として,初めて提案する。 2つのPI変種を考案し、イテレーション数とイテレーション毎の計算複雑性のトレードオフを分析する。第1の変種は所望の縮小係数を目的とし、文単位の複雑さを最小化する。第2の変種は1イテレーションあたりの計算複雑性を入力とし、全体の収縮係数を最小化する。次に、適応木探索地平線を持つ対応するDQNに基づくアルゴリズムを考案する。また、オンライン学習の新たな拡張として、奥行き値関数推定器(per-deepth value function estimator)も含んでいる。最後に, 迷路環境およびアタリにおける適応型ルックアヘッド法の有効性を実証した。 The classical Policy Iteration (PI) algorithm alternates between greedy one-step policy improvement and policy evaluation. Recent literature shows that multi-step lookahead policy improvement leads to a better convergence rate at the expense of increased complexity per iteration. However, prior to running the algorithm, one cannot tell what is the best fixed lookahead horizon. Moreover, per a given run, using a lookahead of horizon larger than one is often wasteful. In this work, we propose for the first time to dynamically adapt the multi-step lookahead horizon as a function of the state and of the value estimate. We devise two PI variants and analyze the trade-off between iteration count and computational complexity per iteration. The first variant takes the desired contraction factor as the objective and minimizes the per-iteration complexity. The second variant takes as input the computational complexity per iteration and minimizes the overall contraction factor. We then devise a corresponding DQN-based algorithm with an adaptive tree search horizon. We also include a novel enhancement for on-policy learning: per-depth value function estimator. Lastly, we demonstrate the efficacy of our adaptive lookahead method in a maze environment and in Atari.	翻訳日:2022-02-01 15:18:54 公開日:2022-01-28
# 抽象的視覚推論のための深層学習法:レイブンの進行行列に関する調査 Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices ( http://arxiv.org/abs/2201.12382v1 ) ライセンス: Link先を確認	Miko{\l}aj Ma{\l}ki\'nski and Jacek Ma\'ndziuk	(参考訳) 抽象視覚推論(AVR)ドメインは、特定のシーンに存在するエンティティ間の関係を推論する能力を必要とする問題を解決する。人間は一般に「自然」な方法でAVRタスクを解くが、従来の経験がなくてもこのような問題は現在の機械学習システムでは難しいことが証明されている。本稿では,AVR問題に対するディープラーニング手法の適用の最近の進歩を,機械学習研究のプロキシとして要約する。我々は、最も一般的なタイプのAVRタスク(Raven's Progressive Matrices (RPM))に焦点を当て、RPMを解決するために適用される学習方法と深層ニューラルネットワークの包括的なレビューとRPMベンチマークセットを提供する。 RPMを解くための最先端のアプローチのパフォーマンス分析は、この分野の現在と将来のトレンドに関する特定の洞察と発言の定式化につながる。本論文は,rpm研究の発見から実世界の問題がどのように恩恵を受けるかを示すことで結論づける。 Abstract visual reasoning (AVR) domain encompasses problems solving which requires the ability to reason about relations among entities present in a given scene. While humans, generally, solve AVR tasks in a ``natural'' way, even without prior experience, this type of problems has proven difficult for current machine learning systems. The paper summarises recent progress in applying deep learning methods to solving AVR problems, as a proxy for studying machine intelligence. We focus on the most common type of AVR tasks -- the Raven's Progressive Matrices (RPMs) -- and provide a comprehensive review of the learning methods and deep neural models applied to solve RPMs, as well as, the RPM benchmark sets. Performance analysis of the state-of-the-art approaches to solving RPMs leads to formulation of certain insights and remarks on the current and future trends in this area. We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies.	翻訳日:2022-02-01 14:38:51 公開日:2022-01-28
# 動的雑音の背景における視覚探索戦略の最適化のための深部q学習法 A deep Q-learning method for optimizing visual search strategies in backgrounds of dynamic noise ( http://arxiv.org/abs/2201.12385v1 ) ライセンス: Link先を確認	Weimin Zhou, Miguel P. Eckstein	(参考訳) 人間は様々な解像度で視覚情報を処理し(探索された視覚システム)、目の動きを通して高解像度の焦点を興味のある点に向けて画像を探索する。タスク関連情報の完全な知識を用いるベイズ理想探索器(is)は、眼球運動戦略を最適化し、最適な探索性能を達成する。 ISは、人間の眼球運動の最適性を評価する重要なツールとして利用でき、人間の視線探索戦略を改善するためのガイダンスを提供する可能性がある。 Najemnik と Geisler (2005) は空間的 1/f ノイズの背景に対する IS を導出した。対応するテンプレート応答はガウス分布に従い、最適な探索戦略を解析的に決定することができる。しかし、医療画像のようなより現実的で複雑な背景を考えると、ISの計算は難解である。現代の強化学習法は、様々なタスクに対して最適なポリシーを得るためにうまく適用され、背景生成関数の完全な知識を必要とせず、解剖学的背景に適用することができる。重要な第一歩は強化学習法の最適性を検証することである。本研究では, isを近似するqネットワークを用いた強化学習手法について検討する。本稿では,qネットワークに対応する検索戦略がis検索戦略と一致することを示す。本研究は,実解剖学的背景を用いた最適眼球運動計画推定のためのq-networkアプローチによる強化学習の可能性を示す。 Humans process visual information with varying resolution (foveated visual system) and explore images by orienting through eye movements the high-resolution fovea to points of interest. The Bayesian ideal searcher (IS) that employs complete knowledge of task-relevant information optimizes eye movement strategy and achieves the optimal search performance. The IS can be employed as an important tool to evaluate the optimality of human eye movements, and potentially provide guidance to improve human observer visual search strategies. Najemnik and Geisler (2005) derived an IS for backgrounds of spatial 1/f noise. The corresponding template responses follow Gaussian distributions and the optimal search strategy can be analytically determined. However, the computation of the IS can be intractable when considering more realistic and complex backgrounds such as medical images. Modern reinforcement learning methods, successfully applied to obtain optimal policy for a variety of tasks, do not require complete knowledge of the background generating functions and can be potentially applied to anatomical backgrounds. An important first step is to validate the optimality of the reinforcement learning method. In this study, we investigate the ability of a reinforcement learning method that employs Q-network to approximate the IS. We demonstrate that the search strategy corresponding to the Q-network is consistent with the IS search strategy. The findings show the potential of the reinforcement learning with Q-network approach to estimate optimal eye movement planning with real anatomical backgrounds.	翻訳日:2022-02-01 14:36:51 公開日:2022-01-28
# (参考訳) データセンターにおけるネットワーク負荷分散のためのマルチエージェント強化学習 Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center ( http://arxiv.org/abs/2201.11727v2 ) ライセンス: CC0 1.0	Zhiyuan Yao, Zihan Ding, Thomas Clausen	(参考訳) 本稿では,マルチエージェント強化学習(marl)手法のための実世界課題であるネットワーク負荷分散問題を提案する。 Weighted-Cost Multi-Path (WCMP)やLocal Shortest Queue (LSQ)のような従来のヒューリスティックなソリューションは、ワークロードの分散や到着率の変化に対して柔軟性が低く、複数のロードバランサ間のバランスが低い。協調的ネットワーク負荷分散タスクはDec-POMDP問題として定式化され、MARL法を自然に誘導する。学習に基づく手法を適用するための現実のギャップを埋めるため、すべての手法は中程度から大規模までのエミュレーションシステム上で直接訓練され、評価される。現実的なテストベッドの実験では、独立的で"利己的"なロードバランシング戦略が必ずしもグローバルな最適戦略ではないことが示され、提案されたMARLソリューションは、異なる現実的な設定よりも優れたパフォーマンスを示している。さらに,ネットワークロードバランシングにおけるmarl手法の潜在的な難しさを解析し,学習者やネットワークコミュニティの関心を引きつけている。 This paper presents the network load balancing problem, a challenging real-world task for multi-agent reinforcement learning (MARL) methods. Traditional heuristic solutions like Weighted-Cost Multi-Path (WCMP) and Local Shortest Queue (LSQ) are less flexible to the changing workload distributions and arrival rates, with a poor balance among multiple load balancers. The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods. To bridge the reality gap for applying learning-based methods, all methods are directly trained and evaluated on an emulation system from moderate-to large-scale. Experiments on realistic testbeds show that the independent and "selfish" load balancing strategies are not necessarily the globally optimal ones, while the proposed MARL solution has a superior performance over different realistic settings. Additionally, the potential difficulties of MARL methods for network load balancing are analysed, which helps to draw the attention of the learning and network communities to such challenges.	翻訳日:2022-02-01 13:35:55 公開日:2022-01-28
# (参考訳) 低頻度かつ説明可能な特徴を有する白血球白血病の分類 Classification of White Blood Cell Leukemia with Low Number of Interpretable and Explainable Features ( http://arxiv.org/abs/2201.11864v1 ) ライセンス: CC BY 4.0	William Franz Lamberti	(参考訳) 白血球(WBC)白血病は画像ベース分類によって検出される。畳み込みニューラルネットワークは、細胞の画像を悪性または健康に分類するために必要な特徴を学ぶために使用される。しかし、この種のモデルは多数のパラメータを学習する必要があるため、解釈や説明が困難である。説明可能なAI(XAI)は、モデルが意思決定を行う方法に関する洞察を提供することで、この問題を緩和しようとする。そこで,本研究では,説明可能かつ解釈可能な特徴を24個しか用いず,他の手法と比較して約4.38\%の精度で高いxaiモデルを提案する。さらに,本手法は,細胞分類においてどの変数が最も重要なのかを考察する。この洞察は、研究室がWBCを別々に扱うと、様々な指標の重要性が大幅に変化することを示す。分類の重要な特徴を理解することは、医学的画像診断や、科学的な追求のために構築されたAIモデルの理解において不可欠である。 White Blood Cell (WBC) Leukaemia is detected through image-based classification. Convolutional Neural Networks are used to learn the features needed to classify images of cells a malignant or healthy. However, this type of model requires learning a large number of parameters and is difficult to interpret and explain. Explainable AI (XAI) attempts to alleviate this issue by providing insights to how models make decisions. Therefore, we present an XAI model which uses only 24 explainable and interpretable features and is highly competitive to other approaches by outperforming them by about 4.38\%. Further, our approach provides insight into which variables are the most important for the classification of the cells. This insight provides evidence that when labs treat the WBCs differently, the importance of various metrics changes substantially. Understanding the important features for classification is vital in medical imaging diagnosis and, by extension, understanding the AI models built in scientific pursuits.	翻訳日:2022-02-01 09:28:22 公開日:2022-01-28
# (参考訳) FedLite: リソース制約のあるクライアント上でのフェデレーション学習のためのスケーラブルなアプローチ FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients ( http://arxiv.org/abs/2201.11865v1 ) ライセンス: CC BY 4.0	Jianyu Wang, Hang Qi, Ankit Singh Rawat, Sashank Reddi, Sagar Waghmare, Felix X. Yu, Gauri Joshi	(参考訳) 古典的なフェデレーション学習では、クライアントは、プライベートデータ上の基盤モデルのローカルアップデートをコーディネートサーバに伝えて、全体的なトレーニングに寄与する。しかし、リソースに制約のあるクライアントが大規模な機械学習モデルを学習しようとすると、モデル全体の更新と通信は極めて高価になる。分割学習は、モデルの一部だけがクライアントに保存され、トレーニングされ、残りの大部分がサーバに留まっているような環境で、自然なソリューションを提供する。しかし、分割学習で使用されるモデル分割は、かなりの通信コストをもたらす。本稿では,勾配補正法を併用した新しいクラスタリング方式を用いて,付加的な通信を圧縮することで,この問題に対処する。画像およびテキストベンチマークの広範な実証評価により、提案手法は最大490\times$の通信コストを最小の精度で削減でき、望ましい性能と通信トレードオフを実現できることが示された。 In classical federated learning, the clients contribute to the overall training by communicating local updates for the underlying model on their private data to a coordinating server. However, updating and communicating the entire model becomes prohibitively expensive when resource-constrained clients collectively aim to train a large machine learning model. Split learning provides a natural solution in such a setting, where only a small part of the model is stored and trained on clients while the remaining large part of the model only stays at the servers. However, the model partitioning employed in split learning introduces a significant amount of communication cost. This paper addresses this issue by compressing the additional communication using a novel clustering scheme accompanied by a gradient correction method. Extensive empirical evaluations on image and text benchmarks show that the proposed method can achieve up to $490\times$ communication cost reduction with minimal drop in accuracy, and enables a desirable performance vs. communication trade-off.	翻訳日:2022-02-01 09:12:40 公開日:2022-01-28
# (参考訳) ラベル平滑化を用いた病理組織像分類器の校正 Calibrating Histopathology Image Classifiers using Label Smoothing ( http://arxiv.org/abs/2201.11866v1 ) ライセンス: CC BY 4.0	Jerry Wei and Lorenzo Torresani and Jason Wei and Saeed Hassanpour	(参考訳) 病理組織像の分類は、病理組織像が自然に様々な診断的特徴を示すため、従来の画像分類課題と根本的に異なる。しかし、アノテータの不一致の例は、多くの場合、大多数のラベルに割り当てられるか、病理組織学画像分類器の訓練時に完全に破棄される。この広範にわたる慣行は、しばしば難易度を考慮せず、モデルのキャリブレーションが貧弱な分類器をもたらす。本稿では, 組織像分類器にサンプル難易度に関する帰納バイアスを与えることにより, モデル校正を改善することができるか? 画像毎のアノテータ合意を利用したラベル平滑化手法を提案する。提案手法は単純ではあるが,精度を維持(あるいは改善)しながら,モデルキャリブレーションを大幅に改善していることがわかった。大腸ポリープ分類は消化管病理における一般的な課題でありながら課題であり,本提案の合意対応ラベル平滑化手法は校正誤差を約70%削減する。さらに,アノテータ契約のプロキシとしてモデル信頼性を用いることでキャリブレーションと精度が向上し,複数のアノテータを含まないデータセットは,提案手法によるラベル平滑化手法の恩恵を受けられることが示唆された。キャリブレーション(特に病理組織学的画像解析)の重要性を考えると、提案手法の改善は、他の病理組織学的画像分類タスクにおけるさらなる探索と潜在的な実装に役立つ。 The classification of histopathology images fundamentally differs from traditional image classification tasks because histopathology images naturally exhibit a range of diagnostic features, resulting in a diverse range of annotator agreement levels. However, examples with high annotator disagreement are often either assigned the majority label or discarded entirely when training histopathology image classifiers. This widespread practice often yields classifiers that do not account for example difficulty and exhibit poor model calibration. In this paper, we ask: can we improve model calibration by endowing histopathology image classifiers with inductive biases about example difficulty? We propose several label smoothing methods that utilize per-image annotator agreement. Though our methods are simple, we find that they substantially improve model calibration, while maintaining (or even improving) accuracy. For colorectal polyp classification, a common yet challenging task in gastrointestinal pathology, we find that our proposed agreement-aware label smoothing methods reduce calibration error by almost 70%. Moreover, we find that using model confidence as a proxy for annotator agreement also improves calibration and accuracy, suggesting that datasets without multiple annotators can still benefit from our proposed label smoothing methods via our proposed confidence-aware label smoothing methods. Given the importance of calibration (especially in histopathology image analysis), the improvements from our proposed techniques merit further exploration and potential implementation in other histopathology image classification tasks.	翻訳日:2022-02-01 08:47:00 公開日:2022-01-28
# (参考訳) 構造入力による局所潜時空間ベイズ最適化 Local Latent Space Bayesian Optimization over Structured Inputs ( http://arxiv.org/abs/2201.11872v1 ) ライセンス: CC BY-SA 4.0	Natalie Maus, Haydn T. Jones, Juston S. Moore, Matt J. Kusner, John Bradshaw, Jacob R. Gardner	(参考訳) 深層オートエンコーダモデル(英語版)(daes)の潜在空間上のベイズ最適化は、構造化、離散化、列挙困難な探索空間(例えば分子)に対して挑戦的なブラックボックス関数を最適化するための有望な新しいアプローチとして最近登場した。ここで、daeは入力をベイズ最適化ツールがより容易に適用できる連続的潜在空間にマッピングすることで、検索空間を劇的に単純化する。この単純化にもかかわらず、潜在空間は通常高次元のままである。したがって、うまく適合した潜在空間であっても、これらのアプローチは必ずしも完全な解を提供するものではなく、むしろ構造化最適化問題を高次元空間に移すことができる。本稿では,高次元ベイズ最適化における信頼領域の概念を構造化環境に適応させるLOL-BOを提案する。 daeのグローバルエンコーダと信頼領域内のサロゲートモデルのディープカーネルの両方として機能するようにエンコーダを再構成することで、潜在空間における局所最適化の概念を入力空間における局所最適化と一致させる。 LOL-BOは6つの実世界のベンチマークで、最先端の潜在空間ベイズ最適化手法よりも最大20倍の改善を実現し、最適化戦略の改善はより良いDAEモデルの開発と同じくらい重要であることを示した。 Bayesian optimization over the latent spaces of deep autoencoder models (DAEs) has recently emerged as a promising new approach for optimizing challenging black-box functions over structured, discrete, hard-to-enumerate search spaces (e.g., molecules). Here the DAE dramatically simplifies the search space by mapping inputs into a continuous latent space where familiar Bayesian optimization tools can be more readily applied. Despite this simplification, the latent space typically remains high-dimensional. Thus, even with a well-suited latent space, these approaches do not necessarily provide a complete solution, but may rather shift the structured optimization problem to a high-dimensional one. In this paper, we propose LOL-BO, which adapts the notion of trust regions explored in recent work on high-dimensional Bayesian optimization to the structured setting. By reformulating the encoder to function as both an encoder for the DAE globally and as a deep kernel for the surrogate model within a trust region, we better align the notion of local optimization in the latent space with local optimization in the input space. LOL-BO achieves as much as 20 times improvement over state-of-the-art latent space Bayesian optimization methods across six real-world benchmarks, demonstrating that improvement in optimization strategies is as important as developing better DAE models.	翻訳日:2022-02-01 08:35:16 公開日:2022-01-28
# (参考訳) オートエンコーダアーキテクチャにおける分布データの幾何学的不安定性 Geometric instability of out of distribution data across autoencoder architecture ( http://arxiv.org/abs/2201.11902v1 ) ライセンス: CC BY 4.0	Susama Agarwala, Ben Dees, Corey Lowman	(参考訳) mnistで学習したオートエンコーダの系統が学習した地図を調査し,10種類の分布に応じて画素値のランダム選択により作成した10種類のデータセットについて評価した。具体的には,オートエンコーダの重み行列で定義されるジャコビアンの固有値と評価点について検討する。十分高い潜在次元では、各オートエンコーダは、類似の \emph{generalized characters} としてすべての評価データセットを再構成するが、この再構成された \emph{generalized character} は、オートエンコーダをまたいで変化する。固有値解析により、再構成された画像が分布データセットの全てに対してMNIST文字のように見える場合でも、MNIST文字の潜在表現に近い潜在表現を持つとは限らないことが分かる。いずれにせよ、固有値解析は、分布入力の関数としてのオートエンコーダの幾何的不安定性を、同じ入力の集合上のアーキテクチャ全体にわたって証明した。 We study the map learned by a family of autoencoders trained on MNIST, and evaluated on ten different data sets created by the random selection of pixel values according to ten different distributions. Specifically, we study the eigenvalues of the Jacobians defined by the weight matrices of the autoencoder at each training and evaluation point. For high enough latent dimension, we find that each autoencoder reconstructs all the evaluation data sets as similar \emph{generalized characters}, but that this reconstructed \emph{generalized character} changes across autoencoder. Eigenvalue analysis shows that even when the reconstructed image appears to be an MNIST character for all out of distribution data sets, not all have latent representations that are close to the latent representation of MNIST characters. All told, the eigenvalue analysis demonstrated a great deal of geometric instability of the autoencoder both as a function on out of distribution inputs, and across architectures on the same set of inputs.	翻訳日:2022-02-01 08:12:49 公開日:2022-01-28
# (参考訳) 大規模言語モデルにおける思考プロンプトの連鎖 Chain of Thought Prompting Elicits Reasoning in Large Language Models ( http://arxiv.org/abs/2201.11903v1 ) ライセンス: CC BY 4.0	Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, Denny Zhou	(参考訳) 言語モデルのスケールアップは、様々なNLPタスクのパフォーマンスを確実に向上させたが、現在最大のモデルでさえ、数学語問題、記号操作、コモンセンス推論のような特定の推論タスクに苦戦している。本稿は,質問に回答する際の推論過程を模倣した一連の短い文である,一貫性のある思考列を生成する言語モデルの能力について考察する。実験により、プロンプトによって思考の連鎖を誘導することで、十分に大きな言語モデルが、平らなスケーリング曲線を持つ推論タスクをより良く実行できるようになることが示されている。 Although scaling up language model size has reliably improved performance on a range of NLP tasks, even the largest models currently struggle with certain reasoning tasks such as math word problems, symbolic manipulation, and commonsense reasoning. This paper explores the ability of language models to generate a coherent chain of thought -- a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves.	翻訳日:2022-02-01 07:53:50 公開日:2022-01-28
# (参考訳) 双対スパイクニューラルネットワークにおける死ニューロンとスパーシティの細線 The fine line between dead neurons and sparsity in binarized spiking neural networks ( http://arxiv.org/abs/2201.11915v1 ) ライセンス: CC BY 4.0	Jason K. Eshraghian, Wei D. Lu	(参考訳) スパイキングニューラルネットワークは、時間領域の情報を符号化したり、より高精度な隠蔽状態の離散化量を処理することによって量子化誤差を補償することができる。理論上、広いダイナミックレンジ状態空間は複数の二項化入力をまとめることを可能にし、個々のニューロンの表現能力を向上させる。これは発射しきい値を増加させることによって達成されるが、過度に高くなり、スパーススパイク活性はスパイク放出しない。本稿では,しきい値のウォームアップ手法として'Threshold annealing'を提案する。複数の層にまたがってスパイクを伝播させることで、ニューロンが火を放つのを防ぎ、その結果、二項化重みを使いながら4つの異なるデータセットに対して高い競争力を発揮することを示す。ソースコードはhttps://github.com/jeshraghian/snn-tha/で入手できる。 Spiking neural networks can compensate for quantization error by encoding information either in the temporal domain, or by processing discretized quantities in hidden states of higher precision. In theory, a wide dynamic range state-space enables multiple binarized inputs to be accumulated together, thus improving the representational capacity of individual neurons. This may be achieved by increasing the firing threshold, but make it too high and sparse spike activity turns into no spike emission. In this paper, we propose the use of `threshold annealing' as a warm-up method for firing thresholds. We show it enables the propagation of spikes across multiple layers where neurons would otherwise cease to fire, and in doing so, achieve highly competitive results on four diverse datasets, despite using binarized weights. Source code is available at https://github.com/jeshraghian/snn-tha/	翻訳日:2022-02-01 07:18:11 公開日:2022-01-28
# (参考訳) バタフライネットワーク上のタスク認識ネットワーク符号化 Task-Aware Network Coding Over Butterfly Network ( http://arxiv.org/abs/2201.11917v1 ) ライセンス: CC BY 4.0	Jiangnan Cheng, Sandeep Chinchali, Ao Tang	(参考訳) ネットワーク符号化により、センサなどの分散情報ソースは、帯域幅制限ネットワークを介して分散受信機にデータを効率よく圧縮し、送信することができる。古典的なネットワーク符号化は主にタスクに依存しない - 受信したデータがどの究極のタスクに使用されるかに関わらず、主に受信側でデータを忠実に再構築することを目的としている。本稿では、分散受信機が機械学習(ML)タスクを介して送信されたデータを渡すタスク駆動型ネットワークコーディング問題を分析し、有能なタスク関連データ表現を送信することで効率を向上する機会を提供する。具体的には、主成分分析(PCA)による損失アナログ圧縮を応用できる実座標空間におけるバタフライネットワーク上のタスク認識ネットワーク符号化問題を定式化する。定式化問題に対する全損失関数に対する下限が与えられ、この下限を達成するために必要な十分な条件も提供される。そこで本研究では,一般のケースで問題を解くためにmlアルゴリズムを導入し,タスク認識型ネットワーク符号化の有効性を実証する。 Network coding allows distributed information sources such as sensors to efficiently compress and transmit data to distributed receivers across a bandwidth-limited network. Classical network coding is largely task-agnostic -- the coding schemes mainly aim to faithfully reconstruct data at the receivers, regardless of what ultimate task the received data is used for. In this paper, we analyze a new task-driven network coding problem, where distributed receivers pass transmitted data through machine learning (ML) tasks, which provides an opportunity to improve efficiency by transmitting salient task-relevant data representations. Specifically, we formulate a task-aware network coding problem over a butterfly network in real-coordinate space, where lossy analog compression through principal component analysis (PCA) can be applied. A lower bound for the total loss function for the formulated problem is given, and necessary and sufficient conditions for achieving this lower bound are also provided. We introduce ML algorithms to solve the problem in the general case, and our evaluation demonstrates the effectiveness of task-aware network coding.	翻訳日:2022-02-01 06:56:39 公開日:2022-01-28
# (参考訳) ヘビーテールマルチアームバンディットのための適応型両世界のベストオブバイザーズアルゴリズム Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits ( http://arxiv.org/abs/2201.11921v1 ) ライセンス: CC BY-SA 4.0	Jiatai Huang, Yan Dai, Longbo Huang	(参考訳) 本稿では,重畳型マルチアーム付きバンディットの概念を敵環境に一般化し,重畳型マルチアーム付きバンディット (mab) に対して頑健な最善のバイザーワールドアルゴリズムを開発し,損失が$\sigma^\alpha$ で区切られた$\alpha$-th ($1<\alpha\le 2$) モーメントを持つ場合,分散は存在しない。具体的には、ヘビーテールパラメータ $\alpha$ と $\sigma$ がエージェントに知られている場合、 \texttt{htinf} は、実際の環境タイプ a-priori を知らずに、確率的環境と敵対的環境の両方に対して最適な後悔を達成する。 alpha,\sigma$ が未知の場合、 \texttt{htinf} は確率ケースでは $\log t$-style instance-dependent regret となり、反対ケースでは $o(t)$ no-regret が保証される。さらに、$\mathcal O(\sigma K^{1-\nicefrac 1\alpha}T^{\nicefrac{1}{\alpha}})$ minimax optimal regret を、$\alpha$ と $\sigma$ について事前の知識を必要とせずに、敵対的設定でも実現できるアルゴリズムである。この結果は、確率的な環境を仮定した既知の後悔(bubeck et al., 2013)と、$\alpha$と$\sigma$の両方が知られている。我々の知る限り、提案した‘texttt{HTINF} アルゴリズムは、最良世界の後悔の保証を初めて享受し、‘texttt{AdaTINF} は $\alpha$ と $\sigma$ の両方に適応できる最初のアルゴリズムであり、古典的な重み付き確率的 MAB 設定と我々の新しい逆数定式化において最適なギャップ非依存性の後悔を達成できる。 In this paper, we generalize the concept of heavy-tailed multi-armed bandits to adversarial environments, and develop robust best-of-both-worlds algorithms for heavy-tailed multi-armed bandits (MAB), where losses have $\alpha$-th ($1<\alpha\le 2$) moments bounded by $\sigma^\alpha$, while the variances may not exist. Specifically, we design an algorithm \texttt{HTINF}, when the heavy-tail parameters $\alpha$ and $\sigma$ are known to the agent, \texttt{HTINF} simultaneously achieves the optimal regret for both stochastic and adversarial environments, without knowing the actual environment type a-priori. When $\alpha,\sigma$ are unknown, \texttt{HTINF} achieves a $\log T$-style instance-dependent regret in stochastic cases and $o(T)$ no-regret guarantee in adversarial cases. We further develop an algorithm \texttt{AdaTINF}, achieving $\mathcal O(\sigma K^{1-\nicefrac 1\alpha}T^{\nicefrac{1}{\alpha}})$ minimax optimal regret even in adversarial settings, without prior knowledge on $\alpha$ and $\sigma$. This result matches the known regret lower-bound (Bubeck et al., 2013), which assumed a stochastic environment and $\alpha$ and $\sigma$ are both known. To our knowledge, the proposed \texttt{HTINF} algorithm is the first to enjoy a best-of-both-worlds regret guarantee, and \texttt{AdaTINF} is the first algorithm that can adapt to both $\alpha$ and $\sigma$ to achieve optimal gap-indepedent regret bound in classical heavy-tailed stochastic MAB setting and our novel adversarial formulation.	翻訳日:2022-02-01 06:27:18 公開日:2022-01-28
# (参考訳) 非凸最適化におけるデフレの簡易化とベイズ推論と位相最適化への応用 Simplifying deflation for non-convex optimization with applications in Bayesian inference and topology optimization ( http://arxiv.org/abs/2201.11926v1 ) ライセンス: CC BY 4.0	Mohamed Tarek, Yijiang Huang	(参考訳) 非凸最適化問題は複数の局所最適解を持つ。非凸最適化問題は、多くのアプリケーションでよく見られる。ランダムな再初期化なしに複数の局所最適解を効率的に探索する手法の1つがデフレの概念に依存している。本稿では,非凸最適化と非線形解法におけるデフレの異なる方法について述べる。既存の非線形プログラミング解法や非線形システム解法とともにデフレを可能にするために, 単純で汎用的で斬新なデフレ制約を提案する。提案したデフレ制約と最小距離制約との接続を示す。さらに、デフレ制約の様々なバリエーションとその制限について論じる。最後に、近似ベイズ推定と位相最適化の分野における提案手法の多くの応用について述べる。 Non-convex optimization problems have multiple local optimal solutions. Non-convex optimization problems are commonly found in numerous applications. One of the methods recently proposed to efficiently explore multiple local optimal solutions without random re-initialization relies on the concept of deflation. In this paper, different ways to use deflation in non-convex optimization and nonlinear system solving are discussed. A simple, general and novel deflation constraint is proposed to enable the use of deflation together with existing nonlinear programming solvers or nonlinear system solvers. The connection between the proposed deflation constraint and a minimum distance constraint is presented. Additionally, a number of variations of deflation constraints and their limitations are discussed. Finally, a number of applications of the proposed methodology in the fields of approximate Bayesian inference and topology optimization are presented.	翻訳日:2022-02-01 05:47:25 公開日:2022-01-28
# (参考訳) 高速解釈可能なグリーディツリー和(fig) Fast Interpretable Greedy-Tree Sums (FIGS) ( http://arxiv.org/abs/2201.11931v1 ) ライセンス: CC BY 4.0	Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, Bin Yu	(参考訳) 現代の機械学習は印象的な予測性能を達成しているが、多くの問題において重要な考慮事項である解釈性を犠牲にすることが多い。本稿では,簡潔なルールモデルに適合するアルゴリズムであるFast Interpretable Greedy-Tree Sums (FIGS)を提案する。具体的には、FIGSはCARTアルゴリズムを一般化し、累積で柔軟な数の木を同時に成長させる。すべての木にまたがる分割の総数は予め定められたしきい値によって制限され、その木のサイズと数の両方が制御される。両者が小さい場合は、簡単に視覚化して手書きで書けるので、高い解釈が可能である。部分的にオラクルの理論的結果は、生成的加法モデルの付加成分を分離することで、figがシングルツリーモデルの重大な弱点を克服し、同じ特徴の繰り返し分裂による冗長性を低減できる可能性を示唆している。さらに、最適木構造へのオラクルアクセスが与えられた場合、C1成分関数の場合、そのような生成モデルに対してL2一般化境界を得る。さまざまな実世界のデータセットにわたる大規模な実験は、FIGSがほんの数分割(例:20未満)に制限された場合、最先端の予測性能(一般的なルールベースのメソッドすべて)を達成することを示している。 FIGSは繰り返し分割を回避でき、しばしば予測性能を犠牲にすることなく、適合した決定木よりも簡潔な決定ルールを提供する。すべてのコードとモデルはGithub \url{https://github.com/csinva/imodels} のフルパッケージでリリースされる。 Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in many problems. Here, we propose Fast Interpretable Greedy-Tree Sums (FIGS), an algorithm for fitting concise rule-based models. Specifically, FIGS generalizes the CART algorithm to simultaneously grow a flexible number of trees in a summation. The total number of splits across all the trees can be restricted by a pre-specified threshold, thereby keeping both the size and number of its trees under control. When both are small, the fitted tree-sum can be easily visualized and written out by hand, making it highly interpretable. A partially oracle theoretical result hints at the potential for FIGS to overcome a key weakness of single-tree models by disentangling additive components of generative additive models, thereby reducing redundancy from repeated splits on the same feature. Furthermore, given oracle access to optimal tree structures, we obtain L2 generalization bounds for such generative models in the case of C1 component functions, matching known minimax rates in some cases. Extensive experiments across a wide array of real-world datasets show that FIGS achieves state-of-the-art prediction performance (among all popular rule-based methods) when restricted to just a few splits (e.g. less than 20). We find empirically that FIGS is able to avoid repeated splits, and often provides more concise decision rules than fitted decision trees, without sacrificing predictive performance. All code and models are released in a full-fledged package on Github \url{https://github.com/csinva/imodels}.	翻訳日:2022-02-01 05:31:36 公開日:2022-01-28
# (参考訳) 周期グラフの深い生成モデル Deep Generative Model for Periodic Graphs ( http://arxiv.org/abs/2201.11932v1 ) ライセンス: CC BY 4.0	Shiyu Wang, Xiaojie Guo, Liang Zhao	(参考訳) 周期グラフは、クリスタルネットやポリゴンメッシュのような繰り返し局所構造からなるグラフである。それらの生成モデリングは、マテリアルデザインやグラフィック合成といった現実世界の応用に大きな可能性を秘めている。古典モデルは、ドメイン固有の事前定義された生成原理(例えば、クリスタルネット設計)に依存するか、または幾何学に基づく所定の規則に従う。近年,深層生成モデルが一般グラフの自動生成に大きな期待を寄せている。しかし、それらの周期グラフへの進歩は、いくつかの重要な課題のために、十分に研究されていない。 1) グラフ周期性を維持すること 2) 地域的及びグローバル的パターンの分離 3)反復パターン学習の効率性。そこで本研究では,周期グラフの深部生成モデルとして,局所およびグローバルグラフパターンの自動学習・解離・生成が可能な周期グラフ分散変分オートエンコーダ(PGD-VAE)を提案する。具体的には,グローバル・パターン・エンコーダとローカル・パターン・エンコーダからなる周期グラフエンコーダを開発し,その表現をグローバル・ローカル・セマンティクスに変換する。次に、局所構造デコーダ、近傍デコーダ、大域構造デコーダからなる新しい周期グラフデコーダと、周期性を保証するそれらの出力のアセンブラを提案する。さらに,同じ局所構造を持つグラフに対して局所意味表現の不変性を確実にする新しいモデル学習目標を設計する。提案手法の有効性を実証するための総合的な実験的評価を行った。 PGD-VAEのコードはhttps://github.com/shi-yu-wang/PGD-VAEで公開されている。 Periodic graphs are graphs consisting of repetitive local structures, such as crystal nets and polygon mesh. Their generative modeling has great potential in real-world applications such as material design and graphics synthesis. Classical models either rely on domain-specific predefined generation principles (e.g., in crystal net design), or follow geometry-based prescribed rules. Recently, deep generative models has shown great promise in automatically generating general graphs. However, their advancement into periodic graphs have not been well explored due to several key challenges in 1) maintaining graph periodicity; 2) disentangling local and global patterns; and 3) efficiency in learning repetitive patterns. To address them, this paper proposes Periodical-Graph Disentangled Variational Auto-encoder (PGD-VAE), a new deep generative models for periodic graphs that can automatically learn, disentangle, and generate local and global graph patterns. Specifically, we develop a new periodic graph encoder consisting of global-pattern encoder and local-pattern encoder that ensures to disentangle the representation into global and local semantics. We then propose a new periodic graph decoder consisting of local structure decoder, neighborhood decoder, and global structure decoder, as well as the assembler of their outputs that guarantees periodicity. Moreover, we design a new model learning objective that helps ensure the invariance of local-semantic representations for the graphs with the same local structure. Comprehensive experimental evaluations have been conducted to demonstrate the effectiveness of the proposed method. The code of proposed PGD-VAE is availabe at https://github.com/shi-yu-wang/PGD-VAE.	翻訳日:2022-02-01 04:51:27 公開日:2022-01-28
# (参考訳) より大きな距離を持つとパフォーマンスが悪化する:層利用とモデル一般化の観点から With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization ( http://arxiv.org/abs/2201.11939v1 ) ライセンス: CC BY 4.0	James Wang, Cheng-Lin Yang	(参考訳) ディープニューラルネットワークの一般化は、マシンラーニングにおける主要なオープン問題の1つだ。モデル複雑性の厳密な境界を導出することに焦点を当てた以前の理論研究では、ニューラルネットワークがトレーニングサンプル数とニューラルネットワークのサイズの両方に関して二重降下を示すことが示されている。本稿では、ニューラルネットワークの異なる層がモデルにどのように貢献するかを実証的に検討し、初期の層は、トレーニングデータとテストデータの両方で、パフォーマンスに関連する表現を一般的に学習していることを発見した。逆に、より深い層はトレーニングのリスクを最小にし、テストや誤ったラベルデータの一般化に失敗します。さらに、トレーニングされた重みと最終層の初期値との距離が一般化誤差と高い相関を持ち、モデルの過度な適合の指標として機能することを示す。さらに,最終層の重みを再初期化することにより,トレーニング後の正規化を支援する証拠を示す。本研究は,ニューラルネットワークの一般化能力を効率的に推定する手法であり,ニューラルネットワークの内部構造を考慮したより優れた一般化境界への導出を促すことができる。 Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit double descent with respect to both training sample counts and the neural network size. In this paper, we empirically examined how different layers of neural networks contribute differently to the model; we found that early layers generally learn representations relevant to performance on both training data and testing data. Contrarily, deeper layers only minimize training risks and fail to generalize well with testing or mislabeled data. We further illustrate the distance of trained weights to its initial value of final layers has high correlation to generalization errors and can serve as an indicator of an overfit of model. Moreover, we show evidence to support post-training regularization by re-initializing weights of final layers. Our findings provide an efficient method to estimate the generalization capability of neural networks, and the insight of those quantitative results may inspire derivation to better generalization bounds that take the internal structure of neural networks into consideration.	翻訳日:2022-02-01 04:29:50 公開日:2022-01-28
# (参考訳) スティル化ニューラルアニメーションのためのワッサースプライン Wassersplines for Stylized Neural Animation ( http://arxiv.org/abs/2201.11940v1 ) ライセンス: CC BY 4.0	Paul Zhang, Dmitriy Smirnov, Justin Solomon	(参考訳) コンピュータ生成アニメーションの多くは、メッシュをリグで操作することで生成される。このアプローチは動物のような関節のある物体をアニメーションするのにうまく機能するが、「レイアとラスト・ドラゴン」のドーンのようなより構造の低い生物をアニメーションするための柔軟性は限られている。連続正規化流と最適輸送の最近の進歩に基づき,非構造密度をアニメーション化する新しい軌道推定法であるwassersplinesを紹介する。鍵となるアイデアは、キーフレーム間の動きを表す神経パラメータの速度場をトレーニングすることだ。トラジェクトリは、キーフレームを速度場にプッシュすることで計算される。追加のwaserstein barycenter補間問題を解き、キーフレームへの厳格な準拠を保証する。我々のツールは、様々なPDEベースの正規化器を通して軌跡をスタイリングし、異なる視覚効果を生み出すことができる。我々は,様々なキーフレーム補間問題に対して,メッシュ化やリギングを伴わずに時間的にコヒーレントなアニメーションを生成するツールを示す。 Much of computer-generated animation is created by manipulating meshes with rigs. While this approach works well for animating articulated objects like animals, it has limited flexibility for animating less structured creatures such as the Drunn in "Raya and the Last Dragon." We introduce Wassersplines, a novel trajectory inference method for animating unstructured densities based on recent advances in continuous normalizing flows and optimal transport. The key idea is to train a neurally-parameterized velocity field that represents the motion between keyframes. Trajectories are then computed by pushing keyframes through the velocity field. We solve an additional Wasserstein barycenter interpolation problem to guarantee strict adherence to keyframes. Our tool can stylize trajectories through a variety of PDE-based regularizers to create different visual effects. We demonstrate our tool on various keyframe interpolation problems to produce temporally-coherent animations without meshing or rigging.	翻訳日:2022-02-01 04:21:15 公開日:2022-01-28
# (参考訳) 複雑力学におけるペアワイズ相互作用の統一 Unifying Pairwise Interactions in Complex Dynamics ( http://arxiv.org/abs/2201.11941v1 ) ライセンス: CC BY 4.0	Oliver M. Cliff, Joseph T. Lizier, Naotsugu Tsuchiya, Ben D. Fulcher	(参考訳) 科学者は複雑なシステムにおけるプロセスのペア間の相互作用を測定するために何百もの技術を開発した。しかし、これらの計算手法は、相関係数から因果推論まで、大きく切り離された異なる定量的理論に依存している。本稿では,ペアインタラクションのための249の統計ライブラリを紹介し,その挙動を実世界およびモデル生成システムから1053個の多変量時系列上で評価する。本分析では,異なる数学的定式化間の新たな共通性に注目し,リッチで学際的な文学の統一像を提供する。そして,各科学の手法を多用することで,与えられた問題に最も適した問題を明らかにすることができ,高い正確性と解釈可能な理解が得られることを示す。我々のフレームワークは拡張可能なオープンソフトウェアで提供されており、数十年の方法論的進歩を統合することで包括的なデータ駆動分析を可能にする。 Scientists have developed hundreds of techniques to measure the interactions between pairs of processes in complex systems. But these computational methods -- from correlation coefficients to causal inference -- rely on distinct quantitative theories that remain largely disconnected. Here we introduce a library of 249 statistics for pairwise interactions and assess their behavior on 1053 multivariate time series from a wide range of real-world and model-generated systems. Our analysis highlights new commonalities between different mathematical formulations, providing a unified picture of a rich, interdisciplinary literature. We then show that leveraging many methods from across science can uncover those most suitable for addressing a given problem, yielding high accuracy and interpretable understanding. Our framework is provided in extendable open software, enabling comprehensive data-driven analysis by integrating decades of methodological advances.	翻訳日:2022-02-01 04:06:35 公開日:2022-01-28
# (参考訳) DICP:ドップラー反復閉点アルゴリズム DICP: Doppler Iterative Closest Point Algorithm ( http://arxiv.org/abs/2201.11944v1 ) ライセンス: CC BY 4.0	Bruno Hexsel, Heethesh Vhavle and Yi Chen	(参考訳) 本稿では,向きの瞬時速度を計測できる距離センサのための点雲登録のための新しいアルゴリズムであるドップラーicpを提案する。既存のICPの変種は、通常、非識別的な特徴を持つシナリオや、廊下、トンネル、高速道路、橋などの反復幾何学構造を持つシナリオにおいて、センサーの運動を正確に見積もることができない。本稿では,各点のドップラー計測とセンサの現在の動き推定との整合性を利用した新しいドップラー速度客観的関数を提案する。我々は,特徴量の多い環境においても,ドップラー速度目標関数と,点雲アライメント問題を十分に制約する幾何学的対象関数を共同で最適化する。さらに、ICP溶液を一般的に分解する動的ターゲットから点を切り離すことにより、アライメントに使用する対応マッチングを改善した。本手法は,実センサから収集したデータとシミュレーションから評価する。その結果,ドップラー速度勾配によって導かれる高速収束の利点を付加することにより,登録精度において大幅な性能向上が得られた。 In this paper, we present a novel algorithm for point cloud registration for range sensors capable of measuring per-return instantaneous radial velocity: Doppler ICP. Existing variants of ICP that solely rely on geometry or other features generally fail to estimate the motion of the sensor correctly in scenarios that have non-distinctive features and/or repetitive geometric structures such as hallways, tunnels, highways, and bridges. We propose a new Doppler velocity objective function that exploits the compatibility of each point's Doppler measurement and the sensor's current motion estimate. We jointly optimize the Doppler velocity objective function and the geometric objective function which sufficiently constrains the point cloud alignment problem even in feature-denied environments. Furthermore, the correspondence matches used for the alignment are improved by pruning away the points from dynamic targets which generally degrade the ICP solution. We evaluate our method on data collected from real sensors and from simulation. Our results show a significant performance improvement in terms of the registration accuracy with the added benefit of faster convergence guided by the Doppler velocity gradients.	翻訳日:2022-02-01 04:05:20 公開日:2022-01-28
# (参考訳) 複数のオプティマスを発見するための近位演算子学習 Learning Proximal Operators to Discover Multiple Optima ( http://arxiv.org/abs/2201.11945v1 ) ライセンス: CC0 1.0	Lingxiao Li, Noam Aigerman, Vladimir G. Kim, Jiajin Li, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon	(参考訳) 非凸最適化問題の複数の解を見つけることは至るところで困難である。一般的な既存の解は、複数のランダムな初期推測からの単一解最適化法を適用するか、アドホックなヒューリスティックを用いた解の近傍で探索する。本研究では,非凸問題群にまたがる近位演算子をエンドツーエンドに学習し,テスト時に見つからない問題に対する複数の解を復元する手法を提案する。本手法は,真理解の監督を必要とせず,目的へのアクセスのみを要求できる。最近の理論的結果を適用することで、弱い凸目標と穏やかな規則性条件の下では、近似作用素の訓練は過度なパラメータ化条件下でグローバルに収束することを示す。さらに,幅広いアプリケーションを含むマルチソリューション最適化のためのベンチマークを示し,その効果を示すために提案手法を評価した。 Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Typical existing solutions either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator across a family of non-convex problems, which can then be used to recover multiple solutions for unseen problems at test time. Our method only requires access to the objectives without needing the supervision of ground truth solutions. Notably, the added proximal regularization term elevates the convexity of our formulation: by applying recent theoretical results, we show that for weakly-convex objectives and under mild regularity conditions, training of the proximal operator converges globally in the over-parameterized setting. We further present a benchmark for multi-solution optimization including a wide range of applications and evaluate our method to demonstrate its effectiveness.	翻訳日:2022-02-01 03:50:47 公開日:2022-01-28
# (参考訳) 多視点学習のための高次相関分析 Higher Order Correlation Analysis for Multi-View Learning ( http://arxiv.org/abs/2201.11949v1 ) ライセンス: CC BY 4.0	Jiawang Nie and Li Wang and Zequn Zheng	(参考訳) マルチビュー学習はデータサイエンスでよく使われる。ペアワイズ相関最大化は、複数の視点のコンセンサスを探求するための古典的なアプローチである。対相関は2つのビューに固有のため、より多くのビューへの拡張は多様化でき、ビュー間の固有の相互接続は一般的に失われる。この問題に対処するため,高次相関を最大化することを提案する。これは多視点データの高次相関テンソルを用いた低階近似問題として定式化することができる。低階近似問題の解法として生成多項式法を用いる。実マルチビューデータにおける数値計算結果から,本手法が従来手法を一貫して上回っていることが分かる。 Multi-view learning is frequently used in data science. The pairwise correlation maximization is a classical approach for exploring the consensus of multiple views. Since the pairwise correlation is inherent for two views, the extensions to more views can be diversified and the intrinsic interconnections among views are generally lost. To address this issue, we propose to maximize higher order correlations. This can be formulated as a low rank approximation problem with the higher order correlation tensor of multi-view data. We use the generating polynomial method to solve the low rank approximation problem. Numerical results on real multi-view data demonstrate that this method consistently outperforms prior existing methods.	翻訳日:2022-02-01 03:19:49 公開日:2022-01-28
# (参考訳) 暗黙的神経表現を用いた時系列異常検出 Time-Series Anomaly Detection with Implicit Neural Representation ( http://arxiv.org/abs/2201.11950v1 ) ライセンス: CC BY 4.0	Kyeong-Joong Jeong, Yong-Min Shin	(参考訳) 多変量時系列データの異常検出は多くの実世界のアプリケーションで必須である。近年,様々な深層学習に基づく手法が時系列異常検出において大幅に改善されている。しかし、既存の方法には、複雑なモデル設計による長いトレーニング時間や、与えられたデータセットの最適なハイパーパラメータ(例えば、スライディングウィンドウの長さ)を見つけるための高価なチューニング手順など、いくつかの制限がある。本稿では,インプシットニューラル表現に基づく異常検出(INRAD)と呼ばれる新しい手法を提案する。具体的には、入力に時間を要し、その時点で対応する値を出力する単純な多層パーセプトロンを訓練する。次に,異常検出のための異常スコアとして表現誤差を利用する。 5つの実世界のデータセットにおける実験により,提案手法が性能,トレーニング速度,ロバスト性において,他の最先端手法よりも優れていることを証明した。 Detecting anomalies in multivariate time-series data is essential in many real-world applications. Recently, various deep learning-based approaches have shown considerable improvements in time-series anomaly detection. However, existing methods still have several limitations, such as long training time due to their complex model designs or costly tuning procedures to find optimal hyperparameters (e.g., sliding window length) for a given dataset. In our paper, we propose a novel method called Implicit Neural Representation-based Anomaly Detection (INRAD). Specifically, we train a simple multi-layer perceptron that takes time as input and outputs corresponding values at that time. Then we utilize the representation error as an anomaly score for detecting anomalies. Experiments on five real-world datasets demonstrate that our proposed method outperforms other state-of-the-art methods in performance, training speed, and robustness.	翻訳日:2022-02-01 03:03:38 公開日:2022-01-28
# (参考訳) 不均一エルドス・レーニランダムグラフのフレシェ平均(または中間値)に対するシャープ閾値 Sharp Threshold for the Frechet Mean (or Median) of Inhomogeneous Erdos-Renyi Random Graphs ( http://arxiv.org/abs/2201.11954v1 ) ライセンス: CC BY 4.0	Francois G. Meyer	(参考訳) 不均質なエルドス・レーニーのランダムグラフのアンサンブルの人口、サンプル、フレシェ平均(または中央値)グラフとは何か? グラフ間の距離を計算するためにハミング距離を使用すると、アンサンブルの期待隣接行列のしきい値化により、不均一なランダムグラフのアンサンブルのフレシェ平均(または中央値)グラフが得られる。また, 人口予測隣接行列をサンプル平均隣接行列に置き換えた場合, サンプル平均(あるいは中央値)についても, 結果が成り立つことを示した。したがって、不均質なエルドス・レーニーのランダムグラフのフレシェ平均(または中央値)グラフは、空グラフか完全グラフのいずれかである鋭い閾値を示す。この新しい理論的な結果には、いくつかの重要な実用的結果があり、例えば、疎不均質なランダムグラフのアンサンブルのフレシェ平均は常に空グラフである。 We address the following foundational question: what is the population, and sample, Frechet mean (or median) graph of an ensemble of inhomogeneous Erdos-Renyi random graphs? We prove that if we use the Hamming distance to compute distances between graphs, then the Frechet mean (or median) graph of an ensemble of inhomogeneous random graphs is obtained by thresholding the expected adjacency matrix of the ensemble. We show that the result also holds for the sample mean (or median) when the population expected adjacency matrix is replaced with the sample mean adjacency matrix. Consequently, the Frechet mean (or median) graph of inhomogeneous Erdos-Renyi random graphs exhibits a sharp threshold: it is either the empty graph, or the complete graph. This novel theoretical result has some significant practical consequences; for instance, the Frechet mean of an ensemble of sparse inhomogeneous random graphs is always the empty graph.	翻訳日:2022-02-01 02:49:00 公開日:2022-01-28
# (参考訳) 強化学習による動的時間緩和 Dynamic Temporal Reconciliation by Reinforcement learning ( http://arxiv.org/abs/2201.11964v1 ) ライセンス: CC BY 4.0	Himanshi Charotia, Abhishek Garg, Gaurav Dhama, Naman Maheshwari	(参考訳) 長期および短期の時系列予測に基づくプランニングは、多くの業界で一般的なプラクティスである。この文脈では、時間的集約と和解技術は予測を改善し、モデルの不確実性を低減し、異なる時間的地平をまたいだ一貫性のある予測を提供するのに有用である。しかしながら、これらすべての技術にまたがる前提は、時間階層のすべてのレベルにわたるデータの完全な可用性であるが、これは数学的に便利であるが、ほとんどの場合、低周波データが部分的に完了し、予測中に利用できない。一方、新型コロナウイルスのパンデミックのようなシナリオでは、高周波データが大幅に変化し、この変化は、長期的な状況と大きく異なる予測を改善するために利用することができる。そこで本稿では,マルコフ決定プロセス(MDP)として,低周波予測を高頻度で予測する問題を定式化することにより,プロセスのダイナミクスに関する完全な情報が得られないことを確かめる。これにより、低周波周期が部分的にしか完了していない場合でも、最新のデータに基づいて最適な長期推定を行うことができる。 MDPは、時間差強化学習(TDRL)アプローチを用いて、カスタマイズ可能な動作を用いて、歴史的低周波データにのみ依存するよりも、長期予測を劇的に改善している。この結果は、低周波予測が時間調整文献(低周波予測が信号比よりも低雑音であるという仮定に基づく)で述べたような高周波予測を改善することができる一方で、低周波予測も低周波予測に活用できるという事実も強調している。 Planning based on long and short term time series forecasts is a common practice across many industries. In this context, temporal aggregation and reconciliation techniques have been useful in improving forecasts, reducing model uncertainty, and providing a coherent forecast across different time horizons. However, an underlying assumption spanning all these techniques is the complete availability of data across all levels of the temporal hierarchy, while this offers mathematical convenience but most of the time low frequency data is partially completed and it is not available while forecasting. On the other hand, high frequency data can significantly change in a scenario like the COVID pandemic and this change can be used to improve forecasts that will otherwise significantly diverge from long term actuals. We propose a dynamic reconciliation method whereby we formulate the problem of informing low frequency forecasts based on high frequency actuals as a Markov Decision Process (MDP) allowing for the fact that we do not have complete information about the dynamics of the process. This allows us to have the best long term estimates based on the most recent data available even if the low frequency cycles have only been partially completed. The MDP has been solved using a Time Differenced Reinforcement learning (TDRL) approach with customizable actions and improves the long terms forecasts dramatically as compared to relying solely on historical low frequency data. The result also underscores the fact that while low frequency forecasts can improve the high frequency forecasts as mentioned in the temporal reconciliation literature (based on the assumption that low frequency forecasts have lower noise to signal ratio) the high frequency forecasts can also be used to inform the low frequency forecasts.	翻訳日:2022-02-01 02:27:01 公開日:2022-01-28
# (参考訳) 非定常目的と制約を持つCMDPの高能率2次元強化学習 Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints ( http://arxiv.org/abs/2201.11965v1 ) ライセンス: CC BY 4.0	Yuhao Ding and Javad Lavaei	(参考訳) 時間変動環境におけるRLの安全性の確保に中心的な役割を果たす非定常的目的と制約を伴うマルコフ決定過程(CMDP)における原始的双対強化学習(RL)について考察する。この問題では、報酬/有効性関数と状態遷移関数の両方が、その累積変動が既知の変動予算を超えない限り、時間とともに任意に変化することが許される。時間変動環境における安全なrlアルゴリズムの設計は、制約違反の低減、安全な探索、非定常性への適応などを統合する必要があるため、特に困難である。そこで本研究では,周期的再スタートに基づく政策改善,二重正規化による2次更新,周期的再スタートに基づく楽観的政策評価という3つのメカニズムを特徴とする,周期的再スタート最適化(PROPD-PPO)アルゴリズムを提案する。本稿では,線形カーネルCMDP関数近似設定と表計算CMDP設定の両方において,提案アルゴリズムに対する動的後悔境界と制約違反境界を確立する。本稿では,非定常cmdpに対して,安全かつ効率的なアルゴリズムを提案する。 We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which play a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumulative variations do not exceed certain known variation budgets. Designing safe RL algorithms in time-varying environments is particularly challenging because of the need to integrate the constraint violation reduction, safe exploration, and adaptation to the non-stationarity. To this end, we propose a Periodically Restarted Optimistic Primal-Dual Proximal Policy Optimization (PROPD-PPO) algorithm that features three mechanisms: periodic-restart-based policy improvement, dual update with dual regularization, and periodic-restart-based optimistic policy evaluation. We establish a dynamic regret bound and a constraint violation bound for the proposed algorithm in both the linear kernel CMDP function approximation setting and the tabular CMDP setting. This paper provides the first provably efficient algorithm for non-stationary CMDPs with safe exploration.	翻訳日:2022-02-01 02:17:00 公開日:2022-01-28
# (参考訳) 部分微分方程式の学習解演算子に対する擬微分積分演算子 Pseudo-Differential Integral Operator for Learning Solution Operators of Partial Differential Equations ( http://arxiv.org/abs/2201.11967v1 ) ライセンス: CC BY 4.0	Jin Young Shin, Jae Yong Lee, Hyung Ju Hwang	(参考訳) 2つの関数空間間の学習マッピングは、かなりの研究の注目を集めている。しかし、偏微分方程式(PDE)の解演算子を学ぶことは科学計算の課題である。そこで本研究では,微分作用素の一般化であり,ある記号を特徴とする擬似微分作用素に着想を得た新しい擬微分積分作用素(pdio)を提案する。ニューラルネットワークを用いてシンボルをパラメータ化し,ニューラルネットワークに基づくシンボルがスムーズなシンボルクラスに含まれることを示す。その後、PDIO が有界線型作用素であることを証明し、従ってソボレフ空間において連続である。 PDIOとニューラル演算子を組み合わせて擬微分ニューラル演算子(PDNO)を開発し,PDEの非線形解演算子を学習する。提案モデルの有効性を,バーガーズ方程式,ダーシー流,ナビエ・ストークス方程式を用いて実験的に検証した。その結果,提案するpdnoは既存のニューラルオペレータのアプローチに匹敵することがわかった。 Learning mapping between two function spaces has attracted considerable research attention. However, learning the solution operator of partial differential equations (PDEs) remains a challenge in scientific computing. Therefore, in this study, we propose a novel pseudo-differential integral operator (PDIO) inspired by a pseudo-differential operator, which is a generalization of a differential operator and characterized by a certain symbol. We parameterize the symbol by using a neural network and show that the neural-network-based symbol is contained in a smooth symbol class. Subsequently, we prove that the PDIO is a bounded linear operator, and thus is continuous in the Sobolev space. We combine the PDIO with the neural operator to develop a pseudo-differential neural operator (PDNO) to learn the nonlinear solution operator of PDEs. We experimentally validate the effectiveness of the proposed model by using Burgers' equation, Darcy flow, and the Navier-Stokes equation. The results reveal that the proposed PDNO outperforms the existing neural operator approaches in most experiments.	翻訳日:2022-02-01 02:15:30 公開日:2022-01-28
# (参考訳) 訓練不変性と低ランク現象--線形ネットワークを超えて Training invariances and the low-rank phenomenon: beyond linear networks ( http://arxiv.org/abs/2201.11968v1 ) ライセンス: CC BY 4.0	Thien Le, Stefanie Jegelka	(参考訳) ニューラルネットワークのトレーニングによって引き起こされる暗黙のバイアスは、厳密な研究の対象となっている。勾配流の限界と適切なステップサイズでの勾配降下では、線形分離可能なデータ上で対数的あるいは指数的損失を持つ深い線形ネットワークを訓練すると、重みはランク1ドルの行列に収束することが示されている。本稿では、この理論結果を、完全に接続された層とスキップ接続を含むより広範な非線形ReLU活性化フィードフォワードネットワークに拡張する。私たちの知る限りでは、これらのアーキテクチャで低ランクな現象が厳格に証明されたのはこれが初めてであり、文学における実証的な結果を反映している。この証明は、特定の局所的なトレーニング不変性に依存しており、これはアライメントと呼ばれることがある。我々の証明は、あるパラメータの方向収束の下で重みが一定である多重線型関数とReLUネットワークへのネットワークの特定の分解に依存している。 The implicit bias induced by the training of neural networks has become a topic of rigorous study. In the limit of gradient flow and gradient descent with appropriate step size, it has been shown that when one trains a deep linear network with logistic or exponential loss on linearly separable data, the weights converge to rank-$1$ matrices. In this paper, we extend this theoretical result to the much wider class of nonlinear ReLU-activated feedforward networks containing fully-connected layers and skip connections. To the best of our knowledge, this is the first time a low-rank phenomenon is proven rigorously for these architectures, and it reflects empirical results in the literature. The proof relies on specific local training invariances, sometimes referred to as alignment, which we show to hold for a wide set of ReLU architectures. Our proof relies on a specific decomposition of the network into a multilinear function and another ReLU network whose weights are constant under a certain parameter directional convergence.	翻訳日:2022-02-01 01:57:12 公開日:2022-01-28
# (参考訳) 確率勾配ランゲヴィンダイナミクスのための微分プライバシー保証 Differential Privacy Guarantees for Stochastic Gradient Langevin Dynamics ( http://arxiv.org/abs/2201.11980v1 ) ライセンス: CC BY 4.0	Th\'eo Ryffel, Francis Bach, David Pointcheval	(参考訳) ランジュバン拡散を伴うr\'enyiダイバージェンスダイナミクスのモデル化により,ノイズの確率的勾配降下のプライバシーリークを解析した。非確率的アルゴリズムに関する最近の研究に触発されて、確率的設定における同様の望ましい性質を導出する。特に,従来のdp-sgd分析よりも大幅に改善した,滑らかで強い凸目標に対して,プライバシ損失が指数関数的に高速に収束することを示す。また,様々なステップサイズの任意のシーケンスに解析を拡張し,新たなユーティリティ境界を導出する。最後に,従来のDP-SGDライブラリと比較して,本手法の実用性を示す実装を提案する。 We analyse the privacy leakage of noisy stochastic gradient descent by modeling R\'enyi divergence dynamics with Langevin diffusions. Inspired by recent work on non-stochastic algorithms, we derive similar desirable properties in the stochastic setting. In particular, we prove that the privacy loss converges exponentially fast for smooth and strongly convex objectives under constant step size, which is a significant improvement over previous DP-SGD analyses. We also extend our analysis to arbitrary sequences of varying step sizes and derive new utility bounds. Last, we propose an implementation and our experiments show the practical utility of our approach compared to classical DP-SGD libraries.	翻訳日:2022-02-01 01:19:00 公開日:2022-01-28
# (参考訳) 再生医療用超音波画像を用いた多孔性バイオエラストマーのコンピュータ支援認識と評価 Computer-aided Recognition and Assessment of a Porous Bioelastomer on Ultrasound Images for Regenerative Medicine Applications ( http://arxiv.org/abs/2201.11987v1 ) ライセンス: CC BY 4.0	Dun Wang, Kaixuan Guo, Yanying Zhu, Jia Sun, Aliona Dreglea, Zhengwei You, Jiao Yu	(参考訳) 生分解性弾性足場は軟組織修復や組織工学の分野でますます注目を集めている。多孔質のバイオエラストマーからなるこれらの足場は、組織の成長とそれ自身の分解をサポートする。超音波画像に基づくコンピュータ支援分析手法を開発し, 足場の劣化性能を把握し, 破壊試験を行う必要をなくすだけでなく, 足場の劣化や組織成長を経時的に監視するためにも必要である。多孔質バイオエラストマーの連続的かつ正確な輪郭を抽出するために、単一の伝統的な画像処理アルゴリズムを用いるのは難しい。本稿では,生体エラストマーの輪郭検出のためのジョイントアルゴリズムと,生体エラストマーの劣化挙動を監視するテクスチャ特徴抽出法を提案する。平均シフトクラスタリング法は、生体エラストマーおよび生体組織のクラスタリング特徴情報を得るために用いられる。そして、大津画像2値化方法は、最適な閾値を自動的に選択してグレースケール超音波画像を2値画像に変換する。カニーエッジ検出器は完全なバイオエラストマーの輪郭を抽出するために用いられる。テクスチャの1次および2次統計特徴を抽出する。提案手法は, 超音波画像中の生体エラストマーの輪郭を理想的に抽出するだけでなく, テクスチャ特性と輪郭面積の変化に基づき, インプラント部位における生体エラストマーの劣化挙動に対する貴重なフィードバックを与える。本研究の予備的な結果から, 提案したコンピュータ支援画像処理技術は, 生体内超音波画像を用いた非侵襲的組織足場解析に有用であり, 組織足場劣化と細胞成長の進展を評価し, 足場設計の改善に役立つ可能性が示唆された。 Biodegradable elastic scaffolds have attracted more and more attention in the field of soft tissue repair and tissue engineering. These scaffolds made of porous bioelastomers support tissue ingrowth along with their own degradation. It is necessary to develop a computer-aided analyzing method based on ultrasound images to identify the degradation performance of the scaffold, not only to obviate the need to do destructive testing, but also to monitor the scaffold's degradation and tissue ingrowth over time. It is difficult using a single traditional image processing algorithm to extract continuous and accurate contour of a porous bioelastomer. This paper proposes a joint algorithm for the bioelastomer's contour detection and a texture feature extraction method for monitoring the degradation behavior of the bioelastomer. Mean-shift clustering method is used to obtain the bioelastomer's and native tissue's clustering feature information. Then the OTSU image binarization method automatically selects the optimal threshold value to convert the grayscale ultrasound image into a binary image. The Canny edge detector is used to extract the complete bioelastomer's contour. The first-order and second-order statistical features of texture are extracted. The proposed joint algorithm not only achieves the ideal extraction of the bioelastomer's contours in ultrasound images, but also gives valuable feedback of the degradation behavior of the bioelastomer at the implant site based on the changes of texture characteristics and contour area. The preliminary results of this study suggest that the proposed computer-aided image processing techniques have values and potentials in the non-invasive analysis of tissue scaffolds in vivo based on ultrasound images and may help tissue engineers evaluate the tissue scaffold's degradation and cellular ingrowth progress and improve the scaffold designs.	翻訳日:2022-02-01 00:40:04 公開日:2022-01-28
# (参考訳) 2つの時間スケール更新規則の定学習率を用いた生成逆数ネットワークの訓練 Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks ( http://arxiv.org/abs/2201.11989v1 ) ライセンス: CC BY 4.0	Naoki Sato and Hideaki Iiduka	(参考訳) 従来,一定の学習率を用いた2つの時間スケール更新ルール(TTUR)が,GAN(Generative Adversarial Network)のトレーニングに有用であった。一方、TTURの理論的解析により、2人のプレイヤー(判別器とジェネレータ)とのナッシュ平衡問題の定常局所ナッシュ平衡が崩壊する学習率を用いて与えられる。本稿では,一定の学習率を用いてTTURの理論解析を行い,理論と実践のギャップを埋める。特に,tturでは定常学習率を用いて,バッチサイズが増加するにつれて定常局所ナッシュ平衡を求めるために必要なステップ数が減少することを示す。また,理論解析を支援する数値計算結果も提供する。 Previous numerical results have shown that a two time-scale update rule (TTUR) using constant learning rates is practically useful for training generative adversarial networks (GANs). Meanwhile, a theoretical analysis of TTUR to find a stationary local Nash equilibrium of a Nash equilibrium problem with two players, a discriminator and a generator, has been given using decaying learning rates. In this paper, we give a theoretical analysis of TTUR using constant learning rates to bridge the gap between theory and practice. In particular, we show that, for TTUR using constant learning rates, the number of steps needed to find a stationary local Nash equilibrium decreases as the batch size increases. We also provide numerical results to support our theoretical analyzes.	翻訳日:2022-02-01 00:29:29 公開日:2022-01-28
# (参考訳) 画像とビデオの超解像のためのディープネットワーク Deep Networks for Image and Video Super-Resolution ( http://arxiv.org/abs/2201.11996v1 ) ライセンス: CC0 1.0	Kuldeep Purohit, Srimanta Mandal, A. N. Rajagopalan	(参考訳) 畳み込みニューラルネットワークの中間層における勾配伝播の効率は,超解像処理において重要である。そこで本研究では,MDCB(Mixed-dense connection block)と呼ぶ効率的な畳み込みユニットを用いて構築した,単一画像超解像(SISR)の深層構造を提案する。 MDCBの設計は、その限界を克服しつつ、残差と密接な接続戦略の強さを組み合わせている。複数因子に対する超解像を実現するために,高次因子に対する低次因子に対して学習したフィルタを再活用するスケール・リカレント・フレームワークを提案する。これにより性能が向上し、より高い因子に対するパラメトリック効率が向上する。ネットワークの2つのバージョンをトレーニングし、異なる損失構成を用いて補完的な画像品質を向上させる。ネットワークは,複数のフレームから情報を集約し,時空間的一貫性を維持する。提案したネットワークは、画像およびビデオ超解像ベンチマークにおける最先端技術に対する質的かつ定量的な改善をもたらす。 Efficiency of gradient propagation in intermediate layers of convolutional neural networks is of key importance for super-resolution task. To this end, we propose a deep architecture for single image super-resolution (SISR), which is built using efficient convolutional units we refer to as mixed-dense connection blocks (MDCB). The design of MDCB combines the strengths of both residual and dense connection strategies, while overcoming their limitations. To enable super-resolution for multiple factors, we propose a scale-recurrent framework which reutilizes the filters learnt for lower scale factors recursively for higher factors. This leads to improved performance and promotes parametric efficiency for higher factors. We train two versions of our network to enhance complementary image qualities using different loss configurations. We further employ our network for video super-resolution task, where our network learns to aggregate information from multiple frames and maintain spatio-temporal consistency. The proposed networks lead to qualitative and quantitative improvements over state-of-the-art techniques on image and video super-resolution benchmarks.	翻訳日:2022-01-31 23:59:41 公開日:2022-01-28
# (参考訳) スケールリカレントDense Networkを用いた画像超解像 Image Superresolution using Scale-Recurrent Dense Network ( http://arxiv.org/abs/2201.11998v1 ) ライセンス: CC0 1.0	Kuldeep Purohit, Srimanta Mandal, A. N. Rajagopalan	(参考訳) 畳み込みニューラルネットワーク(CNN)の設計の最近の進歩は、画像超解像(SR)の性能に大きな改善をもたらした。性能の向上は、これらのネットワークの中間層に残留または密接な接続が存在することに起因する。このような接続の効率的な組み合わせは、修復品質を維持しながらパラメータの数を劇的に削減することができる。本稿では,残差ブロック (residual dense blocks (rdbs)) 内の一連の高密度接続を含むユニット上に,画像から豊富な局所的特徴を抽出するためのスケールリカレントsrアーキテクチャを提案する。我々のスケールリカレント設計は、現在の最先端のアプローチに比べてパラメトリックに効率的でありながら、より高いスケール要因の競合性能を提供する。ネットワークの性能をさらに向上するため,中間層では複数の残差接続(Multi-Residual Dense Blocks)を採用し,既存の層では勾配伝搬を改善する。最近の研究で、従来の損失関数は、高いPSNRを持つが知覚的に劣る結果を生み出すためにネットワークを導くことができることがわかった。我々はGAN(Generative Adversarial Network)ベースのフレームワークとVGG(Deep Feature)の損失を利用してネットワークをトレーニングすることでこの問題を軽減する。実験により,VGG損失と対向損失の重み付けの組み合わせの違いが,ネットワーク出力を知覚歪曲線に沿って横切ることを実証した。提案したネットワークは,より少ないパラメータで知覚的かつ客観的に(PSNRベース)既存の手法に対して良好に動作する。 Recent advances in the design of convolutional neural network (CNN) have yielded significant improvements in the performance of image super-resolution (SR). The boost in performance can be attributed to the presence of residual or dense connections within the intermediate layers of these networks. The efficient combination of such connections can reduce the number of parameters drastically while maintaining the restoration quality. In this paper, we propose a scale recurrent SR architecture built upon units containing series of dense connections within a residual block (Residual Dense Blocks (RDBs)) that allow extraction of abundant local features from the image. Our scale recurrent design delivers competitive performance for higher scale factors while being parametrically more efficient as compared to current state-of-the-art approaches. To further improve the performance of our network, we employ multiple residual connections in intermediate layers (referred to as Multi-Residual Dense Blocks), which improves gradient propagation in existing layers. Recent works have discovered that conventional loss functions can guide a network to produce results which have high PSNRs but are perceptually inferior. We mitigate this issue by utilizing a Generative Adversarial Network (GAN) based framework and deep feature (VGG) losses to train our network. We experimentally demonstrate that different weighted combinations of the VGG loss and the adversarial loss enable our network outputs to traverse along the perception-distortion curve. The proposed networks perform favorably against existing methods, both perceptually and objectively (PSNR-based) with fewer parameters.	翻訳日:2022-01-31 23:36:29 公開日:2022-01-28
# (参考訳) Puppeteer: メモリ階層を越えたハードウェアプリフェッチのためのランダムフォレストベースのマネージャ Puppeteer: A Random Forest-based Manager for Hardware Prefetchers across the Memory Hierarchy ( http://arxiv.org/abs/2201.12027v1 ) ライセンス: CC BY 4.0	Furkan Eris, Marcia S. Louis, Kubra Eris, Jose L. Abellan, Ajay Joshi	(参考訳) 長年にわたり、プロセッサのスループットは着実に向上した。しかし、メモリスループットは同じ速度では向上せず、結果としてメモリウォールの問題が発生し、効率と理論上のピークプロセッサ性能のギャップが増大した。これに対処するため、データ/インストラクションプリフェッチャー設計の領域では、多くの作業が行われている。プリフェッチは、将来のデータ/インストラクションアドレスアクセスを予測し、データ/インストラクションアクセスレイテンシの低下を目標として、メモリ階層内のデータ/インストラクションを積極的にフェッチする。この目的のために、1つ以上のプリフェッチがメモリ階層の各レベルでデプロイされるが、通常、各プリフェッチはシステム内の他のプリフェッチを包括的に考慮することなく、独立して設計される。その結果、個々のプリフェッチが常に補完するとは限らないため、平均的なパフォーマンス向上や、あるいは多くの負のアウトリーチにつながる。本稿では,ハードウェアプリフェッチマネージャであるpuppeteerを提案する。このpuppeteerは,ランダムフォレストレグレプタのスイートを使用して,プリフェッチが相互補完し,データ/インストラクションアクセスレイテンシを低減するように,メモリ階層の各レベルにおいてプリフェッチがどのレベルにあるべきかを実行時に判断する。 Puppeteer では 1 Core (1C) で 46.0%、 4 Core (4C) で 25.8%、および 8 Core (8C) プロセッサで 11.9% の改善を行い、SPEC2017, SPEC2006, Cloud Suites から生成される平均10KB のオーバヘッドを持つ。さらに,負の外れ値の数が89%以上減少し,最悪の場合の負の外れ値のパフォーマンスが25%から5%に低下した。 Over the years, processor throughput has steadily increased. However, the memory throughput has not increased at the same rate, which has led to the memory wall problem in turn increasing the gap between effective and theoretical peak processor performance. To cope with this, there has been an abundance of work in the area of data/instruction prefetcher designs. Broadly, prefetchers predict future data/instruction address accesses and proactively fetch data/instructions in the memory hierarchy with the goal of lowering data/instruction access latency. To this end, one or more prefetchers are deployed at each level of the memory hierarchy, but typically, each prefetcher gets designed in isolation without comprehensively accounting for other prefetchers in the system. As a result, individual prefetchers do not always complement each other, and that leads to lower average performance gains and/or many negative outliers. In this work, we propose Puppeteer, which is a hardware prefetcher manager that uses a suite of random forest regressors to determine at runtime which prefetcher should be ON at each level in the memory hierarchy, such that the prefetchers complement each other and we reduce the data/instruction access latency. Compared to a design with no prefetchers, using Puppeteer we improve IPC by 46.0% in 1 Core (1C), 25.8% in 4 Core (4C), and 11.9% in 8 Core (8C) processors on average across traces generated from SPEC2017, SPEC2006, and Cloud suites with ~10KB overhead. Moreover, we also reduce the number of negative outliers by over 89%, and the performance loss of the worst-case negative outlier from 25% to only 5% compared to the state-of-the-art.	翻訳日:2022-01-31 23:25:43 公開日:2022-01-28
# (参考訳) FreshPRINCE: 簡単な変換ベースのパイプライン時系列分類器 The FreshPRINCE: A Simple Transformation Based Pipeline Time Series Classifier ( http://arxiv.org/abs/2201.12048v1 ) ライセンス: CC BY 4.0	Matthew Middlehurst and Anthony Bagnall	(参考訳) 近年,時系列分類(TSC)のために提案されるアルゴリズムの精度が著しく向上している。しかし、実際の実践者やデータサイエンティストが研究のトピックに詳しくない質問としてよく聞かれるのは、アルゴリズムの複雑さが最先端にあると考えるかどうかだ。最初に提案されたアプローチは、単純な要約統計のパイプラインやTSFreshのような時系列の特徴抽出アプローチであり、それ自体は理にかなった問題であり、複数の問題タイプに一般化されたTSCアルゴリズムの出版物では、これらのアプローチが考慮または比較されることはめったにない。ベクトルベース分類器を用いて,現在最先端の時系列分類器の連続特性に有効であることを示す。これらのアプローチをucr時系列データセットアーカイブでテストし、tsc文献がこれらのアプローチの有効性を見落としているかどうかを確認した。 TSFreshのパイプラインに続いて, FreshPRINCEと呼ばれる回転森林分類器が最適であることがわかった。最先端の技術ではないが、動的に経時的に振る舞う隣人よりもかなり正確であり、将来の比較のための合理的なベンチマークである。 There have recently been significant advances in the accuracy of algorithms proposed for time series classification (TSC). However, a commonly asked question by real world practitioners and data scientists less familiar with the research topic, is whether the complexity of the algorithms considered state of the art is really necessary. Many times the first approach suggested is a simple pipeline of summary statistics or other time series feature extraction approaches such as TSFresh, which in itself is a sensible question; in publications on TSC algorithms generalised for multiple problem types, we rarely see these approaches considered or compared against. We experiment with basic feature extractors using vector based classifiers shown to be effective with continuous attributes in current state-of-the-art time series classifiers. We test these approaches on the UCR time series dataset archive, looking to see if TSC literature has overlooked the effectiveness of these approaches. We find that a pipeline of TSFresh followed by a rotation forest classifier, which we name FreshPRINCE, performs best. It is not state of the art, but it is significantly more accurate than nearest neighbour with dynamic time warping, and represents a reasonable benchmark for future comparison.	翻訳日:2022-01-31 22:52:29 公開日:2022-01-28
# (参考訳) 浅層ニューラルネットワークにおける確率勾配降下のグローバル収束のためのオーバーパラメータ境界の改善 Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks ( http://arxiv.org/abs/2201.12052v1 ) ライセンス: CC BY 4.0	Bart{\l}omiej Polaczyk and Jacek Cyranka	(参考訳) 隠れ層フィードフォワードニューラルネットワークのクラスに対する確率勾配降下アルゴリズムのグローバル収束に必要な過パラメトリゼーション境界について検討し、ReLUを含む実際に用いられる活性化関数のほとんどを考慮して検討した。必要な層幅を隠蔽することで既存の最先端結果を改善する。本稿では,非線形解析とネットワークのランダム初期化特性を組み合わせた新しい証明手法を提案する。まず、MSE損失に対する勾配流の非滑らかな類似である微分包含の連続解のグローバル収束を確立する。第2に,上記の微分包含の解を(離散)確率的勾配降下列に関連付ける技術的結果(一般近似子に対しても動作する)を提供し,確率的勾配降下反復においてゼロ損失に向かう線形収束を確立する。 We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks, considering most of the activation functions used in practice, including ReLU. We improve the existing state-of-the-art results in terms of the required hidden layer width. We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network. First, we establish the global convergence of continuous solutions of the differential inclusion being a nonsmooth analogue of the gradient flow for the MSE loss. Second, we provide a technical result (working also for general approximators) relating solutions of the aforementioned differential inclusion to the (discrete) stochastic gradient descent sequences, hence establishing linear convergence towards zero loss for the stochastic gradient descent iterations.	翻訳日:2022-01-31 22:29:09 公開日:2022-01-28
# (参考訳) 自動エンコーダを用いたベイズ推論のための学習概要統計 Learning Summary Statistics for Bayesian Inference with Autoencoders ( http://arxiv.org/abs/2201.12059v1 ) ライセンス: CC BY 4.0	Carlo Albert, Simone Ulzega, Firat Ozdemir, Fernando Perez-Cruz, Antonietta Mira	(参考訳) 難解な確率関数を持つ確率モデルに対して、近似ベイズ計算は、シミュレーションされたモデル出力と観測の繰り返し比較を通じて、小さな要約統計量の組で真の後部を近似する方法を提供する。これらの統計は、パラメータを制約するがノイズをキャンセルするための情報を保持する必要がある。したがって、一般の確率モデルでは熱力学的状態変数と見なすことができる。多くの科学的応用において、後部の十分な近似に到達するためにはモデルパラメータよりも厳密な要約統計が必要である。そこで我々は,ディープニューラルネットワークに基づくオートエンコーダの内部次元を要約統計として利用する。パラメータ関連情報を全て符号化するエンコーダのインセンティブを作成するため、トレーニングデータを生成するために使用したノイズに関する明示的または暗黙的な情報にデコーダがアクセスする。このアプローチを2種類の確率モデルで実証的に検証する。 For stochastic models with intractable likelihood functions, approximate Bayesian computation offers a way of approximating the true posterior through repeated comparisons of observations with simulated model outputs in terms of a small set of summary statistics. These statistics need to retain the information that is relevant for constraining the parameters but cancel out the noise. They can thus be seen as thermodynamic state variables, for general stochastic models. For many scientific applications, we need strictly more summary statistics than model parameters to reach a satisfactory approximation of the posterior. Therefore, we propose to use the inner dimension of deep neural network based Autoencoders as summary statistics. To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information on the noise that has been used to generate the training data. We validate the approach empirically on two types of stochastic models.	翻訳日:2022-01-31 21:32:44 公開日:2022-01-28
# (参考訳) DynaMixer:動的ミキシングを備えたビジョンMLPアーキテクチャ DynaMixer: A Vision MLP Architecture with Dynamic Mixing ( http://arxiv.org/abs/2201.12083v1 ) ライセンス: CC BY 4.0	Ziyu Wang and Wenhao Jiang and Yiming Zhu and Li Yuan and Yibing Song and Wei Liu	(参考訳) 近年,MLPのような視覚モデルが主流の視覚認識タスクにおいて有望な性能を達成している。視覚トランスフォーマーやcnnとは対照的に、mlpライクなモデルの成功は、トークンとチャネル間の単純な情報融合操作が深い認識モデルに優れた表現力をもたらすことを示している。しかし、既存のMLPのようなモデルは、トークンを静的融合操作を通じて融合させ、混在するトークンの内容への適応性に欠ける。したがって、慣用的な情報融合手順は不十分である。そこで本稿では,動的情報融合を利用して,DynaMixerと呼ばれる効率的なMLP型ネットワークアーキテクチャを提案する。本稿では,DynaMixerモデルが依存する手法を提案し,混合する全てのトークンの内容を活用することで,混合行列を動的に生成する。時間の複雑さを低減し、ロバスト性を向上させるため、寸法低減技術と多セグメント融合機構を採用する。提案したDynaMixerモデル (97Mパラメータ) は,ImageNet-1Kデータセットの84.3\%のTop-1精度を実現する。パラメータ数が26mに減少しても82.7\%のtop-1精度を達成し、同様の能力を持つ既存のmlpライクなモデルを上回る。 DynaMixerの実装は一般公開される予定だ。 Recently, MLP-like vision models have achieved promising performances on mainstream visual recognition tasks. In contrast with vision transformers and CNNs, the success of MLP-like models shows that simple information fusion operations among tokens and channels can yield a good representation power for deep recognition models. However, existing MLP-like models fuse tokens through static fusion operations, lacking adaptability to the contents of the tokens to be mixed. Thus, customary information fusion procedures are not effective enough. To this end, this paper presents an efficient MLP-like network architecture, dubbed DynaMixer, resorting to dynamic information fusion. Critically, we propose a procedure, on which the DynaMixer model relies, to dynamically generate mixing matrices by leveraging the contents of all the tokens to be mixed. To reduce the time complexity and improve the robustness, a dimensionality reduction technique and a multi-segment fusion mechanism are adopted. Our proposed DynaMixer model (97M parameters) achieves 84.3\% top-1 accuracy on the ImageNet-1K dataset without extra training data, performing favorably against the state-of-the-art vision MLP models. When the number of parameters is reduced to 26M, it still achieves 82.7\% top-1 accuracy, surpassing the existing MLP-like models with a similar capacity. The implementation of DynaMixer will be made available to the public.	翻訳日:2022-01-31 21:19:54 公開日:2022-01-28
# (参考訳) BLIP:Unified Vision-Language Understanding and Generationのためのブートストラップ言語画像事前学習 BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation ( http://arxiv.org/abs/2201.12086v1 ) ライセンス: CC BY 4.0	Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi	(参考訳) Vision-Language Pre-Training (VLP)は多くの視覚言語タスクのパフォーマンスを向上した。しかし、既存のトレーニング済みモデルのほとんどは、理解ベースタスクまたは生成ベースタスクにのみ優れている。さらに、Webから収集されたノイズの多い画像とテキストのペアでデータセットをスケールアップすることで、パフォーマンスが大幅に向上した。本稿では,視覚言語理解と生成の両方に柔軟に変換可能な新しいVLPフレームワークBLIPを提案する。 blipは、キャプションをブートストラップし、キャプションが合成キャプションを生成し、フィルタが騒がしいキャプションを取り除くことで、ノイズの多いwebデータを効果的に活用する。画像テキスト検索(平均リコール@1で+2.7%)、画像キャプション(CIDErで+2.8%)、VQA(VQAで+1.6%)など、幅広い視覚言語タスクにおける最先端の成果を得た。 BLIPはまた、ゼロショット方式で直接ビデオ言語タスクに移行する際に、強力な一般化能力を示す。コード、モデル、データセットはhttps://github.com/salesforce/BLIPで公開されている。 Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score). BLIP also demonstrates strong generalization ability when directly transferred to video-language tasks in a zero-shot manner. Code, models, and datasets are released at https://github.com/salesforce/BLIP.	翻訳日:2022-01-31 20:55:18 公開日:2022-01-28
# (参考訳) 疾患スクリーニングのためのラベル不確実性誘導マルチストリームモデル Label uncertainty-guided multi-stream model for disease screening ( http://arxiv.org/abs/2201.12089v1 ) ライセンス: CC BY 4.0	Chi Liu, Zongyuan Ge, Mingguang He, Xiaotong Han	(参考訳) 医学画像データセットに対する病気の重症度のアノテーションは、しばしば複数のヒトグレーダーからの協調的な決定に依存している。個々の差異に由来するオブザーバ内変動は、常にこのプロセスで持続するが、影響は過小評価されることが多い。本稿では,不確実性問題としてオブザーバ内変動性(intra-observer variability)を取り上げ,そのラベル不確実性情報を疾患スクリーニングモデルに導入し,最終決定を改善する。主な考え方は、画像を不確実性情報によって単純で難しいケースに分割し、異なるケースを別々に扱うマルチストリームネットワークを開発することである。特に難しい場合は,適切な疾患の特徴を把握し,不確実性の干渉に抵抗するネットワークの能力を強化する。眼底画像を用いた緑内障スクリーニング実験では,提案モデルがいくつかのベースライン,特に難検例よりも優れていた。 The annotation of disease severity for medical image datasets often relies on collaborative decisions from multiple human graders. The intra-observer variability derived from individual differences always persists in this process, yet the influence is often underestimated. In this paper, we cast the intra-observer variability as an uncertainty problem and incorporate the label uncertainty information as guidance into the disease screening model to improve the final decision. The main idea is dividing the images into simple and hard cases by uncertainty information, and then developing a multi-stream network to deal with different cases separately. Particularly, for hard cases, we strengthen the network's capacity in capturing the correct disease features and resisting the interference of uncertainty. Experiments on a fundus image-based glaucoma screening case study show that the proposed model outperforms several baselines, especially in screening hard cases.	翻訳日:2022-01-31 20:32:25 公開日:2022-01-28
# (参考訳) 線形反転概念消去 Linear Adversarial Concept Erasure ( http://arxiv.org/abs/2201.12091v1 ) ライセンス: CC BY 4.0	Shauli Ravfogel, Michael Twiton, Yoav Goldberg and Ryan Cotterell	(参考訳) テキストデータに基づいてトレーニングされた現代のニューラルモデルは、直接の監督なしに現れる事前訓練された表現に依存している。これらの表現が現実のアプリケーションで使われるようになるにつれて、それらのコンテンツが \emph{control} できないことがますます重要な問題になっている。線形予測器が概念を回復するのを防ぐために、与えられた概念に対応する線形部分空間の同定と消去の問題を定式化する。我々は、この問題を制約付き線形ミニマックスゲームとしてモデル化し、既存のソリューションが一般にこのタスクに最適でないことを示す。我々は,ある目的に対する閉形式解を導出し,他の目的にうまく機能する凸緩和 r-レースを提案する。二元性除去の文脈で評価すると、本手法は、内在的および外在的評価によりバイアスを緩和する低次元部分空間を回復する。線形であるにもかかわらず、この手法は、トラクタビリティと解釈可能性を維持しつつ、深い非線形分類器のバイアスを効果的に軽減する。 Modern neural models trained on textual data rely on pre-trained representations that emerge without direct supervision. As these representations are increasingly being used in real-world applications, the inability to \emph{control} their content becomes an increasingly important problem. We formulate the problem of identifying and erasing a linear subspace that corresponds to a given concept, in order to prevent linear predictors from recovering the concept. We model this problem as a constrained, linear minimax game, and show that existing solutions are generally not optimal for this task. We derive a closed-form solution for certain objectives, and propose a convex relaxation, R-LACE, that works well for others. When evaluated in the context of binary gender removal, the method recovers a low-dimensional subspace whose removal mitigates bias by intrinsic and extrinsic evaluation. We show that the method -- despite being linear -- is highly expressive, effectively mitigating bias in deep nonlinear classifiers while maintaining tractability and interpretability.	翻訳日:2022-01-31 20:23:13 公開日:2022-01-28
# (参考訳) 魚眼カメラシステムにおけるグラフ畳み込みネットワークによるオーナー・メンバー関係の検出 Detecting Owner-member Relationship with Graph Convolution Network in Fisheye Camera System ( http://arxiv.org/abs/2201.12099v1 ) ライセンス: CC BY 4.0	Zizhang Wu, Jason Wang, Tianhao Xu, Fan Wang	(参考訳) 車輪と車両の所有者とメンバーの関係は、特に組込み環境での車両の3D知覚に大きく貢献する。しかし、この関係を利用するには、2つの大きな課題に直面する必要がある。 i) 従来のiouベースのヒューリスティックは、交通渋滞シナリオの処理が困難である。二車両搭載システムにおけるソリューションの有効性及び適用性は困難である。これらの問題に対処するために,グラフ畳み込みネットワーク(GCN)を設計し,新しい関係予測手法であるDeepWORDを提案する。具体的には、情報豊かさを向上させるために、ノードへの入力として局所相関を持つ特徴マップを用いる。次に,先行推定バイアスを動的に補正するグラフアテンションネットワーク(GAT)を導入する。最後に、WORDと呼ばれる注釈付きオーナシップを持つ大規模ベンチマークとしてデータセットを設計した。実験の結果,提案手法は最先端の精度と実時間性能を達成した。 WORDデータセットはhttps://github.com/NamespaceMain/ownermember-relationship-datasetで公開されている。 The owner-member relationship between wheels and vehicles contributes significantly to the 3D perception of vehicles, especially in embedded environments. However, to leverage this relationship we must face two major challenges: i) Traditional IoU-based heuristics have difficulty handling occluded traffic congestion scenarios. ii) The effectiveness and applicability of the solution in a vehicle-mounted system is difficult. To address these issues, we propose an innovative relationship prediction method, DeepWORD, by designing a graph convolutional network (GCN). Specifically, to improve the information richness, we use feature maps with local correlation as input to the nodes. Subsequently, we introduce a graph attention network (GAT) to dynamically correct the a priori estimation bias. Finally, we designed a dataset as a large-scale benchmark which has annotated owner-member relationship, called WORD. In the experiments we learned that the proposed method achieved state-of-the-art accuracy and real-time performance. The WORD dataset is made publicly available at https://github.com/NamespaceMain/ownermember-relationship-dataset.	翻訳日:2022-01-31 19:57:02 公開日:2022-01-28
# (参考訳) 忠実性侵害テストによる注意モデル説明可能性の再検討 Rethinking Attention-Model Explainability through Faithfulness Violation Test ( http://arxiv.org/abs/2201.12114v1 ) ライセンス: CC BY 4.0	Yibing Liu, Haoliang Li, Yangyang Guo, Chenqi Kong, Jing Li, Shiqi Wang	(参考訳) 注意機構は深層モデルの説明可能性を支配している。それらは入力上の確率分布を生成し、特徴重要度指標として広く見なされている。しかし,本論文では,機能的影響の極性を識別する弱さという,注意的説明に重要な制限がある。注意重みが高い機能はモデル予測に忠実に寄与しないかもしれないし、代わりに抑制効果を課すことができる。本稿では,現在の注意に基づく手法であるattentio$\odot$gradient や lrp-based attention descriptions について考察する。本稿ではまず,説明重みと衝突極性との整合性を測定するための実用的な診断手法を提案する。広範な実験を通じ,ほとんどの説明手法が不意に忠実性侵害問題,特に生の注意によって妨げられていることを示した。違反問題に影響を及ぼす要因に関する実証分析は、注意モデルにおける説明法の適用に有用である。 Attention mechanisms are dominating the explainability of deep models. They produce probability distributions over the input, which are widely deemed as feature-importance indicators. However, in this paper, we find one critical limitation in attention explanations: weakness in identifying the polarity of feature impact. This would be somehow misleading -- features with higher attention weights may not faithfully contribute to model predictions; instead, they can impose suppression effects. With this finding, we reflect on the explainability of current attention-based techniques, such as Attentio$\odot$Gradient and LRP-based attention explanations. We first propose an actionable diagnostic methodology (henceforth faithfulness violation test) to measure the consistency between explanation weights and the impact polarity. Through the extensive experiments, we then show that most tested explanation methods are unexpectedly hindered by the faithfulness violation issue, especially the raw attention. Empirical analyses on the factors affecting violation issues further provide useful observations for adopting explanation methods in attention models.	翻訳日:2022-01-31 19:47:32 公開日:2022-01-28
# (参考訳) DELAUNAY:精神物理学と機械学習研究のための抽象芸術のデータセット DELAUNAY: a dataset of abstract art for psychophysical and machine learning research ( http://arxiv.org/abs/2201.12123v1 ) ライセンス: CC BY 4.0	Camille Gontier, Jakob Jordan, Mihai A. Petrovici	(参考訳) 画像データセットは、心理物理学実験や機械学習研究で一般的に使用される。ほとんどの公開データセットは、現実的で自然なオブジェクトのイメージで構成されている。しかし、典型的な機械学習モデルには自然オブジェクトに関するドメイン固有の知識が欠けているが、人間はそのようなデータに対して事前の経験を活用でき、人工学習と自然学習の比較が難しい。本稿では,抽象絵画のデータセットであるDELAUNAYについて紹介する。このデータセットは、自然画像と人工パターンの中間層を提供しており、例えば人間や人工ニューラルネットワークのサンプル効率を調べるために、さまざまなコンテキストで使用することができる。最後に、DELAUNAYで市販の畳み込みニューラルネットワークをトレーニングし、いくつかの興味深い特徴を強調します。 Image datasets are commonly used in psychophysical experiments and in machine learning research. Most publicly available datasets are comprised of images of realistic and natural objects. However, while typical machine learning models lack any domain specific knowledge about natural objects, humans can leverage prior experience for such data, making comparisons between artificial and natural learning challenging. Here, we introduce DELAUNAY, a dataset of abstract paintings and non-figurative art objects labelled by the artists' names. This dataset provides a middle ground between natural images and artificial patterns and can thus be used in a variety of contexts, for example to investigate the sample efficiency of humans and artificial neural networks. Finally, we train an off-the-shelf convolutional neural network on DELAUNAY, highlighting several of its intriguing features.	翻訳日:2022-01-31 19:32:01 公開日:2022-01-28
# (参考訳) 自動過パラメータ最適化問題に対する適応最適化器 Adaptive Optimizer for Automated Hyperparameter Optimization Problem ( http://arxiv.org/abs/2201.12124v1 ) ライセンス: CC BY 4.0	Huayuan Sun	(参考訳) ハイパーパラメータの選択は、機械学習モデルの性能に重大な影響を与える。本稿では,最適化プロセスにおいて適切なアルゴリズムとパラメータを自動的に調整する適応オプティマイザを構築するための汎用フレームワークを提案する。適応最適化手法を検証し、遺伝的アルゴリズムを用いてベイズ最適化器に基づく適応最適化器を構築し、元の最適化器と比較する。特に並列最適化には大きな利点があります。 The choices of hyperparameters have critical effects on the performance of machine learning models. In this paper, we present a general framework that is able to construct an adaptive optimizer, which automatically adjust the appropriate algorithm and parameters in the process of optimization. Examining the method of adaptive optimizer, we product an example of using genetic algorithm to construct an adaptive optimizer based on Bayesian Optimizer and compared effectiveness with original optimizer. Especially, It has great advantages in parallel optimization.	翻訳日:2022-01-31 19:22:15 公開日:2022-01-28
# (参考訳) 残留ポリシー勾配法による共通意味強化学習におけるクラス抽象化の活用 Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods ( http://arxiv.org/abs/2201.12126v1 ) ライセンス: CC BY 4.0	Niklas H\"opner, Ilaria Tiddi, Herke van Hoof	(参考訳) 知識ベースを活用するために強化学習(RL)エージェントを導入し、経験から学習することで、知識集約ドメインにおいてRLを前進させる。しかし、手動で環境に合わせた知識を活用することは困難であることが証明されている。本稿では,オープンソース知識グラフに存在するサブクラス関係を利用して,特定のオブジェクトを抽象化することを提案する。我々は,クラス階層内の異なる抽象レベルにまたがる知識を統合可能な残留ポリシー勾配法を開発した。提案手法は,コモンセンスゲームにおいて,サンプル効率の向上とオブジェクトの一般化を実現するとともに,抽出したクラス知識の過度なノイズや,クラス構造がほとんどない環境など,障害モードについても検討する。 Enabling reinforcement learning (RL) agents to leverage a knowledge base while learning from experience promises to advance RL in knowledge intensive domains. However, it has proven difficult to leverage knowledge that is not manually tailored to the environment. We propose to use the subclass relationships present in open-source knowledge graphs to abstract away from specific objects. We develop a residual policy gradient method that is able to integrate knowledge across different abstraction levels in the class hierarchy. Our method results in improved sample efficiency and generalisation to unseen objects in commonsense games, but we also investigate failure modes, such as excessive noise in the extracted class knowledge or environments with little class structure.	翻訳日:2022-01-31 19:17:10 公開日:2022-01-28
# (参考訳) 教師付き機械学習における意思決定のための学習曲線 - サーベイ Learning Curves for Decision Making in Supervised Machine Learning -- A Survey ( http://arxiv.org/abs/2201.12150v1 ) ライセンス: CC BY 4.0	Felix Mohr, Jan N. van Rijn	(参考訳) 学習曲線(英: learning curves)とは、機械学習の文脈において、特定の資源(例えば、訓練例の数や訓練イテレーションの数)に対する学習アルゴリズムのパフォーマンスを評価するために採用されている社会科学の概念である。学習曲線は、機械学習のいくつかの文脈において重要な応用であり、最も重要なのは、データ取得の文脈、モデルのトレーニングの早期停止、モデル選択である。例えば、学習曲線をモデル化することで、アルゴリズムとハイパーパラメータの構成が適切な選択の可能性があるかどうかを早期に評価することができ、しばしばアルゴリズムの選択プロセスを高速化することができる。意思決定に学習曲線を使用するための様々なアプローチが提案されている。一部のモデルは、ある予算の特定のアルゴリズムが特定の参照性能を上回るかどうかという二分決定問題に答えるが、より複雑なモデルはアルゴリズムの学習曲線全体を予測する。学習曲線のアプローチを3つの基準で分類するフレームワーク、すなわち、対処する意思決定状況、彼らが回答する本質的学習曲線問題、彼らが使用するリソースの種類に分類する。文献から論文を調査し,この枠組みに分類する。 Learning curves are a concept from social sciences that has been adopted in the context of machine learning to assess the performance of a learning algorithm with respect to a certain resource, e.g. the number of training examples or the number of training iterations. Learning curves have important applications in several contexts of machine learning, most importantly for the context of data acquisition, early stopping of model training and model selection. For example, by modelling the learning curves, one can assess at an early stage whether the algorithm and hyperparameter configuration have the potential to be a suitable choice, often speeding up the algorithm selection process. A variety of approaches has been proposed to use learning curves for decision making. Some models answer the binary decision question of whether a certain algorithm at a certain budget will outperform a certain reference performance, whereas more complex models predict the entire learning curve of an algorithm. We contribute a framework that categorizes learning curve approaches using three criteria: the decision situation that they address, the intrinsic learning curve question that they answer and the type of resources that they use. We survey papers from literature and classify them into this framework.	翻訳日:2022-01-31 18:57:19 公開日:2022-01-28
# (参考訳) 分子コンフォメーションの生成的粗粒化 Generative Coarse-Graining of Molecular Conformations ( http://arxiv.org/abs/2201.12176v1 ) ライセンス: CC BY 4.0	Wujie Wang, Minkai Xu, Chen Cai, Benjamin Kurt Miller, Tess Smidt, Yusu Wang, Jian Tang, Rafael G\'omez-Bombarelli	(参考訳) 分子シミュレーションの粗粒化(CG)は、選択された原子を擬似ビーズに分類することで粒子表現を単純化し、シミュレーションを劇的に加速する。しかし,このようなcg処理は情報損失を誘発し,cg座標からの微細粒度 (fg) 座標の復元を精度良く行う。生成モデルと同変ネットワークの最近の進歩に触発されて、バックマッピング変換の重要な確率的性質と幾何的整合性要件を厳密に組み込む新しいモデルを提案する。我々のモデルはFGの不確かさを不変潜在空間にエンコードし、同変畳み込みを通じてFG測度に復号する。この領域の評価を標準化するために、分子動力学軌道に基づく3つの包括的なベンチマークも提供する。広範な実験によって、我々のアプローチは常により現実的な構造を回復し、既存のデータ駆動型メソッドをかなりのマージンで上回っています。 Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and therefore drastically accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from CG coordinates, a long-standing challenge. Inspired by the recent progress in generative models and equivariant networks, we propose a novel model that rigorously embeds the vital probabilistic nature and geometric consistency requirements of the backmapping transformation. Our model encodes the FG uncertainties into an invariant latent space and decodes them back to FG geometries via equivariant convolutions. To standardize the evaluation of this domain, we further provide three comprehensive benchmarks based on molecular dynamics trajectories. Extensive experiments show that our approach always recovers more realistic structures and outperforms existing data-driven methods with a significant margin.	翻訳日:2022-01-31 18:21:06 公開日:2022-01-28
# (参考訳) カーネル空間における逆概念消去 Adversarial Concept Erasure in Kernel Space ( http://arxiv.org/abs/2201.12191v1 ) ライセンス: CC BY 4.0	Shauli Ravfogel and Francisco Vargas and Yoav Goldberg and Ryan Cotterell	(参考訳) テキストデータに対するニューラルモデルの表現空間は、トレーニング中に教師なしの方法で現れる。性別などの人間に解釈可能な概念がどのようにコード化されているかを理解することで、ユーザーはこれらの表現の内容を‘emph{control}’し、それらに依存するモデルの動作を分析する能力を向上させることができる。制御問題に対する顕著なアプローチの1つは、与えられた概念に対応する表現空間内の線型概念部分空間の同定と除去である。これらは扱いやすく解釈可能であるが、ニューラルネットワークは必ずしも線形部分空間の概念を表すものではない。我々は, [ravfogel et al. 2022] の線形概念除去目的のカーナラライズを提案し, ある種の非線形敵が概念を回復する能力に対抗して有効であることを示した。興味深いことに、線形モデルと非線形モデルの間の分割は過度に単純化され、二項性の概念と中性化を考えると、すべての概念に関連する情報を排他的に含む単一のカーネル空間は見つからない。したがって、一度に \emph{all} 非線形敵から保護することは困難である。 The representation space of neural models for textual data emerges in an unsupervised manner during training. Understanding how human-interpretable concepts, such as gender, are encoded in these representations would improve the ability of users to \emph{control} the content of these representations and analyze the working of the models that rely on them. One prominent approach to the control problem is the identification and removal of linear concept subspaces -- subspaces in the representation space that correspond to a given concept. While those are tractable and interpretable, neural network do not necessarily represent concepts in linear subspaces. We propose a kernalization of the linear concept-removal objective of [Ravfogel et al. 2022], and show that it is effective in guarding against the ability of certain nonlinear adversaries to recover the concept. Interestingly, our findings suggest that the division between linear and nonlinear models is overly simplistic: when considering the concept of binary gender and its neutralization, we do not find a single kernel space that exclusively contains all the concept-related information. It is therefore challenging to protect against \emph{all} nonlinear adversaries at once.	翻訳日:2022-01-31 17:47:09 公開日:2022-01-28
# (参考訳) バリセントリック符号化モデルにおける測度推定 Measure Estimation in the Barycentric Coding Model ( http://arxiv.org/abs/2201.12195v1 ) ライセンス: CC BY 4.0	Matthew Werenski, Ruijie Jiang, Abiy Tasissa, Shuchin Aeron, James M. Murphy	(参考訳) 本稿では、未知の測度が既知の測度の有限集合のワッサーシュタイン2バリーセンタの集合に属すると仮定する、バリー中心符号化モデル(BCM)に基づく測度推定の問題について考察する。このモデルの下で測度を推定することは、未知の重心座標を推定することと同値である。 3つの主要な結果からなるBCMに基づく測度推定のための新しい幾何学的,統計的,および計算的洞察を提供する。最初の主要な結果は、ワッサーシュタイン2空間のリーマン幾何学を利用して、真の基準測度へのアクセスを仮定する二次最適化問題の解として、偏心座標を復元する手順を提供する。本質的な幾何学的洞察は、この二次問題のパラメータは、与えられた測度からBCMを定義する基準測度までの最適変位写像の間の内部積によって決定されるということである。第2の主な結果は、すべての測定値がi.i.d.サンプルによって実証的に観測された場合、bcmにおける座標の解法を確立します。基礎となる測度とその次元の滑らかさによって決定されるこのアルゴリズムの正確な収束率を証明し、その統計的一貫性を保証する。最後に,3つの応用領域において,BCMと関連する評価手順の有用性を実証する。 (i)ガウス測度に対する共分散推定 (ii)画像処理,及び (iii)自然言語処理。 This paper considers the problem of measure estimation under the barycentric coding model (BCM), in which an unknown measure is assumed to belong to the set of Wasserstein-2 barycenters of a finite set of known measures. Estimating a measure under this model is equivalent to estimating the unknown barycenteric coordinates. We provide novel geometrical, statistical, and computational insights for measure estimation under the BCM, consisting of three main results. Our first main result leverages the Riemannian geometry of Wasserstein-2 space to provide a procedure for recovering the barycentric coordinates as the solution to a quadratic optimization problem assuming access to the true reference measures. The essential geometric insight is that the parameters of this quadratic problem are determined by inner products between the optimal displacement maps from the given measure to the reference measures defining the BCM. Our second main result then establishes an algorithm for solving for the coordinates in the BCM when all the measures are observed empirically via i.i.d. samples. We prove precise rates of convergence for this algorithm -- determined by the smoothness of the underlying measures and their dimensionality -- thereby guaranteeing its statistical consistency. Finally, we demonstrate the utility of the BCM and associated estimation procedures in three application areas: (i) covariance estimation for Gaussian measures; (ii) image processing; and (iii) natural language processing.	翻訳日:2022-01-31 17:21:57 公開日:2022-01-28
# (参考訳) データ非依存関数による暗黙正則化の限界 Limitation of characterizing implicit regularization by data-independent functions ( http://arxiv.org/abs/2201.12198v1 ) ライセンス: CC BY 4.0	Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang	(参考訳) 近年,ニューラルネットワーク(nns)の暗黙的正規化の理解が深層学習理論の中心的な課題となっている。しかし、暗黙の正則化自体は完全に定義されておらず、よく理解されていない。本研究では,暗黙の正規化を数学的に定義し,研究する。重要なのは,データ独立関数による暗黙の正規化を特徴付ける共通アプローチの限界を検討することである。本稿では,2つの動的メカニズム,すなわち2点重なり合い機構を提案する。このメカニズムは,1つの隠れニューロンNNのクラスを生成するための2つのレシピを提供する。その結果,暗黙的正則化の深遠なデータ依存性が示され,将来におけるNNの暗黙的正則化のデータ依存性を詳細に研究するきっかけとなった。 In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task of deep learning theory. However, implicit regularization is in itself not completely defined and well understood. In this work, we make an attempt to mathematically define and study the implicit regularization. Importantly, we explore the limitation of a common approach of characterizing the implicit regularization by data-independent functions. We propose two dynamical mechanisms, i.e., Two-point and One-point Overlapping mechanisms, based on which we provide two recipes for producing classes of one-hidden-neuron NNs that provably cannot be fully characterized by a type of or all data-independent functions. Our results signify the profound data-dependency of implicit regularization in general, inspiring us to study in detail the data-dependency of NN implicit regularization in the future.	翻訳日:2022-01-31 17:20:53 公開日:2022-01-28
# (参考訳) 分割コンカレントアルゴリズムによるパーコレーション逆問題の解法 Solving a percolation inverse problem with a divide-and-concur algorithm ( http://arxiv.org/abs/2201.12222v1 ) ライセンス: CC BY 4.0	Sean Deyo	(参考訳) 我々は,ダイオードネットワークに対するパーコレーション逆問題を提案する。どのノードが電流を相互にパーコレーションできるかという情報があれば,観測電流と一致したダイオードネットワークを構築することができるか? そこで本研究では,この問題の非自明な事例に対する包括的アプローチに対する手法の優位性を実証するため,分割・収束反復予測法を実装した。パーコレーションデータが全て隠されていない場合、問題は最も困難であり、一般に再構成する最も難しいネットワークは、電流が1つのダイオードの追加や削除に最も敏感である場合である。 We present a percolation inverse problem for diode networks: Given information about which pairs of nodes allow current to percolate from one to the other, can one construct a diode network consistent with the observed currents? We implement a divide-and-concur iterative projection method for solving the problem and demonstrate the supremacy of our method over an exhaustive approach for nontrivial instances of the problem. We find that the problem is most difficult when some but not all of the percolation data are hidden, and that the most difficult networks to reconstruct generally are those for which the currents are most sensitive to the addition or removal of a single diode.	翻訳日:2022-01-31 16:49:15 公開日:2022-01-28
# 探索の克服:時間論理の仕様から複雑な環境での深層強化学習 Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications ( http://arxiv.org/abs/2201.12231v1 ) ライセンス: Link先を確認	Mingyu Cai, Erfan Aasi, Calin Belta, Cristian-Ioan Vasile	(参考訳) 大規模複雑な環境に展開する未知の連続時間ダイナミクスを持つタスク誘導型ロボットに対して,深層強化学習(drl)アルゴリズムを提案する。リニア時間論理(LTL)は、リッチなロボット仕様を表現するために用いられる。環境問題に対処するため,我々は,未知のロボット力学により計算された幾何学的経路が実現不可能な状態空間に密接な経路計画誘導型報酬スキームを提案する。提案手法は,LTLミッションを分散DRLを用いて解いたサブタスクに分解し,そのサブタスクをDeep Policy Gradientアルゴリズムを用いて並列にトレーニングする。本フレームワークは,大規模複雑な環境下での複雑なミッションをこなすロボットの性能(有効性,効率)を著しく向上させる。 We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment. Linear Temporal Logic (LTL) is applied to express a rich robotic specification. To overcome the environmental challenge, we propose a novel path planning-guided reward scheme that is dense over the state space, and crucially, robust to infeasibility of computed geometric paths due to the unknown robot dynamics. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-tasks that are solved using distributed DRL, where the sub-tasks are trained in parallel, using Deep Policy Gradient algorithms. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.	翻訳日:2022-01-31 16:37:55 公開日:2022-01-28
# 二重学習音楽作曲と舞踊振付 Dual Learning Music Composition and Dance Choreography ( http://arxiv.org/abs/2201.11999v1 ) ライセンス: Link先を確認	Shuang Wu, Zhenguang Li, Shijian Lu, Li Cheng	(参考訳) 音楽とダンスは常に人間の活動の柱として共存しており、事実上全ての社会における文化的、社会的、娯楽的な機能に大きく貢献している。音楽とダンスが徐々に2つの独立した分野に体系化されているにもかかわらず、彼らの親密なつながりは否定できないものであり、一方の芸術形態はしばしば他方なしでは不完全に見える。近年の研究では、音楽に基づくダンスシーケンスの生成モデルが研究されている。しかし、与えられたダンスのために作曲する2つの課題は、ほとんど見落とされた。本稿では,両タスクを二重学習方式で協調的にモデル化する新しい拡張法を提案する。 2つのモダリティの双対性を活用するために,機能埋め込みを整列する最適な輸送目標を導入するとともに,全体の整合性を高めるためのサイクル整合損失も導入する。実験結果から,2つの学習フレームワークが個々のタスクパフォーマンスを改善し,生成した楽曲の合成とダンス振付を,条件付き入力に忠実に再現できることが確認された。 Music and dance have always co-existed as pillars of human activities, contributing immensely to the cultural, social, and entertainment functions in virtually all societies. Notwithstanding the gradual systematization of music and dance into two independent disciplines, their intimate connection is undeniable and one art-form often appears incomplete without the other. Recent research works have studied generative models for dance sequences conditioned on music. The dual task of composing music for given dances, however, has been largely overlooked. In this paper, we propose a novel extension, where we jointly model both tasks in a dual learning approach. To leverage the duality of the two modalities, we introduce an optimal transport objective to align feature embeddings, as well as a cycle consistency loss to foster overall consistency. Experimental results demonstrate that our dual learning framework improves individual task performance, delivering generated music compositions and dance choreographs that are realistic and faithful to the conditioned inputs.	翻訳日:2022-01-31 16:37:00 公開日:2022-01-28
# differential privacyのハイブリッドモデルにおける転送学習 Transfer Learning In Differential Privacy's Hybrid-Model ( http://arxiv.org/abs/2201.12018v1 ) ライセンス: Link先を確認	Refael Kohen and Or Sheffet	(参考訳) ディファレンシャルプライバシにおけるハイブリッドモデル(avent et al 2017)は、ローカルモデルの拡張であり、n人のローカルエイジェントに加えて、n人の追加的な個人の繊細な詳細を保持するキュレーターである1人の特別エージェントによって支援される。本稿では,キュレーターデータセットのn個の個体が一般集団(地域エージェント)とは異なる分布から引き出されるハイブリッドモデルにおける機械学習の問題点について考察する。我々は,このトランスファー学習問題に対して,反復サブサンプリングと乗法重みアルゴリズム(bun et al, 2020)の滑らかな変動に基づいて,キュレーターが保持するn個のサンプルの簡約化を用いて,キュレーターモデルdp-learnerをハイブリッドモデル学習者に還元する一般的なスキームを与える。提案手法は, 2つの分布間のカイ二乗発散に依存するサンプル複雑性を有する。プライベートリダクションに必要なサンプルの複雑さに、最悪のケース分析バウンダリを与えます。上記のサンプルの複雑さを減らすため、2つの特定のインスタンスにサンプルの複雑さを劇的に減らすことができる(1つのインスタンスは数学的に分析され、もう1つのインスタンスは経験的に分析される)。 The hybrid-model (Avent et al 2017) in Differential Privacy is a an augmentation of the local-model where in addition to N local-agents we are assisted by one special agent who is in fact a curator holding the sensitive details of n additional individuals. Here we study the problem of machine learning in the hybrid-model where the n individuals in the curators dataset are drawn from a different distribution than the one of the general population (the local-agents). We give a general scheme -- Subsample-Test-Reweigh -- for this transfer learning problem, which reduces any curator-model DP-learner to a hybrid-model learner in this setting using iterative subsampling and reweighing of the n examples held by the curator based on a smooth variation of the Multiplicative-Weights algorithm (introduced by Bun et al, 2020). Our scheme has a sample complexity which relies on the chi-squared divergence between the two distributions. We give worst-case analysis bounds on the sample complexity required for our private reduction. Aiming to reduce said sample complexity, we give two specific instances our sample complexity can be drastically reduced (one instance is analyzed mathematically, while the other - empirically) and pose several directions for follow-up work.	翻訳日:2022-01-31 16:36:42 公開日:2022-01-28
# alpa: 分散ディープラーニングのための操作間並列処理の自動化 Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning ( http://arxiv.org/abs/2201.12023v1 ) ライセンス: Link先を確認	Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica	(参考訳) Alpaは、データ、オペレータ、パイプライン並列性を統一する実行計画を生成することで、大規模なディープラーニング(DL)モデルのモデル並列トレーニングを自動化する。既存のモデル並列トレーニングシステムは、ユーザが手動で並列化計画を作成するか、モデル並列化設定の限られたスペースから自動的にモデルを生成する必要があるが、分散コンピューティングデバイス上で複雑なDLモデルをスケールアウトするのに十分ではない。 Alpaは、大きなDLモデルのトレーニングを、並列化を2つの階層レベルとして見ることによって配布する。これに基づいて、Alpaは大規模なモデル並列実行計画のための新しい階層空間を構築している。 Alpaは複数のコンパイルパスを設計し、各独立した並列処理レベルで最適な並列実行計画を自動的に導出し、分散コンピューティングデバイス上で2レベル並列実行をオーケストレーションする効率的なランタイムを実装している。評価の結果,alpaが設計したモデルでも,ハンドチューニング型モデル並列トレーニングシステムと一致するか,あるいは上回る並列化計画を生成することがわかった。特殊なシステムとは異なり、Alpaは手動設計の計画なしで異質なアーキテクチャやモデルを持つモデルに一般化する。 Alpa automates model-parallel training of large deep learning (DL) models by generating execution plans that unify data, operator, and pipeline parallelism. Existing model-parallel training systems either require users to manually create a parallelization plan or automatically generate one from a limited space of model parallelism configurations, which does not suffice to scale out complex DL models on distributed compute devices. Alpa distributes the training of large DL models by viewing parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms. Based on it, Alpa constructs a new hierarchical space for massive model-parallel execution plans. Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel execution on distributed compute devices. Our evaluation shows Alpa generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for. Unlike specialized systems, Alpa also generalizes to models with heterogeneous architectures and models without manually-designed plans.	翻訳日:2022-01-31 16:34:38 公開日:2022-01-28
# MDCT領域における符号化音声の品質向上のためのDNNベースのポストフィルタ A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain ( http://arxiv.org/abs/2201.12039v1 ) ライセンス: Link先を確認	Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs	(参考訳) 周波数領域処理、特にMDCT(Modified Discrete Cosine Transform)は、オーディオ符号化において最も広く使われている手法である。しかし、低ビットレートでは、特に音声の音声品質は、変換係数を直接コードする利用可能なビットがないため、劇的に劣化する。伝統的に、ポストフィルタは、ソースのa-priori情報と余分な送信パラメータを利用して、符号化された音声のアーティファクトを緩和するために使われてきた。近年、データ駆動のポストフィルタはより良い結果を示しているが、複雑さと遅延が大幅に増大している。本研究では,コーデックのmdctドメイン内で直接動作し,余分な遅延を生じさせないマスクベースのポストフィルタを提案する。実数値マスクは量子化MDCT係数に適用され、比較的軽量な畳み込みエンコーダ・デコーダネットワークから推定される。本手法は,最近標準化された低遅延低複雑コーデック (LC3) 上で16kbpsの最小ビットレートで試験する。目的的および主観的評価は従来のポストフィルタよりもこのアプローチの利点を示し,LC3符号化音声よりも平均10MUSHRA点が向上した。 Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding. However, at low bitrates, audio quality, especially for speech, degrades drastically due to the lack of available bits to directly code the transform coefficients. Traditionally, post-filtering has been used to mitigate artefacts in the coded speech by exploiting a-priori information of the source and extra transmitted parameters. Recently, data-driven post-filters have shown better results, but at the cost of significant additional complexity and delay. In this work, we propose a mask-based post-filter operating directly in MDCT domain of the codec, inducing no extra delay. The real-valued mask is applied to the quantized MDCT coefficients and is estimated from a relatively lightweight convolutional encoder-decoder network. Our solution is tested on the recently standardized low-delay, low-complexity codec (LC3) at lowest possible bitrate of 16 kbps. Objective and subjective assessments clearly show the advantage of this approach over the conventional post-filter, with an average improvement of 10 MUSHRA points over the LC3 coded speech.	翻訳日:2022-01-31 16:34:18 公開日:2022-01-28
# 物理誘導型ニューラルネットワークによるフィードフォワード制御:トレーニングコスト正規化と最適化初期化 On feedforward control using physics-guided neural networks: Training cost regularization and optimized initialization ( http://arxiv.org/abs/2201.12088v1 ) ライセンス: Link先を確認	Max Bolderman, Mircea Lazar, Hans Butler	(参考訳) モデルベースフィードフォワードコントローラの性能は通常、逆系のダイナミクスモデルの精度によって制限される。物理誘導型ニューラルネットワーク(PGNN)は、同定された逆ダイナミクスの高精度化手法として最近提案されている。しかし、ニューラルネットワークのフレキシブルな性質は、物理モデルと並行して使用すると過パラメータ化を発生させ、トレーニング中にパラメータドリフトが発生する。このドリフトは、物理モデルのパラメータが物理値に対応しない可能性があるため、トレーニングデータに存在しない運用条件にpgnnの脆弱性が増大する。そこで本研究では, 同定された物理パラメータによる正規化法と, 学習収束を改善する最適化トレーニング初期化法を組み合わせることを提案する。正規化PGNNフレームワークは実生活の産業用リニアモータ上で検証され、追跡精度と外挿性が向上する。 Performance of model-based feedforward controllers is typically limited by the accuracy of the inverse system dynamics model. Physics-guided neural networks (PGNN), where a known physical model cooperates in parallel with a neural network, were recently proposed as a method to achieve high accuracy of the identified inverse dynamics. However, the flexible nature of neural networks can create overparameterization when employed in parallel with a physical model, which results in a parameter drift during training. This drift may result in parameters of the physical model not corresponding to their physical values, which increases vulnerability of the PGNN to operating conditions not present in the training data. To address this problem, this paper proposes a regularization method via identified physical parameters, in combination with an optimized training initialization that improves training convergence. The regularized PGNN framework is validated on a real-life industrial linear motor, where it delivers better tracking accuracy and extrapolation.	翻訳日:2022-01-31 16:33:59 公開日:2022-01-28
# バックドアがフロントドアに突っ込む:バックファイアのマルチエージェントバックドア攻撃 Backdoors Stuck At The Frontdoor: Multi-Agent Backdoor Attacks That Backfire ( http://arxiv.org/abs/2201.12211v1 ) ライセンス: Link先を確認	Siddhartha Datta, Nigel Shadbolt	(参考訳) 協調学習とアウトソースデータ収集における悪意あるエージェントは、クリーンモデルのトレーニングを脅かす。攻撃者が訓練中にモデルに毒を塗って標的の誤分類を成功させるバックドア攻撃は、列車時の堅牢性にとって大きな懸念事項である。本稿では,複数の攻撃者が同時に被害者モデルのバックドアを試みるマルチエージェントバックドア攻撃シナリオについて検討する。エージェントが集団攻撃の成功率の低いゲームで一貫したバックファイリング現象が観察される。バックドア攻撃の態様、非協調/協力、共同分散シフト、ゲーム設定の異なるモードを検証し、下位境界での平衡攻撃成功率を返却する。その結果,実践環境におけるバックドア防衛研究の再評価の動機となった。 Malicious agents in collaborative learning and outsourced data collection threaten the training of clean models. Backdoor attacks, where an attacker poisons a model during training to successfully achieve targeted misclassification, are a major concern to train-time robustness. In this paper, we investigate a multi-agent backdoor attack scenario, where multiple attackers attempt to backdoor a victim model simultaneously. A consistent backfiring phenomenon is observed across a wide range of games, where agents suffer from a low collective attack success rate. We examine different modes of backdoor attack configurations, non-cooperation / cooperation, joint distribution shifts, and game setups to return an equilibrium attack success rate at the lower bound. The results motivate the re-evaluation of backdoor defense research for practical environments.	翻訳日:2022-01-31 16:33:43 公開日:2022-01-28
# データ同化を用いたレベルセット法による海洋出口氷河の表面高さと終点位置のシミュレーション Simulating surface height and terminus position for marine outlet glaciers using a level set method with data assimilation ( http://arxiv.org/abs/2201.12235v1 ) ライセンス: Link先を確認	M. Alamgir Hossaina, Sam Pimentel, John M. Stockie	(参考訳) 本研究では,氷面と終端位置観測を数値氷流モデルに統合するデータ同化フレームワークを実装した。このモデルは、よく知られた浅層棚近似 (ssa) とレベルセット法を結合して氷の動きと氷河の幾何学の変化を捉えている。レベルセット法は、海洋出口氷河の発達する氷-大気圏と氷-海の境界を明示的に追跡する。氷の界面を記述するレベルセット関数を更新することにより,氷表面の標高と横方向の氷の深さの観測を統一するために,アンサンブル変換カルマンフィルタを用いる。理想的な海洋性氷河における数値実験は,季節・多年氷河の進行・後退サイクルを追跡するデータ同化手法の有効性を実証するものである。このモデルはまた、グリーンランド氷床の潮流を終止する主要な氷河であるヘルハイム氷河をシミュレートするためにも適用され、最近の急速な後退の歴史を経験した。リモートセンシングされた地表高度プロファイルからの観測を同化することで、移動氷河の終端と氷河の表面変化をより正確に追跡することができる。これらの結果は, 短期氷床力学のより正確な予測を行うためのデータ同化手法の利用を支持する。 We implement a data assimilation framework for integrating ice surface and terminus position observations into a numerical ice-flow model. The model uses the well-known shallow shelf approximation (SSA) coupled to a level set method to capture ice motion and changes in the glacier geometry. The level set method explicitly tracks the evolving ice-atmosphere and ice-ocean boundaries for a marine outlet glacier. We use an Ensemble Transform Kalman Filter to assimilate observations of ice surface elevation and lateral ice extent by updating the level set function that describes the ice interface. Numerical experiments on an idealized marine-terminating glacier demonstrate the effectiveness of our data assimilation approach for tracking seasonal and multi-year glacier advance and retreat cycles. The model is also applied to simulate Helheim Glacier, a major tidewater-terminating glacier of the Greenland Ice Sheet that has experienced a recent history of rapid retreat. By assimilating observations from remotely-sensed surface elevation profiles we are able to more accurately track the migrating glacier terminus and glacier surface changes. These results support the use of data assimilation methodologies for obtaining more accurate predictions of short-term ice sheet dynamics.	翻訳日:2022-01-31 16:33:29 公開日:2022-01-28
# 認証強化学習のための共同微分可能最適化と検証 Joint Differentiable Optimization and Verification for Certified Reinforcement Learning ( http://arxiv.org/abs/2201.12243v1 ) ライセンス: Link先を確認	Yixuan Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu	(参考訳) 安全クリティカル制御システムのためのモデルベース強化学習では、学習コントローラの下でシステム特性(例えば、安全性、安定性)を正式に認定することが重要である。しかし、既存の手法は一般に正式な検証を施すため、コントローラが学習されているため、学習と検証を何度も繰り返したとしても、証明書を得るのは難しいことがある。そこで,本稿では,価値関数や証明書から勾配によって微分可能な新しい二段階最適化問題を定式化・解決することにより,強化学習と形式検証を共同で行う枠組みを提案する。 svg(model-based stochastic value gradient)法やppo(model-free proximal policy optimization)法に比べて,バリア関数やリアプノフ関数によるシステム安全性と安定性を確保するための実現可能なコントローラを見つける上で,様々な例で実験を行った。 In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties (e.g., safety, stability) under the learned controller. However, as existing methods typically apply formal verification \emph{after} the controller has been learned, it is sometimes difficult to obtain any certificate, even after many iterations between learning and verification. To address this challenge, we propose a framework that jointly conducts reinforcement learning and formal verification by formulating and solving a novel bilevel optimization problem, which is differentiable by the gradients from the value function and certificates. Experiments on a variety of examples demonstrate the significant advantages of our framework over the model-based stochastic value gradient (SVG) method and the model-free proximal policy optimization (PPO) method in finding feasible controllers with barrier functions and Lyapunov functions that ensure system safety and stability.	翻訳日:2022-01-31 16:33:08 公開日:2022-01-28
# 可変化を伴う適応加速度(Extra-)勾配法 Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction ( http://arxiv.org/abs/2201.12302v1 ) ライセンス: Link先を確認	Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen	(参考訳) 本稿では,一般凸の場合に着目した有限サム凸最適化問題について検討する。近年, 分散低減(VR)法とその加速変種の研究は, わくわくする進歩を遂げている。しかし、既存のVRアルゴリズムで使用されるステップサイズは、しばしば未知であり、実際にチューニングを必要とする滑らかさパラメータに依存する。この問題に対処するため,Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) とAdaptive Variance Reduced Accelerated Gradient (AdaVRAG) の2つの新しい適応VRアルゴリズムを提案する。我々のアルゴリズムは滑らかさパラメータの知識を必要としない。 AdaVRAE は $\mathcal{O}\left(n\log\log n+\sqrt {\frac{n\beta}{\epsilon}}\right)$グルーフ評価と AdaVRAG は $\mathcal{O}\left(n\log\log n+\sqrt {\frac{n\beta\log\beta}{\epsilon}}\right)$グルーフ評価を使用して $\mathcal{O}(\epsilon)$-suboptimal Solution を得る。この結果は、非適応型VR手法の最もよく知られた収束率と一致し、アート適応型VR手法であるAdaSVRGの収束率を改善する。実世界のデータセット実験における従来の手法と比較して,アルゴリズムの優れた性能を示す。 In this paper, we study the finite-sum convex optimization problem focusing on the general convex case. Recently, the study of variance reduced (VR) methods and their accelerated variants has made exciting progress. However, the step size used in the existing VR algorithms typically depends on the smoothness parameter, which is often unknown and requires tuning in practice. To address this problem, we propose two novel adaptive VR algorithms: Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) and Adaptive Variance Reduced Accelerated Gradient (AdaVRAG). Our algorithms do not require knowledge of the smoothness parameter. AdaVRAE uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta}{\epsilon}}\right)$ gradient evaluations and AdaVRAG uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta\log\beta}{\epsilon}}\right)$ gradient evaluations to attain an $\mathcal{O}(\epsilon)$-suboptimal solution, where $n$ is the number of functions in the finite sum and $\beta$ is the smoothness parameter. This result matches the best-known convergence rate of non-adaptive VR methods and it improves upon the convergence of the state of the art adaptive VR method, AdaSVRG. We demonstrate the superior performance of our algorithms compared with previous methods in experiments on real-world datasets.	翻訳日:2022-01-31 16:32:46 公開日:2022-01-28
# 多数派支援の価格 The Price of Majority Support ( http://arxiv.org/abs/2201.12303v1 ) ライセンス: Link先を確認	Robin Fritsch and Roger Wattenhofer	(参考訳) 我々は,個人集団の意見の相違点を,相互に独立した連立的な話題で発見する問題を考察する。本稿では,結果が多数派支持を必要とすることによる代表性の喪失,すなわち「多数派支持の価格」を定量化する。各個人は、少なくとも、同意しないトピック数で結果に同意した場合、結果をサポートすると仮定される。我々の結果は、トピック別多数決の結果が多数派に支持されないかもしれないというアンスコンボのパラドックスを定量化すると見なすこともできる。結果の代表性を測定するために,2つの指標を検討する。まず、できるだけ多くのトピックについて多数派と合意する結果を探します。我々は、この数のトピックで多数派と一致し、多数派が支持する結果が存在することが保証されるような最大数が$\ceil{(t+1)/2}$となることを証明する。第2に、あるトピックに対する投票者の意見が、そのトピックの結果と一致する回数を数えます。ゴールは、最大数のマッチで多数派が支持する結果を見つけることである。我々は,この数字と,過半数の支持を得られないような総合的最適結果の一致数との比を考察する。我々は、多数派支持による結果と、ベスト全体に対するこの一致率が存在することが保証されるような最大比率を見出そうとする。 3つのトピックについて、この比率は5/6\approx 0.83$である。一般に、$t$ が無限大に近づく傾向にあるような 2\sqrt{6}-4\approx 0.90$ に近い上限を証明する。さらに、より優れた上界と非整合な下界を、関連する範囲で$t$で数値計算する。 We consider the problem of finding a compromise between the opinions of a group of individuals on a number of mutually independent, binary topics. In this paper, we quantify the loss in representativeness that results from requiring the outcome to have majority support, in other words, the "price of majority support". Each individual is assumed to support an outcome if they agree with the outcome on at least as many topics as they disagree on. Our results can also be seen as quantifying Anscombes paradox which states that topic-wise majority outcome may not be supported by a majority. To measure the representativeness of an outcome, we consider two metrics. First, we look for an outcome that agrees with a majority on as many topics as possible. We prove that the maximum number such that there is guaranteed to exist an outcome that agrees with a majority on this number of topics and has majority support, equals $\ceil{(t+1)/2}$ where $t$ is the total number of topics. Second, we count the number of times a voter opinion on a topic matches the outcome on that topic. The goal is to find the outcome with majority support with the largest number of matches. We consider the ratio between this number and the number of matches of the overall best outcome which may not have majority support. We try to find the maximum ratio such that an outcome with majority support and this ratio of matches compared to the overall best is guaranteed to exist. For 3 topics, we show this ratio to be $5/6\approx 0.83$. In general, we prove an upper bound that comes arbitrarily close to $2\sqrt{6}-4\approx 0.90$ as $t$ tends to infinity. Furthermore, we numerically compute a better upper and a non-matching lower bound in the relevant range for $t$.	翻訳日:2022-01-31 16:31:04 公開日:2022-01-28
# 独立系鎖を持つnドルプレイヤ確率ゲームにおける定性ナッシュ平衡ポリシのデュアルミラーディフレッシュによる学習 Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains via Dual Mirror Descent ( http://arxiv.org/abs/2201.12224v1 ) ライセンス: Link先を確認	S. Rasoul Etesami	(参考訳) 我々は$n$プレーヤ確率ゲームのサブクラスについて検討し、プレイヤーはペイオフ関数を介して結合された状態で内部の状態/行動空間を持つ。プレイヤーの内部鎖は独立した遷移確率によって駆動されると仮定される。さらに、プレイヤーは実際の機能ではなく、それぞれのペイオフの実現しか受け取れず、お互いの状態や行動も観察できない。ペイオフ関数の構造に関するいくつかの仮定の下で、双対平均化と双対ミラー降下に基づく効率的な学習アルゴリズムを開発し、ほぼ確実に収束し、あるいは$\epsilon$-nash均衡ポリシーの集合に期待できる。特に、ゲームパラメーターの観点から多項式的にスケールするイテレートの数の上界を導出して、$\epsilon$-nash 平衡ポリシーを達成する。マルコフポテンシャルゲームや線型四進確率ゲームに加えて、この研究は、ある仮定の下では、$\epsilon$-Nash平衡ポリシーを見つけるために多項式時間学習アルゴリズムを確実に認める$n$プレイヤ確率ゲームの別の興味深いサブクラスを提供する。 We consider a subclass of $n$-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can only receive realizations of their payoffs but not the actual functions, nor can they observe each others' states/actions. Under some assumptions on the structure of the payoff functions, we develop efficient learning algorithms based on Dual Averaging and Dual Mirror Descent, which provably converge almost surely or in expectation to the set of $\epsilon$-Nash equilibrium policies. In particular, we derive upper bounds on the number of iterates that scale polynomially in terms of the game parameters to achieve an $\epsilon$-Nash equilibrium policy. Besides Markov potential games and linear-quadratic stochastic games, this work provides another interesting subclass of $n$-player stochastic games that under some assumption provably admit polynomial-time learning algorithm for finding their $\epsilon$-Nash equilibrium policies.	翻訳日:2022-01-31 16:30:36 公開日:2022-01-28
# (参考訳) 新しいダイナミックキャリブレーションによる予報と観測を組み合わせた短期風速アンサンブル予測のスキル向上 Increasing the skill of short-term wind speed ensemble forecasts combining forecasts and observations via a new dynamic calibration ( http://arxiv.org/abs/2201.12234v1 ) ライセンス: CC BY 4.0	Gabriele Casciaro, Francesco Ferrari, Daniele Lagomarsino Oneto, Andrea Lira-Loarca, Andrea Mazzino	(参考訳) 風力産業で使用される全ての数値気象予測モデルは、解析が利用可能になったら、メインのシナプス時間00,06,12,18 utcから予測を生成する必要がある。 2つの連続するモデル間の6時間の遅延時間は、少なくとも1時間の周波数を持つ新しい正確な予測を提供することで、ギャップを埋めるための戦略を要求する。これは、頻繁で正確で新鮮な情報をトレーダーやシステム規制当局から要求し、継続的に作業戦略を適用するために行われる。本稿では,準実時間観測風速と気象モデル予測を,新しいアンサンブルモデル出力統計(emos)戦略を用いて組み合わせる手法を提案する。本戦略の成功は,2018年と2019年のイタリア上空の観測風速との比較によって評価された。 All numerical weather prediction models used for the wind industry need to produce their forecasts starting from the main synoptic hours 00, 06, 12, and 18 UTC, once the analysis becomes available. The six-hour latency time between two consecutive model runs calls for strategies to fill the gap by providing new accurate predictions having, at least, hourly frequency. This is done to accommodate the request of frequent, accurate and fresh information from traders and system regulators to continuously adapt their work strategies. Here, we propose a strategy where quasi-real time observed wind speed and weather model predictions are combined by means of a novel Ensemble Model Output Statistics (EMOS) strategy. The success of our strategy is measured by comparisons against observed wind speed from SYNOP stations over Italy in the years 2018 and 2019.	翻訳日:2022-01-31 16:29:49 公開日:2022-01-28
# エンドツーエンドコード切り換え自動音声認識における言語コンテキスト混乱の低減 Reducing language context confusion for end-to-end code-switching automatic speech recognition ( http://arxiv.org/abs/2201.12155v1 ) ライセンス: Link先を確認	Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Yu Ting Yeung, Liqun Deng	(参考訳) コードスイッチングは、コミュニケーションプロセスにおける代替言語を扱うことです。コードスイッチングのための訓練用エンドツーエンド(E2E)自動音声認識(ASR)システムは、複数の言語が存在するため、言語コンテキストの混乱によって複雑化するデータが少ないため、難しい問題であることが知られている。本稿では、等価制約理論(EC)に基づくE2E符号スイッチングASRモデルの多言語文脈混乱を低減するための言語関連注意機構を提案する。言語理論では、コードスイッチング文で発生する任意の単言語フラグメントは、一言語文の1つでなければならない。モノリンガルデータとコードスイッチングデータの間にブリッジを確立する。複数の言語のそれぞれの注意を計算することにより、豊かな単言語データから言語知識を効率的に伝達することができる。本手法をasru 2019 mandarin- english code-switching challengeデータセットで評価した。ベースラインモデルと比較して,提案手法は11.37%の相対混合誤差率低減を実現する。 Code-switching is about dealing with alternative languages in the communication process. Training end-to-end (E2E) automatic speech recognition (ASR) systems for code-switching is known to be a challenging problem because of the lack of data compounded by the increased language context confusion due to the presence of more than one language. In this paper, we propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model based on the Equivalence Constraint Theory (EC). The linguistic theory requires that any monolingual fragment that occurs in the code-switching sentence must occur in one of the monolingual sentences. It establishes a bridge between monolingual data and code-switching data. By calculating the respective attention of multiple languages, our method can efficiently transfer language knowledge from rich monolingual data. We evaluate our method on ASRU 2019 Mandarin-English code-switching challenge dataset. Compared with the baseline model, the proposed method achieves 11.37% relative mix error rate reduction.	翻訳日:2022-01-31 16:12:18 公開日:2022-01-28
# 協調運転自動化のためのインフラストラクチャに基づく物体検出と追跡:調査 Infrastructure-Based Object Detection and Tracking for Cooperative Driving Automation: A Survey ( http://arxiv.org/abs/2201.11871v1 ) ライセンス: Link先を確認	Zhengwei Bai, Guoyuan Wu, Xuewei Qi, Yongkang Liu, Kentaro Oguchi, Matthew J. Barth	(参考訳) オブジェクト検出は、現代交通システムの安全性、モビリティ、持続可能性問題に対処する革命的なソリューションであるCDA(Cooperative Driving Automation)を実現する上で、基本的な役割を果たす。現在のコンピュータビジョン技術は、咬合のないシナリオで十分な物体検出結果を提供できるが、搭載センサーの知覚性能は、範囲や咬合によって必然的に制限される可能性がある。センサ設置のための柔軟な位置とポーズのため、インフラストラクチャベースの検出および追跡システムは、コネクテッドカーの認識能力を高めることができ、すぐに最も人気のある研究トピックの1つとなる。本稿では,インフラに基づく物体検出・追跡システムの研究動向について述べる。各種センサに基づく道路サイドセンシングシステムのアーキテクチャをレビューし,インフラベースのセンシングシステムにおけるワークフローの高レベルな記述を示す。道路サイドセンサと異なる知覚方法論を詳細な文献でレビュー・分析し、特定の方法の低レベルな説明と、インフラストラクチャベースの物体検出と追跡方法の全体像を描くためのデータセットとシミュレータを提供する。議論は、現在の機会、オープンな問題、将来のトレンドを指摘するために行われます。 Object detection plays a fundamental role in enabling Cooperative Driving Automation (CDA), which is regarded as the revolutionary solution to addressing safety, mobility, and sustainability issues of contemporary transportation systems. Although current computer vision technologies could provide satisfactory object detection results in occlusion-free scenarios, the perception performance of onboard sensors could be inevitably limited by the range and occlusion. Owing to flexible position and pose for sensor installation, infrastructure-based detection and tracking systems can enhance the perception capability for connected vehicles and thus quickly become one of the most popular research topics. In this paper, we review the research progress for infrastructure-based object detection and tracking systems. Architectures of roadside perception systems based on different types of sensors are reviewed to show a high-level description of the workflows for infrastructure-based perception systems. Roadside sensors and different perception methodologies are reviewed and analyzed with detailed literature to provide a low-level explanation for specific methods followed by Datasets and Simulators to draw an overall landscape of infrastructure-based object detection and tracking methods. Discussions are conducted to point out current opportunities, open problems, and anticipated future trends.	翻訳日:2022-01-31 16:12:03 公開日:2022-01-28
# GAN生成顔画像の一般視品質評価 Generalized Visual Quality Assessment of GAN-Generated Face Images ( http://arxiv.org/abs/2201.11975v1 ) ライセンス: Link先を確認	Yu Tian and Zhangkai Ni and Baoliang Chen and Shiqi Wang and Hanli Wang and Sam Kwong	(参考訳) 近年では、gans(generative adversarial networks)による顔生成への関心が劇的に高まっている。異なるアプリケーションシナリオに対して鮮明な顔画像を生成するために、多くのganアルゴリズムが開発されている。しかし、そのようなGAN生成顔画像(GFI)の自動品質評価にはほとんど貢献していないが、GANモデルで生成されたGFIの一般化と堅牢な品質評価にはほとんど貢献していない。本稿では, GFIの汎用品質評価に向けて, 主観的, 客観的品質を研究するための最初の試みを行う。具体的には、4つのGANアルゴリズムのGFI、画像品質評価(IQA)尺度の擬似ラベル、および主観的テストによる人間の意見スコアからなる大規模データベースを構築した。その後,メタラーニングに基づくGANアルゴリズムを用いて,GFIの正確な品質予測を行うことができる品質評価モデルを開発した。特に、限られたGANアルゴリズムから生まれるGFIのペアから共有知識を学習するために、畳み込みブロック注意(CBA)と顔属性に基づく分析(ABA)モジュールを開発し、学習知識が人間の視覚的知覚と一致することを保証する。大規模実験により,提案モデルは最先端のIQAモデルと比較して性能が向上し,未知のGANアルゴリズムからGFIを評価する際の有効性を維持することができることがわかった。 Recent years have witnessed the dramatically increased interest in face generation with generative adversarial networks (GANs). A number of successful GAN algorithms have been developed to produce vivid face images towards different application scenarios. However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to generalized and robust quality assessment of GFIs generated with unseen GAN model. Herein, we make the first attempt to study the subjective and objective quality towards generalized quality assessment of GFIs. More specifically, we establish a large-scale database consisting of GFIs from four GAN algorithms, the pseudo labels from image quality assessment (IQA) measures, as well as the human opinion scores via subjective testing. Subsequently, we develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms based on meta-learning. In particular, to learn shared knowledge from GFIs pairs that are born of limited GAN algorithms, we develop the convolutional block attention (CBA) and facial attributes-based analysis (ABA) modules, ensuring that the learned knowledge tends to be consistent with human visual perception. Extensive experiments exhibit that the proposed model achieves better performance compared with the state-of-the-art IQA models, and is capable of retaining the effectiveness when evaluating GFIs from the unseen GAN algorithms.	翻訳日:2022-01-31 16:11:43 公開日:2022-01-28
# 深部畳み込みニューラルネットワークを用いた超音波画像における頸動脈壁セグメンテーション Carotid artery wall segmentation in ultrasound image sequences using a deep convolutional neural network ( http://arxiv.org/abs/2201.12152v1 ) ライセンス: Link先を確認	Nolann Lain\'e, Guillaume Zahnd, Herv \'e Liebgott, Maciej Orkisz	(参考訳) 本研究の目的は, 胸動脈 intima-media complex の経時的超音波像による分画を行い, その厚みを計測することである。拡張されたU-netネットワークに基づく教師付き領域ベースディープラーニングアプローチを含む完全自動領域ベースセグメンテーション手法を提案する。 2人の専門家が注釈付けした2176の画像からなるマルチセンターデータベース上で、5倍のクロスバリデーションを用いてトレーニングと評価を行った。その結果,参照アノテーションと比較して平均絶対差(<120 um)は,サーバ間変動 (180 um) よりも低かった。 98.7%の成功率、すなわち手動修正を必要とする症例は1.3%に過ぎず、提案手法は堅牢であり、臨床応用に推奨される可能性がある。 The objective of this study is the segmentation of the intima-media complex of the common carotid artery, on longitudinal ultrasound images, to measure its thickness. We propose a fully automatic region-based segmentation method, involving a supervised region-based deep-learning approach based on a dilated U-net network. It was trained and evaluated using a 5-fold cross-validation on a multicenter database composed of 2176 images annotated by two experts. The resulting mean absolute difference (<120 um) compared to reference annotations was less than the inter-observer variability (180 um). With a 98.7% success rate, i.e., only 1.3% cases requiring manual correction, the proposed method has been shown to be robust and thus may be recommended for use in clinical practice.	翻訳日:2022-01-31 16:11:18 公開日:2022-01-28
# VRT:ビデオ再生用トランス VRT: A Video Restoration Transformer ( http://arxiv.org/abs/2201.12288v1 ) ライセンス: Link先を確認	Jingyun Liang and Jiezhang Cao and Yuchen Fan and Kai Zhang and Rakesh Ranjan and Yawei Li and Radu Timofte and Luc Van Gool	(参考訳) ビデオ復元(ビデオスーパーレゾリューション)は、高品質のフレームを低品質のフレームから復元することを目的としている。単一の画像復元とは異なり、ビデオ復元は通常、隣接する複数のビデオフレームの時間的情報を利用する必要がある。既存のディープメソッドは、スライディングウィンドウ戦略やリカレントアーキテクチャを利用して、フレーム毎の復元や長距離モデリング能力の欠如によって制限される。本稿では,並列フレーム予測と長距離時間依存性モデリング機能を備えたビデオ再生変換器(VRT)を提案する。より具体的には、VRTは複数のスケールから構成されており、それぞれが時間的相互自己注意(TMSA)と並列ワープの2種類のモジュールで構成されている。 tmsaは動画を小さなクリップに分割し、相互注意を関節の動きの推定、特徴のアライメント、特徴の融合に応用し、自己注意を特徴抽出に使用する。クロスクリップインタラクションを可能にするために、ビデオシーケンスを他のレイヤ毎にシフトする。また、並列処理は、隣接するフレームからの情報を並列特徴ワープによってさらに融合するために用いられる。ビデオスーパーレゾリューション、ビデオデブロアリング、ビデオデノーミングを含む3つのタスクの実験結果は、VRTが9つのベンチマークデータセットで最先端の手法よりも大きなマージン($2.16dB}$)で優れていることを示した。 Video restoration (e.g., video super-resolution) aims to restore high-quality frames from low-quality frames. Different from single image restoration, video restoration generally requires to utilize temporal information from multiple adjacent but usually misaligned video frames. Existing deep methods generally tackle with this by exploiting a sliding window strategy or a recurrent architecture, which either is restricted by frame-by-frame restoration or lacks long-range modelling ability. In this paper, we propose a Video Restoration Transformer (VRT) with parallel frame prediction and long-range temporal dependency modelling abilities. More specifically, VRT is composed of multiple scales, each of which consists of two kinds of modules: temporal mutual self attention (TMSA) and parallel warping. TMSA divides the video into small clips, on which mutual attention is applied for joint motion estimation, feature alignment and feature fusion, while self attention is used for feature extraction. To enable cross-clip interactions, the video sequence is shifted for every other layer. Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping. Experimental results on three tasks, including video super-resolution, video deblurring and video denoising, demonstrate that VRT outperforms the state-of-the-art methods by large margins ($\textbf{up to 2.16dB}$) on nine benchmark datasets.	翻訳日:2022-01-31 16:11:06 公開日:2022-01-28
# 完全自動NMRタンパク質構造決定のためのディープラーニングの活用 Leveraging deep learning for fully automated NMR protein structure determination ( http://arxiv.org/abs/2201.12041v1 ) ライセンス: Link先を確認	Piotr Klukowski, Roland Riek, Peter G\"untert	(参考訳) 核磁気共鳴分光法は、タンパク質データバンクに11800以上のタンパク質構造が蓄積された構造生物学における主要な技術の一つである。 NMRは溶液、生体細胞、固体中の中小タンパク質の構造と動態を解明することができるが、退屈なデータ解析プロセスによって制限されている。通常、NMR測定をタンパク質構造に変えるには、訓練された専門家の手作業が数週間から数ヶ月かかる。このプロセスの自動化は、30年以上前にこの分野で定式化されたオープンな問題です。ここでは、この課題に対処する最初のアプローチを示す。本手法はnmrスペクトルとタンパク質配列のみを入力として使用し,人間の介入なしに構造を厳密に伝達する。 100タンパク質のベンチマーク (1329 2D/3D/4D NMRスペクトル) で、ARTINAは、PDB基準に1.44 {\AA} 中央のRMSDと91.36%の正しいNMR共鳴割り当てを持つ構造を解く能力を示した。 ARTINAは非専門家によって使用することができ、NMRによるタンパク質構造決定の労力を、基本的に試料とスペクトルの測定のために削減することができる。 Nuclear Magnetic Resonance (NMR) spectroscopy is one of the major techniques in structural biology with over 11800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. Here, we present the first approach that addresses this challenge. Our method, ARTINA, uses as input only NMR spectra and the protein sequence, delivering a structure strictly without any human intervention. Tested on a 100-protein benchmark (1329 2D/3D/4D NMR spectra), ARTINA demonstrated its ability to solve structures with 1.44 {\AA} median RMSD to the PDB reference and 91.36% correct NMR resonance assignments. ARTINA can be used by non-experts, reducing the effort for a protein structure determination by NMR essentially to the preparation of the sample and the spectra measurements.	翻訳日:2022-01-31 16:08:08 公開日:2022-01-28
# 生成型ガイトネット Generative GaitNet ( http://arxiv.org/abs/2201.12044v1 ) ライセンス: Link先を確認	Jungnam Park, Sehee Min, Phil Sik Chang, Jaedong Lee, Moonseok Park, Jehee Lee	(参考訳) 解剖学と外界の関係を理解することは、予測歩行シミュレーションの成功の鍵となる。本稿では,304ヒル型マスカロテンを用いた全身筋骨格モデルを制御するための,深層強化学習に基づく新しいネットワークアーキテクチャであるGenerative GaitNetを提案する。生成ゲイト(Generative Gait)は、解剖学的条件(例えば、質量分布、体比、骨変形、筋肉の欠損など)の618次元連続領域で学習された人工ニューラルネットワークの訓練済み、未処理のシステムである。事前学習されたゲイトネットは、入力として解剖学と歩行条件を取り、物理に基づくシミュレーションを通じて条件に合った一連の歩行サイクルを生成させる。我々は,実時間物理学に基づくシミュレーションにおいて,ゲン・エレーティブ・ガイトネットの有効性と表現力について検討する。 Understanding the relation between anatomy andgait is key to successful predictive gait simulation. Inthis paper, we present Generative GaitNet, which isa novel network architecture based on deep reinforce-ment learning for controlling a comprehensive, full-body, musculoskeletal model with 304 Hill-type mus-culotendons. The Generative Gait is a pre-trained, in-tegrated system of artificial neural networks learnedin a 618-dimensional continuous domain of anatomyconditions (e.g., mass distribution, body proportion,bone deformity, and muscle deficits) and gait condi-tions (e.g., stride and cadence). The pre-trained Gait-Net takes anatomy and gait conditions as input andgenerates a series of gait cycles appropriate to theconditions through physics-based simulation. We willdemonstrate the efficacy and expressive power of Gen-erative GaitNet to generate a variety of healthy andpathologic human gaits in real-time physics-based sim-ulation.	翻訳日:2022-01-31 16:07:49 公開日:2022-01-28
# HEAT: ハイパーエッジアテンションネットワーク HEAT: Hyperedge Attention Networks ( http://arxiv.org/abs/2201.12113v1 ) ライセンス: Link先を確認	Dobrik Georgiev, Marc Brockschmidt, Miltiadis Allamanis	(参考訳) 構造化データからの学習は、コア機械学習タスクである。一般的に、そのようなデータはグラフとして表現され、通常はノードのペア間の(型付けされた)バイナリ関係しか考慮しない。これは、高度に構造化されたデータを持つ多くのドメインにとって実質的な制限である。このようなドメインの重要な1つはソースコードであり、ハイパーグラフベースの表現は、セマンティックにリッチで構造化されたコードの性質をよりよく捉えることができる。本稿では,型付きハイパーグラフと資格付きハイパーグラフを表現可能なニューラルモデルであるHEATについて述べる。これは、メッセージパッシングニューラルネットワークとトランスフォーマーの両方の一般化と見なすことができる。本稿では,プログラムのハイパーグラフ表現を用いた知識ベース補完とバグ検出と修復について評価する。どちらの設定でも、強力なベースラインよりも優れており、そのパワーと汎用性を示している。 Learning from structured data is a core machine learning task. Commonly, such data is represented as graphs, which normally only consider (typed) binary relationships between pairs of nodes. This is a substantial limitation for many domains with highly-structured data. One important such domain is source code, where hypergraph-based representations can better capture the semantically rich and structured nature of code. In this work, we present HEAT, a neural model capable of representing typed and qualified hypergraphs, where each hyperedge explicitly qualifies how participating nodes contribute. It can be viewed as a generalization of both message passing neural networks and Transformers. We evaluate HEAT on knowledge base completion and on bug detection and repair using a novel hypergraph representation of programs. In both settings, it outperforms strong baselines, indicating its power and generality.	翻訳日:2022-01-31 16:07:33 公開日:2022-01-28
# (参考訳) risknet: 信頼できない資源のネットワークにおける神経リスク評価 RiskNet: Neural Risk Assessment in Networks of Unreliable Resources ( http://arxiv.org/abs/2201.12263v1 ) ライセンス: CC BY 4.0	Krzysztof Rusek, Piotr Bory{\l}o, Piotr Jaglarz, Fabien Geyer, Albert Cabellos, Piotr Cho{\l}da	(参考訳) 作業経路とバックアップ経路間で共有されるリソースによって接続が保護される通信ネットワークにおいて、障害によって引き起こされる罰則の分布を予測するグラフニューラルネットワーク(GNN)に基づく手法を提案する。 GNNベースのアルゴリズムは、Barab\'asi-Albertモデルで生成されたランダムグラフでのみ訓練される。しかし, 得られた実験結果から, 既存の様々なトポロジにおいて, ペナルティを正確にモデル化できることが示唆された。 GNNは、研究中のネットワークトポロジの複雑な停止シナリオをシミュレートする必要がない。実際には、設計操作は現代のハードウェアでは4msに制限されている。このようにして、12,000回以上のスピード改善を達成できます。 We propose a graph neural network (GNN)-based method to predict the distribution of penalties induced by outages in communication networks, where connections are protected by resources shared between working and backup paths. The GNN-based algorithm is trained only with random graphs generated with the Barab\'asi-Albert model. Even though, the obtained test results show that we can precisely model the penalties in a wide range of various existing topologies. GNNs eliminate the need to simulate complex outage scenarios for the network topologies under study. In practice, the whole design operation is limited by 4ms on modern hardware. This way, we can gain as much as over 12,000 times in the speed improvement.	翻訳日:2022-01-31 16:06:03 公開日:2022-01-28
# ニューロンの勾配降下とその近似2次最適化への応用 Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization ( http://arxiv.org/abs/2201.12250v1 ) ライセンス: Link先を確認	Frederik Benzing	(参考訳) 二階オプティマイザはニューラルネットワークのトレーニングを高速化する可能性を持っていると考えられているが、曲率行列の巨大さのため、計算的に扱いやすい近似が必要となる。最も成功した近似の族はクロネッカー因子付きブロック対角曲率推定 (kfac) である。ここでは、事前の作業から得られたツールを組み合わせて、正確な2次更新を評価するとともに、驚くべき結果を得るための注意深いアブレーションを行う: その近似のため、kfacは2次更新と密接に関連しておらず、特に、真の2次更新よりも大幅に優れています。この課題は広く信じられており、なぜKFACがうまく機能するのかという疑問を即座に提起している。我々は、KFACが重みよりもニューロンに勾配降下を行う1次アルゴリズムを近似していることを示し、この問題に答える。最後に、この最適化は計算コストとデータ効率の観点から、KFACよりも良くなることを示す。 Second-order optimizers are thought to hold the potential to speed up neural network training, but due to the enormous size of the curvature matrix, they typically require approximations to be computationally tractable. The most successful family of approximations are Kronecker-Factored, block-diagonal curvature estimates (KFAC). Here, we combine tools from prior work to evaluate exact second-order updates with careful ablations to establish a surprising result: Due to its approximations, KFAC is not closely related to second-order updates, and in particular, it significantly outperforms true second-order updates. This challenges widely held believes and immediately raises the question why KFAC performs so well. We answer this question by showing that KFAC approximates a first-order algorithm, which performs gradient descent on neurons rather than weights. Finally, we show that this optimizer often improves over KFAC in terms of computational cost and data-efficiency.	翻訳日:2022-01-31 15:53:25 公開日:2022-01-28
# 差分プライバシーによるImageNetスケールのトレーニングに向けて Toward Training at ImageNet Scale with Differential Privacy ( http://arxiv.org/abs/2201.12328v1 ) ライセンス: Link先を確認	Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, Abhradeep Thakurta	(参考訳) 差分プライバシー(DP)は、ニューラルネットワークを含む機械学習(ML)モデルをトレーニングするためのデファクトスタンダードであり、トレーニングセット内の個々のサンプルのプライバシを保証する。 MLモデルを異なるプライバシでトレーニングする方法に関する豊富な文献があるにも関わらず、現実の大規模ニューラルネットワークを適切な精度とプライバシの両方でトレーニングすることは、依然として極めて難しい。そこで私たちは,イメージネット画像分類をMLタスクのポスター例として用いて,DPで正確に解決することが非常に困難である点を調査した。本論文は,我々の取り組みから得た最初の教訓を共有し,他の研究者に大規模にdpトレーニングを探求するよう促し,知らせることを目的としている。 DPトレーニングを高速化する上で有効なアプローチと、DPに向いているトレーニングプロセスのモデルタイプと設定を示す。この方法を組み合わせることで、差分プライバシーを持つResnet-18を47.9%の精度とプライバシパラメータにトレーニングすることができる。$\epsilon = 10, \delta = 10^{-6}$, ImagenetモデルのDP-SGDトレーニングよりも大幅に改善されるが、同じネットワークがプライバシなしで取得できる7,5\%の正確さとは程遠い。私たちはコードをhttps://github.com/google-research/dp-imagenetで共有しています。 Differential privacy (DP) is the de facto standard for training machine learning (ML) models, including neural networks, while ensuring the privacy of individual examples in the training set. Despite a rich literature on how to train ML models with differential privacy, it remains extremely challenging to train real-life, large neural networks with both reasonable accuracy and privacy. We set out to investigate how to do this, using ImageNet image classification as a poster example of an ML task that is very challenging to resolve accurately with DP right now. This paper shares initial lessons from our effort, in the hope that it will inspire and inform other researchers to explore DP training at scale. We show approaches which help to make DP training faster, as well as model types and settings of the training process that tend to work better for DP. Combined, the methods we discuss let us train a Resnet-18 with differential privacy to 47.9% accuracy and privacy parameters $\epsilon = 10, \delta = 10^{-6}$, a significant improvement over "naive" DP-SGD training of Imagenet models but a far cry from the $75\%$ accuracy that can be obtained by the same network without privacy. We share our code at https://github.com/google-research/dp-imagenet calling for others to join us in moving the needle further on DP at scale.	翻訳日:2022-01-31 15:53:10 公開日:2022-01-28
# 地域最適化 Regionalized optimization ( http://arxiv.org/abs/2201.11876v1 ) ライセンス: Link先を確認	Gr\'egoire Sergeant-Perthuis	(参考訳) Yedidia, Freeman, Weiss の参照論文 "Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms" において、一般エネルギーの一般エネルギー(Generalized Bethe Free Energy)と呼ばれる領域ベースの自由エネルギー近似を導入することにより、一般エネルギー伝播の根底にある変動原理が存在することを示した。彼らは一般信条の固定点がこの自由エネルギーの臨界点であることの証明をスケッチし、この証明はペルトルの論文で完成した。本稿では,局所最適化問題のパッチングとして定義される最適化問題のクラスと,アルゴリズムの臨界点と固定点との対応が成立するメッセージパッシングアルゴリズムを特定する。このフレームワークには多くのアプリケーションがあり、そのうちの1つはフィルタデータのためのpcaであり、領域確率に対する確率的互換性の制約を伴うmaxentの領域ベース近似である。このようなアプローチは、マルチモーダルな統合による推論や、複数のビューを持つシーンでの推論に特に当てはまる。 Yedidia, Freeman, Weiss have shown in their reference article, "Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms", that there is a variational principle underlying the General Belief Propagation, by introducing a region-based free energy approximation of the MaxEnt free energy, that we will call the Generalized Bethe free energy. They sketched a proof that fixed points of the General Belief Propagation are critical points of this free energy, this proof was completed in the thesis of Peltre. In this paper we identify a class of optimization problems defined as patching local optimization problems and associated message passing algorithms for which such correspondence between critical points and fix points of the algorithms holds. This framework holds many applications one of which being a PCA for filtered data and a region-based approximation of MaxEnT with stochastic compatibility constraints on the region probabilities. Such approach is particularly adapted for inference with multimodal integration, inference on scenes with multiple views.	翻訳日:2022-01-31 15:52:22 公開日:2022-01-28
# 二重メタ模倣学習による階層構造伝達 Transfering Hierarchical Structure with Dual Meta Imitation Learning ( http://arxiv.org/abs/2201.11981v1 ) ライセンス: Link先を確認	Chongkai Gao, Yizhou Jiang, Feng Chen	(参考訳) 階層的模倣学習(hil)は、ロボットが長いホリゾンのデモからサブスキルを学ぶ効果的な方法である。しかし、学習された階層構造は、マルチタスクや新しいタスクに転送するメカニズムを欠いているため、新しい状況に直面した時にスクラッチから学ぶ必要がある。モジュラーサブスキルの転送と再構成は階層構造全体の迅速な適応能力を必要とする。本研究では,ハイレベルネットワークとサブスキルをモデルに依存しないメタ学習で反復的にメタ学習する階層的メタ模倣学習法であるDual Meta Imitation Learning (DMIL)を提案する。 DMILは、各サブスキルからのステートアクションペアの可能性をハイレベルネットワーク適応の監督に利用し、適応されたハイレベルネットワークを使用して、サブスキル適応毎に異なるデータセットを決定する。我々は,DMILの反復学習過程の収束を理論的に証明し,DMILと期待最大化アルゴリズムの接続を確立する。実験により,Meta-world \cite{metaworld} ベンチマークによる最先端数発の模倣学習性能と,Kitchen 環境の長期タスクにおける競合結果が得られた。 Hierarchical Imitation Learning (HIL) is an effective way for robots to learn sub-skills from long-horizon unsegmented demonstrations. However, the learned hierarchical structure lacks the mechanism to transfer across multi-tasks or to new tasks, which makes them have to learn from scratch when facing a new situation. Transferring and reorganizing modular sub-skills require fast adaptation ability of the whole hierarchical structure. In this work, we propose Dual Meta Imitation Learning (DMIL), a hierarchical meta imitation learning method where the high-level network and sub-skills are iteratively meta-learned with model-agnostic meta-learning. DMIL uses the likelihood of state-action pairs from each sub-skill as the supervision for the high-level network adaptation, and use the adapted high-level network to determine different data set for each sub-skill adaptation. We theoretically prove the convergence of the iterative training process of DMIL and establish the connection between DMIL and Expectation-Maximization algorithm. Empirically, we achieve state-of-the-art few-shot imitation learning performance on the Meta-world \cite{metaworld} benchmark and competitive results on long-horizon tasks of Kitchen environments.	翻訳日:2022-01-31 15:52:02 公開日:2022-01-28
# エンドツーエンド音声認識のためのニューラルFSTクラス言語モデル Neural-FST Class Language Model for End-to-End Speech Recognition ( http://arxiv.org/abs/2201.11867v1 ) ライセンス: Link先を確認	Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo Wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer	(参考訳) ニューラルネットワーク言語モデル(NNLM)と有限状態トランスデューサ(FST)を数学的に一貫した枠組みで組み合わせた,エンドツーエンド音声認識のためのニューラルFSTクラス言語モデル(NFCLM)を提案する。提案手法は,汎用的な背景テキストをモデル化するバックグラウンドNNLMと,個別FSTとしてモデル化されたドメイン固有エンティティのコレクションを利用する。それぞれの出力トークンはこれらの成分の混合によって生成され、混合重みは個別に訓練された神経決定器で推定される。その結果,NFCLMは単語誤り率においてNNLMを15.8%上回っていることがわかった。 NFCLM は従来の NNLM や FST の浅層核融合と同等の性能を保ちながら、オーバーバイアスや12倍のコンパクトさを保ち、デバイス上での使用に適している。 We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework. Our method utilizes a background NNLM which models generic background text together with a collection of domain-specific entities modeled as individual FSTs. Each output token is generated by a mixture of these components; the mixture weights are estimated with a separately trained neural decider. We show that NFCLM significantly outperforms NNLM by 15.8% relative in terms of Word Error Rate. NFCLM achieves similar performance as traditional NNLM and FST shallow fusion while being less prone to overbiasing and 12 times more compact, making it more suitable for on-device usage.	翻訳日:2022-01-31 15:51:41 公開日:2022-01-28
# DiffGAN-TTS: 拡散GANを用いた高忠実かつ効率的なテキスト音声合成 DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs ( http://arxiv.org/abs/2201.11972v1 ) ライセンス: Link先を確認	Songxiang Liu, Dan Su, Dong Yu	(参考訳) 拡散確率モデル (DDPM) は、様々な音声合成問題を解くために用いられてきた表現的生成モデルである。しかし,サンプリングコストが高いため,リアルタイム音声処理ではDDPMの使用が困難である。本稿では,DiffGAN-TTSについて紹介する。DiffGAN-TTSは,高忠実で効率的な音声合成を実現する新しいDDPMベースのテキスト音声合成モデルである。 DiffGAN-TTSは拡散生成逆数ネットワーク (GAN) をデノナイズし、デノナイズ分布を近似するために逆向きに訓練された表現モデルを採用する。 DiffGAN-TTSは4ステップで高忠実度音声サンプルを生成可能であることを示す。さらに, 推定を高速化するために, アクティブな浅層拡散機構を提案する。ステージ2で訓練されたDDPMに貴重な事前情報を提供する基本的TSS音響モデルを用いて,2段階のトレーニングスキームを提案する。実験の結果,DiffGAN-TTSは1段階のみの高合成性能が得られることがわかった。 Denoising diffusion probabilistic models (DDPMs) are expressive generative models that have been used to solve a variety of speech synthesis problems. However, because of their high sampling costs, DDPMs are difficult to use in real-time speech processing applications. In this paper, we introduce DiffGAN-TTS, a novel DDPM-based text-to-speech (TTS) model achieving high-fidelity and efficient speech synthesis. DiffGAN-TTS is based on denoising diffusion generative adversarial networks (GANs), which adopt an adversarially-trained expressive model to approximate the denoising distribution. We show with multi-speaker TTS experiments that DiffGAN-TTS can generate high-fidelity speech samples within only 4 denoising steps. We present an active shallow diffusion mechanism to further speed up inference. A two-stage training scheme is proposed, with a basic TTS acoustic model trained at stage one providing valuable prior information for a DDPM trained at stage two. Our experiments show that DiffGAN-TTS can achieve high synthesis performance with only 1 denoising step.	翻訳日:2022-01-31 15:51:24 公開日:2022-01-28
# (参考訳) ermよりも一般化された重み付けが改善しない理由を理解する Understanding Why Generalized Reweighting Does Not Improve Over ERM ( http://arxiv.org/abs/2201.12293v1 ) ライセンス: CC BY 4.0	Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar	(参考訳) 経験的リスク最小化(experimental risk minimization, erm)は、トレーニングとテスト分布が異なる分散シフトに対する非ロバストであることが知られている。重み付けや分散ロバスト最適化(DRO)の変種といった一連のアプローチがこの問題を解決するために提案されている。しかし、最近の一連の研究は、分散シフトを伴う実際のアプリケーションにおいて、これらのアプローチがermを大幅に改善していないことを実証している。この研究の目的は、この興味深い現象の総合的な理論的理解を得ることである。まず、トレーニングサンプルの反復的再重み付けに基づいてモデルパラメータを反復的に更新するアプローチの幅広いカテゴリとして、一般再重み付け(GRW)アルゴリズムのクラスを仮定する。 GRWで過度パラメータ化モデルをトレーニングした場合,得られたモデルはERMで得られたモデルに近いことを示す。また,経験的トレーニング精度に大きく影響しない小さな正規化を加えることは,効果がないことを示す。以上より,grwアプローチの幅広いカテゴリは,分布的にロバストな一般化を実現することができないことを示した。分布的に堅牢な一般化に向けて進むためには、非GRWアプローチを開発するか、あるいはGRWアプローチのクラスに適応した新しい分類/回帰損失関数を考案する必要がある。 Empirical risk minimization (ERM) is known in practice to be non-robust to distributional shift where the training and the test distributions are different. A suite of approaches, such as importance weighting, and variants of distributionally robust optimization (DRO), have been proposed to solve this problem. But a line of recent work has empirically shown that these approaches do not significantly improve over ERM in real applications with distribution shift. The goal of this work is to obtain a comprehensive theoretical understanding of this intriguing phenomenon. We first posit the class of Generalized Reweighting (GRW) algorithms, as a broad category of approaches that iteratively update model parameters based on iterative reweighting of the training samples. We show that when overparameterized models are trained under GRW, the resulting models are close to that obtained by ERM. We also show that adding small regularization which does not greatly affect the empirical training accuracy does not help. Together, our results show that a broad category of what we term GRW approaches are not able to achieve distributionally robust generalization. Our work thus has the following sobering takeaway: to make progress towards distributionally robust generalization, we either have to develop non-GRW approaches, or perhaps devise novel classification/regression loss functions that are adapted to the class of GRW approaches.	翻訳日:2022-01-31 15:50:26 公開日:2022-01-28
# コストボリュームに基づくスパース不均質伝播によるステレオマッチング Stereo Matching with Cost Volume based Sparse Disparity Propagation ( http://arxiv.org/abs/2201.11937v1 ) ライセンス: Link先を確認	Wei Xue and Xiaojiang Peng	(参考訳) 両眼立体視にはステレオマッチングが不可欠である。既存の手法は主に、立体マッチングを改善するための単純な不均等写像の融合に焦点を当てている。本稿では,特徴分散伝播と呼ばれる,シンプルながら斬新な手法を提案し,コストの一致量とスパースマッチング特徴点に基づく一般的なステレオマッチングを改善する。具体的には、まず、局所的な特徴マッチングにより、信頼性の高いスパース不一致マップを計算し、その後、マッチングコスト領域において、隣接する画素に信頼性のある不一致を伝播することにより、その不一致マップを洗練する。また,局所的格差領域の勾配および多スケール情報を考慮して,コスト集約ステップがなくてもコストボリュームのロバスト性を保証するad-censusに基づく$\rho$-censusコスト尺度を提案する。ミドルベリーステレオベンチマークv3における広範囲な実験により,提案手法が最先端手法に匹敵する有望な性能を実現することを実証した。 Stereo matching is crucial for binocular stereo vision. Existing methods mainly focus on simple disparity map fusion to improve stereo matching, which require multiple dense or sparse disparity maps. In this paper, we propose a simple yet novel scheme, termed feature disparity propagation, to improve general stereo matching based on matching cost volume and sparse matching feature points. Specifically, our scheme first calculates a reliable sparse disparity map by local feature matching, and then refines the disparity map by propagating reliable disparities to neighboring pixels in the matching cost domain. In addition, considering the gradient and multi-scale information of local disparity regions, we present a $\rho$-Census cost measure based on the well-known AD-Census, which guarantees the robustness of cost volume even without the cost aggregation step. Extensive experiments on Middlebury stereo benchmark V3 demonstrate that our scheme achieves promising performance comparable to state-of-the-art methods.	翻訳日:2022-01-31 15:47:52 公開日:2022-01-28
# 教師なし人物再識別のためのクラスタアンサンブルを用いたハイブリッドコントラスト学習 Hybrid Contrastive Learning with Cluster Ensemble for Unsupervised Person Re-identification ( http://arxiv.org/abs/2201.11995v1 ) ライセンス: Link先を確認	He Sun, Mingkun Li, Chun-Guang Li	(参考訳) unsupervised person re-identification (reid) は、歩行者の問合せ画像とギャラリーセットの画像とを、監督ラベルなしでマッチングすることを目的としている。教師なしのreidに取り組む最も一般的なアプローチは、通常、クラスタリングアルゴリズムを実行して疑似ラベルを生成し、疑似ラベルを利用してディープニューラルネットワークをトレーニングすることです。しかし、疑似ラベルは、クラスタリングアルゴリズムのハイパーパラメータ(s)に対してノイズと感度がある。本稿では,インスタンスレベルのコントラスト損失関数とクラスタレベルのコントラスト損失関数のハイブリッドに基づく教師なしreidのためのハイブリッドコントラスト学習(hcl)手法を提案する。さらに,多粒度クラスタリング型ハイブリッドコントラスト学習(MGCE-HCL)アプローチを提案する。この手法は,擬似正のサンプルペア間の優先情報をマイニングするために,多粒度クラスタリングアンサンブル戦略を採用し,擬似正のサンプルのノイズを許容するための優先重み付きハイブリッドコントラスト損失を定義する。ベンチマークデータセットである Market-1501 と DukeMTMC-reID について広範な実験を行った。提案の有効性を実験的に検証した。 Unsupervised person re-identification (ReID) aims to match a query image of a pedestrian to the images in gallery set without supervision labels. The most popular approaches to tackle unsupervised person ReID are usually performing a clustering algorithm to yield pseudo labels at first and then exploit the pseudo labels to train a deep neural network. However, the pseudo labels are noisy and sensitive to the hyper-parameter(s) in clustering algorithm. In this paper, we propose a Hybrid Contrastive Learning (HCL) approach for unsupervised person ReID, which is based on a hybrid between instance-level and cluster-level contrastive loss functions. Moreover, we present a Multi-Granularity Clustering Ensemble based Hybrid Contrastive Learning (MGCE-HCL) approach, which adopts a multi-granularity clustering ensemble strategy to mine priority information among the pseudo positive sample pairs and defines a priority-weighted hybrid contrastive loss for better tolerating the noises in the pseudo positive samples. We conduct extensive experiments on two benchmark datasets Market-1501 and DukeMTMC-reID. Experimental results validate the effectiveness of our proposals.	翻訳日:2022-01-31 15:47:32 公開日:2022-01-28
# ぼやけたイメージの展開 Unfolding a blurred image ( http://arxiv.org/abs/2201.12010v1 ) ライセンス: Link先を確認	Kuldeep Purohit, Anshul Shah, A. N. Rajagopalan	(参考訳) 本稿では,1つの動きのぼやけた画像から映像を抽出し,露光時にカメラが保持するシーンの明快な視点を順次再構成することを目的とする。まず,ビデオ再構成のサロゲートタスクを実行する畳み込み再生ビデオオートエンコーダネットワークのトレーニングを通じて,シャープビデオからの映像表現を教師なしで学習する。訓練後、ぼやけた画像のためのモーションエンコーダのガイドトレーニングに使用される。このネットワークは、ぼやけた画像から埋め込み動作情報を抽出し、トレーニングされたリカレントビデオデコーダとともにシャープビデオを生成する。中間的なステップとして,リアルタイムの単一画像の分解と,精度,速度,コンパクト性など,競合するすべての要因に対する性能向上が可能な効率的なアーキテクチャを設計する。実際のシーンと標準データセットに関する実験は、最先端のフレームワークの優位性と、時間的に一貫性のあるシャープフレームの生成能力を示しています。 We present a solution for the goal of extracting a video from a single motion blurred image to sequentially reconstruct the clear views of a scene as beheld by the camera during the time of exposure. We first learn motion representation from sharp videos in an unsupervised manner through training of a convolutional recurrent video autoencoder network that performs a surrogate task of video reconstruction. Once trained, it is employed for guided training of a motion encoder for blurred images. This network extracts embedded motion information from the blurred image to generate a sharp video in conjunction with the trained recurrent video decoder. As an intermediate step, we also design an efficient architecture that enables real-time single image deblurring and outperforms competing methods across all factors: accuracy, speed, and compactness. Experiments on real scenes and standard datasets demonstrate the superiority of our framework over the state-of-the-art and its ability to generate a plausible sequence of temporally consistent sharp frames.	翻訳日:2022-01-31 15:47:11 公開日:2022-01-28
# 注意誘導フレームアソシエーションを用いたRGB-D SLAM RGB-D SLAM Using Attention Guided Frame Association ( http://arxiv.org/abs/2201.12047v1 ) ライセンス: Link先を確認	Ali Caglayan, Nevrez Imamoglu, Oguzhan Guclu, Ali Osman Serhatoglu, Weimin Wang, Ahmet Burak Can, Ryosuke Nakamura	(参考訳) 新たなトピックとしてのディープラーニングモデルは、さまざまな分野で大きな進歩を見せている。特に、クラスアクティベーションマッピング法のような可視化ツールは、畳み込みニューラルネットワーク(CNN)の推論に関する視覚的な説明を提供する。ネットワーク層の勾配を用いることで、特定の画像認識タスク中にネットワークがどこに注意を払っているかを示すことができる。さらに、これらの勾配はcnnの機能と統合でき、シーン内のより一般化されたタスク依存注意(salient)オブジェクトをローカライズすることができる。この進歩にもかかわらず、オブジェクトセマンティクスのcnn表現と統合するために、この勾配(ネットワークの注意)情報はあまり明確には使われていない。これは、空間的に注意すべき物体位置のCNN表現が性能改善につながるような、同時局所化とマッピング(SLAM)のような視覚的タスクに非常に有用である。そこで本研究では,RGB-D屋内SLAMにおけるタスク固有ネットワークアテンションの利用を提案する。そこで我々は,最新のRGB-D屋内SLAM法において,CNN層表現とレイヤワイドオブジェクトアテンション情報(レイヤ勾配)を統合し,フレームアソシエーション性能を向上させる。実験はパフォーマンスを向上して有望な初期結果を示す。 Deep learning models as an emerging topic have shown great progress in various fields. Especially, visualization tools such as class activation mapping methods provided visual explanation on the reasoning of convolutional neural networks (CNNs). By using the gradients of the network layers, it is possible to demonstrate where the networks pay attention during a specific image recognition task. Moreover, these gradients can be integrated with CNN features for localizing more generalized task dependent attentive (salient) objects in scenes. Despite this progress, there is not much explicit usage of this gradient (network attention) information to integrate with CNN representations for object semantics. This can be very useful for visual tasks such as simultaneous localization and mapping (SLAM) where CNN representations of spatially attentive object locations may lead to improved performance. Therefore, in this work, we propose the use of task specific network attention for RGB-D indoor SLAM. To do so, we integrate layer-wise object attention information (layer gradients) with CNN layer representations to improve frame association performance in a state-of-the-art RGB-D indoor SLAM method. Experiments show promising initial results with improved performance.	翻訳日:2022-01-31 15:46:56 公開日:2022-01-28
# ビデオ中の偽顔の検出 Detection of fake faces in videos ( http://arxiv.org/abs/2201.12051v1 ) ライセンス: Link先を確認	M. Shamanth, Russel Mathias, Dr Vijayalakshmi MN	(参考訳) ディープ・ラーニング・方法論は、プライバシー、民主主義、国家安全保障に脅威をもたらし、悪意のある活動をさらに増幅するアプリケーションを作成するために用いられてきた。ディープラーニングを利用した最近のアプリケーションの一つが、有名人格の合成ビデオだ。 Forbesによると、GAN(Generative Adversarial Networks)は、毎年急速に成長するフェイクビデオを生成し、Deeptraceとして知られる組織は2018年から2019年にかけて、ディープフェイクを84%増加させたと見積もっている。それらは人間の顔の生成と修正に使われており、既存のフェイクビデオのほとんどは、その推定では96%、一部ではサイバー犯罪の個人性を偽造している。本稿では、利用可能なビデオデータセットを特定し、顔検出にプリトレーニングモデルblazefaceを使用し、データセット上でトレーニングされたresnetおよびxceptionアンサンブルアーキテクチャドニューラルネットワークを使用して、ビデオ中の偽顔検出の目標を達成する。このモデルは損失値とログ損失値よりも最適化され、そのf1スコアで評価される。データのサンプルでは、焦点損失のガンマがハイパーパラメータとなるにつれて、焦点損失がより良い精度、F1のスコアと損失をもたらすことが観察された。これにより、トレーニングサイクルのピーク時のk折りの精度は約91%となり、実際の精度はモデルが崩壊するにつれて経時的に変化する。 : Deep learning methodologies have been used to create applications that can cause threats to privacy, democracy and national security and could be used to further amplify malicious activities. One of those deep learning-powered applications in recent times is synthesized videos of famous personalities. According to Forbes, Generative Adversarial Networks(GANs) generated fake videos growing exponentially every year and the organization known as Deeptrace had estimated an increase of deepfakes by 84% from the year 2018 to 2019. They are used to generate and modify human faces, where most of the existing fake videos are of prurient non-consensual nature, of which its estimates to be around 96% and some carried out impersonating personalities for cyber crime. In this paper, available video datasets are identified and a pretrained model BlazeFace is used to detect faces, and a ResNet and Xception ensembled architectured neural network trained on the dataset to achieve the goal of detection of fake faces in videos. The model is optimized over a loss value and log loss values and evaluated over its F1 score. Over a sample of data, it is observed that focal loss provides better accuracy, F1 score and loss as the gamma of the focal loss becomes a hyper parameter. This provides a k-folded accuracy of around 91% at its peak in a training cycle with the real world accuracy subjected to change over time as the model decays.	翻訳日:2022-01-31 15:46:36 公開日:2022-01-28
# デジタル顔画像操作検出における人の心理的評価 Psychophysical Evaluation of Human Performance in Detecting Digital Face Image Manipulations ( http://arxiv.org/abs/2201.12084v1 ) ライセンス: Link先を確認	Robert Nichols, Christian Rathgeb, Pawel Drozdowski, Christoph Busch	(参考訳) 近年では、国境管理や法執行など、セキュリティ上重要な設定における顔認識技術の導入が増加し、デジタル操作された顔画像に基づいて発行される正統な文書を利用する攻撃に対する顔認識システムの脆弱性にかなりの関心が寄せられている。自動操作と攻撃検出は依然として困難な課題であり、人間の検査者が身元確認を行う従来のプロセスは不可欠である。これらの状況は、操作された顔画像を検出する人間の能力をより深く研究する上で有効であり、この分野での以前の研究は疎く、しばしば特定のシナリオや生体特性にのみ集中している。本研究は、心理物理学の分野から採用されている原則に基づき、webベースの遠隔視覚識別実験を行い、その後、顔の交換、モーフィング、リタッチなど、様々な種類のデジタル操作された顔画像の検出における人間の習熟度を調べることを目的として、学際的な機会について論じる。適切な性能測定値の解析に加えて,検出可能性の指標も検討した。 306個のプロバンドによる実験データによると、検出性能は個体群全体に広く分布しており、特定の種類の顔画像操作の検出は他よりもはるかに困難である。 In recent years, increasing deployment of face recognition technology in security-critical settings, such as border control or law enforcement, has led to considerable interest in the vulnerability of face recognition systems to attacks utilising legitimate documents, which are issued on the basis of digitally manipulated face images. As automated manipulation and attack detection remains a challenging task, conventional processes with human inspectors performing identity verification remain indispensable. These circumstances merit a closer investigation of human capabilities in detecting manipulated face images, as previous work in this field is sparse and often concentrated only on specific scenarios and biometric characteristics. This work introduces a web-based, remote visual discrimination experiment on the basis of principles adopted from the field of psychophysics and subsequently discusses interdisciplinary opportunities with the aim of examining human proficiency in detecting different types of digitally manipulated face images, specifically face swapping, morphing, and retouching. In addition to analysing appropriate performance measures, a possible metric of detectability is explored. Experimental data of 306 probands indicate that detection performance is widely distributed across the population and detection of certain types of face image manipulations is much more challenging than others.	翻訳日:2022-01-31 15:46:11 公開日:2022-01-28
# 点クラウド登録のための近傍対応幾何エンコーディングネットワーク Neighborhood-aware Geometric Encoding Network for Point Cloud Registration ( http://arxiv.org/abs/2201.12094v1 ) ライセンス: Link先を確認	Lifa Zhu, Haining Guan, Changwei Lin, Renmin Han	(参考訳) 幾何的特徴の区別は、点雲登録の成功を決定する。しかし、ほとんどの点の雲は部分的に重なり、ノイズによって破損し、識別不能な表面で構成されているため、識別的特徴を抽出することは困難である。本稿では,正確なポイントクラウド登録のためのNighborhood-aware Geometric Encoding Network (NgeNet)を提案する。 NgeNetは幾何学的特徴を考慮に入れた幾何学的ガイド付き符号化モジュール、異なるスケールで意味的にリッチな領域にフォーカスするマルチスケールアーキテクチャ、および適切な近傍サイズを持つ特徴を選択し、特異な特徴を拒絶する一貫した投票戦略を利用する。適応的な近傍点の認識は、投票を伴うマルチスケールアーキテクチャを通して得られる。具体的には、NgeNetの手法はモデルに依存しないため、他のネットワークに容易に移行できる。屋内、屋外、およびオブジェクト中心の合成データセットに関する総合的な実験は、NgeNetが公開された最先端の手法をすべて超越していることを示している。コードはhttps://github.com/zhulf0804/NgeNetで入手できる。 The distinguishing geometric features determine the success of point cloud registration. However, most point clouds are partially overlapping, corrupted by noise, and comprised of indistinguishable surfaces, which makes it a challenge to extract discriminative features. Here, we propose the Neighborhood-aware Geometric Encoding Network (NgeNet) for accurate point cloud registration. NgeNet utilizes a geometric guided encoding module to take geometric characteristics into consideration, a multi-scale architecture to focus on the semantically rich regions in different scales, and a consistent voting strategy to select features with proper neighborhood size and reject the specious features. The awareness of adaptive neighborhood points is obtained through the multi-scale architecture accompanied by voting. Specifically, the proposed techniques in NgeNet are model-agnostic, which could be easily migrated to other networks. Comprehensive experiments on indoor, outdoor and object-centric synthetic datasets demonstrate that NgeNet surpasses all of the published state-of-the-art methods. The code will be available at https://github.com/zhulf0804/NgeNet.	翻訳日:2022-01-31 15:45:48 公開日:2022-01-28
# HSADML:脳腫瘍分類のための超球面角深度に基づく学習 HSADML: Hyper-Sphere Angular Deep Metric based Learning for Brain Tumor Classification ( http://arxiv.org/abs/2201.12269v1 ) ライセンス: Link先を確認	Aman Verma and Vibhav Prakash Singh	(参考訳) 脳腫瘍は脳の領域を貫通する集合細胞の異常な質量である。タイムリーな識別と分類は、医師が適切な治療を行うのに役立つ。しかし、脳腫瘍の分類は、高いイントラクラス類似性と低インタークラス変動のため、かなり複雑である。様々なMRIクラスにおける形態的類似性のため、課題はさらに深まる。この全ては分類モデルの一般化を妨げている。そこで本稿では,SphereFace Loss を用いた深度メトリック学習(DML)を実現する新しいフレームワークであるHSADMLを提案する。 sphereface lossは、機能を超球面マニフォールドに組み込み、埋め込みにマージンを課し、クラス間の差別化性を高める。 sphereface lossベースのディープメトリック学習を利用することで、クラスからのサンプルがクラスタ化され、異なるサンプルがプッシュされる。 k-nn (k=1) を用いた98.69%の検証 accu-racy を達成し,98.47%の検証精度が得られたもののクラス間分離性とクラス内近接性が制限された通常のソフトマックス損失トレーニングよりも有意に高い。様々な分類器と損失関数セットによる実験的解析は、アプローチの可能性を示唆している。 Brain Tumors are abnormal mass of clustered cells penetrating regions of brain. Their timely identification and classification help doctors to provide appropriate treatment. However, Classifi-cation of Brain Tumors is quite intricate because of high-intra class similarity and low-inter class variability. Due to morphological similarity amongst various MRI-Slices of different classes the challenge deepens more. This all leads to hampering generalizability of classification models. To this end, this paper proposes HSADML, a novel framework which enables deep metric learning (DML) using SphereFace Loss. SphereFace loss embeds the features into a hyperspheric-manifold and then imposes margin on the embeddings to enhance differentiability between the classes. With utilization of SphereFace loss based deep metric learning it is ensured that samples from class clustered together while the different ones are pushed apart. Results reflects the promi-nence in the approach, the proposed framework achieved state-of-the-art 98.69% validation accu-racy using k-NN (k=1) and this is significantly higher than normal SoftMax Loss training which though obtains 98.47% validation accuracy but that too with limited inter-class separability and intra-class closeness. Experimental analysis done over various classifiers and loss function set-tings suggests potential in the approach.	翻訳日:2022-01-31 15:44:35 公開日:2022-01-28
# DAB-DETR: DETRのための動的アンカーボックス DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR ( http://arxiv.org/abs/2201.12329v1 ) ライセンス: Link先を確認	Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang	(参考訳) 本稿では,DTR(Detection TRansformer)のための動的アンカーボックスを用いた新しいクエリ定式化を行い,DTRにおけるクエリの役割についてより深く理解する。この新たな定式化は、Transformerデコーダのクエリとしてボックス座標を直接使用し、層ごとに動的に更新する。ボックス座標を用いることで,クエリ・ツー・フィーチャーの類似性を向上し,DETRの遅いトレーニング収束問題を解消するだけでなく,ボックス幅と高さ情報を用いて位置対応マップを変調することが可能になる。このような設計により、DETRにおけるクエリは、カスケード方式でソフトROIプーリング層として実装可能であることが明らかになる。その結果、同じ設定下でのDEC-COCOベンチマークでは、例えばAP 45.7\%でResNet50-DC5を50時間でトレーニングしたバックボーンとして使用した。また,本手法の有効性を検証するため,広範な実験を行った。コードは \url{https://github.com/SlongLiu/DAB-DETR} で入手できる。 We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. Code is available at \url{https://github.com/SlongLiu/DAB-DETR}.	翻訳日:2022-01-31 15:44:10 公開日:2022-01-28
# 不完全対称力学に対する近似同値ネットワーク Approximately Equivariant Networks for Imperfectly Symmetric Dynamics ( http://arxiv.org/abs/2201.11969v1 ) ライセンス: Link先を確認	Rui Wang, Robin Walters, Rose Yu	(参考訳) ニューラルネットワークアーキテクチャにインダクティブバイアスとして対称性を組み込むことで、ダイナミクスモデリングにおける一般化、データ効率、物理的一貫性が向上した。 cnnや等変ニューラルネットワークのような手法では、シフト不変性や回転同値性といった対称性を強制するために重み付きを用いる。しかし、物理法則が多くの対称性に従うという事実にもかかわらず、実世界の力学データは、ノイズや不完全データによる厳密な数学的対称性や、基礎となる力学系における対称性の破れの特徴にほとんど準拠しない。対称性の保存に偏りがあるが、厳密に制約されていない、概略同変ネットワークを探索する。等分散制約を緩和することにより、我々のモデルは対称性バイアスのないベースラインと、シミュレーションされた乱流領域と実世界のマルチストリームジェットフローの両方において過度に厳密な対称性を持つベースラインの両方より優れていることが分かる。 Incorporating symmetry as an inductive bias into neural network architecture has led to improvements in generalization, data efficiency, and physical consistency in dynamics modeling. Methods such as CNN or equivariant neural networks use weight tying to enforce symmetries such as shift invariance or rotational equivariance. However, despite the fact that physical laws obey many symmetries, real-world dynamical data rarely conforms to strict mathematical symmetry either due to noisy or incomplete data or to symmetry breaking features in the underlying dynamical system. We explore approximately equivariant networks which are biased towards preserving symmetry but are not strictly constrained to do so. By relaxing equivariance constraints, we find that our models can outperform both baselines with no symmetry bias and baselines with overly strict symmetry in both simulated turbulence domains and real-world multi-stream jet flow.	翻訳日:2022-01-31 15:41:28 公開日:2022-01-28
# グラフニューラルネットワークによる未知の物理系シミュレーションの学習 Learning to Simulate Unseen Physical Systems with Graph Neural Networks ( http://arxiv.org/abs/2201.11976v1 ) ライセンス: Link先を確認	Ce Yang, Weihao Gao, Di Wu, Chong Wang	(参考訳) 物理システムのダイナミクスのシミュレーションは、科学と工学の両方の発展に不可欠である。近年,ニューラルネットワークを用いた物理システムのダイナミクスをシミュレートする学習への関心が高まっている。しかし、既存のアプローチでは、粘度の異なる液体や弾性の異なるエラストマーなど、トレーニングセットにない物質に一般化することができない。本稿では,多種多様なシナリオにおいて異なる物質の物理力学を効率的にモデル化するために,物理量と物質パラメータを組み込んだ機械学習手法であるgraph-based physics engine(gpe)を提案する。 GPEはトレーニングセットにない異なる特性を持つ材料に一般化でき、シングルステップ予測からマルチステップロールアウトシミュレーションまでよく機能することを示した。さらに、モデルに運動量保存の法則を導入することで、学習の効率と安定性が大幅に向上し、トレーニングステップの少ないより良いモデルへの収束が可能になる。 Simulation of the dynamics of physical systems is essential to the development of both science and engineering. Recently there is an increasing interest in learning to simulate the dynamics of physical systems using neural networks. However, existing approaches fail to generalize to physical substances not in the training set, such as liquids with different viscosities or elastomers with different elasticities. Here we present a machine learning method embedded with physical priors and material parameters, which we term as "Graph-based Physics Engine" (GPE), to efficiently model the physical dynamics of different substances in a wide variety of scenarios. We demonstrate that GPE can generalize to materials with different properties not seen in the training set and perform well from single-step predictions to multi-step roll-out simulations. In addition, introducing the law of momentum conservation in the model significantly improves the efficiency and stability of learning, allowing convergence to better models with fewer training steps.	翻訳日:2022-01-31 15:41:09 公開日:2022-01-28
# グラフ上の拡張永続ホモロジーの神経近似 Neural Approximation of Extended Persistent Homology on Graphs ( http://arxiv.org/abs/2201.12032v1 ) ライセンス: Link先を確認	Zuoyu Yan, Tengfei Ma, Liangcai Gao, Zhi Tang, Yusu Wang, Chao Chen	(参考訳) 永続ホモロジーは、位相データ解析において広く用いられる理論である。グラフ学習の文脈では、永続的ホモロジーに基づくトポロジ的特徴は、既存のグラフニューラルネットワーク手法を拡張するために、潜在的に高次構造情報をキャプチャするために使われてきた。しかし、特に学習アプリケーションでは、この計算を何度も行わなければならないため、拡張された永続的ホモロジー要約は、大きくて密度の高いグラフでは遅いままである。近年のニューラルアルゴリズム推論の成功に触発されて,グラフ上の拡張永続化図を計算するための新しい学習法を提案する。提案するニューラルネットワークは,特定のアルゴリズムをシミュレートすることを目的として,新しいグラフに対する拡張永続化図の効率的な計算方法を学ぶ。拡張永続化図と下流グラフ表現学習タスクの近似実験により,本手法の有効性が示された。大規模かつ高密度なグラフでは、計算を100倍近く高速化する。 Persistent homology is a widely used theory in topological data analysis. In the context of graph learning, topological features based on persistent homology have been used to capture potentially high-order structural information so as to augment existing graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs, especially since in learning applications one has to carry out this computation potentially many times. Inspired by recent success in neural algorithmic reasoning, we propose a novel learning method to compute extended persistence diagrams on graphs. The proposed neural network aims to simulate a specific algorithm and learns to compute extended persistence diagrams for new graphs efficiently. Experiments on approximating extended persistence diagrams and several downstream graph representation learning tasks demonstrate the effectiveness of our method. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times.	翻訳日:2022-01-31 15:40:54 公開日:2022-01-28
# 脳波を用いた感情分類のためのAsMapの自動特徴抽出 Automated Feature Extraction on AsMap for Emotion Classification using EEG ( http://arxiv.org/abs/2201.12055v1 ) ライセンス: Link先を確認	Md. Zaved Iqubal Ahmed (1), Nidul Sinha (2) and Souvik Phadikar (2) ((1) Department of Computer Science & Engineering, National Institute of Technology, Silchar, India, (2) Department of Electrical Engineering, National Institute of Technology, Silchar, India)	(参考訳) 脳波を用いた感情認識は感情コンピューティングに関連する課題に対処するために広く研究されている。脳波信号に対する手動特徴抽出法を用いることで,学習モデルによる準最適性能が得られる。自動機能エンジニアリングのためのツールとしてのディープラーニングの進歩に伴い、本研究では、手作業と自動機能抽出のハイブリッドが提案されている。異なる脳領域における非対称性は、脳波信号の微分エントロピー(de)特徴からasmapと呼ばれる2次元ベクトルに捕捉される。これらのasmapは、畳み込みニューラルネットワーク(cnn)モデルを使用して自動的に特徴を抽出するために使用される。提案手法は, RASM, DASM, DCAU などの DE に基づく特徴抽出手法と比較した。クラス数に基づく分類問題に対して,DEAPおよびSEEDデータセットを用いて実験を行った。その結果,提案手法はdeに基づく特徴抽出法よりも高い分類精度が得られることがわかった。 SEEDデータセットを用いた3クラス分類問題において,最高分類精度97.10%を達成した。さらに,本研究では,窓サイズが分類精度に与える影響についても検討した。 Emotion recognition using EEG has been widely studied to address the challenges associated with affective computing. Using manual feature extraction method on EEG signals result in sub-optimal performance by the learning models. With the advancements in deep learning as a tool for automated feature engineering, in this work a hybrid of manual and automatic feature extraction method has been proposed. The asymmetry in the different brain regions are captured in a 2-D vector, termed as AsMap from the differential entropy (DE) features of EEG signals. These AsMaps are then used to extract features automatically using Convolutional Neural Network (CNN) model. The proposed feature extraction method has been compared with DE and other DE-based feature extraction methods such as RASM, DASM and DCAU. Experiments are conducted using DEAP and SEED dataset on different classification problems based on number of classes. Results obtained indicate that the proposed method of feature extraction results in higher classification accuracy outperforming the DE based feature extraction methods. Highest classification accuracy of 97.10% is achieved on 3-class classification problem using SEED dataset. Further, the impact of window size on classification accuracy has also been assessed in this work.	翻訳日:2022-01-31 15:37:53 公開日:2022-01-28
# 強化学習のためのマスクベース潜在性再構成 Mask-based Latent Reconstruction for Reinforcement Learning ( http://arxiv.org/abs/2201.12096v1 ) ライセンス: Link先を確認	Tao Yu, Zhizheng Zhang, Cuiling Lan, Zhibo Chen, Yan Lu	(参考訳) 画素からの深部強化学習(RL)では,高い性能を達成するために有効な状態表現の学習が不可欠である。しかし実際には、限られた経験と高次元入力が効果的な表現学習を妨げる。これを解決するために、他の研究分野におけるマスクモデリングの成功を動機として、RLにおける状態表現学習を促進するためにマスクベースの再構築を導入する。具体的には,空間的および時空間的にマスクされた画素を用いた観測から潜在空間の完全な状態表現を予測するための,単純かつ効果的な自己教師あり法であるマスクベース潜時再構成(mlr)を提案する。 MLRは、状態表現を学習する際の文脈情報のより良い利用を可能にし、それらをより情報的にし、RLエージェントの訓練を容易にする。総合的な実験により,MLRはRLの試料効率を大幅に向上し,複数の連続ベンチマーク環境において最先端の試料効率RL法より優れていた。 For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance. However, in practice, limited experience and high-dimensional input prevent effective representation learning. To address this, motivated by the success of masked modeling in other research fields, we introduce mask-based reconstruction to promote state representation learning in RL. Specifically, we propose a simple yet effective self-supervised method, Mask-based Latent Reconstruction (MLR), to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. MLR enables the better use of context information when learning state representations to make them more informative, which facilitates RL agent training. Extensive experiments show that our MLR significantly improves the sample efficiency in RL and outperforms the state-of-the-art sample-efficient RL methods on multiple continuous benchmark environments.	翻訳日:2022-01-31 15:37:37 公開日:2022-01-28
# 定次元潜在空間をもつグラフオートエンコーダ Graph autoencoder with constant dimensional latent space ( http://arxiv.org/abs/2201.12165v1 ) ライセンス: Link先を確認	Adam Ma{\l}kowski, Jakub Grzechoci\'nski, Pawe{\l} Wawrzy\'nski	(参考訳) 大きいグラフの不変次元ベクトル(埋め込み)への可逆変換は依然として挑戦である。本稿では、再帰的ニューラルネットワーク(エンコーダとデコーダ)でそれに対処する。エンコーダネットワークは、サブグラフの埋め込みをより大きなサブグラフの埋め込みに変換し、最終的に入力グラフの埋め込みに変換する。デコーダは逆を行う。埋め込みの次元は (sub) グラフのサイズに関係なく一定である。本稿では,提案するグラフオートエンコーダが何千もの頂点を持つグラフを処理できることをシミュレーション実験により確認する。 Invertible transformation of large graphs into constant dimensional vectors (embeddings) remains a challenge. In this paper we address it with recursive neural networks: The encoder and the decoder. The encoder network transforms embeddings of subgraphs into embeddings of larger subgraphs, and eventually into the embedding of the input graph. The decoder does the opposite. The dimension of the embeddings is constant regardless of the size of the (sub)graphs. Simulation experiments presented in this paper confirm that our proposed graph autoencoder can handle graphs with even thousands of vertices.	翻訳日:2022-01-31 15:37:21 公開日:2022-01-28
# 構成性を考慮したGraph2Seq学習 Compositionality-Aware Graph2Seq Learning ( http://arxiv.org/abs/2201.12178v1 ) ライセンス: Link先を確認	Takeshi D. Itoh and Takatomi Kubo and Kazushi Ikeda	(参考訳) グラフは非常に表現力のあるデータ構造であるが、複雑なグラフからパターンを見つけることはしばしば困難である。したがって、グラフから人間の解釈可能なシーケンスを生成することは、Graph2seq Learningと呼ばれる関心を集めている。グラフにおける構成性は、多くのグラフ2seqタスクの出力シーケンスにおける構成性に関連付けられることが期待される。したがって、構成性に配慮したGNNアーキテクチャを適用することで、モデルの性能が向上する。本研究では,複数レベルの情報局所性からグラフ表現を集約するマルチレベルアテンションプーリング(MLAP)アーキテクチャを採用する。実世界の例として、極端にソースコードの要約タスクを取り上げ、モデルがそのソースコードからプログラム関数の名前を推定する。 MLAPアーキテクチャを持つモデルは,従来の最先端モデルよりも7倍以上少ないパラメータで性能を向上することを示した。 Graphs are a highly expressive data structure, but it is often difficult for humans to find patterns from a complex graph. Hence, generating human-interpretable sequences from graphs have gained interest, called graph2seq learning. It is expected that the compositionality in a graph can be associated to the compositionality in the output sequence in many graph2seq tasks. Therefore, applying compositionality-aware GNN architecture would improve the model performance. In this study, we adopt the multi-level attention pooling (MLAP) architecture, that can aggregate graph representations from multiple levels of information localities. As a real-world example, we take up the extreme source code summarization task, where a model estimate the name of a program function from its source code. We demonstrate that the model having the MLAP architecture outperform the previous state-of-the-art model with more than seven times fewer parameters than it.	翻訳日:2022-01-31 15:37:12 公開日:2022-01-28
# ニューロカオス学習を用いた因果効果保存と分類 Cause-Effect Preservation and Classification using Neurochaos Learning ( http://arxiv.org/abs/2201.12181v1 ) ライセンス: Link先を確認	Harikrishnan N B, Aditi Kathpalia, Nithin Nagaraj	(参考訳) 観測データからの因果効果の発見は、科学と工学において重要だが困難な問題である。本研究では、シミュレーションデータから原因影響を分類するために、最近提案された脳誘発学習アルゴリズムである-\emph{Neurochaos Learning} (NL) を用いる。使用されるデータインスタンスは、結合ARプロセス、結合1Dカオススキューテントマップ、結合1Dカオスロジスティックマップ、および現実世界の捕食者システムから生成される。提案手法は、0.1$から0.7$までの結合係数値に対して、5層ディープニューラルネットワークアーキテクチャを一貫して上回る。さらに,複合ARプロセスのためのGranger Causality(GC)と,複合カオスシステムと実世界の捕食者データセットのためのCompression-Complexity Causality(CCC)を用いて,NLの特徴抽出空間における因果関係の保存について検討した。 NLがカオス変換の下で因果関係を保ち、因果関係の分類と効果時系列(転帰学習シナリオを含む)をうまく行う能力は、因果的機械学習応用において非常に望ましい。 Discovering cause-effect from observational data is an important but challenging problem in science and engineering. In this work, a recently proposed brain inspired learning algorithm namely-\emph{Neurochaos Learning} (NL) is used for the classification of cause-effect from simulated data. The data instances used are generated from coupled AR processes, coupled 1D chaotic skew tent maps, coupled 1D chaotic logistic maps and a real-world prey-predator system. The proposed method consistently outperforms a five layer Deep Neural Network architecture for coupling coefficient values ranging from $0.1$ to $0.7$. Further, we investigate the preservation of causality in the feature extracted space of NL using Granger Causality (GC) for coupled AR processes and and Compression-Complexity Causality (CCC) for coupled chaotic systems and real-world prey-predator dataset. This ability of NL to preserve causality under a chaotic transformation and successfully classify cause and effect time series (including a transfer learning scenario) is highly desirable in causal machine learning applications.	翻訳日:2022-01-31 15:36:59 公開日:2022-01-28
# データからfuncta: データポイントは関数であり、それを関数として扱うべきです From data to functa: Your data point is a function and you should treat it like one ( http://arxiv.org/abs/2201.12204v1 ) ライセンス: Link先を確認	Emilien Dupont, Hyunjik Kim, S. M. Ali Eslami, Danilo Rezende, Dan Rosenbaum	(参考訳) ディープラーニングでは、例えばピクセルの2dグリッドのように、離散格子上の世界の測定を表すのが一般的である。しかし、これらの測定で表される基盤となる信号はしばしば連続的であり、例えば画像に表されるシーンなどである。次に、強力な連続的な代替手段として、暗黙の神経表現(入力空間位置の適切な測定値を出力するように訓練された神経関数)を用いてこれらの測定を表現することが挙げられる。この論文では、このアイデアを次のレベルに上げている。これらの関数をデータとして扱う代わりに、ディープラーニングを実行するのに何が必要か? この文脈では、データを functa と呼び、 functa の深層学習のためのフレームワークを提案する。この見解は、データからfunctaへの効率的な変換、functaのコンパクト表現、functaのダウンストリームタスクの効果的解決に関する多くの課題を示している。本稿では,これらの課題を克服するためのレシピを概説し,画像,3次元形状,ニューラル放射場(NeRF),多様体上のデータなど,幅広いデータモダリティに適用する。提案手法は,データモダリティ,特に生成モデル,データ計算,新しいビュー合成,分類の標準的タスクにおいて,様々な魅力的な特性を有することを示す。 It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa. This view presents a number of challenges around efficient conversion from data to functa, compact representation of functa, and effectively solving downstream tasks on functa. We outline a recipe to overcome these challenges and apply it to a wide range of data modalities including images, 3D shapes, neural radiance fields (NeRF) and data on manifolds. We demonstrate that this approach has various compelling properties across data modalities, in particular on the canonical tasks of generative modeling, data imputation, novel view synthesis and classification.	翻訳日:2022-01-31 15:36:39 公開日:2022-01-28
# 神経の最適輸送 Neural Optimal Transport ( http://arxiv.org/abs/2201.12220v1 ) ライセンス: Link先を確認	Alexander Korotin, Daniil Selikhanovych, Evgeny Burnaev	(参考訳) 本稿では,最適輸送マップの計算を行うニューラルネットワークに基づく新しいアルゴリズムと,強い輸送コストと弱い輸送コストの計画を提案する。ニューラルネットワークの利用を正当化するために、確率分布間の輸送計画の普遍的な近似であることを示す。我々は,おもちゃの例や未完成画像から画像への翻訳作業において,最適な輸送アルゴリズムの性能を評価する。 We present a novel neural-networks-based algorithm to compute optimal transport maps and plans for strong and weak transport costs. To justify the usage of neural networks, we prove that they are universal approximators of transport plans between probability distributions. We evaluate the performance of our optimal transport algorithm on toy examples and on the unpaired image-to-image style translation task.	翻訳日:2022-01-31 15:36:16 公開日:2022-01-28
# (参考訳) アンタングルバイシミュレーションによる制御法における意味的類似性の効率的な埋め込み Efficient Embedding of Semantic Similarity in Control Policies via Entangled Bisimulation ( http://arxiv.org/abs/2201.12300v1 ) ライセンス: CC BY 4.0	Martin Bertran, Walter Talbott, Nitish Srivastava, Joshua Susskind	(参考訳) 視覚障害の存在下で視覚入力から一般化可能なポリシーを学ぶことは、強化学習において難しい問題である。これらの指標は、原則として、状態間の振る舞いの類似性を測定することによって、無関係な気晴らしに不変な表現を学習するために使用することができる。これらのメトリクスの正確で偏りがなく、スケーラブルな評価は、継続的な状態とアクションシナリオにおいて明らかです。本稿では、状態間の距離関数の仕様化を可能にするビシミュレーション計量である絡み合ったビシミュレーションを提案し、連続状態や行動空間のバイアスなしに推定できる。本研究では,データ拡張技術に付加された場合においても,従来のDCS(Distracting Control Suite)の手法よりも有意な改善が可能であることを示す。 Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning. Recently, there has been renewed interest in bisimulation metrics as a tool to address this issue; these metrics can be used to learn representations that are, in principle, invariant to irrelevant distractions by measuring behavioural similarity between states. An accurate, unbiased, and scalable estimation of these metrics has proved elusive in continuous state and action scenarios. We propose entangled bisimulation, a bisimulation metric that allows the specification of the distance function between states, and can be estimated without bias in continuous state and action spaces. We show how entangled bisimulation can meaningfully improve over previous methods on the Distracting Control Suite (DCS), even when added on top of data augmentation techniques.	翻訳日:2022-01-31 15:34:47 公開日:2022-01-28
# 星時区分:部分ラベルデータを用いた系列分類 Star Temporal Classification: Sequence Classification with Partially Labeled Data ( http://arxiv.org/abs/2201.12208v1 ) ライセンス: Link先を確認	Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert	(参考訳) ラベル付きおよび未指定の逐次データから学習可能なアルゴリズムを開発した。コネクショニスト時相分類(ctc)のようなほとんどの逐次損失関数は、多くのラベルが欠落した時に崩壊する。この問題は、特別な星のトークンを使用して、トークンが欠落するたびに可能な全てのトークンを含むアライメントを可能にするStar Temporal Classification (STC)によって解決される。我々は、STCを重み付き有限状態トランスデューサ(WFST)の合成として表現し、GTN(WFSTによる自動微分のためのフレームワーク)を用いて勾配を計算する。我々は自動音声認識に関する広範囲な実験を行う。これらの実験により,STCは最大70%のラベルが欠落している場合に,教師付きベースラインの性能を回復できることがわかった。また,手書き認識の実験を行い,この手法が他のシーケンス分類タスクにも容易に適用できることを示す。 We develop an algorithm which can learn from partially labeled and unsegmented sequential data. Most sequential loss functions, such as Connectionist Temporal Classification (CTC), break down when many labels are missing. We address this problem with Star Temporal Classification (STC) which uses a special star token to allow alignments which include all possible tokens whenever a token could be missing. We express STC as the composition of weighted finite-state transducers (WFSTs) and use GTN (a framework for automatic differentiation with WFSTs) to compute gradients. We perform extensive experiments on automatic speech recognition. These experiments show that STC can recover most of the performance of supervised baseline when up to 70% of the labels are missing. We also perform experiments in handwriting recognition to show that our method easily applies to other sequence classification tasks.	翻訳日:2022-01-31 15:02:34 公開日:2022-01-28
# x線異物検出のための深層学習を可能にするトモグラフィーワークフロー A tomographic workflow to enable deep learning for X-ray based foreign object detection ( http://arxiv.org/abs/2201.12184v1 ) ライセンス: Link先を確認	Math\'e T. Zeegers, Tristan van Leeuwen, Dani\"el M. Pelt, Sophia Bethany Coban, Robert van Liere, Kees Joost Batenburg	(参考訳) 製品内の望ましくない('foreign')オブジェクトの検出は、生産品質を維持するために多くの業界で一般的な手順である。 X線イメージングは、異物検出のための高速で非侵襲的で広く適用可能な方法である。深層学習は最近、x線画像のパターンを認識する強力なアプローチとして登場し、x線ベースの外部物体の自動検出を可能にしている。しかし、これらの方法には多数のトレーニング例が必要であり、手動アノテーションは主観的かつ手間のかかる作業である。本研究では,作業量を最小限に抑えながら,異物検出の教師あり学習のための訓練データを生成するためのct法を提案する。提案手法では,CTをスキャンして3Dで再構成する。 CTスキャンデータの一部として取得されたラジオグラフは、機械学習方法の入力として機能する。外部オブジェクトの高品質な地上真実位置は、正確な3次元再構成とセグメンテーションによって得られる。これらのセグメンテーションボリュームを用いて、仮想投影を作成することで対応する2次元セグメンテーションが得られる。本稿では,従来のラジオグラフィアノテーションと比較して,客観的かつ再現的にトレーニングデータを生成する利点を概説する。また,CT再建に使用する物体数によって精度がどう変わるかを示す。その結果, 産業環境での適切な検出性能を実現するためには, 比較的少数の代表対象(すなわち10未満)しか必要とされないことがわかった。さらに、実際の実験データでは、標準のラジオグラフアノテーションよりも外部物体検出精度が高いことが示されている。 Detection of unwanted (`foreign') objects within products is a common procedure in many branches of industry for maintaining production quality. X-ray imaging is a fast, non-invasive and widely applicable method for foreign object detection. Deep learning has recently emerged as a powerful approach for recognizing patterns in radiographs (i.e., X-ray images), enabling automated X-ray based foreign object detection. However, these methods require a large number of training examples and manual annotation of these examples is a subjective and laborious task. In this work, we propose a Computed Tomography (CT) based method for producing training data for supervised learning of foreign object detection, with minimal labour requirements. In our approach, a few representative objects are CT scanned and reconstructed in 3D. The radiographs that have been acquired as part of the CT-scan data serve as input for the machine learning method. High-quality ground truth locations of the foreign objects are obtained through accurate 3D reconstructions and segmentations. Using these segmented volumes, corresponding 2D segmentations are obtained by creating virtual projections. We outline the benefits of objectively and reproducibly generating training data in this way compared to conventional radiograph annotation. In addition, we show how the accuracy depends on the number of objects used for the CT reconstructions. The results show that in this workflow generally only a relatively small number of representative objects (i.e., fewer than 10) are needed to achieve adequate detection performance in an industrial setting. Moreover, for real experimental data we show that the workflow leads to higher foreign object detection accuracies than with standard radiograph annotation.	翻訳日:2022-01-31 15:01:59 公開日:2022-01-28
# 球状CNNに対するM\"{o}bius Convolutions M\"{o}bius Convolutions for Spherical CNNs ( http://arxiv.org/abs/2201.12212v1 ) ライセンス: Link先を確認	Thomas W. Mitchel, Noam Aigerman, Vladimir G. Kim, Michael Kazhdan	(参考訳) M\"{o}bius 変換は幾何学と球面画像処理の両方において重要な役割を果たし、それらは2次元曲面の共形自己同型群とホモグラフの球面同値群である。ここでは、M\"{o}bius-equivariant spherical convolution operatorとよばれる新しい M\"{o}bius-equivariant spherical convolution operator を示し、それとともに、M\"{o}bius-equivariant spherical CNNの基礎を開発する。我々のアプローチは単純な観察に基づいている: 等分散を達成するためには、近傍のフレームに見られるような点の位置を変換する低次元部分群を考えるのみである。スケールでのM\"{o}bius畳み込みを効率的に計算するために、球面フィルタ上の変換の作用の近似を導出し、高速な球面高調波変換を用いてスペクトル領域における畳み込みを計算する。得られたフレームワークはフレキシブルかつ記述的であり、形状分類と画像分割の両タスクにおいて有望な結果を達成し、その有用性を実証する。 M\"{o}bius transformations play an important role in both geometry and spherical image processing -- they are the group of conformal automorphisms of 2D surfaces and the spherical equivalent of homographies. Here we present a novel, M\"{o}bius-equivariant spherical convolution operator which we call M\"{o}bius convolution, and with it, develop the foundations for M\"{o}bius-equivariant spherical CNNs. Our approach is based on a simple observation: to achieve equivariance, we only need to consider the lower-dimensional subgroup which transforms the positions of points as seen in the frames of their neighbors. To efficiently compute M\"{o}bius convolutions at scale we derive an approximation of the action of the transformations on spherical filters, allowing us to compute our convolutions in the spectral domain with the fast Spherical Harmonic Transform. The resulting framework is both flexible and descriptive, and we demonstrate its utility by achieving promising results in both shape classification and image segmentation tasks.	翻訳日:2022-01-31 15:01:37 公開日:2022-01-28
# スキップDQと無限時間ニューラルネットワーク(連続DQ)による暗黙と明示的深層学習の混合 Mixing Implicit and Explicit Deep Learning with Skip DEQs and Infinite Time Neural ODEs (Continuous DEQs) ( http://arxiv.org/abs/2201.12240v1 ) ライセンス: Link先を確認	Avik Pal, Alan Edelman, Christopher Rackauckas	(参考訳) Neural ODEsやDeep Equilibrium Models (DEQs)のような暗黙的なディープラーニングアーキテクチャは、そのソリューションプロセスの記述からレイヤの定義を分離する。暗黙の層は、新しいシナリオや入力に自動的に適応する深度などの特徴を許容するが、この適応性は計算コストの予測を困難にする。多くの著者は暗黙的層手法は明示的な層法よりも計算集約的であると指摘している。明示的なレイヤの計算コストを削減しつつ、暗黙的なレイヤの堅牢性を同時に達成する方法はあるのだろうか? そこで我々は,明示的予測と暗黙的補正を同時に学習する暗黙的拡張(imex)層であるskip deqを開発した。この明示的な層をトレーニングすることは自由であり、トレーニング時間を2.5倍、予測時間を3.4倍も短縮する。さらに、時間的逆伝播を必要とせず、標準の神経回路上でのトレーニングコストをパラドックス的に低減する無限時間神経回路の手法を再定義することにより、DECの「単純さ」をさらに増大させる。連続したスキップdeqアーキテクチャが、元のdeqよりも堅牢にトレーニングし、より高速なトレーニングと予測時間を実現する様子を実証する。この写本は、暗黙の深層学習と明示的な深層学習の二分法が両技法の利点を組み合わせていることを示すものである。 Implicit deep learning architectures, like Neural ODEs and Deep Equilibrium Models (DEQs), separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. Numerous authors have noted that implicit layer techniques can be more computationally intensive than explicit layer methods. In this manuscript, we address the question: is there a way to simultaneously achieve the robustness of implicit layers while allowing the reduced computational expense of an explicit layer? To solve this we develop Skip DEQ, an implicit-explicit (IMEX) layer that simultaneously trains an explicit prediction followed by an implicit correction. We show that training this explicit layer is free and even decreases the training time by 2.5x and prediction time by 3.4x. We then further increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE which paradoxically decreases the training cost over a standard neural ODE by not requiring backpropagation through time. We demonstrate how the resulting Continuous Skip DEQ architecture trains more robustly than the original DEQ while achieving faster training and prediction times. Together, this manuscript shows how bridging the dichotomy of implicit and explicit deep learning can combine the advantages of both techniques.	翻訳日:2022-01-31 15:01:15 公開日:2022-01-28
# 胎児超音波画像解析のための深層学習アルゴリズムの検討 A Review on Deep-Learning Algorithms for Fetal Ultrasound-Image Analysis ( http://arxiv.org/abs/2201.12260v1 ) ライセンス: Link先を確認	Maria Chiara Fiorentino and Francesca Pia Villani and Mariachiara Di Cosmo and Emanuele Frontoni and Sara Moccia	(参考訳) 深層学習(DL)アルゴリズムは超音波(US)胎児画像処理の標準となっている。この分野にはすでに多くの調査論文が存在しているが、そのほとんどは、胎児のdlアプリケーションをすべてカバーしていないか、医療画像分析の広い領域に焦点を当てている。本稿は,2017年以降に145の研究論文を出版し,この分野の最新研究を概観する。各論文は方法論とアプリケーションの観点から分析され、コメントされる。私たちは論文を分類した (i)胎児の標準航空機検出 (ii)解剖学的構造解析、及び (iii)バイオメトリパラメータ推定。各カテゴリに対して、主な制限とオープンイシューが提示される。概要表は、異なるアプローチの比較を容易にするために含まれます。アルゴリズムのパフォーマンスを評価するために一般的に使用される公開データセットとパフォーマンスメトリクスも要約されている。本稿では、胎児のUS画像解析におけるDLアルゴリズムの現状と、その研究方法論を実際の臨床実践に翻訳するために現場で働く研究者が取り組まなければならない課題について論じる。 Deep-learning (DL) algorithms are becoming the standard for processing ultrasound (US) fetal images. Despite a large number of survey papers already present in this field, most of them are focusing on a broader area of medical-image analysis or not covering all fetal US DL applications. This paper surveys the most recent work in the field, with a total of 145 research papers published after 2017. Each paper is analyzed and commented on from both the methodology and application perspective. We categorized the papers in (i) fetal standard-plane detection, (ii) anatomical-structure analysis, and (iii) biometry parameter estimation. For each category, main limitations and open issues are presented. Summary tables are included to facilitate the comparison among the different approaches. Publicly-available datasets and performance metrics commonly used to assess algorithm performance are summarized, too. This paper ends with a critical summary of the current state of the art on DL algorithms for fetal US image analysis and a discussion on current challenges that have to be tackled by researchers working in the field to translate the research methodology into the actual clinical practice.	翻訳日:2022-01-31 15:00:49 公開日:2022-01-28
# グローバルコンテキスト埋め込みによるtwitterストリームのエンティティ参照検出の促進 Boosting Entity Mention Detection for Targetted Twitter Streams with Global Contextual Embeddings ( http://arxiv.org/abs/2201.11885v1 ) ライセンス: Link先を確認	Satadisha Saha Bhowmick, Eduard C. Dragut and Weiyi Meng	(参考訳) Twitterのようなマイクロブログサイトは、ユビキタスな情報ソースとして登場した。マイクロブログにおける情報の自動抽出と分析に関する2つの重要なタスクは、エンティティ参照検出(emd)とエンティティ検出(ed)である。最先端のemdシステムは、オフラインの静的データセットでトレーニングすることで、マイクロブログテキストの非文字的性質をモデル化することを目的としている。彼らは、ノイズの多いテキストモデリングとエンティティ抽出のために個々のメッセージから、表面レベルの特徴(正書法、語彙、意味)の組み合わせを抽出する。しかし、マイクロブログストリームの絶え間なく進化する性質を考えると、短いメッセージのさまざまな限られたコンテキストから、すべてのエンティティへの言及を検出することは難しい問題である。そこで本稿では,マイクロブログストリーム上でのEMD学習者実行に適したフレームワークであるEMD Globalizerを提案する。既存のemdシステムによる分離されたマイクロブログメッセージの処理から逸脱し、メッセージの即時コンテキストからの学習知識がエンティティの提案に使用される。 EMDシステムによるエンティティ候補の最初の抽出の後、提案手法は発生源のマイニングを利用して、この最初の検出で見落とされた追加の候補言及を見つける。これらの言及の局所的な文脈表現を集約すると、ストリーム内のエンティティ候補の集団的コンテキストからグローバル埋め込みが引き出される。グローバルな埋め込みは、候補内のエンティティを偽陽性から分離するために使用される。ストリームからの当該エンティティに関するすべての言及は、フレームワークの最終出力で生成される。実験の結果、emd globalizerは、テストしたすべての既存のemdシステム(平均25.61%)の有効性を、計算オーバーヘッドを小さく向上できることがわかった。 Microblogging sites, like Twitter, have emerged as ubiquitous sources of information. Two important tasks related to the automatic extraction and analysis of information in Microblogs are Entity Mention Detection (EMD) and Entity Detection (ED). The state-of-the-art EMD systems aim to model the non-literary nature of microblog text by training upon offline static datasets. They extract a combination of surface-level features -- orthographic, lexical, and semantic -- from individual messages for noisy text modeling and entity extraction. But given the constantly evolving nature of microblog streams, detecting all entity mentions from such varying yet limited context of short messages remains a difficult problem. To this end, we propose a framework named EMD Globalizer, better suited for the execution of EMD learners on microblog streams. It deviates from the processing of isolated microblog messages by existing EMD systems, where learned knowledge from the immediate context of a message is used to suggest entities. After an initial extraction of entity candidates by an EMD system, the proposed framework leverages occurrence mining to find additional candidate mentions that are missed during this first detection. Aggregating the local contextual representations of these mentions, a global embedding is drawn from the collective context of an entity candidate within a stream. The global embeddings are then utilized to separate entities within the candidates from false positives. All mentions of said entities from the stream are produced in the framework's final outputs. Our experiments show that EMD Globalizer can enhance the effectiveness of all existing EMD systems that we tested (on average by 25.61%) with a small additional computational overhead.	翻訳日:2022-01-31 14:59:35 公開日:2022-01-28
# DeepSpeed と Megatron を用いた大規模生成言語モデル NLG 530B の訓練 Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model ( http://arxiv.org/abs/2201.11990v1 ) ライセンス: Link先を確認	Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro	(参考訳) 事前訓練された汎用言語モデルは、ゼロショット、少数ショット、微調整技術を用いて下流タスクに適応することで、様々な自然言語処理領域における最先端の精度を達成することができる。その成功により、これらのモデルのサイズは急速に増加し、そのような大規模モデルのトレーニングを可能にするために高性能なハードウェア、ソフトウェア、アルゴリズム技術が必要となった。 MicrosoftとNVIDIAの共同作業の結果、我々は最大のモノリシックトランスフォーマーベースの言語モデルであるMegatron-Turing NLG 530B(MT-NLG)のトレーニングの詳細を5300億のパラメータで提示した。本稿では,まず,このモデルをdeepspeedとmegatronを用いてトレーニングするための3次元並列化手法とともに,インフラストラクチャに焦点をあてる。次に、トレーニングプロセス、トレーニングコーパスの設計、データキュレーション技術について詳述する。最後に,MT-NLGによる様々な評価結果と,他の興味深い観測結果と新たな特性について考察する。 MT-NLGは、いくつかのNLPベンチマークにおいて、優れたゼロ、ワンショット、少数ショットの学習精度を実現し、新しい最先端結果を確立することを実証する。私たちの貢献は、大規模トレーニングインフラストラクチャ、大規模言語モデル、および自然言語世代の発展に役立ちます。 Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success, the size of these models has increased rapidly, requiring high-performance hardware, software, and algorithmic techniques to enable training such large models. As the result of a joint effort between Microsoft and NVIDIA, we present details on the training of the largest monolithic transformer based language model, Megatron-Turing NLG 530B (MT-NLG), with 530 billion parameters. In this paper, we first focus on the infrastructure as well as the 3D parallelism methodology used to train this model using DeepSpeed and Megatron. Next, we detail the training process, the design of our training corpus, and our data curation techniques, which we believe is a key ingredient to the success of the model. Finally, we discuss various evaluation results, as well as other interesting observations and new properties exhibited by MT-NLG. We demonstrate that MT-NLG achieves superior zero-, one-, and few-shot learning accuracies on several NLP benchmarks and establishes new state-of-the-art results. We believe that our contributions will help further the development of large-scale training infrastructures, large-scale language models, and natural language generations.	翻訳日:2022-01-31 14:59:07 公開日:2022-01-28
# PCL:教師なし文埋め込みのための多言語拡張によるピアコントラスト学習 PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings ( http://arxiv.org/abs/2201.12093v1 ) ライセンス: Link先を確認	Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang	(参考訳) 教師なしの方法で文を埋め込む学習は自然言語処理において基本である。最近の一般的な実践は、教師なしのコントラスト学習と事前訓練された言語モデルを組み合わせることである。それにもかかわらず、既存のアプローチは、通常単調な戦略に依存しているため、バイアスの増大に対する学習の近道が引き起こされ、したがって文埋め込みの品質が損なわれる。率直な解決策は、多元的戦略からより多様なポジティブスに頼ることであるが、オープンな疑問は、様々なポジティブスから教師なしで学ぶ方法でありながら、テキストフィールドの質を均等に増やすことである。 1つの答えとして,多種多様な拡張を伴うペアコントラスト学習(PCL)を提案する。 pclは、教師なし文埋め込みの群レベルでの多様な対比的正と負を構成する。 pclはピアポジティブなコントラストやピアネットワークの協調を行うことができ、独自のアンチバイアス能力と多様な拡張から学ぶ効果的な方法を提供する。 stsベンチマーク実験は,教師なし文埋め込みにおける競合相手に対するpclの有効性を検証する。 Learning sentence embeddings in an unsupervised manner is fundamental in natural language processing. Recent common practice is to couple pre-trained language models with unsupervised contrastive learning, whose success relies on augmenting a sentence with a semantically-close positive instance to construct contrastive pairs. Nonetheless, existing approaches usually depend on a mono-augmenting strategy, which causes learning shortcuts towards the augmenting biases and thus corrupts the quality of sentence embeddings. A straightforward solution is resorting to more diverse positives from a multi-augmenting strategy, while an open question remains about how to unsupervisedly learn from the diverse positives but with uneven augmenting qualities in the text field. As one answer, we propose a novel Peer-Contrastive Learning (PCL) with diverse augmentations. PCL constructs diverse contrastive positives and negatives at the group level for unsupervised sentence embeddings. PCL can perform peer-positive contrast as well as peer-network cooperation, which offers an inherent anti-bias ability and an effective way to learn from diverse augmentations. Experiments on STS benchmarks verify the effectiveness of our PCL against its competitors in unsupervised sentence embeddings.	翻訳日:2022-01-31 14:58:43 公開日:2022-01-28
# Protum: "[MASK]"に基づくプロンプトチューニングの新しい方法 Protum: A New Method For Prompt Tuning Based on "[MASK]" ( http://arxiv.org/abs/2201.12109v1 ) ライセンス: Link先を確認	Pan He and Yuxi Chen and Yan Wang and Yanru Zhang	(参考訳) 近年, 先行学習言語モデル (PLM) のパラメータを凍結することにより, 下流タスクにおける顕著な性能を得ることにより, 単語の表現にのみ依存する, NLP の新たなパラダイムとなっている。 Masked Language Model (MLM) \cite{devlin2018bert} タスクの事前トレーニングプロセスにおける一貫性を維持し、微調整中に発生する可能性のある問題を回避する。当然、"[mask]"トークンは他のトークンよりも有用な情報を持っていると考えます。現在のプロンプトチューニング手法では,複数の単語を予測した場合の解答トークンのランダムな構成に深刻な問題があるため,ヘルプ弁解器を用いてラベルにトークンをマッピングする必要がある。そこで,本稿では,[[mask]"トークンの隠れた層によって保持される情報を通じて分類タスクを構築し,応答トークンではなくラベルを直接予測する手法である,<textbf{m}ask] (\textbf{protum}) 法に基づく新しい \textbf{pro}mpt \textbf{tu}ning を提案する。同時に、"[MASK]"の下に隠された異なる層が、多くの異なるデータセットの分類モデルにどのように影響するかを調査する。最後に、私たちの \textbf{protum} は、時間消費の少ない継続的事前トレーニングの後、微調整よりもずっと優れたパフォーマンスを達成できることがわかりました。我々のモデルは,NLPにおける大規模モデルの実用化を促進する。 Recently, prompt tuning \cite{lester2021power} has gradually become a new paradigm for NLP, which only depends on the representation of the words by freezing the parameters of pre-trained language models (PLMs) to obtain remarkable performance on downstream tasks. It maintains the consistency of Masked Language Model (MLM) \cite{devlin2018bert} task in the process of pre-training, and avoids some issues that may happened during fine-tuning. Naturally, we consider that the "[MASK]" tokens carry more useful information than other tokens because the model combines with context to predict the masked tokens. Among the current prompt tuning methods, there will be a serious problem of random composition of the answer tokens in prediction when they predict multiple words so that they have to map tokens to labels with the help verbalizer. In response to the above issue, we propose a new \textbf{Pro}mpt \textbf{Tu}ning based on "[\textbf{M}ASK]" (\textbf{Protum}) method in this paper, which constructs a classification task through the information carried by the hidden layer of "[MASK]" tokens and then predicts the labels directly rather than the answer tokens. At the same time, we explore how different hidden layers under "[MASK]" impact on our classification model on many different data sets. Finally, we find that our \textbf{Protum} can achieve much better performance than fine-tuning after continuous pre-training with less time consumption. Our model facilitates the practical application of large models in NLP.	翻訳日:2022-01-31 14:58:23 公開日:2022-01-28
# エンティティリソースの広範囲化に向けて:多言語に対するデータ効率なアプローチ Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages ( http://arxiv.org/abs/2201.12219v1 ) ライセンス: Link先を確認	Silvia Severini, Ayyoob Imani, Philipp Dufter, Hinrich Sch\"utze	(参考訳) 並列コーパスは、MNE(multilingual named entity)リソース、すなわち複数の言語に翻訳された名前のデータセットを抽出するのに理想的である。並列コーパスからMNEデータセットを抽出する以前の作業では、大きなモノリンガルコーパスや単語調整器のようなリソースが必要だった。我々は、mneリソースを作成する新しい手法であるclc-bnを提案し、1000以上の言語からなるコーパスである並列聖書コーパスに適用する。 CLC-BNは、他のバイリンガルリソース、単語調整器、シードデータを必要としない、並列コーパス統計から神経翻訳モデルを学ぶ。実験の結果,CLC-BNは従来より明らかに優れていた。我々は1340言語用のMNEリソースをリリースし、知識グラフ増強とバイリンガル語彙誘導という2つの下流タスクでその効果を示す。 Parallel corpora are ideal for extracting a multilingual named entity (MNE) resource, i.e., a dataset of names translated into multiple languages. Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages. We present CLC-BN, a new method for creating an MNE resource, and apply it to the Parallel Bible Corpus, a corpus of more than 1000 languages. CLC-BN learns a neural transliteration model from parallel-corpus statistics, without requiring any other bilingual resources, word aligners, or seed data. Experimental results show that CLC-BN clearly outperforms prior work. We release an MNE resource for 1340 languages and demonstrate its effectiveness in two downstream tasks: knowledge graph augmentation and bilingual lexicon induction.	翻訳日:2022-01-31 14:57:51 公開日:2022-01-28
# (参考訳) REET:計算病理のロバスト性評価・強化ツールボックス REET: Robustness Evaluation and Enhancement Toolbox for Computational Pathology ( http://arxiv.org/abs/2201.12311v1 ) ライセンス: CC BY 4.0	Alex Foote, Amina Asif, Nasir Rajpoot and Fayyaz Minhas	(参考訳) モチベーション: デジタルスライドスキャナーによる病理学研究室のデジタル化と、客観的組織学的評価のための深層学習アプローチの進歩により、コンピュータ病理学(CPath)の分野で急速に進歩し、医学・薬学研究や臨床ワークフローにも幅広く応用されている。しかし、入力画像の変動に対するCPathモデルのロバスト性の推定は、これらのアプローチの下流での実用性、展開、受容性に大きな影響を及ぼす、オープンな問題である。さらに、このようなモデルの堅牢性を高めるためのドメイン特化戦略の開発も重要である。実装と可用性: 本研究では, 計算病理学応用のための最初のドメイン固有ロバスト性評価および拡張ツールボックス(reet)を提案する。ステンドリング、圧縮、フォーカス、ぼかし、空間分解能の変化、輝度の変化、幾何学的変化、ピクセルレベルの逆摂動といった特殊な画像変換に関して、予測モデルのロバスト性評価を可能にするアルゴリズム戦略のスイートを提供する。さらにreetは、計算病理学におけるディープラーニングパイプラインの効率的で堅牢なトレーニングも可能にする。 REETはPythonで実装されており、以下のURLで利用できる。連絡先: Fayyaz.minhas@warwick.ac.uk Motivation: Digitization of pathology laboratories through digital slide scanners and advances in deep learning approaches for objective histological assessment have resulted in rapid progress in the field of computational pathology (CPath) with wide-ranging applications in medical and pharmaceutical research as well as clinical workflows. However, the estimation of robustness of CPath models to variations in input images is an open problem with a significant impact on the down-stream practical applicability, deployment and acceptability of these approaches. Furthermore, development of domain-specific strategies for enhancement of robustness of such models is of prime importance as well. Implementation and Availability: In this work, we propose the first domain-specific Robustness Evaluation and Enhancement Toolbox (REET) for computational pathology applications. It provides a suite of algorithmic strategies for enabling robustness assessment of predictive models with respect to specialized image transformations such as staining, compression, focusing, blurring, changes in spatial resolution, brightness variations, geometric changes as well as pixel-level adversarial perturbations. Furthermore, REET also enables efficient and robust training of deep learning pipelines in computational pathology. REET is implemented in Python and is available at the following URL: https://github.com/alexjfoote/reetoolbox. Contact: Fayyaz.minhas@warwick.ac.uk	翻訳日:2022-01-31 14:56:20 公開日:2022-01-28
# O-ViT:直交型視覚変換器 O-ViT: Orthogonal Vision Transformer ( http://arxiv.org/abs/2201.12133v1 ) ライセンス: Link先を確認	Yanhong Fei, Yingjie Liu, Xian Wei, Mingsong Chen	(参考訳) ViT(Vision Transformer)は、自然言語処理における自己認識機構の素晴らしい成功に触発され、画像パッチシーケンスに創造的に適用し、素晴らしいパフォーマンスを実現します。しかし、ViTのスケールされたドット積自己アテンションは、元の特徴空間の構造にスケールの曖昧さをもたらす。この問題に対処するために、幾何学的視点からViTを最適化するOrthogonal Vision Transformer (O-ViT) という新しい手法を提案する。 O-ViT は自己アテンションブロックのパラメータをノルム維持直交多様体上に制限し、特徴空間の幾何学を維持できる。さらに、O-ViTは直交群とリー代数間の全射写像を採用することで、直交制約と安価な最適化オーバーヘッドの両方を実現し、O-ViTの有効性を実証するために画像認識タスクの比較実験を行い、O-ViTが最大3.6%向上することを示した。 Inspired by the tremendous success of the self-attention mechanism in natural language processing, the Vision Transformer (ViT) creatively applies it to image patch sequences and achieves incredible performance. However, the scaled dot-product self-attention of ViT brings about scale ambiguity to the structure of the original feature space. To address this problem, we propose a novel method named Orthogonal Vision Transformer (O-ViT), to optimize ViT from the geometric perspective. O-ViT limits parameters of self-attention blocks to be on the norm-keeping orthogonal manifold, which can keep the geometry of the feature space. Moreover, O-ViT achieves both orthogonal constraints and cheap optimization overhead by adopting a surjective mapping between the orthogonal group and its Lie algebra.We have conducted comparative experiments on image recognition tasks to demonstrate O-ViT's validity and experiments show that O-ViT can boost the performance of ViT by up to 3.6%.	翻訳日:2022-01-31 14:52:36 公開日:2022-01-28
# 知覚再構成を用いた教師なし単発深度推定 Unsupervised Single-shot Depth Estimation using Perceptual Reconstruction ( http://arxiv.org/abs/2201.12170v1 ) ライセンス: Link先を確認	Christoph Angermann, Matthias Schwab, Markus Haltmeier, Christian Laubichler and Steinbj\"orn J\'onsson	(参考訳) 実物体深度の実時間推定は,3次元再構成,シーン理解,機械部品の状態評価など,様々な自律システムタスクの実行に不可欠なモジュールである。機械学習の過去10年間、コンピュータビジョンタスクへのディープラーニング手法の広範な展開は、単純なRGBモダリティから現実的な深度合成を実現するためのアプローチを生み出してきた。これらのモデルのほとんどは、対の深度データやビデオシーケンスやステレオ画像の可用性に基づいているが、完全な教師なし設定での単視点深度合成の手法はほとんど検討されていない。この研究は、生成ニューラルネットワークの分野における最新の進歩を示し、それらを活用して完全に教師なしの単発深度合成を行う。 RGB-to-deepthとdeep-to-RGB転送用の2つのジェネレータを実装し,Wasserstein-1距離と新しい知覚再構成項を用いて同時に最適化した。提案手法が検証可能であることを確認するため, 工業用表面深度データと, 体深を記録するテキサス3次元顔認識データベースとSURREALデータセットを用いて, モデルを総合的に評価した。この研究で得られた成功は、実世界のアプリケーションにおける教師なし単発深度推定の可能性を示唆している。 Real-time estimation of actual object depth is a module that is essential to performing various autonomous system tasks such as 3D reconstruction, scene understanding and condition assessment of machinery parts. During the last decade of machine learning, extensive deployment of deep learning methods to computer vision tasks has yielded approaches that succeed in achieving realistic depth synthesis out of a simple RGB modality. While most of these models are based on paired depth data or availability of video sequences and stereo images, methods for single-view depth synthesis in a fully unsupervised setting have hardly been explored. This study presents the most recent advances in the field of generative neural networks, leveraging them to perform fully unsupervised single-shot depth synthesis. Two generators for RGB-to-depth and depth-to-RGB transfer are implemented and simultaneously optimized using the Wasserstein-1 distance and a novel perceptual reconstruction term. To ensure that the proposed method is plausible, we comprehensively evaluate the models using industrial surface depth data as well as the Texas 3D Face Recognition Database and the SURREAL dataset that records body depth. The success observed in this study suggests the great potential for unsupervised single-shot depth estimation in real-world applications.	翻訳日:2022-01-31 14:52:15 公開日:2022-01-28
# 失語症例におけるテキスト列検出能力向上のための自己ペース学習 Self-paced learning to improve text row detection in historical documents with missing lables ( http://arxiv.org/abs/2201.12216v1 ) ライセンス: Link先を確認	Mihaela Gaman, Lida Ghadamiyan, Radu Tudor Ionescu, Marius Popescu	(参考訳) 光文字認識システムの重要な予備ステップは、テキスト列の検出である。この課題にラベルを欠いた履歴データを用いて対処するために,行検出性能を向上させる自己評価学習アルゴリズムを提案する。より地味なバウンディングボックスを持つページはアノテーションを欠く可能性が低いと推測する。この仮説に基づいて, 基礎トラス境界ボックスの数に関して, 下位順のトレーニング例をソートし, それらをkバッチに整理する。自己ペース学習法を用いて,k個の反復に対して列検出器を訓練し,基底アノテーションの少ないバッチを徐々に追加する。各イテレーションにおいて、ゼロトラス境界ボックスと擬似バウンディングボックス(モデル自身によって予測されるバウンディングボックス)を非最大抑圧を用いて組み合わせ、次のトレーニングイテレーションで得られたアノテーションを含める。我々の自己ペース学習戦略は、2つの歴史的文書のデータセットで大きなパフォーマンス向上をもたらし、yolov4の平均精度を1つのデータセットで12%以上、もう一方で39%向上させることを実証した。 An important preliminary step of optical character recognition systems is the detection of text rows. To address this task in the context of historical data with missing labels, we propose a self-paced learning algorithm capable of improving the row detection performance. We conjecture that pages with more ground-truth bounding boxes are less likely to have missing annotations. Based on this hypothesis, we sort the training examples in descending order with respect to the number of ground-truth bounding boxes, and organize them into k batches. Using our self-paced learning method, we train a row detector over k iterations, progressively adding batches with less ground-truth annotations. At each iteration, we combine the ground-truth bounding boxes with pseudo-bounding boxes (bounding boxes predicted by the model itself) using non-maximum suppression, and we include the resulting annotations at the next training iteration. We demonstrate that our self-paced learning strategy brings significant performance gains on two data sets of historical documents, improving the average precision of YOLOv4 with more than 12% on one data set and 39% on the other.	翻訳日:2022-01-31 14:51:54 公開日:2022-01-28
# 画像検索:ブラックボックス学習をグレーに変える Indicative Image Retrieval: Turning Blackbox Learning into Grey ( http://arxiv.org/abs/2201.11898v1 ) ライセンス: Link先を確認	Xulu Zhang (1), Zhenqun Yang (2), Hao Tian (1), Qing Li (3), Xiaoyong Wei (1 and 3) ((1) Sichuan University, (2) Chinese University of Hong Kong, (3) Hong Kong Polytechnic Univeristy)	(参考訳) ディープラーニングは、導入後すぐに画像検索のためのゲームチェンジャーとなった。画像検索のコアとして特徴抽出(表現学習による)を促進し、関連性/マッチング評価を単純な類似度メトリクスに分解する。多くのアプリケーションでは、ランク付けされたリスト(例えば、医療画像中の標的タンパク質/細胞/レシオンの位置)を持つのではなく、一致する証拠を示す必要がある。一致した単語を検索エンジンでハイライトする必要があるように思える。しかし、明示的な適合/マッチングモデリングなしでは、これは実装が容易ではない。深層表現学習モデルはブラックボックスの性質のため実現不可能である。本稿では,深層学習における関連・マッチングモデリングの重要性を再考する。本研究は,表現学習を省略し,一致した証拠を直接モデル化することは可能であることを示す。事前学習されたモデルへの依存を取り除くことで、多くの関連する問題(例えば、分類と検索の間のドメインギャップ、畳み込みによる詳細拡散など)を避けてきた。さらに重要なことに、この研究は一致した証拠を生成するために、マッチングを明示的にモデル化し、後でバックトラックすることができることを示している。深い推論の説明可能性を向上させることができる。本手法はオックスフォード5kとパリ6kの両文献で最高の性能を示し,オックスフォード5kでは97.77%(パリ6kでは97.81%)の新記録を,深い特徴を抽出することなく達成した。 Deep learning became the game changer for image retrieval soon after it was introduced. It promotes the feature extraction (by representation learning) as the core of image retrieval, with the relevance/matching evaluation being degenerated into simple similarity metrics. In many applications, we need the matching evidence to be indicated rather than just have the ranked list (e.g., the locations of the target proteins/cells/lesions in medical images). It is like the matched words need to be highlighted in search engines. However, this is not easy to implement without explicit relevance/matching modeling. The deep representation learning models are not feasible because of their blackbox nature. In this paper, we revisit the importance of relevance/matching modeling in deep learning era with an indicative retrieval setting. The study shows that it is possible to skip the representation learning and model the matching evidence directly. By removing the dependency on the pre-trained models, it has avoided a lot of related issues (e.g., the domain gap between classification and retrieval, the detail-diffusion caused by convolution, and so on). More importantly, the study demonstrates that the matching can be explicitly modeled and backtracked later for generating the matching evidence indications. It can improve the explainability of deep inference. Our method obtains a best performance in literature on both Oxford-5k and Paris-6k, and sets a new record of 97.77% on Oxford-5k (97.81% on Paris-6k) without extracting any deep features.	翻訳日:2022-01-31 14:51:33 公開日:2022-01-28
# 感情応答検出のためのCAREデータセット The CARE Dataset for Affective Response Detection ( http://arxiv.org/abs/2201.11895v1 ) ライセンス: Link先を確認	Jane A. Yu and Alon Y. Halevy	(参考訳) ソーシャルメディアは、友人や家族とのコミュニケーション、情報とエンターテイメントの消費において、ますます大きな役割を果たしている。したがって、ソーシャルメディア上で投稿の効果的なランキング関数を設計するには、投稿に対する感情的な反応を予測するのが有用である(例えば、ユーザーがユーモア、インスピレーション、怒り、インフォメーションを受けやすいかどうかなど)。感情認識の研究(記事の出版者の影響に焦点を当てている)と同様に、感情的反応を認識する伝統的なアプローチは、トレーニングデータの人間のアノテーションに投資するコストがかかる。そこで我々は,CARE(Common Affective Response Expression, CARE)法を用いて,7つの感情反応に基づいて注釈付き230kのソーシャルメディア投稿のデータセットであるCARE$_{db}$を紹介した。 CARE法は、投稿に反応して投稿されたコメントに存在している信号を活用する手段であり、人間のアノテーションなしで投稿に対する読者の感情反応に関する高精度な証拠を提供する。ヒューマンアノテーションとは異なり、ここで記述したアノテーションプロセスは、特に新しい感情的な反応のために、メソッドのカバレッジを拡大するために繰り返します。本稿では,CAREアノテーションがクラウドソースアノテーションと良好に比較できることを示す実験について述べる。最後に、CARE$_{db}$を使用して、競合するBERTベースのモデルをトレーニングし、感情検出だけでなく、関連するタスクに対するデータセットの有用性を実証する。 Social media plays an increasing role in our communication with friends and family, and our consumption of information and entertainment. Hence, to design effective ranking functions for posts on social media, it would be useful to predict the affective response to a post (e.g., whether the user is likely to be humored, inspired, angered, informed). Similar to work on emotion recognition (which focuses on the affect of the publisher of the post), the traditional approach to recognizing affective response would involve an expensive investment in human annotation of training data. We introduce CARE$_{db}$, a dataset of 230k social media posts annotated according to 7 affective responses using the Common Affective Response Expression (CARE) method. The CARE method is a means of leveraging the signal that is present in comments that are posted in response to a post, providing high-precision evidence about the affective response of the readers to the post without human annotation. Unlike human annotation, the annotation process we describe here can be iterated upon to expand the coverage of the method, particularly for new affective responses. We present experiments that demonstrate that the CARE annotations compare favorably with crowd-sourced annotations. Finally, we use CARE$_{db}$ to train competitive BERT-based models for predicting affective response as well as emotion detection, demonstrating the utility of the dataset for related tasks.	翻訳日:2022-01-31 14:51:10 公開日:2022-01-28
# NLPのためのセキュアで効率的なフェデレート学習フレームワーク A Secure and Efficient Federated Learning Framework for NLP ( http://arxiv.org/abs/2201.11934v1 ) ライセンス: Link先を確認	Jieren Deng, Chenghong Wang, Xianrui Meng, Yijue Wang, Ji Li, Sheng Lin, Shuo Han, Fei Miao, Sanguthevar Rajasekaran, Caiwen Ding	(参考訳) 本稿では,安全かつ効率的な連合学習(fl)フレームワークの設計について考察する。既存のソリューションは、信頼できるアグリゲータを含むか、重厚な暗号プリミティブを必要とする。さらに、既存のセキュアなFL設計の多くは、トレーニングプロトコルからクライアントを排除できないという制限的な仮定の下でのみ機能します。これらの問題に対処するために,(1)信頼エンティティの必要性をなくすセキュアで効率的なFLフレームワークSEFLを提案し,(2)既存のFL設計と類似したモデル精度を達成し,(3)クライアントのドロップアウトに対して耐性を持つ。自然言語処理(NLP)タスクに関する広範な実験的研究を通じて,SEFLが既存のFLソリューションと同等の精度を実現し,提案手法により実行時の性能を最大13.7倍に向上させることができることを示した。 In this work, we consider the problem of designing secure and efficient federated learning (FL) frameworks. Existing solutions either involve a trusted aggregator or require heavyweight cryptographic primitives, which degrades performance significantly. Moreover, many existing secure FL designs work only under the restrictive assumption that none of the clients can be dropped out from the training protocol. To tackle these problems, we propose SEFL, a secure and efficient FL framework that (1) eliminates the need for the trusted entities; (2) achieves similar and even better model accuracy compared with existing FL designs; (3) is resilient to client dropouts. Through extensive experimental studies on natural language processing (NLP) tasks, we demonstrate that the SEFL achieves comparable accuracy compared to existing FL solutions, and the proposed pruning technique can improve runtime performance up to 13.7x.	翻訳日:2022-01-31 14:50:45 公開日:2022-01-28
# 音声言語理解におけるセット予測のためのエンドツーエンドモデルの改善 Improving End-to-End Models for Set Prediction in Spoken Language Understanding ( http://arxiv.org/abs/2201.12105v1 ) ライセンス: Link先を確認	Hong-Kwang J. Kuo, Zoltan Tuske, Samuel Thomas, Brian Kingsbury, George Saon	(参考訳) 音声言語理解システム(SLU)の目標は,入力音声信号の意味を決定することである。エンド・ツー・エンド(E2E)音声モデリングの進歩により、動詞の転写よりもはるかに安価に収集できるセマンティック・エンティティのみを訓練できるようになった。我々は、エンティティの順序が未定であるこのセットの予測問題に焦点を当てる。 RNNトランスデューサとアテンションベースエンコーダ-デコーダの2種類のE2Eモデルを用いて,トレーニングエンティティシーケンスを音声順に並べた場合,これらのモデルが最もよく動作することを示す。エンティティ音声の順序が不明な場合、E2E SLUモデルを改善するために、暗黙の注意に基づくアライメント手法とともに、新しいデータ拡張手法を提案する。 F1スコアは、RNN-Tで11%以上増加し、アテンションベースのエンコーダデコーダSLUモデルで約2%増加した。 The goal of spoken language understanding (SLU) systems is to determine the meaning of the input speech signal, unlike speech recognition which aims to produce verbatim transcripts. Advances in end-to-end (E2E) speech modeling have made it possible to train solely on semantic entities, which are far cheaper to collect than verbatim transcripts. We focus on this set prediction problem, where entity order is unspecified. Using two classes of E2E models, RNN transducers and attention based encoder-decoders, we show that these models work best when the training entity sequence is arranged in spoken order. To improve E2E SLU models when entity spoken order is unknown, we propose a novel data augmentation technique along with an implicit attention based alignment method to infer the spoken order. F1 scores significantly increased by more than 11% for RNN-T and about 2% for attention based encoder-decoder SLU models, outperforming previously reported results.	翻訳日:2022-01-31 14:50:30 公開日:2022-01-28
# 安全強化学習のための制約付き変分政策最適化 Constrained Variational Policy Optimization for Safe Reinforcement Learning ( http://arxiv.org/abs/2201.11927v1 ) ライセンス: Link先を確認	Zuxin Liu, Zhepeng Cen, Vladislav Isenbaev, Wei Liu, Zhiwei Steven Wu, Bo Li, Ding Zhao	(参考訳) 安全強化学習(RL)は、安全クリティカルなアプリケーションにデプロイする前に、一定の制約を満たすポリシーを学ぶことを目的としている。一般的な制約付き最適化フレームワークであるprimal-dualは不安定な問題に苦しんでおり、最適性の保証が欠けている。本稿では,新しい確率的推論の観点から問題を克服し,安全政策を学習するための期待最大化方式を提案する。安全なRL問題は分解可能であることを示す。 1)非パラメトリック変分分布をもつ凸最適化位相と 2)教師付き学習段階。最適性と政策改善の安定性を証明し,制約付き変動政策最適化の独特な利点を示す。連続ロボットタスクに関する幅広い実験により,本手法は本手法よりも制約満足度とサンプル効率の点で有意に優れた性能が得られることが示された。 Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying to safety-critical applications. Primal-dual as a prevalent constrained optimization framework suffers from instability issues and lacks optimality guarantees. This paper overcomes the issues from a novel probabilistic inference perspective and proposes an Expectation-Maximization style approach to learn safe policy. We show that the safe RL problem can be decomposed to 1) a convex optimization phase with a non-parametric variational distribution and 2) a supervised learning phase. We show the unique advantages of constrained variational policy optimization by proving its optimality and policy improvement stability. A wide range of experiments on continuous robotic tasks show that the proposed method achieves significantly better performance in terms of constraint satisfaction and sample efficiency than primal-dual baselines.	翻訳日:2022-01-31 14:49:14 公開日:2022-01-28
# FCMNet: Full Communication Memory Net]{FCMNet: Full Communication Memory Net for Team-Level Cooperation in Multi-Agent Systems FCMNet: Full Communication Memory Net]{FCMNet: Full Communication Memory Net for Team-Level Cooperation in Multi-Agent Systems ( http://arxiv.org/abs/2201.11994v1 ) ライセンス: Link先を確認	Yutong Wang and Guillaume Sartoretti	(参考訳) 部分観測可能なマルチエージェントシステムにおける分散協調は、エージェント間の効果的な通信を必要とする。この取り組みをサポートするため、本研究は、グローバルコミュニケーションが利用可能だが信頼性に欠ける可能性のある問題のクラスに焦点を当てている。エージェントが同時に学習できる強化学習ベースのアプローチであるFCMNetを導入する。 a) 効果的なマルチホップ通信プロトコル及び b)チームレベルの意思決定を可能にする共通の分散型政策。具体的には,エージェント間の通信メッセージとして,複数方向リカレントニューラルネットワークの隠れ状態を利用する。単純なマルチホップトポロジーを用いて,各エージェントに,各エージェントがシーケンシャルにエンコードした情報を各時間ステップ毎に受信する能力を与え,グローバルな協調性を改善する。 FCMNetは、共有報酬を伴うStarCraft IIマイクロマネジメントタスクの挑戦的なセットと、個別報酬を伴う協調的なマルチエージェントパスフィンディングタスクを実証する。そこで本研究では,FCMNetがStarCraft IIマイクロマネジメントタスクにおいて,最先端のコミュニケーションに基づく強化学習手法と,特定のタスクにおける価値分解手法より優れていることを示す。さらに,ランダムなメッセージ損失や2元化メッセージ(非微分可能通信チャネル)といった現実的通信障害下でのfcmnetのロバスト性について検討し,様々な実環境下でのロボットタスクへのfmcnetの適用可能性を示す。 Decentralized cooperation in partially-observable multi-agent systems requires effective communications among agents. To support this effort, this work focuses on the class of problems where global communications are available but may be unreliable, thus precluding differentiable communication learning methods. We introduce FCMNet, a reinforcement learning based approach that allows agents to simultaneously learn a) an effective multi-hop communications protocol and b) a common, decentralized policy that enables team-level decision-making. Specifically, our proposed method utilizes the hidden states of multiple directional recurrent neural networks as communication messages among agents. Using a simple multi-hop topology, we endow each agent with the ability to receive information sequentially encoded by every other agent at each time step, leading to improved global cooperation. We demonstrate FCMNet on a challenging set of StarCraft II micromanagement tasks with shared rewards, as well as a collaborative multi-agent pathfinding task with individual rewards. There, our comparison results show that FCMNet outperforms state-of-the-art communication-based reinforcement learning methods in all StarCraft II micromanagement tasks, and value decomposition methods in certain tasks. We further investigate the robustness of FCMNet under realistic communication disturbances, such as random message loss or binarized messages (i.e., non-differentiable communication channels), to showcase FMCNet's potential applicability to robotic tasks under a variety of real-world conditions.	翻訳日:2022-01-31 14:49:04 公開日:2022-01-28
# バイオインスパイアされたCortexベースの高速コードブック生成 Bioinspired Cortex-based Fast Codebook Generation ( http://arxiv.org/abs/2201.12322v1 ) ライセンス: Link先を確認	Meric Yucel, Serdar Bagis, Ahmet Sertbas, Mehmet Sarikaya, Burak Berk Ustundag	(参考訳) 人工知能の主な原型は、一般化性能を高めながら時間効率と正確性を促進するアルゴリズムの開発である。機械学習の最近の発展にもかかわらず、重要な制限は初期データから非効率な特徴抽出であり、これは性能最適化に不可欠である。本稿では,脳内の知覚皮質ネットワークに触発された特徴抽出手法を提案する。バイオインスパイアされた皮質と呼ばれるこのアルゴリズムは、圧縮された形式でデータを処理しながら、優れた計算効率でストリーミング信号からの直交的特徴への収束を提供する。本稿では,Birch,GMM,K-meansなどの一般的なクラスタリングアルゴリズムと比較し,人工的な複雑なデータを用いた新しいアルゴリズムの性能を示す。データ処理時間は大幅に短縮されるが、数秒対時間では符号化歪みは、より一般化の基盤となる新しいアルゴリズムで本質的に同じである。ここでは、クラスタリングとベクトル量子化における大脳皮質モデルの優れた性能を示すが、推論、異常検出、大範囲アプリケーションでの分類、例えば金融、サイバーセキュリティ、医療といった機械学習の基本コンポーネントに強力な実装機会を提供する。 A major archetype of artificial intelligence is developing algorithms facilitating temporal efficiency and accuracy while boosting the generalization performance. Even with the latest developments in machine learning, a key limitation has been the inefficient feature extraction from the initial data, which is essential in performance optimization. Here, we introduce a feature extraction method inspired by sensory cortical networks in the brain. Dubbed as bioinspired cortex, the algorithm provides convergence to orthogonal features from streaming signals with superior computational efficiency while processing data in compressed form. We demonstrate the performance of the new algorithm using artificially created complex data by comparing it with the commonly used traditional clustering algorithms, such as Birch, GMM, and K-means. While the data processing time is significantly reduced, seconds versus hours, encoding distortions remain essentially the same in the new algorithm providing a basis for better generalization. Although we show herein the superior performance of the cortex model in clustering and vector quantization, it also provides potent implementation opportunities for machine learning fundamental components, such as reasoning, anomaly detection and classification in large scope applications, e.g., finance, cybersecurity, and healthcare.	翻訳日:2022-01-31 14:48:36 公開日:2022-01-28
# 連続行動空間における政策鏡の隠れバイアスについて On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces ( http://arxiv.org/abs/2201.12332v1 ) ライセンス: Link先を確認	Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian Sadler, Pratap Tokekar, Alec Koppel	(参考訳) 連続行動空間上での強化学習のためのパラメータ化政策探索に着目した。典型的には、政策に付随するスコア関数が有界であると仮定するが、これはガウスの政策でさえ保持できない。この問題に適切に対処するには、境界のある領域を定量化する探索許容パラメータを導入する必要がある。このようなことは、期待される方針勾配ノルムの減衰率に現れる永続的なバイアスを生じさせ、これは作用空間の半径に逆比例する。この隠れたバイアスを軽減するために、境界スコア関数を示すヘビーテールのポリシーパラメータ化が用いられるが、アルゴリズム更新の不安定性を引き起こす可能性がある。そこで本研究では,重み付きパラメータ化下でのポリシー勾配アルゴリズムの収束について検討し,ミラーアセント型更新と勾配追跡を組み合わせることで安定化する手法を提案する。我々の理論的な主な貢献は、このスキームが一定のステップとバッチサイズに収束すること、一方、以前の研究ではこれらのパラメータをそれぞれnullに縮小するか無限大に成長させる必要があることである。実験的に、重み付きポリシーパラメータ化の下でこのスキームは、標準ベンチマークと比べて様々な設定で報酬の蓄積が改善される。 We focus on parameterized policy search for reinforcement learning over continuous action spaces. Typically, one assumes the score function associated with a policy is bounded, which {fails to hold even for Gaussian policies. } To properly address this issue, one must introduce an exploration tolerance parameter to quantify the region in which it is bounded. Doing so incurs a persistent bias that appears in the attenuation rate of the expected policy gradient norm, which is inversely proportional to the radius of the action space. To mitigate this hidden bias, heavy-tailed policy parameterizations may be used, which exhibit a bounded score function, but doing so can cause instability in algorithmic updates. To address these issues, in this work, we study the convergence of policy gradient algorithms under heavy-tailed parameterizations, which we propose to stabilize with a combination of mirror ascent-type updates and gradient tracking. Our main theoretical contribution is the establishment that this scheme converges with constant step and batch sizes, whereas prior works require these parameters to respectively shrink to null or grow to infinity. Experimentally, this scheme under a heavy-tailed policy parameterization yields improved reward accumulation across a variety of settings as compared with standard benchmarks.	翻訳日:2022-01-31 14:48:17 公開日:2022-01-28
# テンソル分解による一貫した協調フィルタリング Consistent Collaborative Filtering via Tensor Decomposition ( http://arxiv.org/abs/2201.11936v1 ) ライセンス: Link先を確認	Shiwen Zhao, Charles Crissman, Guillermo R Sapiro	(参考訳) コラボレーティブフィルタリングは、ユーザのアクティビティを分析し、アイテムのレコメンデーションシステムを構築するためのデファクトスタンダードである。本研究では,暗黙的フィードバックに基づく協調フィルタリングの新しいモデルであるsliced anti-symmetric decomposition (sad)を開発した。ユーザ(ユーザベクター)とアイテム(テムベクター)の潜伏表現を推定する従来の手法とは対照的に、SADはユーザ-テムインタラクションの3方向テンソルビューを使用して、各項目に1つの潜伏ベクトルを導入する。この新たなベクターは、標準ドット製品によって計算されたユーザ-項目の嗜好を一般的な内部製品に拡張し、相対的な嗜好を評価する際にアイテム間の相互作用を生成する。 SADはベクトルが1に崩壊すると、最先端(SOTA)協調フィルタリングモデルに還元されるが、本論文では、その値をデータから推定する。提案したSADモデルは単純で,グループ確率勾配降下法(SGD)アルゴリズムが有効である。我々は,100万以上のユーザ・イテムインタラクションを含むシミュレーションおよび実世界のデータセットにおいて,SADの効率を実証する。 SADを7種類のSOTA協調フィルタリングモデルと比較することにより、SADはパーソナライズされた好みをより一貫して推定できることを示す。 Collaborative filtering is the de facto standard for analyzing users' activities and building recommendation systems for items. In this work we develop Sliced Anti-symmetric Decomposition (SAD), a new model for collaborative filtering based on implicit feedback. In contrast to traditional techniques where a latent representation of users (user vectors) and items (item vectors) are estimated, SAD introduces one additional latent vector to each item, using a novel three-way tensor view of user-item interactions. This new vector extends user-item preferences calculated by standard dot products to general inner products, producing interactions between items when evaluating their relative preferences. SAD reduces to state-of-the-art (SOTA) collaborative filtering models when the vector collapses to one, while in this paper we allow its value to be estimated from data. The proposed SAD model is simple, resulting in an efficient group stochastic gradient descent (SGD) algorithm. We demonstrate the efficiency of SAD in both simulated and real world datasets containing over 1M user-item interactions. By comparing SAD with seven alternative SOTA collaborative filtering models, we show that SAD is able to more consistently estimate personalized preferences.	翻訳日:2022-01-31 14:47:57 公開日:2022-01-28
# BCDAG:ガウスDAGのベイズ構造と因果学習のためのRパッケージ BCDAG: An R package for Bayesian structure and Causal learning of Gaussian DAGs ( http://arxiv.org/abs/2201.12003v1 ) ライセンス: Link先を確認	Federico Castelletti and Alessandro Mascaro	(参考訳) 有向非巡回グラフ(英語版)(dags)は多変量設定における変数間の因果関係をモデル化するための強力な枠組みを提供する。この設定では、データからDAG構造を推定する過程を因果構造学習または因果構造発見と呼ぶ。ガウス観測データからベイジアン因果発見と因果効果推定のためのRパッケージであるBCDAGを導入し,カステレッティ&マスカロ (2021) が提案したマルコフ連鎖モンテカルロ (MCMC) 方式を実装した。我々の実装は、観測回数と、DAGが十分にスパースであるたびに、データセット内の変数の数で効率よくスケールする。また、収束診断や後部推論の可視化及び要約のための機能も提供する。本稿では,BCDAGの実装とともに,基礎となる方法論の重要な特徴について述べる。次に,実データとシミュレーションデータの両方において,主な機能とアルゴリズムを説明する。 Directed Acyclic Graphs (DAGs) provide a powerful framework to model causal relationships among variables in multivariate settings; in addition, through the do-calculus theory, they allow for the identification and estimation of causal effects between variables also from pure observational data. In this setting, the process of inferring the DAG structure from the data is referred to as causal structure learning or causal discovery. We introduce BCDAG, an R package for Bayesian causal discovery and causal effect estimation from Gaussian observational data, implementing the Markov chain Monte Carlo (MCMC) scheme proposed by Castelletti & Mascaro (2021). Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, with the number of variables in the dataset. The package also provides functions for convergence diagnostics and for visualizing and summarizing posterior inference. In this paper, we present the key features of the underlying methodology along with its implementation in BCDAG. We then illustrate the main functions and algorithms on both real and simulated datasets.	翻訳日:2022-01-31 14:47:33 公開日:2022-01-28
# コンフォメーション予測による専門家予測の改善 Provably Improving Expert Predictions with Conformal Prediction ( http://arxiv.org/abs/2201.12006v1 ) ライセンス: Link先を確認	Eleni Straitouri and Lequng Wang and Nastaran Okati and Manuel Gomez Rodriguez	(参考訳) 自動意思決定支援システムは、人間の専門家がより効率的に正確にタスクを解決できるようにする。しかし、既存のシステムは一般に専門家に、いつエージェンシーをシステムに割譲するか、いつ独自のエージェンシーを行使するかを理解する必要がある。さらに、専門家がシステムに対する誤った信頼を育むと、パフォーマンスが悪化する可能性がある。この作業では、上記の要件を引き上げ、設計上、専門家がいつパフォーマンスを確実に向上させるかを理解する必要のない自動意思決定支援システムを開発する。この目的のために,マルチクラス分類タスクに着目し,各データサンプルに対してラベルのサブセットを人間エキスパートに推薦するために分類器を使用する自動決定支援システムを検討する。まず,そのようなシステムの設計を共形予測の観点から見ることにより,ラベルの推奨部分集合が真のラベルを含む確率が,ほぼ正確にターゲット確率値に一致することを確かめる。そこで,提案するサブセット内のラベルの予測が極めて良好であるターゲット確率値のセットを特定し,最適に近いターゲット確率値を求めるための効率的な実用的な方法を開発した。合成データと実データを用いた実験により,本システムはより正確な予測を行うことができ,それに依存する分類器の精度にロバストであることが証明された。 Automated decision support systems promise to help human experts solve tasks more efficiently and accurately. However, existing systems typically require experts to understand when to cede agency to the system or when to exercise their own agency. Moreover, if the experts develop a misplaced trust in the system, their performance may worsen. In this work, we lift the above requirement and develop automated decision support systems that, by design, do not require experts to understand when to trust them to provably improve their performance. To this end, we focus on multiclass classification tasks and consider automated decision support systems that, for each data sample, use a classifier to recommend a subset of labels to a human expert. We first show that, by looking at the design of such systems from the perspective of conformal prediction, we can ensure that the probability that the recommended subset of labels contains the true label matches almost exactly a target probability value. Then, we identify the set of target probability values under which the human expert is provably better off predicting a label among those in the recommended subset and develop an efficient practical method to find a near-optimal target probability value. Experiments on synthetic and real data demonstrate that our system can help the experts make more accurate predictions and is robust to the accuracy of the classifier it relies on.	翻訳日:2022-01-31 14:47:14 公開日:2022-01-28
# ニューラルネットワークの深さとターゲット関数の局所性の間の相互作用 Interplay between depth of neural networks and locality of target functions ( http://arxiv.org/abs/2201.12082v1 ) ライセンス: Link先を確認	Takashi Mori, Masahito Ueda	(参考訳) 過パラメータの深層ニューラルネットワーク(dnn)は、さまざまな機械学習タスクにおいて驚くほど優れた一般化性能を示すことが認識されている。近似理論や統計的学習理論など,様々な観点から深度の利点が研究されてきたが,既存の理論では過パラメータDNNの実証的成功を十分に説明できない。本稿では,対象関数の深さと局所性との間に顕著な相互作用を示す。我々は、k$-local と $k$-global 関数を導入し、深さは局所関数の学習に有用であるが、グローバル関数の学習に不利であることを見出した。この相互作用は、遅延学習システム内で無限に広いニューラルネットワークを記述するニューラルネットワークによって適切にキャプチャされない。 It has been recognized that heavily overparameterized deep neural networks (DNNs) exhibit surprisingly good generalization performance in various machine-learning tasks. Although benefits of depth have been investigated from different perspectives such as the approximation theory and the statistical learning theory, existing theories do not adequately explain the empirical success of overparameterized DNNs. In this work, we report a remarkable interplay between depth and locality of a target function. We introduce $k$-local and $k$-global functions, and find that depth is beneficial for learning local functions but detrimental to learning global functions. This interplay is not properly captured by the neural tangent kernel, which describes an infinitely wide neural network within the lazy learning regime.	翻訳日:2022-01-31 14:46:51 公開日:2022-01-28
# 不完全な測定から学ぶためのサンプリング定理 Sampling Theorems for Learning from Incomplete Measurements ( http://arxiv.org/abs/2201.12151v1 ) ライセンス: Link先を確認	Juli\'an Tachella, Dongdong Chen and Mike Davies	(参考訳) 多くの実世界の環境では、学習に問題を引き起こす可能性のある不完全な測定データのみが利用可能である。固定不完全測定プロセスを用いた信号モデルの教師なし学習は一般に不可能であり、測定演算子のヌルスペースには情報がない。この制限は、複数の演算子の測定によって克服できる。このアイデアは様々な応用でうまく適用されているが、学習条件の正確なキャラクタリゼーションはまだ不足している。本稿では,このギャップを埋めるために,個別計測演算子$g$,オペレータあたりの計測回数$m$,モデル$k$の次元,信号の次元$n$との相互作用を示す信号モデルを学ぶための必要十分条件を提示する。特に,各演算子が少なくとも$m>k+n/G$の測定値を得た場合,一般教師なし学習が可能であることを示す。結果は学習アルゴリズムに依存せず,低ランク行列回復からディープニューラルネットワークまで,多岐にわたる実用的なアルゴリズムに影響を与えている。 In many real-world settings, only incomplete measurement data are available which can pose a problem for learning. Unsupervised learning of the signal model using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multiple operators. While this idea has been successfully applied in various applications, a precise characterization of the conditions for learning is still lacking. In this paper, we fill this gap by presenting necessary and sufficient conditions for learning the signal model which indicate the interplay between the number of distinct measurement operators $G$, the number of measurements per operator $m$, the dimension of the model $k$ and the dimension of the signals $n$. In particular, we show that generically unsupervised learning is possible if each operator obtains at least $m>k+n/G$ measurements. Our results are agnostic of the learning algorithm and have implications in a wide range of practical algorithms, from low-rank matrix recovery to deep neural networks.	翻訳日:2022-01-31 14:46:38 公開日:2022-01-28
# (参考訳) Optimal Transport Tools (OTT): Wasserstein のすべてのもののための JAX ツールボックス Optimal Transport Tools (OTT): A JAX Toolbox for all things Wasserstein ( http://arxiv.org/abs/2201.12324v1 ) ライセンス: CC BY 4.0	Marco Cuturi, Laetitia Meng-Papaxanthos, Yingtao Tian, Charlotte Bunne, Geoff Davis, Olivier Teboul	(参考訳) 最適なトランスポートツール(OTT-JAX)は、ポイントクラウドとヒストグラム間の最適なトランスポート問題を解決するPythonツールボックスである。ツールボックスは、自動およびカスタムのリバースモードの差別化、ベクタライゼーション、ジャスト・イン・タイムのコンパイル、アクセラレータのサポートなど、さまざまなJAX機能をベースにしている。このツールボックスは、正規化ot問題の解法や、barycenters、gromov-wasserstein、low-rank solvers、凸写像の推定、分位数と階数の微分可能な一般化、ガウス混合物間の近似otといった、より高度な拡張といった基本的な計算を扱っている。ツールボックスコードは \texttt{https://github.com/ott-jax/ott} で入手できる。 Optimal transport tools (OTT-JAX) is a Python toolbox that can solve optimal transport problems between point clouds and histograms. The toolbox builds on various JAX features, such as automatic and custom reverse mode differentiation, vectorization, just-in-time compilation and accelerators support. The toolbox covers elementary computations, such as the resolution of the regularized OT problem, and more advanced extensions, such as barycenters, Gromov-Wasserstein, low-rank solvers, estimation of convex maps, differentiable generalizations of quantiles and ranks, and approximate OT between Gaussian mixtures. The toolbox code is available at \texttt{https://github.com/ott-jax/ott}	翻訳日:2022-01-31 14:45:37 公開日:2022-01-28
# 3D-FlowNet:3次元表現を用いたイベントベース光フロー推定 3D-FlowNet: Event-based optical flow estimation with 3D representation ( http://arxiv.org/abs/2201.12265v1 ) ライセンス: Link先を確認	Haixin Sun, Minh-Quan Dao, Vincent Fremont	(参考訳) イベントベースのカメラは、低い照明条件下での自動運転車のナビゲーション中の高速モーション検出などの重要なタスクのために、フレームベースのカメラの制限を克服することができる。イベントカメラの高時間分解能と高ダイナミックレンジにより、速い動きと極端な光のシナリオで作業することができる。しかし、Deep Neural Networksのような従来のコンピュータビジョン手法は、非同期で離散的なイベントデータを扱うには適していない。さらに、イベントデータに対する従来の2Dエンコーディング表現手法は、時間分解能を犠牲にする。本稿では,まず,事象の時間分布をよりよく保存するために,それを3次元に拡張して2次元符号化表現を改善する。次に,3次元入力表現を処理し,新たな符号化手法に従って光フロー推定を出力するネットワークアーキテクチャである3D-FlowNetを提案する。イベントベースカメラのラベル付きデータセットの欠如を補うために、セルフ教師付きトレーニング戦略が採用されている。最後に,提案ネットワークをmvsec(multi-vehicle stereo event camera)データセットを用いてトレーニングし,評価する。その結果、私たちの3D-FlowNetは、トレーニングエポックの少ない最先端のアプローチ(Spike-FlowNetの100に対して30)よりも優れています。 Event-based cameras can overpass frame-based cameras limitations for important tasks such as high-speed motion detection during self-driving cars navigation in low illumination conditions. The event cameras' high temporal resolution and high dynamic range, allow them to work in fast motion and extreme light scenarios. However, conventional computer vision methods, such as Deep Neural Networks, are not well adapted to work with event data as they are asynchronous and discrete. Moreover, the traditional 2D-encoding representation methods for event data, sacrifice the time resolution. In this paper, we first improve the 2D-encoding representation by expanding it into three dimensions to better preserve the temporal distribution of the events. We then propose 3D-FlowNet, a novel network architecture that can process the 3D input representation and output optical flow estimations according to the new encoding methods. A self-supervised training strategy is adopted to compensate the lack of labeled datasets for the event-based camera. Finally, the proposed network is trained and evaluated with the Multi-Vehicle Stereo Event Camera (MVSEC) dataset. The results show that our 3D-FlowNet outperforms state-of-the-art approaches with less training epoch (30 compared to 100 of Spike-FlowNet).	翻訳日:2022-01-31 14:38:04 公開日:2022-01-28
# ニューロモルフィック転倒検出と行動認識データセットのベンチマーク標準ビジョンモデル Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset ( http://arxiv.org/abs/2201.12285v1 ) ライセンス: Link先を確認	Karthik Sivarama Krishnan and Koushik Sivarama Krishnan	(参考訳) ニューロモルフィックな視覚ベースのセンサーは近年、低消費電力で時空間イベントをキャプチャする能力で人気が高まっている。これらのセンサーは、記録されている被写体のプライバシーを守るのに役立つ従来のカメラのイベントやスパイクを記録する。これらのイベントはピクセル毎の輝度変化としてキャプチャされ、出力データストリームは時間、位置、ピクセルの強度変化情報でエンコードされる。本稿では,ニューロモルフィックな人間の行動認識と転倒検出データセットに関する,微調整された従来の視覚モデルの性能を評価・評価する。ダイナミックビジョンセンシングカメラからの時空間イベントストリームは、標準シーケンス画像フレームに符号化される。これらのビデオフレームは、従来のディープラーニングベースのアーキテクチャのベンチマークに使用される。提案手法では,DVS-R2+1D,DVS-CSN,DVS-C2D,DVS-SlowFast,DVS-X3D,DVS-MViTと命名した。これらのモデルの性能を比較すると、現在の最先端のMViTベースのアーキテクチャDVS-MViTは0.958の精度とF-1スコアの0.958の精度で他のモデルよりも優れています。 2つ目はDVS-C2Dで、精度0.916、F-1スコア0.916である。第3と第4はDVS-R2+1DとDVS-SlowFastで、精度は0.875と0.833とF-1スコアは0.875と0.861である。 DVS-CSNとDVS-X3Dは0.708と0.625で、F1スコアは0.722と0.625である。 Neuromorphic vision-based sensors are gaining popularity in recent years with their ability to capture Spatio-temporal events with low power sensing. These sensors record events or spikes over traditional cameras which helps in preserving the privacy of the subject being recorded. These events are captured as per-pixel brightness changes and the output data stream is encoded with time, location, and pixel intensity change information. This paper proposes and benchmarks the performance of fine-tuned conventional vision models on neuromorphic human action recognition and fall detection datasets. The Spatio-temporal event streams from the Dynamic Vision Sensing cameras are encoded into a standard sequence image frames. These video frames are used for benchmarking conventional deep learning-based architectures. In this proposed approach, we fine-tuned the state-of-the-art vision models for this Dynamic Vision Sensing (DVS) application and named these models as DVS-R2+1D, DVS-CSN, DVS-C2D, DVS-SlowFast, DVS-X3D, and DVS-MViT. Upon comparing the performance of these models, we see the current state-of-the-art MViT based architecture DVS-MViT outperforms all the other models with an accuracy of 0.958 and an F-1 score of 0.958. The second best is the DVS-C2D with an accuracy of 0.916 and an F-1 score of 0.916. Third and Fourth are DVS-R2+1D and DVS-SlowFast with an accuracy of 0.875 and 0.833 and F-1 score of 0.875 and 0.861 respectively. DVS-CSN and DVS-X3D were the least performing models with an accuracy of 0.708 and 0.625 and an F1 score of 0.722 and 0.625 respectively.	翻訳日:2022-01-31 14:37:43 公開日:2022-01-28
# コーディネートドメインエンコーダとペア分類器による複数ソースドメイン適応 Multiple-Source Domain Adaptation via Coordinated Domain Encoders and Paired Classifiers ( http://arxiv.org/abs/2201.11870v1 ) ライセンス: Link先を確認	Payam Karisani	(参考訳) ドメインシフト下でのテキスト分類のための新しいマルチソース教師なしモデルを提案する。我々のモデルは文書表現における更新率を利用してドメインエンコーダを動的に統合する。また、ソース分類器のペア化のために、ターゲット領域のエラー率を推定するために確率的ヒューリスティックを用いる。我々のヒューリスティックは、対象特徴空間におけるデータ変換コストと分類器の精度を利用する。我々は,本アルゴリズムの有効性を評価するために,ドメイン適応の現実シナリオを用いた。また,事前学習された多層トランスフォーマを文書エンコーダとして使用し,事前学習によってドメイン適応モデルによる改善が達成可能かどうかを実証した。実験では、この設定で私たちのモデルが最もパフォーマンスの高いアプローチであることを証明します。 We present a novel multiple-source unsupervised model for text classification under domain shift. Our model exploits the update rates in document representations to dynamically integrate domain encoders. It also employs a probabilistic heuristic to infer the error rate in the target domain in order to pair source classifiers. Our heuristic exploits data transformation cost and the classifier accuracy in the target feature space. We have used real world scenarios of Domain Adaptation to evaluate the efficacy of our algorithm. We also used pretrained multi-layer transformers as the document encoder in the experiments to demonstrate whether the improvement achieved by domain adaptation models can be delivered by out-of-the-box language model pretraining. The experiments testify that our model is the top performing approach in this setting.	翻訳日:2022-01-31 14:37:02 公開日:2022-01-28
# 自然言語生成のための生成協調ネットワーク Generative Cooperative Networks for Natural Language Generation ( http://arxiv.org/abs/2201.12320v1 ) ライセンス: Link先を確認	Sylvain Lamprier and Thomas Scialom and Antoine Chaffin and Vincent Claveau and Ewa Kijak and Jacopo Staiano and Benjamin Piwowarski	(参考訳) GAN(Generative Adversarial Networks)は、特に画像生成の分野で、多くの連続生成タスクにおいて大きな成功を収めている。しかし、言語のような離散出力の場合、ガンの最適化は多くの不安定性を持つ未解決問題であり、判別器出力からジェネレータパラメータへの勾配を適切にバックプロパゲーションできない。言い換えると、ジェネレータネットワークを強化学習を通じて学習し、識別器信号を報酬として利用するが、そのような技術は報酬の移動と勾配問題に悩まされる。最後に、直接の最大様相のアプローチに比べれば、しばしば短くなる。本稿では,識別器アーキテクチャを協調的に使用する生成協調ネットワークと,手元のタスクに対して現実的なテキストのサンプルを出力する生成ポリシーを提案する。提案手法ではコンバージェンスを理論的に保証し、2つの主要なnlgタスクにおける最先端の成果を実証的に達成するために,様々な効率的な復号手法を検討する。 Generative Adversarial Networks (GANs) have known a tremendous success for many continuous generation tasks, especially in the field of image generation. However, for discrete outputs such as language, optimizing GANs remains an open problem with many instabilities, as no gradient can be properly back-propagated from the discriminator output to the generator parameters. An alternative is to learn the generator network via reinforcement learning, using the discriminator signal as a reward, but such a technique suffers from moving rewards and vanishing gradient problems. Finally, it often falls short compared to direct maximum-likelihood approaches. In this paper, we introduce Generative Cooperative Networks, in which the discriminator architecture is cooperatively used along with the generation policy to output samples of realistic texts for the task at hand. We give theoretical guarantees of convergence for our approach, and study various efficient decoding schemes to empirically achieve state-of-the-art results in two main NLG tasks.	翻訳日:2022-01-31 14:36:51 公開日:2022-01-28
# フェデレーション学習のための勾配マスク平均化 Gradient Masked Averaging for Federated Learning ( http://arxiv.org/abs/2201.11986v1 ) ライセンス: Link先を確認	Irene Tenison, Sai Aravind Sreeramadas, Vaikkunth Mugunthan, Edouard Oyallon, Eugene Belilovsky, Irina Rish	(参考訳) フェデレートラーニング(Federated Learning)は、異種データを持つ多数のクライアントが互いにデータを共有することなく、統一されたグローバルモデルの学習をコーディネートできるようにする、新たなパラダイムである。標準的なフェデレーション学習アルゴリズムは、サーバーのグローバルモデルを近似するためにモデルパラメータや勾配更新の平均化を伴う。しかし、不均一な設定における平均化は、情報損失をもたらし、支配的なクライアントによって引き起こされるバイアスによる一般化の低下につながる。 FL設定のように、非i.dデータセットをより一般化するためには、クライアント間で異なる急激なメカニズムを無視しながら、一定である不変なメカニズムを学習することに集中すべきである、という仮説を立てる。本研究では,分散型学習のための勾配マスク型平均化手法を,クライアント更新の標準平均化に代わるものとして提案する。このクライアント更新集約技術は、既存のほとんどのフェデレーションアルゴリズムのドロップイン代替として適用することができる。分散性,実世界性,分散性(最悪の場合として)のテストデータセットを備えた複数のflアルゴリズムに対して,勾配マスクによる広範囲な実験を行い,特にヘテロジニアスクライアントの場合において一貫した改善を提供することを示す。 Federated learning is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a unified global model without the need to share data amongst each other. Standard federated learning algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, in heterogeneous settings averaging can result in information loss and lead to poor generalization due to the bias induced by dominant clients. We hypothesize that to generalize better across non-i.i.d datasets as in FL settings, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent work in the Out-of-Distribution literature, we propose a gradient masked averaging approach for federated learning as an alternative to the standard averaging of client updates. This client update aggregation technique can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments with gradient masked approach on multiple FL algorithms with in-distribution, real-world, and out-of-distribution (as the worst case scenario) test dataset and show that it provides consistent improvements, particularly in the case of heterogeneous clients.	翻訳日:2022-01-31 14:35:05 公開日:2022-01-28
# 局所不変説明:局所不変学習による安定・一方向説明に向けて Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning ( http://arxiv.org/abs/2201.12143v1 ) ライセンス: Link先を確認	Amit Dhurandhar, Karthikeyan Ramamurthy, Kartik Ahuja and Vijay Arya	(参考訳) ローカル解釈可能なモデル非依存説明(lime)メソッドは、例ごとにブラックボックスモデルを説明するために使われる最も一般的な方法の1つである。多くの変種が提案されているが、安定で直感的な高忠実度説明を生成する簡単な方法を提供するものはほとんどない。本研究では,不変リスク最小化(IRM)原理に着想を得たモデル非依存的局所的説明法を提案する。本手法は,理論上,ブラックボックス関数の勾配が説明したい例の局所性において突然符号が変化するような特徴を解消する傾向が強いことを理論的に示すゲーム理論定式化に基づいているが,他の場合ではより慎重であり,より保守的な(特徴)属性を選択する。実験では, ランダムな摂動を用いて生成した近傍における説明の質が, LIMEよりも優れており, また, データ多様体からサンプリングしたリアルな隣人を用いた他の手法に匹敵する場合もある。これは、写実的な隣人を作るか、説明を投影するために多様体を学ぶことは通常高価であるか、あるいは不可能であるかもしれないことを考慮すれば望ましい。さらに,本アルゴリズムは訓練が簡単かつ効率的であり,最近の研究で見られるような(部分的な)因果グラフなどのサイド情報にアクセスせずに,ブラックボックスの局所的な決定に対する安定した入力特徴を確認できる。 Locally interpretable model agnostic explanations (LIME) method is one of the most popular methods used to explain black-box models at a per example level. Although many variants have been proposed, few provide a simple way to produce high fidelity explanations that are also stable and intuitive. In this work, we provide a novel perspective by proposing a model agnostic local explanation method inspired by the invariant risk minimization (IRM) principle -- originally proposed for (global) out-of-distribution generalization -- to provide such high fidelity explanations that are also stable and unidirectional across nearby examples. Our method is based on a game theoretic formulation where we theoretically show that our approach has a strong tendency to eliminate features where the gradient of the black-box function abruptly changes sign in the locality of the example we want to explain, while in other cases it is more careful and will choose a more conservative (feature) attribution, a behavior which can be highly desirable for recourse. Empirically, we show on tabular, image and text data that the quality of our explanations with neighborhoods formed using random perturbations are much better than LIME and in some cases even comparable to other methods that use realistic neighbors sampled from the data manifold. This is desirable given that learning a manifold to either create realistic neighbors or to project explanations is typically expensive or may even be impossible. Moreover, our algorithm is simple and efficient to train, and can ascertain stable input features for local decisions of a black-box without access to side information such as a (partial) causal graph as has been seen in some recent works.	翻訳日:2022-01-31 14:34:41 公開日:2022-01-28
# 離散マルコフ決定過程における安全政策改善アプローチ Safe Policy Improvement Approaches on Discrete Markov Decision Processes ( http://arxiv.org/abs/2201.12175v1 ) ライセンス: Link先を確認	Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft	(参考訳) 安全政策改善(SPI)は、学習方針が与えられた基準方針とほぼ同等であることを示すことを目的としている。 NadjahiらによるSoft Baseline Bootstrapping (Soft-SPIBB)によるSPI上に構築し、それらのアプローチにおける理論的問題を特定し、補正された理論を提案し、有限マルコフ決定過程(MDP)上で確実に安全な新しいアルゴリズムを導出する。さらに、2つの異なるベンチマークで、多くの最先端SPIアルゴリズムの中で最高の性能を示すヒューリスティックアルゴリズムを提供する。さらに,spiアルゴリズムの分類法を導入し,spiアルゴリズムの2つのクラスの興味深い特性を実証的に示した。 Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrapping (Soft-SPIBB) by Nadjahi et al., we identify theoretical issues in their approach, provide a corrected theory, and derive a new algorithm that is provably safe on finite Markov Decision Processes (MDP). Additionally, we provide a heuristic algorithm that exhibits the best performance among many state of the art SPI algorithms on two different benchmarks. Furthermore, we introduce a taxonomy of SPI algorithms and empirically show an interesting property of two classes of SPI algorithms: while the mean performance of algorithms that incorporate the uncertainty as a penalty on the action-value is higher, actively restricting the set of policies more consistently produces good policies and is, thus, safer.	翻訳日:2022-01-31 14:34:09 公開日:2022-01-28
# 楕円分布と欠測データとの混合に対するロバストかつ柔軟なEMアルゴリズム A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data ( http://arxiv.org/abs/2201.12020v1 ) ライセンス: Link先を確認	Florian Mouret, Alexandre Hippert-Ferrer, Fr\'ed\'eric Pascal, Jean-Yves Tourneret	(参考訳) 本稿では,ノイズおよび非ガウスデータに対するデータインプテーションの欠如問題に対処する。ガウス混合モデルに対する古典的計算法である期待最大化(EM)アルゴリズムは、k-アネレスト近傍や連鎖方程式による多重計算のような他の一般的なアプローチと比較して興味深い性質を示している。しかし、ガウス混合モデルは不均一なデータに対して堅牢でないことが知られており、データが外れ値によって汚染されたり、非ガウス分布から来る場合、推定性能が低下する可能性がある。この問題を克服するために, 潜在的欠落データを扱う優れた特性を持つ楕円分布の混合について, 新たな期待最大化アルゴリズムについて検討した。楕円分布の混合に付随する全データ確率は、その条件分布によりemフレームワークによく適合しており、これは学生分布であることが示されている。合成データの実験的結果は,提案アルゴリズムが外れ値に対して頑健であり,非ガウスデータで使用可能であることを示す。さらに、実世界のデータセットで実施された実験は、このアルゴリズムが他の古典的計算法と比較して非常に競争力があることを示している。 This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be not robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or come from a non-Gaussian distributions. To overcome this issue, a new expectation maximization algorithm is investigated for mixtures of elliptical distributions with the nice property of handling potential missing data. The complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework thanks to its conditional distribution, which is shown to be a Student distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.	翻訳日:2022-01-31 14:33:50 公開日:2022-01-28
# 埋め込みラプラシア距離によるマルチスケールグラフの比較 Multiscale Graph Comparison via the Embedded Laplacian Distance ( http://arxiv.org/abs/2201.12064v1 ) ライセンス: Link先を確認	Edric Tam, David Dunson	(参考訳) 異なるサイズのグラフを比較するための単純かつ高速な手法を提案する。既存のアプローチは、しばしば同じ頂点数のグラフの比較に制限されるか、計算不可能である。我々は,潜在的に異なるサイズのグラフを比較するために,埋め込みラプラシアン距離(ELD)を提案する。我々のアプローチはまず、図形構造を尊重する共通の低次元ラプラシア埋め込み空間にグラフを投影する。これにより、ユークリッド空間内の点雲を比較する問題になってしまう。距離は自然スライスされたwassersteinアプローチによって効率的に計算できる。 ELDは擬測度であり、グラフ同型の下で不変であることを示す。スペクトルグラフ理論のツールを用いて,EDDの直観的な解釈を行う。シミュレーションデータと実データの両方を用いて, ELD アプローチの有効性を検証した。結果は良好である。 We introduce a simple and fast method for comparing graphs of different sizes. Existing approaches are often either limited to comparing graphs with the same number of vertices or are computationally unscalable. We propose the Embedded Laplacian Distance (ELD) for comparing graphs of potentially vastly different sizes. Our approach first projects the graphs onto a common, low-dimensional Laplacian embedding space that respects graphical structure. This reduces the problem to that of comparing point clouds in a Euclidean space. A distance can then be computed efficiently via a natural sliced Wasserstein approach. We show that the ELD is a pseudo-metric and is invariant under graph isomorphism. We provide intuitive interpretations of the ELD using tools from spectral graph theory. We test the efficacy of the ELD approach extensively on both simulated and real data. Results obtained are excellent.	翻訳日:2022-01-31 14:33:31 公開日:2022-01-28
# ループ内のドメインエキスパートによる近似ベイズ計算 Approximate Bayesian Computation with Domain Expert in the Loop ( http://arxiv.org/abs/2201.12090v1 ) ライセンス: Link先を確認	Ayush Bharti, Louis Filstroff, Samuel Kaski	(参考訳) 近似ベイズ計算(ABC: Approximate Bayesian calculation)は、難解な確率関数を持つモデルに対する確率自由推論法である。 ABC法は通常、観測データとシミュレーションデータの要約統計を比較することに頼っているため、統計の選択は不可欠である。この選択は、情報の喪失と次元の減少の間のトレードオフを伴い、しばしばドメイン知識に基づいて決定される。しかし、手作りと適切な統計の選択は、複数の試行錯誤のステップを伴う面倒な作業である。本研究では,abc統計選択のためのアクティブラーニング手法を導入し,ドメインエキスパートの作業量を大幅に削減する。専門家を巻き込むことで、既存の次元縮小法とは異なり、不特定モデルを扱うことができる。さらに,シミュレーション予算が制限された場合,既存手法よりも後方推定が優れていることを示す。 Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.	翻訳日:2022-01-31 14:31:53 公開日:2022-01-28
# 分子最適化法とバイアス還元評価法のinsilico評価におけるバイアス Biases in In Silico Evaluation of Molecular Optimization Methods and Bias-Reduced Evaluation Methodology ( http://arxiv.org/abs/2201.12163v1 ) ライセンス: Link先を確認	Hiroshi Kajino, Kohei Miyaguchi, Takayuki Osogami	(参考訳) 分子最適化法におけるシリカ評価手法に興味がある。分子のサンプルとその性質を考慮に入れれば、ターゲットの性質に対して最適化された分子を見つけることができるエージェントを訓練するだけでなく、その性能も評価したい。一般的なプラクティスは、サンプルのターゲットプロパティの予測器をトレーニングし、エージェントのトレーニングと評価の両方に使用することである。この評価器は2つのバイアスを負う可能性がある。1つは予測器の誤特定と、もう1つはトレーニングや評価に同じサンプルを再利用することによるものである。各バイアスに対するバイアス低減手法を包括的に検討し,その効果を実証的に検討した。 We are interested in in silico evaluation methodology for molecular optimization methods. Given a sample of molecules and their properties of our interest, we wish not only to train an agent that can find molecules optimized with respect to the target property but also to evaluate its performance. A common practice is to train a predictor of the target property on the sample and use it for both training and evaluating the agent. We show that this evaluator potentially suffers from two biases; one is due to misspecification of the predictor and the other to reusing the same sample for training and evaluation. We discuss bias reduction methods for each of the biases comprehensively, and empirically investigate their effectiveness.	翻訳日:2022-01-31 14:31:42 公開日:2022-01-28
# ベイセンター推定のためのwasserstein反復ネットワーク Wasserstein Iterative Networks for Barycenter Estimation ( http://arxiv.org/abs/2201.12245v1 ) ライセンス: Link先を確認	Alexander Korotin, Vage Egiazarian, Lingxiao Li, Evgeny Burnaev	(参考訳) ワッサーシュタインのバリセンターは、幾何学的に意味のある方法で確率測度の平均を表す能力によって人気を博している。本稿では,連続測度のwasserstein-2重心を生成モデルを用いて近似するアルゴリズムを提案する。従来のアプローチでは、バイアスを導入する正規化(エントロピー/クワッドラティック)や、大規模なタスクには不十分な入力凸ニューラルネットワークに依存していた。対照的に,本アルゴリズムではバイアスは導入せず,任意のニューラルネットワークを用いることができる。さらに、有名人の顔のデータセットに基づいて、FIDなどの生成モデルの標準指標を用いて、バリセンタアルゴリズムの定量的評価に使用できるAve, celeba!データセットを構築する。 Wasserstein barycenters have become popular due to their ability to represent the average of probability measures in a geometrically meaningful way. In this paper, we present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. Previous approaches rely on regularization (entropic/quadratic) which introduces bias or on input convex neural networks which are not expressive enough for large-scale tasks. In contrast, our algorithm does not introduce bias and allows using arbitrary neural networks. In addition, based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms by using standard metrics of generative models such as FID.	翻訳日:2022-01-31 14:31:29 公開日:2022-01-28
# 教師なし領域適応のためのラベルなしデータによる特徴のシャッフル強化 Shuffle Augmentation of Features from Unlabeled Data for Unsupervised Domain Adaptation ( http://arxiv.org/abs/2201.11963v1 ) ライセンス: Link先を確認	Changwei Xu, Jianfei Yang, Haoran Tang, Han Zou, Cheng Lu, Tianshuo Zhang	(参考訳) 対象サンプルのラベルが利用できない転写学習の分野であるUnsupervised Domain Adaptation (UDA) は, 近年, 逆学習モデルの助けを借りて, 広く研究・開発されている。既存のUDAアルゴリズムは、ニューラルネットワークを誘導して転送可能で識別可能な特徴を抽出するが、分類器はラベル付きソースデータの監督下でのみ訓練される。ソースドメインとターゲットドメインの区別が避けられないため、分類器はターゲットの分類境界をほとんど認識できない。本稿では,新たなUDAフレームワークであるShuffle Augmentation of Features (SAF)を提案する。 SAFはターゲットサンプルから学習し、クラス認識対象の特徴を適応的に蒸留し、クラス境界を見つけるために暗黙的に分類器を誘導する。広範な実験によって実証されたSAFモジュールは、既存のUDAモデルに組み込むことができ、性能改善を実現している。 Unsupervised Domain Adaptation (UDA), a branch of transfer learning where labels for target samples are unavailable, has been widely researched and developed in recent years with the help of adversarially trained models. Although existing UDA algorithms are able to guide neural networks to extract transferable and discriminative features, classifiers are merely trained under the supervision of labeled source data. Given the inevitable discrepancy between source and target domains, the classifiers can hardly be aware of the target classification boundaries. In this paper, Shuffle Augmentation of Features (SAF), a novel UDA framework, is proposed to address the problem by providing the classifier with supervisory signals from target feature representations. SAF learns from the target samples, adaptively distills class-aware target features, and implicitly guides the classifier to find comprehensive class borders. Demonstrated by extensive experiments, the SAF module can be integrated into any existing adversarial UDA models to achieve performance improvements.	翻訳日:2022-01-31 14:30:59 公開日:2022-01-28
# 一度だけカットする: 1回のカットでデータ拡張を増やす You Only Cut Once: Boosting Data Augmentation with a Single Cut ( http://arxiv.org/abs/2201.12078v1 ) ライセンス: Link先を確認	Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li	(参考訳) データ拡張を行うためのYOCO(You Only Cut Once)を提案する。 YOCOは1つの画像を2つのピースに分割し、各ピース内で個別にデータ拡張を行う。 YOCOを適用することで、サンプルあたりの増補の多様性が向上し、ニューラルネットワークが部分的な情報からオブジェクトを認識することを奨励する。 YOCOはパラメータフリーで使いやすく、ほとんどすべての拡張を無償で行うことができる。その効果を評価するために徹底的な実験が行われている。我々はまず、YOCOが様々なデータ拡張、ニューラルネットワークアーキテクチャにシームレスに適用できることを実証し、CIFARとImageNetの分類タスクのパフォーマンス向上をもたらし、時には従来の画像レベルの拡張よりも大きなマージンを上回ります。さらに,複数のダウンストリームタスクにより良い転送が可能な,より強力な表現に向けて,YOCOによる事前学習の対照的なメリットを示す。最後に、複数のYOCOの変種を調査し、各設定の性能を実証的に分析する。コードはGitHubで入手できる。 We present You Only Cut Once (YOCO) for performing data augmentations. YOCO cuts one image into two pieces and performs data augmentations individually within each piece. Applying YOCO improves the diversity of the augmentation per sample and encourages neural networks to recognize objects from partial information. YOCO enjoys the properties of parameter-free, easy usage, and boosting almost all augmentations for free. Thorough experiments are conducted to evaluate its effectiveness. We first demonstrate that YOCO can be seamlessly applied to varying data augmentations, neural network architectures, and brings performance gains on CIFAR and ImageNet classification tasks, sometimes surpassing conventional image-level augmentation by large margins. Moreover, we show YOCO benefits contrastive pre-training toward a more powerful representation that can be better transferred to multiple downstream tasks. Finally, we study a number of variants of YOCO and empirically analyze the performance for respective settings. Code is available at GitHub.	翻訳日:2022-01-31 14:30:42 公開日:2022-01-28
# (参考訳) 自然言語によるテキスト分布の違いの要約 Summarizing Differences between Text Distributions with Natural Language ( http://arxiv.org/abs/2201.12323v1 ) ライセンス: CC BY 4.0	Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt	(参考訳) 2つのテキストの分布はどのように異なるのか? パターンの発見には、何百ものサンプルを退屈に読み込む必要があるからだ。 2つの分布 $d_{0}$ と $d_{1}$ が与えられたとき、我々はより頻繁に$d_{1}$、例えば "is military-related" で真となる記述を探す。この問題に対処するために、gpt-3を微調整して、プロンプトで記述する: "[samples of $d_{0}$] + [samples of $d_{1}$] + それらの間の差は ______ である。次に、学習した検証器でより大きなサンプルのセットを保持する頻度をチェックすることで、記述を再評価します。一方, GPT-3 Curie (13B) は人間のアノテーションに類似した記述しか生成しないのに対して, GPT-3 Curie (13B) は微調整と再ランクで61%, GPT-3 Davinci (175B) を用いたベストシステムは76%であった。本稿では,分散シフトの記述,データセットのショートカットのデバッグ,未知タスクの要約,テキストクラスタのラベル付け,自動生成した記述に基づく分析を行う。 How do two distributions of texts differ? Humans are slow at answering this, since discovering patterns might require tediously reading through hundreds of samples. We propose to automatically summarize the differences by "learning a natural language hypothesis": given two distributions $D_{0}$ and $D_{1}$, we search for a description that is more often true for $D_{1}$, e.g., "is military-related." To tackle this problem, we fine-tune GPT-3 to propose descriptions with the prompt: "[samples of $D_{0}$] + [samples of $D_{1}$] + the difference between them is _____". We then re-rank the descriptions by checking how often they hold on a larger set of samples with a learned verifier. On a benchmark of 54 real-world binary classification tasks, while GPT-3 Curie (13B) only generates a description similar to human annotation 7% of the time, the performance reaches 61% with fine-tuning and re-ranking, and our best system using GPT-3 Davinci (175B) reaches 76%. We apply our system to describe distribution shifts, debug dataset shortcuts, summarize unknown tasks, and label text clusters, and present analyses based on automatically generated descriptions.	翻訳日:2022-01-31 14:28:41 公開日:2022-01-28
# wikipediaはオフラインの強化学習に役立つか? Can Wikipedia Help Offline Reinforcement Learning? ( http://arxiv.org/abs/2201.12122v1 ) ライセンス: Link先を確認	Machel Reid, Yutaro Yamada, Shixiang Shane Gu	(参考訳) 大規模オフザシェルフデータセットの欠如と、異なる環境間の転送可能性のばらつきのため、微調整強化学習(RL)モデルは困難である。最近の研究は、Transformerアーキテクチャの導入により、シーケンスモデリングの観点から、オフラインのRLに取り組むことに注目している。しかし、モデルをスクラッチからトレーニングすると、収束速度が遅くなる。本稿では、この強化学習をシーケンスモデリングとして活用し、オフラインRLタスク(制御、ゲーム)を微調整した場合に、他のドメイン(ビジョン、言語)における事前訓練されたシーケンスモデルの転送可能性を検討する。この目的のために、これらのドメイン間の転送を改善する手法も提案する。結果は,各種環境における収束速度と報酬の両面において一貫したパフォーマンス向上を示し,トレーニングを3～6倍に加速し,WikipediaとGPT2言語モデルを用いた各種タスクにおける最先端のパフォーマンスを達成する。この作業が、汎用シーケンスモデリング技術とrlの事前学習モデルを活用する可能性に光を当てるだけでなく、まったく異なるドメインのジェネレーティブモデリングタスク間の知識共有に関する今後の作業を促すことを期待しています。 Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of large scale off-the-shelf datasets as well as high variance in transferability among different environments. Recent work has looked at tackling offline RL from the perspective of sequence modeling with improved results as result of the introduction of the Transformer architecture. However, when the model is trained from scratch, it suffers from slow convergence speeds. In this paper, we look to take advantage of this formulation of reinforcement learning as sequence modeling and investigate the transferability of pre-trained sequence models on other domains (vision, language) when finetuned on offline RL tasks (control, games). To this end, we also propose techniques to improve transfer between these domains. Results show consistent performance gains in terms of both convergence speed and reward on a variety of environments, accelerating training by 3-6x and achieving state-of-the-art performance in a variety of tasks using Wikipedia-pretrained and GPT2 language models. We hope that this work not only brings light to the potentials of leveraging generic sequence modeling techniques and pre-trained models for RL, but also inspires future work on sharing knowledge between generative modeling tasks of completely different domains.	翻訳日:2022-01-31 13:58:00 公開日:2022-01-28
# 説明可能な人工知能を用いた自動設計評価における特徴可視化 Feature Visualization within an Automated Design Assessment leveraging Explainable Artificial Intelligence Methods ( http://arxiv.org/abs/2201.12107v1 ) ライセンス: Link先を確認	Raoul Sch\"onhof and Artem Werner and Jannes Elstner and Boldizsar Zopcsak and Ramez Awad and Marco Huber	(参考訳) 製造プロセスの自動化だけでなく、自動化手順自体の自動化も、自動化研究に益々関係している。この文脈では、3次元CADデータから駆動されるディープラーニングシステムによって主に活用される自動能力評価が提案されている。現在の評価システムはCADデータを抽象的な特徴(例えば、バルクグッズから部品を自動的に分離する機能やグリップ面の存在など)で評価することができる。それでも彼らはブラックボックスシステムの要因に悩まされており、評価を学習し、容易に生成することができるが、システムの決定の理由に関する幾何学的な指標は持っていない。説明可能なAI(xAI)手法を利用することで、ブラックボックスを開こうとする。説明可能なai手法は、ニューラルネットワークが特定のタスクをうまく学習したかどうかを判断したり、入力のどの特徴が敵の攻撃につながるかを分析するために使われています。これらの方法は、与えられた入力からパターンを分析し、ネットワーク出力に与える影響を分析することによって、ニューラルネットワークへのさらなる洞察を導き出すことを目的としている。 NeuroCADプロジェクト内では、ある抽象的特徴に関連する幾何学的特徴を特定するためにxAIメソッドが使用される。本研究の中で、感度分析(sa)、層間関係伝播(lrp)、勾配重み付けクラス活性化マッピング(gradle-weighted class activation mapping:grad-cam)、局所解釈可能なモデル非依存説明(lime)がニューロカド環境に実装されており、cadモデルを評価するだけでなく、ネットワーク決定に関連する特徴を特定することができる。中期的には、製品デザイナがアセンブリプロセスに関してモデルを最適化するための関心領域を特定できるかもしれません。 Not only automation of manufacturing processes but also automation of automation procedures itself become increasingly relevant to automation research. In this context, automated capability assessment, mainly leveraged by deep learning systems driven from 3D CAD data, have been presented. Current assessment systems may be able to assess CAD data with regards to abstract features, e.g. the ability to automatically separate components from bulk goods, or the presence of gripping surfaces. Nevertheless, they suffer from the factor of black box systems, where an assessment can be learned and generated easily, but without any geometrical indicator about the reasons of the system's decision. By utilizing explainable AI (xAI) methods, we attempt to open up the black box. Explainable AI methods have been used in order to assess whether a neural network has successfully learned a given task or to analyze which features of an input might lead to an adversarial attack. These methods aim to derive additional insights into a neural network, by analyzing patterns from a given input and its impact to the network output. Within the NeuroCAD Project, xAI methods are used to identify geometrical features which are associated with a certain abstract feature. Within this work, a sensitivity analysis (SA), the layer-wise relevance propagation (LRP), the Gradient-weighted Class Activation Mapping (Grad-CAM) method as well as the Local Interpretable Model-Agnostic Explanations (LIME) have been implemented in the NeuroCAD environment, allowing not only to assess CAD models but also to identify features which have been relevant for the network decision. In the medium run, this might enable to identify regions of interest supporting product designers to optimize their models with regards to assembly processes.	翻訳日:2022-01-31 13:57:42 公開日:2022-01-28
# Plug & Playアタック:ロバストでフレキシブルなモデルインバージョンアタックを目指す Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks ( http://arxiv.org/abs/2201.12179v1 ) ライセンス: Link先を確認	Lukas Struppek, Dominik Hintersdorf, Antonio De Almeida Correia, Antonia Adler, Kristian Kersting	(参考訳) モデルインバージョンアタック(MIA)は、モデルの学習知識を活用して、ターゲット分類器のトレーニングデータからクラスワイズ特性を反映した合成画像を作成することを目的としている。従来の研究では、特定のターゲットモデルに適合した画像先行として、GAN(Generative Adversarial Network)を用いた生成MIAを開発した。これにより、攻撃は時間とリソースを消費し、柔軟性がなく、データセット間の分散シフトに影響を受けやすい。これらの欠点を克服するために、ターゲットモデルと画像間の依存性を緩和し、訓練された単一のGANを使用することで、小さな攻撃調整だけで広範囲のターゲットを攻撃できるPlug & Play Attacksを提案する。さらに, 従来の手法では有意な結果が得られなかったのに対して, 事前学習型GANでも強力なMIAが実現可能であることを示す。我々は,プラグイン・アンド・プレイ・アタックの堅牢性と柔軟性の向上と,クラス特性に敏感な高品質な画像を作成する能力を確認した。 Model inversion attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target classifier's training data by exploiting the model's learned knowledge. Previous research has developed generative MIAs using generative adversarial networks (GANs) as image priors that are tailored to a specific target model. This makes the attacks time- and resource-consuming, inflexible, and susceptible to distributional shifts between datasets. To overcome these drawbacks, we present Plug & Play Attacks that loosen the dependency between the target model and image prior and enable the use of a single trained GAN to attack a broad range of targets with only minor attack adjustments needed. Moreover, we show that powerful MIAs are possible even with publicly available pre-trained GANs and under strong distributional shifts, whereas previous approaches fail to produce meaningful results. Our extensive evaluation confirms the improved robustness and flexibility of Plug & Play Attacks and their ability to create high-quality images revealing sensitive class characteristics.	翻訳日:2022-01-31 13:57:11 公開日:2022-01-28
# 共通破壊に対する3次元点雲認識のロバスト性評価 Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions ( http://arxiv.org/abs/2201.12296v1 ) ライセンス: Link先を確認	Jiachen Sun, Qingzhao Zhang, Bhavya Kailkhura, Zhiding Yu, Chaowei Xiao, and Z. Morley Mao	(参考訳) 3dポイントクラウドデータ上のディープニューラルネットワークは、現実世界、特に安全クリティカルなアプリケーションで広く使われている。しかし、汚職に対する頑丈さは研究されていない。本稿では,15の共通および現実的な腐敗からなる3次元点雲の破壊堅牢性に関する最初の総合的なベンチマークであるModelNet40-Cを提案する。評価の結果,モデルNet40 とモデルNet40-C では,最先端モデル (SOTA) では大きな差がみられた。このギャップを減らすために,多種多様な拡張およびテスト時間適応戦略を評価し,PointCutMix-RとTENTを組み合わせた簡易かつ効果的な手法を提案する。我々は、ポイントクラウド認識における汚職の堅牢性に関する将来の研究に対する重要な洞察を多数特定する。例えば、適切なトレーニングレシピを持つTransformerベースのアーキテクチャは、強力な堅牢性を実現しています。詳細な分析が、3d point cloudドメインにおける堅牢なトレーニング戦略やアーキテクチャ設計の開発を動機付けることを期待しています。私たちのコードベースとデータセットはhttps://github.com/jiachens/ModelNet40-Cに含まれています。 Deep neural networks on 3D point cloud data have been widely used in the real world, especially in safety-critical applications. However, their robustness against corruptions is less studied. In this paper, we present ModelNet40-C, the first comprehensive benchmark on 3D point cloud corruption robustness, consisting of 15 common and realistic corruptions. Our evaluation shows a significant gap between the performances on ModelNet40 and ModelNet40-C for state-of-the-art (SOTA) models. To reduce the gap, we propose a simple but effective method by combining PointCutMix-R and TENT after evaluating a wide range of augmentation and test-time adaptation strategies. We identify a number of critical insights for future studies on corruption robustness in point cloud recognition. For instance, we unveil that Transformer-based architectures with proper training recipes achieve the strongest robustness. We hope our in-depth analysis will motivate the development of robust training strategies or architecture designs in the 3D point cloud domain. Our codebase and dataset are included in https://github.com/jiachens/ModelNet40-C	翻訳日:2022-01-31 13:56:55 公開日:2022-01-28
# (参考訳) 医用画像分割用クラスアウェア生成逆変換器 Class-Aware Generative Adversarial Transformers for Medical Image Segmentation ( http://arxiv.org/abs/2201.10737v2 ) ライセンス: CC BY 4.0	Chenyu You, Ruihan Zhao, Fenglin Liu, Sandeep Chinchali, Ufuk Topcu, Lawrence Staib, James S. Duncan	(参考訳) トランスフォーマーは、医用画像分析領域における長距離依存関係のモデリングにおいて著しく進歩した。しかし,現状のトランスフォーマーベースモデルでは,(1)単純トークン化方式による画像の重要特徴の捕捉に失敗し,(2)単一スケールの特徴表現のみを考慮したモデル,(3)リッチな意味的文脈や解剖学的テクスチャを考慮せずに生成したセグメンテーションラベルマップが十分に正確でない,といった欠点がある。本稿では,医療用画像分割のための新しい生成型逆変換器であるca-ganformerを提案する。まず,ピラミッド構造を利用し,マルチスケール表現を構築し,マルチスケールのバリエーションを扱います。次に、意味構造を持つオブジェクトの識別領域をよりよく学習するために、新しいクラス対応トランスフォーマーモジュールを設計する。最後に, セグメンテーションの精度を向上し, 高レベルの意味的関連のある内容と低レベルの解剖学的特徴をトランスフォーマーベースの識別器で捉えるための対角訓練戦略を利用する。実験の結果、CA-GANformerは3つのベンチマークで従来の最先端のトランスフォーマーベースのアプローチを劇的に上回り、従来のモデルよりも2.54%-5.88%向上した。さらに質的な実験によって、モデルの内部動作のより詳細な図が提供され、透明性向上の課題に光を当て、トランスファーラーニングがパフォーマンスを大幅に向上し、トレーニング中の医療画像データセットのサイズを削減し、CA-GANformerが下流の医療画像解析タスクの強力な出発点となることを示す。コードとモデルは一般公開される予定だ。 Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: (1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; (2) the models suffer from information loss because they only consider single-scale feature representations; and (3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CA-GANformer, a novel type of generative adversarial transformers, for medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CA-GANformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining 2.54%-5.88% absolute improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CA-GANformer a strong starting point for downstream medical image analysis tasks. Codes and models will be available to the public.	翻訳日:2022-01-31 13:24:30 公開日:2022-01-28
# (参考訳) vision checklist: 画像モデルのテスト可能なエラー解析に向けて - システム設計者がモデルの能力に疑問を呈するのに役立つ Vision Checklist: Towards Testable Error Analysis of Image Models to Help System Designers Interrogate Model Capabilities ( http://arxiv.org/abs/2201.11674v2 ) ライセンス: CC BY 4.0	Xin Du, Benedicte Legastelois, Bhargavi Ganesh, Ajitha Rajan, Hana Chockler, Vaishak Belle, Stuart Anderson, Subramanian Ramamoorthy	(参考訳) 視覚トランスフォーマーなどの最近のモデルや、vggやresnetといったcnnベースのモデルの成功により、画像認識タスクに大規模な事前訓練済みモデルを使用することが増えている。ベンチマークタスクにおけるこれらのモデルの高精度さは、自動運転や医療診断のような安全クリティカルなアプリケーションを含む、多くのドメインで実用化されている。広く使われているにもかかわらず、画像モデルは運用環境の変化に弱いことが示され、その堅牢性に疑問が呈されている。設計者が安全性と堅牢性を理解し、保証するために、これらのモデルの能力を体系的に特徴付け、定量化する手法が緊急に必要である。本稿では,システム設計者がロバスト性評価に使用できるレポートを作成するために,モデルの能力を問うことを目的としたフレームワークであるvision checklistを提案する。このフレームワークは、異なるタイプのテストサンプルを生成するために基礎となるデータに適用できる一連の摂動操作を提案する。摂動は運用環境の潜在的な変化を反映し、厳密な量から質的な性質まで様々な特性を問う。我々のフレームワークは、Tinyimagenet、CIFAR10、CIFAR100、Camelyon17のような複数のデータセットと、ViTやResnetのようなモデルで評価されている。われわれのvision checklistは、モデルカードのコンセプトに組み込むことのできる、特定の評価セットを提案している。私たちのチェックリストのようなロバストネス評価は、視覚認識モジュールの将来の安全性評価に不可欠であり、これらのシステムの認証に関わるデザイナー、デプロイ者、規制官を含む幅広い利害関係者に役立ちます。 Vision Checklistのソースコードは一般に公開されている。 Using large pre-trained models for image recognition tasks is becoming increasingly common owing to the well acknowledged success of recent models like vision transformers and other CNN-based models like VGG and Resnet. The high accuracy of these models on benchmark tasks has translated into their practical use across many domains including safety-critical applications like autonomous driving and medical diagnostics. Despite their widespread use, image models have been shown to be fragile to changes in the operating environment, bringing their robustness into question. There is an urgent need for methods that systematically characterise and quantify the capabilities of these models to help designers understand and provide guarantees about their safety and robustness. In this paper, we propose Vision Checklist, a framework aimed at interrogating the capabilities of a model in order to produce a report that can be used by a system designer for robustness evaluations. This framework proposes a set of perturbation operations that can be applied on the underlying data to generate test samples of different types. The perturbations reflect potential changes in operating environments, and interrogate various properties ranging from the strictly quantitative to more qualitative. Our framework is evaluated on multiple datasets like Tinyimagenet, CIFAR10, CIFAR100 and Camelyon17 and for models like ViT and Resnet. Our Vision Checklist proposes a specific set of evaluations that can be integrated into the previously proposed concept of a model card. Robustness evaluations like our checklist will be crucial in future safety evaluations of visual perception modules, and be useful for a wide range of stakeholders including designers, deployers, and regulators involved in the certification of these systems. Source code of Vision Checklist would be open for public use.	翻訳日:2022-01-31 12:58:08 公開日:2022-01-28
# (参考訳) 複数インスタンス学習におけるモデル非依存解釈可能性 Model Agnostic Interpretability for Multiple Instance Learning ( http://arxiv.org/abs/2201.11701v2 ) ライセンス: CC BY 4.0	Joseph Early, Christine Evers and Sarvapali Ramchurn	(参考訳) 複数のインスタンス学習(mil:multiple instance learning)では、モデルは、各バッグに単一のラベルのみを提供する、インスタンスの袋を使ってトレーニングされる。バッグラベルは、しばしばバッグ内の一握りのキーインスタンスによってのみ決定されるため、分類器が意思決定に使用する情報を理解するのが困難である。本研究では,MILモデルを解釈するための重要な要件を確立する。次に、これらの要件を満たすモデルに依存しないアプローチをいくつか開発します。提案手法は,複数のデータセット上の既存の解釈可能なMILモデルと比較し,解釈可能性の精度を最大30%向上させる。また、インスタンス間の相互作用を識別し、より大きなデータセットにスケールする手法の能力を検証し、実世界の問題への適用性を向上させる。 In Multiple Instance Learning (MIL), models are trained using bags of instances, where only a single label is provided for each bag. A bag label is often only determined by a handful of key instances within a bag, making it difficult to interpret what information a classifier is using to make decisions. In this work, we establish the key requirements for interpreting MIL models. We then go on to develop several model-agnostic approaches that meet these requirements. Our methods are compared against existing inherently interpretable MIL models on several datasets, and achieve an increase in interpretability accuracy of up to 30%. We also examine the ability of the methods to identify interactions between instances and scale to larger datasets, improving their applicability to real-world problems.	翻訳日:2022-01-31 12:43:20 公開日:2022-01-28
# DiscoScore: BERT と Discourse Coherence によるテキスト生成の評価 DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence ( http://arxiv.org/abs/2201.11176v2 ) ライセンス: Link先を確認	Wei Zhao, Michael Strube, Steffen Eger	(参考訳) 近年、文間の相互依存のモデル化など、言論コヒーレンスの観点からテキスト生成システムを設計することへの関心が高まっている。それでも、最近のBERTベースの評価指標では、コヒーレンスを認識することができず、システム出力の非コヒーレントな要素を罰することができない。本研究では,様々な視点から対話コヒーレンスをモデル化するためにBERTを用いたパラメタライズされた談話計量であるDiscoScoreを紹介する。本実験は,要約と文書レベルの機械翻訳(MT)に基づいて評価されたディスコスコアや一般的なコヒーレンスモデルを含む16の非談話・談話指標を含む。私たちはそれを見つけ (i)10年前に考案された,BERTベースの指標の大部分は,初期の談話基準よりも人間のレーティング・コヒーレンスと相関する。 (II)最近の最先端のBARTScoreは、システムレベルでの運用では弱い - この種のシステムと比較される場合、特に問題となる。対照的にDiscoScoreは、コヒーレンスだけでなく、現実の一貫性やその他の面において、人間の評価と強いシステムレベルの相関を達成し、BARTScoreを平均10以上の相関点で上回っている。さらに,ディスコスコアの理解を目指して,評価指標における談話コヒーレンスの重要性を正当化し,一方の変種が他方よりも優れていることを説明する。私たちのコードは \url{https://github.com/AIPHES/DiscoScore} で利用可能です。 Recently, there has been a growing interest in designing text generation systems from a discourse coherence perspective, e.g., modeling the interdependence between sentences. Still, recent BERT-based evaluation metrics cannot recognize coherence and fail to punish incoherent elements in system outputs. In this work, we introduce DiscoScore, a parametrized discourse metric, which uses BERT to model discourse coherence from different perspectives, driven by Centering theory. Our experiments encompass 16 non-discourse and discourse metrics, including DiscoScore and popular coherence models, evaluated on summarization and document-level machine translation (MT). We find that (i) the majority of BERT-based metrics correlate much worse with human rated coherence than early discourse metrics, invented a decade ago; (ii) the recent state-of-the-art BARTScore is weak when operated at system level -- which is particularly problematic as systems are typically compared in this manner. DiscoScore, in contrast, achieves strong system-level correlation with human ratings, not only in coherence but also in factual consistency and other aspects, and surpasses BARTScore by over 10 correlation points on average. Further, aiming to understand DiscoScore, we provide justifications to the importance of discourse coherence for evaluation metrics, and explain the superiority of one variant over another. Our code is available at \url{https://github.com/AIPHES/DiscoScore}.	翻訳日:2022-01-31 12:17:30 公開日:2022-01-28
# 言語間自動音声認識による音声辞書の探索 Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition ( http://arxiv.org/abs/2201.11207v2 ) ライセンス: Link先を確認	Piotr \.Zelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak	(参考訳) データ取得のコストが高いため、自動音声認識(asr)モデルのトレーニングは、文字が書かれていない言語や電話の在庫が不明な言語を含む、ほとんどの既存の言語で問題となる。過去の研究は、これらの低リソース言語のためのasrシステムを構築するために、多言語学習、転送学習、ゼロショット学習を探求した。複数の言語からのリソースプールが有用であることが示されているが、トレーニング中に見つからない言語へのASRモデルの適用は、まだ成功していない。 ASRの未確認言語への適応における重要なステップは、未確認言語の電話在庫の作成である。私たちの研究の最終的な目標は、トレーニング中に目に見えない言語の電話在庫を教師なしの方法で構築することにあります。本稿では, 1) 未知言語における電話の認識に及ぼす異なる要因(モデルアーキテクチャ、フォノタクティクスモデル、音声表現の種類など)の影響について検討する。 2)電話機が言語をまたいでうまく転送されるか、また、電話の自動在庫作成のさらなる改善のための限界や領域を理解しない分析を提供する。 3) 教師なしの方法で未認識言語の電話インベントリを構築するための異なる方法を示す。そこで本研究では,13言語を対象に単言語,多言語,多言語,多言語間の実験を行った。クロス言語的によく認識される、多くのユニバーサル電話トークン(ipaシンボル)を見つけました。結果の詳細な分析を通じて,固有音,類似音,トーン言語は音素インベントリの発見において依然として大きな課題となっていると結論づけた。 The high cost of data acquisition makes Automatic Speech Recognition (ASR) model training problematic for most existing languages, including languages that do not even have a written script, or for which the phone inventories remain unknown. Past works explored multilingual training, transfer learning, as well as zero-shot learning in order to build ASR systems for these low-resource languages. While it has been shown that the pooling of resources from multiple languages is helpful, we have not yet seen a successful application of an ASR model to a language unseen during training. A crucial step in the adaptation of ASR from seen to unseen languages is the creation of the phone inventory of the unseen language. The ultimate goal of our work is to build the phone inventory of a language unseen during training in an unsupervised way without any knowledge about the language. In this paper, we 1) investigate the influence of different factors (i.e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language; 2) provide an analysis of which phones transfer well across languages and which do not in order to understand the limitations of and areas for further improvement for automatic phone inventory creation; and 3) present different methods to build a phone inventory of an unseen language in an unsupervised way. To that end, we conducted mono-, multi-, and crosslingual experiments on a set of 13 phonetically diverse languages and several in-depth analyses. We found a number of universal phone tokens (IPA symbols) that are well-recognized cross-linguistically. Through a detailed analysis of results, we conclude that unique sounds, similar sounds, and tone languages remain a major challenge for phonetic inventory discovery.	翻訳日:2022-01-31 12:17:02 公開日:2022-01-28
# コールドスタートレコメンデーションのためのスパーシティ規則化 Sparsity Regularization For Cold-Start Recommendation ( http://arxiv.org/abs/2201.10711v3 ) ライセンス: Link先を確認	Aksheshkumar Ajaykumar Shah and Hemanth Venkateswara	(参考訳) 近年, コールドスタート勧告問題に対してGAN(Generative Adversarial Networks)が適用されているが, これらのモデルのトレーニング性能は, 温かいユーザの購入行動の極端に疎らさによって阻害されている。本稿では,ユーザ人口統計とユーザの嗜好を組み合わせることにより,ユーザ-ベクトルの新たな表現法を提案する。本システムでは,二元的ユーザ製品間インタラクション(簡易フィードバック)ではなく,重み付けされたユーザ製品選好(テストフィードバック)を用いて,ユーザ購入行動のモデル化を行う。これを用いて, 温かいユーザへの過度な適合を回避し, トレーニング安定性を保証したスパースユーザ購入行動を活用した, 冷水星推薦のための新しいスパース対逆モデルSRLGANを開発した。 SRLGANを2つの一般的なデータセットで評価し、最先端の結果を示す。 Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. In this paper we introduce a novel representation for user-vectors by combining user demographics and user preferences, making the model a hybrid system which uses Collaborative Filtering and Content Based Recommendation. Our system models user purchase behavior using weighted user-product preferences (explicit feedback) rather than binary user-product interactions (implicit feedback). Using this we develop a novel sparse adversarial model, SRLGAN, for Cold-Start Recommendation leveraging the sparse user-purchase behavior which ensures training stability and avoids over-fitting on warm users. We evaluate the SRLGAN on two popular datasets and demonstrate state-of-the-art results.	翻訳日:2022-01-31 12:16:37 公開日:2022-01-28

Title

Authors

Abstract

論文公表日・翻訳日

# 小さな葉/花弁を持つ大きな星/バラの余剰次元

Large Star/Rose Extra Dimension with Small Leaves/Petals ( http://arxiv.org/abs/2001.07102v5 )

ライセンス: Link先を確認

Florian Nortier

(参考訳) 本稿では,多数の同一葉/ペダルを持つ星座グラフ上に1つの大きな余剰次元(LED)をコンパクト化する手法を提案する。 5次元プランクスケールを$\Lambda_P^{(5)} \sim \mathcal{O}(1)$ TeVとすることでゲージ階層問題を解くことができる。リーフ/ペナル長スケールは$\mathcal{O}(1/\Lambda_{EW})$であり、$\Lambda_{EW} \sim 100$ GeVは従来のLEDモデルの大きな幾何学的階層が安定化しない弱いスケールである。 smの4次元場は、スター/ローズグラフの中心頂点にある3面上に局在する。我々は、TeVスケール上の強結合重力現象の体制の下にある、正に結合した弱いスケールのカルザ・クライン(KK)重力子の塔を予測する。さらに, 右利きニュートリノは, バルク内を伝播するゲージ一重項フェルミオンのKKモードである光ディラックニュートリノを生成するためのLED機構を再構成した。多くのKK-グラビトンとKK-ニュートリノは重力のみに相互作用し、隠れたセクターを構成する。

In this paper, we propose to compactify a single large extra dimension (LED) on a star/rose graph with a large number of identical leaves/petals. The 5D Planck scale can be chosen to be $\Lambda_P^{(5)} \sim \mathcal{O}(1)$ TeV which can provide a path to solve the gauge hierarchy problem. The leaf/petal length scale is of $\mathcal{O}(1/\Lambda_{EW})$, where $\Lambda_{EW} \sim 100$ GeV is the weak scale, without the large geometrical hierarchy of the traditional LED models to stabilize. The 4D fields of the SM are localized on a 3-brane at the central vertex of the star/rose graph. We predict a tower of feebly coupled weak scale Kaluza-Klein (KK) gravitons below a regime of strongly coupled gravitational phenomena above the TeV scale. Moreover, we reformulate in our setup the LED mechanism to generate light Dirac neutrinos, where the right-handed neutrinos are KK-modes of gauge singlet fermions propagating in the bulk. A large number of KK-gravitons and KK-neutrinos interact only gravitationally, and thus constitute a hidden sector.

翻訳日:2023-06-07 06:19:38 公開日:2022-01-28

# ビスト確率作用素と量子ランダム変数

Bistochastic operators and quantum random variables ( http://arxiv.org/abs/2005.00005v2 )

ライセンス: Link先を確認

Sarah Plosker and Christopher Ramsey

(参考訳) 局所コンパクトハウスドルフ空間 $X$ のボレル集合に作用する正の作用素値 $\nu$ が代数 $\mathcal B(\mathcal H)$ 上のすべての有界作用素 (おそらく無限次元) ヒルベルト空間 $\mathcal H$ が与えられたとき、$\nu$-可積分函数 $X\rightarrow \mathcal B(\mathcal H)$ は正の量子確率変数である。そのような函数のスパン上の半ノルムを定義し、商においてバナッハ空間に導く。この空間に作用するビスト確率作用素と量子乱変数の大量化は、これらの作用素に対して定義される。古典的偏化理論と同様に、この文脈における偏化は、ある型のすべての可能な凸函数を含む不等式と関係する。古典的設定とは異なり、連続性と収束問題は作品全体に生じる。

Given a positive operator-valued measure $\nu$ acting on the Borel sets of a locally compact Hausdorff space $X$, with outcomes in the algebra $\mathcal B(\mathcal H)$ of all bounded operators on a (possibly infinite-dimensional) Hilbert space $\mathcal H$, one can consider $\nu$-integrable functions $X\rightarrow \mathcal B(\mathcal H)$ that are positive quantum random variables. We define a seminorm on the span of such functions which in the quotient leads to a Banach space. We consider bistochastic operators acting on this space and majorization of quantum random variables is then defined with respect to these operators. As in classical majorization theory, we relate majorization in this context to an inequality involving all possible convex functions of a certain type. Unlike the classical setting, continuity and convergence issues arise throughout the work.

翻訳日:2023-05-21 17:04:51 公開日:2022-01-28

# 量子アンサンブルのギースワーク

Guesswork of a quantum ensemble ( http://arxiv.org/abs/2012.09350v2 )

ライセンス: Link先を確認

Michele Dall'Arno, Francesco Buscemi, Takeshi Koshiba

(参考訳) 量子アンサンブルの推測作業は、1回に1つの状態しかクエリできない場合、アンサンブルの状態を正確に推測するために平均で必要となる最小の推測数を定量化する。ここでは,一様確率分布を持つ任意のキュービットアンサンブルに対する解析解を含む,有限条件下における推測作業問題の解析解を求める。明示的な例として、任意の量子ビット正則多角形および多面体アンサンブルの推測ワークを計算する。

The guesswork of a quantum ensemble quantifies the minimum number of guesses needed in average to correctly guess the state of the ensemble, when only one state can be queried at a time. Here, we derive analytical solutions of the guesswork problem subject to a finite set of conditions, including the analytical solution for any qubit ensemble with uniform probability distribution. As explicit examples, we compute the guesswork for any qubit regular polygonal and polyhedral ensemble.

翻訳日:2023-04-20 08:43:25 公開日:2022-01-28

# 空間自己相関の存在下でのバイアスの検出

Detecting Bias in the Presence of Spatial Autocorrelation ( http://arxiv.org/abs/2101.01703v3 )

ライセンス: Link先を確認

Subhabrata Majumdar, Cheryl Flynn, Ritwik Mitra

(参考訳) かなりの実用的重要性にもかかわらず、現在のアルゴリズムの公平性文学は、空間データのバイアス問題を評価または緩和しながら、地理的依存性を考慮に入れる技術的手法を欠いている。本稿では,空間的応用におけるバイアスの研究を開始し,定量的手法の体系化に向けて第一歩を踏み出す。空間的データ応用におけるバイアスは、しばしば基礎となる空間的自己相関によって共起される。我々は,この効果の存在と強さを検出するための仮説テスト手法を提案し,空間フィルタリングに基づく手法を用いてそれを説明することにより,既存のバイアス検出メトリクスの適用を可能にする。提案手法を実データおよび合成データセットの数値実験により評価し,空間構造に起因した複数種類の共起効果が存在する場合,テスト手法は低型II誤りと名目I誤りの維持に有効であることを示した。

In spite of considerable practical importance, current algorithmic fairness literature lacks technical methods to account for underlying geographic dependency while evaluating or mitigating bias issues for spatial data. We initiate the study of bias in spatial applications in this paper, taking the first step towards formalizing this line of quantitative methods. Bias in spatial data applications often gets confounded by underlying spatial autocorrelation. We propose hypothesis testing methodology to detect the presence and strength of this effect, then account for it by using a spatial filtering-based approach -- in order to enable application of existing bias detection metrics. We evaluate our proposed methodology through numerical experiments on real and synthetic datasets, demonstrating that in the presence of several types of confounding effects due to the underlying spatial structure our testing methods perform well in maintaining low type-II errors and nominal type-I errors.

翻訳日:2023-04-17 19:48:34 公開日:2022-01-28

# 位相空間における操作理論:高調波発振器のトイモデル

Operational Theories in Phase Space: Toy Model for the Harmonic Oscillator ( http://arxiv.org/abs/2101.08323v2 )

ライセンス: Link先を確認

Martin Pl\'avala, Matthias Kleinmann

(参考訳) 位置と運動量に依存するエネルギー観測可能な理論を含む一般確率論を構築する方法を示す。この構成は古典的および量子的な理論に従い、位置、運動量、エネルギーの確率分布のような物理的予測を可能にする。古典的でも量子的でもない高調波発振器の玩具モデルを定式化して構成を実証する。このモデルでは、離散エネルギースペクトル、鋭い位置と運動量を持つ基底状態、非正のウィグナー関数を持つ固有状態、およびトンネル特性を持つ状態が特徴である。玩具モデルは、操作理論が物理理論を定式化するための実行可能な代替手法であることを示した。

We show how to construct general probabilistic theories that contain an energy observable dependent on position and momentum. The construction is in accordance with classical and quantum theory and allows for physical predictions, such as the probability distribution for position, momentum and energy. We demonstrate the construction by formulating a toy model for the harmonic oscillator that is neither classical nor quantum. The model features a discrete energy spectrum, a ground state with sharp position and momentum, an eigenstate with non-positive Wigner function as well as a state that has tunneling properties. The toy model demonstrates that operational theories can be a viable alternative approach for formulating physical theories.

翻訳日:2023-04-14 11:07:57 公開日:2022-01-28

# 量子リピータセグメントの絡み合い接続のためのメモリ拡張スケーリングの実験的検討

Experimental demonstration of memory-enhanced scaling for entanglement connection of quantum repeater segments ( http://arxiv.org/abs/2101.08541v3 )

ライセンス: Link先を確認

Yunfei Pu, Sheng Zhang, Yukai Wu, Nan Jiang, Wei Chang, Chang Li and Luming Duan

(参考訳) 量子リピータプロトコルは、長距離量子通信と大規模量子ネットワークを実装するための有望なアプローチである。量子リピータプロトコルの重要な考え方は、長い寿命の量子メモリを使用して、多項式スケーリングで異なるリピータセグメント間の効率的な絡み合い接続を実現することである。本稿では,数ミリ秒の記憶時間を持つ2つの原子量子メモリを用いて,オンデマンドのエンタングルメントスワップによる2つの量子リピータセグメントの効率的な接続を実現する実験を報告する。メモリエンハンスメントにより,エンタングルメント接続の成功率において,スケーリング・チェンジ・アクセラレーションが実証される。効率的なメモリ拡張スケーリングによる2つの量子リピータセグメントの絡み合い接続の実験的実現は、将来の大規模量子ネットワークの基盤となる量子リピータプロトコルの重要な利点を示す。

The quantum repeater protocol is a promising approach to implement long-distance quantum communication and large-scale quantum networks. A key idea of the quantum repeater protocol is to use long-lived quantum memories to achieve efficient entanglement connection between different repeater segments with a polynomial scaling. Here we report an experiment which realizes efficient connection of two quantum repeater segments via on-demand entanglement swapping by the use of two atomic quantum memories with storage time of tens of milliseconds. With the memory enhancement, scaling-changing acceleration is demonstrated in the rate for a successful entanglement connection. The experimental realization of entanglement connection of two quantum repeater segments with an efficient memory-enhanced scaling demonstrates a key advantage of the quantum repeater protocol, which makes a cornerstone towards future large-scale quantum networks.

翻訳日:2023-04-14 08:37:03 公開日:2022-01-28

# 高次元ガウスボゾンサンプリングによる量子計算の優位性

Quantum Computational Advantage via High-Dimensional Gaussian Boson Sampling ( http://arxiv.org/abs/2102.12474v3 )

ライセンス: Link先を確認

Abhinav Deshpande, Arthur Mehta, Trevor Vincent, Nicolas Quesada, Marcel Hinsche, Marios Ioannou, Lars Madsen, Jonathan Lavoie, Haoyu Qi, Jens Eisert, Dominik Hangleiter, Bill Fefferman, Ish Dhand

(参考訳) フォトニクス(英: Photonics)は、量子計算上の優位性(QCA)を示すための有望なプラットフォームである。この約束にもかかわらず、既存の提案とデモは課題に直面している。実験的に、ガウスボソンサンプリング(GBS)の現在の実装はプログラマビリティを欠いているか、あるいは不正な損失率を持っている。理論的には、gbsの古典的硬さに対する厳密な証拠が比較的欠如している。本研究は,理論的な証拠と実験的な展望の両方を改善するための進歩である。 GBS の硬さの証拠は,QCA の最も強い理論的提案に匹敵するものである。また,高次元GBSと呼ぶ新しいQCAアーキテクチャを提案する。 GBSをシミュレーションする特定のアルゴリズムは、モデストシステムサイズでの高次元GBS実験により性能が向上していることを示す。この研究は、プログラマブルフォトニックプロセッサによるQCAの実証への道を開く。

Photonics is a promising platform for demonstrating a quantum computational advantage (QCA) by outperforming the most powerful classical supercomputers on a well-defined computational task. Despite this promise, existing proposals and demonstrations face challenges. Experimentally, current implementations of Gaussian boson sampling (GBS) lack programmability or have prohibitive loss rates. Theoretically, there is a comparative lack of rigorous evidence for the classical hardness of GBS. In this work, we make progress in improving both the theoretical evidence and experimental prospects. We provide evidence for the hardness of GBS, comparable to the strongest theoretical proposals for QCA. We also propose a new QCA architecture we call high-dimensional GBS, which is programmable and can be implemented with low loss using few optical components. We show that particular algorithms for simulating GBS are outperformed by high-dimensional GBS experiments at modest system sizes. This work thus opens the path to demonstrating QCA with programmable photonic processors.

翻訳日:2023-04-10 00:57:51 公開日:2022-01-28

# コヒーレントイジングマシンによる地中・低エネルギーイジングスピン配置の効率的なサンプリング

Efficient sampling of ground and low-energy Ising spin configurations with a coherent Ising machine ( http://arxiv.org/abs/2103.05629v2 )

ライセンス: Link先を確認

Edwin Ng, Tatsuhiro Onodera, Satoshi Kako, Peter L. McMahon, Hideo Mabuchi, Yoshihisa Yamamoto

(参考訳) 量子ノイズの存在下での計測フィードバック型コヒーレントイジングマシン(mfb-cim)の非線形確率ダイナミクスを,イジングモデルの縮退グラウンドおよび低エネルギースピン配置のサンプルに活用できることを示す。我々は,MFB-CIMの一般離散時間ガウス状態モデルを定式化し,システム閾値以上の非線形ダイナミクスを忠実に捉える。このモデルは、量子ノイズを無視する平均場モデルと、長い光子寿命を仮定する連続時間モデルの両方の限界を克服する。このモデルの数値シミュレーションにより、mfb-cimが短い光子寿命(すなわち低空洞細さ)を持つ量子ノイズ支配状態で動作している場合、ホモダインモニタリングは低エネルギーイジングスピン配置のサンプルを効率的に生成することができ、確立された高精細連続時間モデルによって提案されるよりも多くのラウンドトリップを必要とすることが示された。サンプリング性能は、パラメトリックドライブの符号をオフにしたり、完全に反転させたりすることで堅牢である、あるいは改善されているが、光学非線形性がない場合には性能が著しく低下する。二項符号のエッジ重みを持つMAX-CUT問題のクラスでは、すべての縮退を含む最初の励起イジングエネルギーまでの全スピン配置を十分にサンプリングするのに十分なラウンドトリップの数は1.08^N$までスケールする。問題サイズのn = 100$ 数十 (中20) のインスタンス毎の所望の構成では、平均的な十分なサンプリング時間は6\times10^6$ ラウンドトリップであり、10ghzの繰り返しレートを持つmfb-cimの実験的な実装では、ウォールクロックのサンプリング時間は60msである。

We show that the nonlinear stochastic dynamics of a measurement-feedback-based coherent Ising machine (MFB-CIM) in the presence of quantum noise can be exploited to sample degenerate ground and low-energy spin configurations of the Ising model. We formulate a general discrete-time Gaussian-state model of the MFB-CIM which faithfully captures the nonlinear dynamics present at and above system threshold. This model overcomes the limitations of both mean-field models, which neglect quantum noise, and continuous-time models, which assume long photon lifetimes. Numerical simulations of our model show that when the MFB-CIM is operated in a quantum-noise-dominated regime with short photon lifetimes (i.e., low cavity finesse), homodyne monitoring of the system can efficiently produce samples of low-energy Ising spin configurations, requiring many fewer roundtrips to sample than suggested by established high-finesse, continuous-time models. We find that sampling performance is robust to, or even improved by, turning off or altogether reversing the sign of the parametric drive, but performance is critically reduced in the absence of optical nonlinearity. For the class of MAX-CUT problems with binary-signed edge weights, the number of roundtrips sufficient to fully sample all spin configurations up to the first-excited Ising energy, including all degeneracies, scales as $1.08^N$. At a problem size of $N = 100$ with a few dozen (median of 20) such desired configurations per instance, we have found median sufficient sampling times of $6\times10^6$ roundtrips; in an experimental implementation of an MFB-CIM with a 10 GHz repetition rate, this corresponds to a wall-clock sampling time of 60 ms.

翻訳日:2023-04-08 15:40:15 公開日:2022-01-28

# 超低メカニカル散逸による階層的引張構造

Hierarchical tensile structures with ultralow mechanical dissipation ( http://arxiv.org/abs/2103.09785v3 )

ライセンス: Link先を確認

Mohammad J. Bereyhi, Alberto Beccari, Robin Groth, Sergey A. Fedorov, Amirali Arabmoheghi, Tobias J. Kippenberg, Nils J. Engelsen

(参考訳) 構造構造は無数の生物系で見られ、エッフェル塔から光学キャビティまで、人工的な構造を改善した。階層的メタマテリアルは、多サイズスケールの構造を利用して、構成材料と著しく異なる新しい、非常に望ましい特性を実現する。静的張力によって剛性が得られる機械的共振器では、構造的階層は、不定形なソフトクランプによる基本モードの超低レベルへの散逸を低減できる。本稿では, 窒化ケイ素ナノメカニカル共振器に階層設計を適用し, 107kHzの周波数で10^9$の高品質な2成分木形共振器を実現し, 浮遊粒子のパラメータ状態に到達した。共振器の熱雑音に制限された力の感度は、室温で740\ \mathrm{zN/\sqrt{Hz}}$と6Kで$\mathrm{90\ zN/\sqrt{Hz}}$に達する。また、二分木共振器の自己相似構造はフラクタルジオメトリの特徴である分数スペクトル次元をもたらすことがわかった。さらに, 階層的設計原理を2次元トランポリン膜にまで拡張できることを示すとともに, ファブリ・ピエロキャビティの干渉位置測定に適した超低散逸膜を作製した。階層型ナノメカニカル共振器は、低散逸が最重要であり、基本モードでの操作がしばしば有利である、力センシング、信号伝達、量子光学の新たな道を開く。

Structural hierarchy is found in myriad biological systems and has improved man-made structures ranging from the Eiffel tower to optical cavities. Hierarchical metamaterials utilize structure at multiple size scales to realize new and highly desirable properties which can be strikingly different from those of the constituent materials. In mechanical resonators whose rigidity is provided by static tension, structural hierarchy can reduce the dissipation of the fundamental mode to ultralow levels due to an unconventional form of soft clamping. Here, we apply hierarchical design to silicon nitride nanomechanical resonators and realize binary tree-shaped resonators with quality factors as high as $10^9$ at 107 kHz frequency, reaching the parameter regime of levitated particles. The resonators' thermal-noise-limited force sensitivities reach $740\ \mathrm{zN/\sqrt{Hz}}$ at room temperature and $\mathrm{90\ zN/\sqrt{Hz}}$ at 6 K, surpassing state-of-the-art cantilevers currently used for force microscopy. We also find that the self-similar structure of binary tree resonators results in fractional spectral dimensions, which is characteristic of fractal geometries. Moreover, we show that the hierarchical design principles can be extended to 2D trampoline membranes, and we fabricate ultralow dissipation membranes suitable for interferometric position measurements in Fabry-P\'erot cavities. Hierarchical nanomechanical resonators open new avenues in force sensing, signal transduction and quantum optomechanics, where low dissipation is paramount and operation with the fundamental mode is often advantageous.

翻訳日:2023-04-07 21:11:33 公開日:2022-01-28

# シリコン中の3ビットドナー量子プロセッサの精密トモグラフィー

Precision tomography of a three-qubit donor quantum processor in silicon ( http://arxiv.org/abs/2106.03082v3 )

ライセンス: Link先を確認

Mateusz T. M\k{a}dzik, Serwan Asaad, Akram Youssry, Benjamin Joecker, Kenneth M. Rudinger, Erik Nielsen, Kevin C. Young, Timothy J. Proctor, Andrew D. Baczewski, Arne Laucht, Vivien Schmitt, Fay E. Hudson, Kohei M. Itoh, Alexander M. Jakob, Brett C. Johnson, David N. Jamieson, Andrew S. Dzurak, Christopher Ferrie, Robin Blume-Kohout and Andrea Morello

(参考訳) 核スピンは、量子コヒーレンスと原子スケールの足跡のため、量子情報処理において最初に検討された物理プラットフォームの一つであった。しかし、スケーラブルなデバイス内の核量子ビットを、フォールトトレラントな量子計算を維持するのに十分な忠実さを持つマルチキュービット演算と組み合わせる方法が欠如しているため、量子コンピューティングの完全なポテンシャルはまだ実現されていない。ここでは、シリコンナノエレクトロニクスデバイスにおいて、イオンを注入した31Pドナー原子核を用いた普遍量子論理演算を示す。共有電子スピンに幾何位相を付与し、94.2(2.7)%までのフィディティを有する絡み合ったベル状態を作成するための核二量子制御zゲートを得る。量子演算は、ゲートセットトモグラフィー(GST)を用いて正確に特徴づけられ、最大99.95(2)%、2キュービット平均ゲート忠実度99.37(11)%、2キュービット準備/測定忠実度98.95(4)%となる。これら3つの指標は、シリコンの核スピンがフォールトトレラント量子プロセッサで要求される性能に近づいていることを示している。次に,グリーンベルガー・ホルン・ザイリンガー三量子状態と92.5(1.0)%の忠実度を発生させることにより,2つの核と共有電子の絡み合いを示す。半導体中の電子スピン量子ビットは、他の電子とさらに結合したり、異なる場所に物理的に伝播することができるため、これらの結果はドナー核スピンと電子スピンを用いたスケーラブルな量子情報処理に有効な経路を確立できる。

Nuclear spins were among the first physical platforms to be considered for quantum information processing, because of their exceptional quantum coherence and atomic-scale footprint. However, their full potential for quantum computing has not yet been realized, due to the lack of methods to link nuclear qubits within a scalable device combined with multi-qubit operations with sufficient fidelity to sustain fault-tolerant quantum computation. Here we demonstrate universal quantum logic operations using a pair of ion-implanted 31P donor nuclei in a silicon nanoelectronic device. A nuclear two-qubit controlled-Z gate is obtained by imparting a geometric phase to a shared electron spin, and used to prepare entangled Bell states with fidelities up to 94.2(2.7)%. The quantum operations are precisely characterised using gate set tomography (GST), yielding one-qubit average gate fidelities up to 99.95(2)%, two-qubit average gate fidelity of 99.37(11)% and two-qubit preparation/measurement fidelities of 98.95(4)%. These three metrics indicate that nuclear spins in silicon are approaching the performance demanded in fault-tolerant quantum processors. We then demonstrate entanglement between the two nuclei and the shared electron by producing a Greenberger-Horne-Zeilinger three-qubit state with 92.5(1.0)% fidelity. Since electron spin qubits in semiconductors can be further coupled to other electrons or physically shuttled across different locations, these results establish a viable route for scalable quantum information processing using donor nuclear and electron spins.

翻訳日:2023-03-27 11:40:07 公開日:2022-01-28

# 確率的ニューラルネットワークの自己組織的臨界性のウィッテン型位相場理論

Witten-type topological field theory of self-organized criticality for stochastic neural networks ( http://arxiv.org/abs/2106.10851v2 )

ライセンス: Link先を確認

Jian Zhai, Chaojun Yu, You Zhai

(参考訳) 確率的ニューラルネットワークに対する自己組織臨界性(SOC)のウィッテン型トポロジカル場理論(W-TFT)について検討する。ニューラルネットワークに対する一般確率微分方程式(SDE)のパリ・ソルラス・ウー量子化、拡散系のベッチ・ルーエ・ストーラ・チューティン対称性、およびSDEの定常状態を接続する自然破断とインスタントンの関係、および擬超対称性確率的ニューラルネットワークにおける十分かつ必要な条件を得る。神経細胞の雪崩は皮質情報処理と記憶のメカニズムである cite{Beggs}\cite{Plenz1}\cite{Plenz2} を仮定し、確率的ニューラルネットワークのモデルが正しいと仮定し、SOCシステムは自発的にBRST対称性が破れたW-TFTと見なすことができる。次に、確率的ニューラルネットワークのモデルから、神経雪崩と自発的に壊れたBRST対称性を回復する。ドリフト係数の発散が小さく非定数であれば、確率的ニューラルネットワークのモデルはbrst対称であることが分かる。すなわち、脳神経回路のSOCを、自発的にBRST対称性が破れたW-TFTとみなすことができれば、神経科学において広く用いられる確率的ニューラルネットワークの一般的なモデルは、SOCを記述するのに十分ではない。一方,fokker-planck方程式を用いて拡散の十分条件を示し,確率的ニューラルネットワークの定常確率分布が存在することを示す。神経ネットワークの発火速度のリズムはプロセスから生じるが、いくつかの生物学的法則は保存されている。

We study the Witten-type topological field theory(W-TFT) of self-organized criticality(SOC) for stochastic neural networks. The Parisi-Sourlas-Wu quantization of general stochastic differential equations (SDEs) for neural networks, the Becchi-Rouet-Stora-Tyutin(BRST)-symmetry of the diffusion system and the relation between spontaneous breaking and instantons connecting steady states of the SDEs, as well as the sufficient and necessary condition on pseudo-supersymmetric stochastic neural networks are obtained. Suppose neuronal avalanche is a mechanism of cortical information processing and storage \cite{Beggs}\cite{Plenz1}\cite{Plenz2} and the model of stochastic neural networks\cite{Dayan} is correct, as well as the SOC system can be looked upon as a W-TFT with spontaneously broken BRST symmetry. Then we should recover the neuronal avalanches and spontaneously broken BRST symmetry from the model of stochastic neural networks. We find that, provided the divergence of drift coefficients is small and non-constant, the model of stochastic neural networks is BRST symmetric. That is, if the SOC of brain neural networks system can be looked upon as a W-TFT with spontaneously broken BRST symmetry, then the general model of stochastic neural networks which be extensively used in neuroscience \cite{Dayan} is not enough to describe the SOC. On the other hand, using the Fokker-Planck equation, we show the sufficient condition on diffusion so that there exists a steady state probability distribution for the stochastic neural networks. Rhythms of the firing rates of the neuronal networks arise from the process, meanwhile some biological laws are conserved.

翻訳日:2023-03-25 23:28:25 公開日:2022-01-28

# リニア応答を超える高速周期運転時の加熱速度

Heating Rates under Fast Periodic Driving beyond Linear Response ( http://arxiv.org/abs/2107.12587v2 )

ライセンス: Link先を確認

Takashi Mori

(参考訳) 周期駆動下での加熱は一般的な非平衡現象であり、非平衡統計物理学では定量的に正確な加熱速度を導出することは難しい問題である。本研究では,古典多体系および量子多体系において,高速かつ強い周期駆動下での加熱速度の簡単な公式を提供する。この公式の背景にある重要な考え方は、マイクロモーション作用素の高周波膨張の切り離しによって見いだされる回転フレームに移動し、線形応答理論を適用することで、時間依存型ハミルトニアンを構成することである。特定の古典モデルや量子モデルでは、高周波膨張の2次切断は線形応答系を超えて定量的に正確な加熱速度をもたらすことが確認されている。その結果, 加熱ダイナミクスに関する情報は, 高周波膨張の最初の数個の項でエンコードされるが, 加熱はしばしば高周波膨張の漸近的発散挙動と関連していることがわかった。

Heating under periodic driving is a generic nonequilibrium phenomenon, and it is a challenging problem in nonequilibrium statistical physics to derive a quantitatively accurate heating rate. In this work, we provide a simple formula on the heating rate under fast and strong periodic driving in classical and quantum many-body systems. The key idea behind the formula is constructing a time-dependent dressed Hamiltonian by moving to a rotating frame, which is found by a truncation of the high-frequency expansion of the micromotion operator, and applying the linear-response theory. It is confirmed for specific classical and quantum models that the second-order truncation of the high-frequency expansion yields quantitatively accurate heating rates beyond the linear-response regime. Our result implies that the information on heating dynamics is encoded in the first few terms of the high-frequency expansion, although heating is often associated with an asymptotically divergent behavior of the high-frequency expansion.

翻訳日:2023-03-20 19:28:28 公開日:2022-01-28

# 導波路結合型ナノキャビティにおける強キラル光マター相互作用

Engineering strong chiral light-matter interactions in a waveguide-coupled nanocavity ( http://arxiv.org/abs/2108.01462v3 )

ライセンス: Link先を確認

D. Hallett, A. P. Foster, D. M. Whittaker, M. S. Skolnick, L. R. Wilson

(参考訳) スピン依存、指向性光-物質相互作用はキラル量子ネットワークの基礎を形成する。固体状態では、量子エミッタは一般にスピン依存のハンドネスを持つ円偏光遷移を持つ。スピン依存キラルカップリングは導波路結合型ナノキャビティにそのようなエミッタを埋め込むことにより実現可能であることを数値的に示す。キラルな挙動は、2つの単一モード出力導波路に結合する際のキャビティモード間の方向依存性による干渉によって生じる。特に、実験的な現実的な空洞設計は、ほぼ均一なキラルコントラスト、効率的な(\beta > 0.95$)導波路結合、光-物質相互作用強度(Purcell factor $F_P > 70$)を同時にサポートする。これらのパラメータを組み合わせることで、ナノフォトニック回路に統合可能な高コヒーレントなスピン光子インタフェースの開発が可能になる。

Spin-dependent, directional light-matter interactions form the basis of chiral quantum networks. In the solid state, quantum emitters commonly possess circularly polarised optical transitions with spin-dependent handedness. We demonstrate numerically that spin-dependent chiral coupling can be realised by embedding such an emitter in a waveguide-coupled nanocavity, which supports two near-degenerate, orthogonally-polarised cavity modes. The chiral behaviour arises due to direction-dependent interference between the cavity modes upon coupling to two single-mode output waveguides. Notably, an experimentally realistic cavity design simultaneously supports near-unity chiral contrast, efficient ($\beta > 0.95$) waveguide coupling and enhanced light-matter interaction strength (Purcell factor $F_P > 70$). In combination, these parameters could enable the development of highly coherent spin-photon interfaces, ready for integration into nanophotonic circuits.

翻訳日:2023-03-20 00:55:59 公開日:2022-01-28

# 単一領域ボース縮合磁力計は帯域当たりのエネルギー分解能を$\hbar$以下で達成する

Single-domain Bose condensate magnetometer achieves energy resolution per bandwidth below $\hbar$ ( http://arxiv.org/abs/2108.11716v2 )

ライセンス: Link先を確認

Silvana Palacios Alvarez, Pau Gomez, Simon Coop, Roberto Zamora-Zamora, Chiara Mazzinghi and Morgan W. Mitchell

(参考訳) 本稿では,帯域当たりのエネルギー分解能を持つ磁気センサについて述べる。非破壊的なファラデー回転法により検出された単ドメインスピノル Bose-Einstein condensate が,ボリューム $V=1091(30)~\mu\mathrm{m}^3$ for $3.5~\mathrm{s}$, $E_R = 0.075(16)~\hbar$ の単ショットdc磁気感度を実現することを示す。本研究では, 凝縮体積, スピンコヒーレンス時間, 読み出しノイズを実験的に測定し, 3+1次元平均場シミュレーションによる位相空間法を用いてスピンノイズを計算する。スピンノイズへの寄与は、強磁性接触相互作用と二次ゼーマンシフトの競合による一体と三体損失と投射雑音分布のせん断を含む。それでも、単一ドメインの超低温2体相互作用の完全なコヒーレンス性により、システムはコヒーレンスvsを回避できる。これは従来のスピンプロセシングセンサーにエネルギー分解能の限界を課す。他のボース凝縮アルカリ、特に反強磁性の$^{23}\mathrm{na}$は、この方法のエネルギー分解能をさらに向上できると予測する。

We present a magnetic sensor with energy resolution per bandwidth $E_R < \hbar$. We show how a $^{87}\mathrm{Rb}$ single domain spinor Bose-Einstein condensate, detected by non-destructive Faraday-rotation probing, achieves single shot dc magnetic sensitivity of $72(8)~\mathrm{fT}$ measuring a volume $V= 1091(30)~\mu\mathrm{m}^3$ for $3.5~\mathrm{s}$, and thus $E_R = 0.075(16)~\hbar$. We measure experimentally the condensate volume, spin coherence time, and readout noise, and use phase-space methods, backed by 3+1D mean-field simulations, to compute the spin noise. Contributions to the spin noise include one-body and three-body losses and shearing of the projection noise distribution, due to competition of ferromagnetic contact interactions and quadratic Zeeman shifts. Nonetheless, the fully-coherent nature of the single-domain, ultracold two-body interactions allows the system to escape the coherence vs.~density trade-off that imposes an energy resolution limit on traditional spin-precession sensors. We predict that other Bose-condensed alkalis, especially the antiferromagnetic $^{23}\mathrm{Na}$, can further improve the energy resolution of this method.

翻訳日:2023-03-17 03:19:00 公開日:2022-01-28

# オンサイト相互作用を持つ周期駆動格子モデルにおけるトポロジカル2粒子ダイナミクス

Topological two-particle dynamics in a periodically driven lattice model with on-site interactions ( http://arxiv.org/abs/2109.05220v2 )

ライセンス: Link先を確認

Anna Berti and Iacopo Carusotto

(参考訳) 本研究では,2粒子境界状態のロバストなトポロジカルなトポロジカルなダイナミクスをオンサイト相互作用と適切な時間依存ホッピングを持つ格子モデルで観測する。このFloquetスキームは、既存のデジタル量子コンピュータプラットフォーム上で現実的に実装することができる。 2つの独立粒子の位相的単粒子ダイナミクスと2つの構成粒子間の絡み合いの明瞭なシグネチャとの顕著な違いを強調する。

We develop a realistic protocol to observe a robust topological dynamics of two-particle bound states in a lattice model with on-site interactions and suitably designed time-dependent hoppings. This Floquet scheme can be realistically implemented on existing digital quantum computer platforms. Marked differences from the topological single-particle dynamics of two independent particles and clear signatures of the entanglement between the two constituent particles are highlighted.

翻訳日:2023-03-15 11:28:12 公開日:2022-01-28

# 測定誘起相転移における波動関数の多重性を超えた普遍的挙動

Universal behavior beyond multifractality of wave-functions at measurement--induced phase transitions ( http://arxiv.org/abs/2109.06882v3 )

ライセンス: Link先を確認

Piotr Sierant, Xhek Turkeshi

(参考訳) 本研究では,1次元量子回路の多体波動関数の構造を局所的測定により検討する。参加エントロピーのシステムサイズ依存性の先行項は、非ゼロの測定速度における波動関数のモデル依存マルチフラクタルスケーリングを示す。サブリード項は、測定誘起相転移に関する普遍的な情報を含み、次数パラメータの役割を担い、誤差補正位相では定数非ゼロであり、量子ゼノ相では消失する。本研究では,様々な量子多体系のロバストな数値的証明を提供し,この振る舞いを2次元における古典的統計モデルの分割関数の観点でエントロピーを表わす解析的解釈を提供する。

We investigate the structure of many-body wave functions of 1D quantum circuits with local measurements employing the participation entropies. The leading term in system size dependence of participation entropy indicates a model dependent multifractal scaling of the wave-functions at any non-zero measurement rate. The sub-leading term contains universal information about measurement-induced phase transitions and plays the role of an order parameter, being constant non-zero in the error correcting phase and vanishing in the quantum Zeno phase. We provide robust numerical evidence investigating a variety of quantum many-body systems, and provide an analytical interpretation of this behavior expressing the participation entropy in terms of partition functions of classical statistical models in 2D.

翻訳日:2023-03-15 02:53:51 公開日:2022-01-28

# 炭素中性データセンターの体系的調査に向けて

Towards a Systematic Survey for Carbon Neutral Data Centers ( http://arxiv.org/abs/2110.09284v3 )

ライセンス: Link先を確認

Zhiwei Cao, Xin Zhou, Han Hu, Zhi Wang, Yonggang Wen

(参考訳) データセンターは大量のエネルギー消費のために炭素集約型企業であり、データセンター産業は2030年までに世界の二酸化炭素排出量の82%を占めると推定されている。しかし、データセンターの二酸化炭素排出量を削減または中和するための技術と政策の手段は、いずれも完全には調査されていない。このギャップを埋めるため,本稿では,政策機器と技術方法論の両方を考慮したカーボンニュートラルデータセンターのロードマップを提案する。まず、データセンターのカーボンフットプリントを提示することから始め、そして炭素排出の主な源に関するいくつかの洞察から始めます。その後、主要なグローバルクラウドプロバイダに対するカーボン中立性計画が議論され、この方向の現在の産業活動が要約される。以下では、コスト効率の高い方法でデータセンターの炭素排出量を相殺する方法を説明するための政策指標として、炭素市場を紹介する。技術面では、再生可能エネルギーの普及、エネルギー効率の向上、同時にエネルギー循環の向上により、炭素中性データセンターの実現を提案する。これら3つのトピックに関する既存技術の総合的なレビューは後述する。これに基づいて、カーボン中立性に対する多角的アプローチが想定され、このソリューションを実現するために、デジタルツインパワー産業人工知能(AI)フレームワークが提案されている。さらに,このような枠組みを立案する上での3つの重要な科学的課題について論じる。最後に、このフレームワークのいくつかのアプリケーションは、その巨大な可能性を示すために提示されます。

Data centers are carbon-intensive enterprises due to their massive energy consumption, and it is estimated that data center industry will account for 8\% of global carbon emissions by 2030. However, both technological and policy instruments for reducing or even neutralizing data center carbon emissions have not been thoroughly investigated. To bridge this gap, this survey paper proposes a roadmap towards carbon-neutral data centers that takes into account both policy instruments and technological methodologies. We begin by presenting the carbon footprint of data centers, as well as some insights into the major sources of carbon emissions. Following that, carbon neutrality plans for major global cloud providers are discussed to summarize current industrial efforts in this direction. In what follows, we introduce the carbon market as a policy instrument to explain how to offset data center carbon emissions in a cost-efficient manner. On the technological front, we propose achieving carbon-neutral data centers by increasing renewable energy penetration, improving energy efficiency, and boosting energy circulation simultaneously. A comprehensive review of existing technologies on these three topics is elaborated subsequently. Based on this, a multi-pronged approach towards carbon neutrality is envisioned and a digital twin-powered industrial artificial intelligence (AI) framework is proposed to make this solution a reality. Furthermore, three key scientific challenges for putting such a framework in place are discussed. Finally, several applications for this framework are presented to demonstrate its enormous potential.

翻訳日:2023-03-11 09:52:59 公開日:2022-01-28

# wikipediaにおける科学の地図

A Map of Science in Wikipedia ( http://arxiv.org/abs/2110.13790v2 )

ライセンス: Link先を確認

Puyu Yang and Giovanni Colavizza

(参考訳) 近年、インターネットの普及が急速に進み、科学情報への便利で安価なアクセスが可能になっている。世界最大の百科事典の1つであるウィキペディアは、この点において参考となり、学者から広く注目を集めている。しかし、ウィキペディアの内容を支える科学資料の明確な理解は、いまだ解明されていない。本研究では,ウィキペディアの記事と科学雑誌記事の関係を地図化するために,ウィキペディアからの引用のオープンデータセットを利用する。ウィキペディアから引用されたほとんどの雑誌記事はSTEM分野、特に生物学と医学(引用の47.6$\%、引用記事の46.1$\%)に属する。さらに、ウィキペディアの伝記はSTEM分野と人文科学、特に歴史を結びつける上で重要な役割を果たしている。これらの結果は、ウィキペディアの科学的情報源への依存と、知識ブローカーとしての一般への役割の理解に寄与する。

In recent decades, the rapid growth of Internet adoption is offering opportunities for convenient and inexpensive access to scientific information. Wikipedia, one of the largest encyclopedias worldwide, has become a reference in this respect, and has attracted widespread attention from scholars. However, a clear understanding of the scientific sources underpinning Wikipedia's contents remains elusive. In this work, we rely on an open dataset of citations from Wikipedia to map the relationship between Wikipedia articles and scientific journal articles. We find that most journal articles cited from Wikipedia belong to STEM fields, in particular biology and medicine ($47.6$\% of citations; $46.1$\% of cited articles). Furthermore, Wikipedia's biographies play an important role in connecting STEM fields with the humanities, especially history. These results contribute to our understanding of Wikipedia's reliance on scientific sources, and its role as knowledge broker to the public.

翻訳日:2023-03-10 05:32:38 公開日:2022-01-28

# カルノー限界を超える有限時間量子計測冷却

Finite-time quantum measurement cooling beyond the Carnot limit ( http://arxiv.org/abs/2111.12467v2 )

ライセンス: Link先を確認

Tong Fu, Jianying Du, Jingyi Chen, Jincan Chen, Chikako Uchiyama, Shanhe Su

(参考訳) そこで我々は, 侵襲的計測が冷却サイクルを駆動する力を与える計測系量子クーラーの有限時間サイクルモデルを提案した。そのようなクーラーは、マウェルの悪魔の代替思考実験と見なすことができる。測定フィードバック情報は、ワーク入力なしで冷湯から熱湯へ熱を移動させ、カルノー限界よりも性能の最大係数を大きくすることができる。この一見パラドックスな結果が熱力学の法則に違反しない原因は、相互情報を含む一般化されたクラウシウスの不等式を導出することで明確に説明できる。

We proposed the finite-time cycle model of a measurement-based quantum cooler, where invasive measurement provides the power to drive the cooling cycle. Such a cooler may be regarded as an alternative thought experiment of Mawell's demon. The measurement-feedback information is capable of moving heat from the cold to hot bath without any work input and even making the maximum coefficient of performance larger than the Carnot limit. The causes that this seemingly paradoxical result does not violate the laws of thermodynamics can be clearly explained through the derivation of a generalized Clausius inequality including the mutual information.

翻訳日:2023-03-07 00:06:47 公開日:2022-01-28

# 一般化回転対称性で保護される非エルミート$C_{NH} = 2$チャーン絶縁体

Non-Hermitian $C_{NH} = 2$ Chern insulator protected by generalized rotational symmetry ( http://arxiv.org/abs/2111.12573v2 )

ライセンス: Link先を確認

Kai Chen and Alexander B. Khanikaev

(参考訳) 空間の回転とエルミート共役を誘導する一般化回転対称性によって保護される非エルミート位相系を提案する。この系は、強結合モデルと非相互ホッピングにより記述され、ギャップ付き位相位相において2対のギャップ内エッジモードをホストし、非エルミート(NH)チャーン数$C_{NH}=2$で特徴づけられる。非エルミートチャーン数の量子化は、系の一般化された回転対称性 $\^H^{+}=\^U\^H\^U^{+}$ によって保護される。我々の発見は、トポロジカル不変量の大きい値と、トポロジカルにレジリエントな多重化に使用できる複数のエッジ状態のホストを特徴とする、新しい非エルミート的トポロジカルシステムへの道を開くものである。

We propose a non-Hermitian topological system protected by the generalized rotational symmetry which invokes rotation in space and Hermitian conjugation. The system, described by the tight-binding model with nonreciprocal hopping, is found to host two pairs of in-gap edge modes in the gapped topological phase and is characterized by the non-Hermitian (NH) Chern number $C_{NH}=2$. The quantization of the non-Hermitian Chern number is shown to be protected by the generalized rotational symmetry $\^H^{+}=\^U\^H\^U^{+}$ of the system. Our finding paves the way towards novel non-Hermitian topological systems characterized by large values of topological invariants and hosting multiple in-gap edge states, which can be used for topologically resilient multiplexing.

翻訳日:2023-03-06 23:57:31 公開日:2022-01-28

# qubitノイズデコンボリューション

Qubit noise deconvolution ( http://arxiv.org/abs/2112.03043v2 )

ライセンス: Link先を確認

Stefano Mangini, Lorenzo Maccone, Chiara Macchiavello

(参考訳) 量子ビットシステム上で任意の測定を行う際に,広帯域ノイズを除去するノイズデコンボリューション手法を提案する。特に、最も一般的な単一キュービットノイズチャネルの逆写像を導出し、データ処理ステップで利用して、既知の雑音を受けるキュービットシステムで評価された可観測物のノイズフリー推定値を得る。本稿では,総称パウリチャネルのデコンボリューションに対するシミュレーション結果と,リゲッティ量子ハードウェア上で発生するデコヒーレンスノイズのデコンボリューションの実験的証拠を提供するために,ノイズ特性が正確であることを保証するための自己矛盾チェックを示す。

We present a noise deconvolution technique to remove a wide class of noises when performing arbitrary measurements on qubit systems. In particular, we derive the inverse map of the most common single qubit noisy channels and exploit it at the data processing step to obtain noise-free estimates of observables evaluated on a qubit system subject to known noise. We illustrate a self-consistency check to ensure that the noise characterization is accurate providing simulation results for the deconvolution of a generic Pauli channel, as well as experimental evidence of the deconvolution of decoherence noise occurring on Rigetti quantum hardware.

翻訳日:2023-03-05 10:05:39 公開日:2022-01-28

# 3量子ドットスピン量子ビットにおける高速で高忠実な状態形成と測定

Fast and high-fidelity state preparation and measurement in triple-quantum-dot spin qubits ( http://arxiv.org/abs/2112.09801v2 )

ライセンス: Link先を確認

Jacob Z. Blumoff, Andrew S. Pan, Tyler E. Keating, Reed W. Andrews, David W. Barnes, Teresa L. Brecht, Edward T. Croke, Larken E. Euliss, Jacob A. Fast, Clayton A. C. Jackson, Aaron M. Jones, Joseph Kerckhoff, Robert K. Lanza, Kate Raach, Bryan J. Thomas, Roland Velunta, Aaron J. Weinstein, Thaddeus D. Ladd, Kevin Eng, Matthew G. Borselli, Andrew T. Hunter, and Matthew T. Rakher

(参考訳) 交換専用si/sigeトリプル量子ドット量子ビットにおける高速で高忠実な状態形成と測定を示す。高速測定統合(980$ ns)と初期化(300$ ns)の操作は、全電気的、ベースバンド制御で実行される。我々は,交換専用量子ビットの文脈で開発された漏洩感度結合初期化および測定基準を強調し,2.5 pm0.5\times 10^{-3}$の不忠実さを報告する。この結果は、2-to-3電子電荷境界における初期化とスピン・ツー・チャージ変換におけるT_1$の慎重な評価と緩和によって実現される。最終的な忠実度は,多くの重要な要因によって制限され,さらに改善された忠実度と速度への明確な道筋が特定される。観測されたシングルキュービットランダム化ベンチマークエラーレートが1.7\times 10^{-3}$と並んで、スケーラブルな量子情報処理を約束する忠実度と持続時間におけるSi/SiGe三重ドット量子ビットの初期化、制御、測定を示す。

We demonstrate rapid, high-fidelity state preparation and measurement in exchange-only Si/SiGe triple-quantum-dot qubits. Fast measurement integration ($980$ ns) and initialization ($\approx 300$ ns) operations are performed with all-electrical, baseband control. We emphasize a leakage-sensitive joint initialization and measurement metric, developed in the context of exchange-only qubits but applicable more broadly, and report an infidelity of $2.5\pm0.5\times 10^{-3}$. This result is enabled by a high-valley-splitting heterostructure, initialization at the 2-to-3 electron charge boundary, and careful assessment and mitigation of $T_1$ during spin-to-charge conversion. The ultimate fidelity is limited by a number of comparably-important factors, and we identify clear paths towards further improved fidelity and speed. Along with an observed single-qubit randomized benchmarking error rate of $1.7\times 10^{-3}$, this work demonstrates initialization, control, and measurement of Si/SiGe triple-dot qubits at fidelities and durations which are promising for scalable quantum information processing.

翻訳日:2023-03-04 06:51:59 公開日:2022-01-28

# 超短パルスおよび高強度レーザーパルスによる原子イオン化におけるトンネルの役割

The role of tunneling in the ionization of atoms by ultrashort and intense laser pulses ( http://arxiv.org/abs/2112.14336v2 )

ライセンス: Link先を確認

Gabriel M. Lando

(参考訳) 古典的に許容される輸送は、ケルディシュパラメータがユニティよりも小さいにもかかわらず、超短パルスおよび強レーザーパルスによる原子のイオン化中に量子トンネルと競合する。これは、Trncated Wigner Approximation を用いて、純粋に古典的な伝播から得られるものと正確な確率密度を比較することによって行われる。古典輸送は、軌道を核から遠ざけることができるだけでなく、実験で現在使われている強度に対して量子輸送と同じ位のイオン化確率を提供することもできる。本研究の結果は,強磁場物理における概念的補正から半古典的なステップモデルから,時空実験におけるトンネル時間測定に関する議論まで多岐にわたる。

Classically allowed transport is shown to compete with quantum tunneling during the ionization of atoms by ultrashort and intense laser pulses, despite Keldysh parameters smaller than unity. This is done by comparing exact probability densities with the ones obtained from purely classical propagation using the Truncated Wigner Approximation. Not only is classical transport capable of moving trajectories away from the core, but it can also furnish ionization probabilities of the same order as the quantum ones for intensities currently employed in experiments. Our results have implications ranging from a conceptual correction to semiclassical step models in strong-field physics to the ongoing debate about tunneling time measurements in attoclock experiments.

翻訳日:2023-03-02 23:42:30 公開日:2022-01-28

# 量子多重アクセスチャネル上のプライベート古典的通信

Private Classical Communication over Quantum Multiple-Access Channels ( http://arxiv.org/abs/2201.11899v1 )

ライセンス: Link先を確認

Remi A. Chou

(参考訳) 量子多重アクセスチャネル上でのプライベート古典通信について検討する。任意の数の送信機に対して、容量領域の正規化表現を導出する。分解可能なチャネルの場合、最善の達成可能な和率に対する単一レター式を確立し、この量もまた分解可能な量子多重アクセスチャネル上の量子通信における最良の達成可能な和率に対応することを証明します。達成可能性の結果として、信頼性とプライバシーの制約を分離し、それぞれ、量子側情報と普遍ハッシュによるソースコーディングによって処理する。したがって、検討中のマルチユーザコーディング問題は、ポイント・ツー・ポイントのコーディング技術でのみ扱うことができる。独立利害の副産物として、我々は、達成可能な結果におけるプライバシを保証する量子側情報に対する分散剰余ハッシュ補題を導出する。

We study private classical communication over quantum multiple-access channels. For an arbitrary number of transmitters, we derive a regularized expression of the capacity region. In the case of degradable channels, we establish a single-letter expression for the best achievable sum-rate and prove that this quantity also corresponds to the best achievable sum-rate for quantum communication over degradable quantum multiple-access channels. In our achievability result, we decouple the reliability and privacy constraints, which are handled via source coding with quantum side information and universal hashing, respectively. Hence, we also establish that the multi-user coding problem under consideration can be handled solely via point-to-point coding techniques. As a by-product of independent interest, we derive a distributed leftover hash lemma against quantum side information that ensures privacy in our achievability result.

翻訳日:2023-02-27 16:19:13 公開日:2022-01-28

# Pseudo-Hermiticityにより保護されたユニタリ散乱

Unitary Scattering Protected by Pseudo-Hermiticity ( http://arxiv.org/abs/2201.11894v1 )

ライセンス: Link先を確認

L. Jin

(参考訳) エルミート系はユニタリ散乱を持つが、ハーミート性はユニタリ散乱には不要であるが、非ハーミート性の影響下での散乱は主に非ユニタリ散乱である。ここでは、ユニタリ散乱がある種の擬ハーミティティーによって保護され、非ハーミティティーの次数の影響を受けないことを証明する。エネルギー保存は散乱過程において破れ、散乱後に回復する。接続点のみを含む擬エルミート散乱中心のサブシステムはエルミートである。これらの発見はユニタリ散乱、擬エルミティシティ、エネルギー保存に関する基本的な知見を提供し、非エルミティアン系における光伝播、メソスコピック電子輸送、量子干渉に有望である。

The Hermitian systems possess unitary scattering; however, the Hermiticity is unnecessary for a unitary scattering although the scattering under the influence of non-Hermiticity is mostly non-unitary. Here we prove that the unitary scattering is protected by certain type of pseudo-Hermiticity and unaffected by the degree of non-Hermiticity. The energy conservation is violated in the scattering process and recovers after scattering. The subsystem of the pseudo-Hermitian scattering center including only the connection sites is Hermitian. These findings provide fundamental insights on the unitary scattering, pseudo-Hermiticity, and energy conservation; and are promising for the light propagation, mesoscopic electron transport, and quantum interference in the non-Hermitian systems.

翻訳日:2023-02-27 16:18:59 公開日:2022-01-28

# 量子鍵分布のための統合室温単一光子源

Integrated Room Temperature Single Photon Source for Quantum Key Distribution ( http://arxiv.org/abs/2201.11882v1 )

ライセンス: Link先を確認

Helen Zhi Jie Zeng, Minh Anh Phan Ngyuen, Xiaoyu Ai, Adam Bennet, Alexander Solnstev, Arne Laucht, Ali Al-Juboori, Milos Toth, Rich Mildren, Robert Malaney, and Igor Aharonovich

(参考訳) 室温で動作可能な高純度単一光子源(SPS)は、量子フォトニクスや量子鍵分布を含む無数のアプリケーションにとって非常に望ましい。本研究では、六方晶窒化ホウ素(hBN)の原子欠陥と固体浸漬レンズ(SIL)を融合した超高輝度固体SPSを実現する。 SILはソース効率を6倍に向上させ、統合システムは室温で毎秒1000万個の光子を生成することができる。この結果は、量子通信プロトコルにおけるspsの実用化に有望である。

High-purity single photon sources (SPS) that can operate at room temperature are highly desirable for a myriad of applications, including quantum photonics and quantum key distribution. In this work, we realise an ultra-bright solid-state SPS based on an atomic defect in hexagonal boron nitride (hBN) integrated with a solid immersion lens (SIL). The SIL increases the source efficiency by a factor of six, and the integrated system is capable of producing over ten million single photons per second at room temperature. Our results are promising for practical applications of SPS in quantum communication protocols.

翻訳日:2023-02-27 16:18:30 公開日:2022-01-28

# 従来型結合クラスタとユニタリ結合クラスタとの演算子関係

Operator relationship between conventional coupled cluster and unitary coupled cluster ( http://arxiv.org/abs/2201.11881v1 )

ライセンス: Link先を確認

James K. Freericks

(参考訳) 化学コミュニティは、特に量子コンピュータ上で量子化学を実行することに関心があるため、単一参照システムにおいて、従来と一元結合クラスタアンサッツの正確な関係を求めてきた。本研究では、指数的不等式とアダマール補題によって与えられた演算子操作を、ユニタリ結合クラスター近似の因子化形式と従来の結合クラスター近似の因子化形式とを関連付ける方法を示す(一部の振幅は演算子値であり、他の項に可換ではないため、因子化形式が必要である)。トロッター積公式を用いることで、分解された形式をユニタリ結合クラスター ansatz の標準形式に関連付けることができる。結合クラスタ近似の分解形式の演算子依存は、さらに高階演算子を必要とするために除去され、最終的に従来の結合クラスタが生成される。このアプローチの代数的操作は手作業で行うのが難しいが、十分に小さなシステムのためにコンピュータ上で自動化することができる。

The chemistry community has long sought the exact relationship between the conventional and the unitary coupled cluster ansatz for a single-reference system, especially given the interest in performing quantum chemistry on quantum computers. In this work, we show how one can use the operator manipulations given by the exponential disentangling identity and the Hadamard lemma to relate the factorized form of the unitary coupled-cluster approximation to a factorized form of the conventional coupled cluster approximation (the factorized form is required, because some amplitudes are operator-valued and do not commute with other terms). By employing the Trotter product formula, one can then relate the factorized form to the standard form of the unitary coupled cluster ansatz. The operator dependence of the factorized form of the coupled cluster approximation can also be removed at the expense of requiring even more higher-rank operators, finally yielding the conventional coupled cluster. The algebraic manipulations of this approach are daunting to carry out by hand, but can be automated on a computer for small enough systems.

翻訳日:2023-02-27 16:18:21 公開日:2022-01-28

# 古典情報伝達の熱力学的基準

Thermodynamic Criterion of Transmitting Classical Information ( http://arxiv.org/abs/2201.12110v1 )

ライセンス: Link先を確認

Chung-Yun Hsieh

(参考訳) 古典情報伝達の熱力学的基準とは何か? 任意に与えられた超チャネルのクラスによって補助される1ショット古典キャパシティ上の熱力学上および下界を証明した。これらの境界は、送信チャネルによって維持される古典的相関から抽出可能な研究によって与えられ、選択されたスーパーチャネルのクラスに依存する追加の熱力学的制約を受ける。これは、ワンショット方式で、古典情報の$n$ビットをチャネルを介して送信する物理メッセージは、保守された古典的相関から抽出可能な$n\times k_BT\ln2$と等価であり、その結果、古典的情報の伝達に必要な熱力学的基準を明らかにする。この結果は漸近理論にまで拡張でき、ホールボ=シュマハ=ウェストモアランドの定理に熱力学的意味を与えることができる。最後に,作業抽出はチャネルの資源理論と密接に関連していることを示す。この課題を定量的に解くために, 作業抽出タスクはダイナミクスの一般的な資源を目の当たりにすることができることを示し, 広い範囲のチャネル資源を初めて熱力学的に解釈する。以上の知見は,コミュニケーションと熱力学の間に明らかなつながりをもたらし,その相互作用から新たな物理メッセージを発見する可能性を示す。

What is the thermodynamic criterion of transmitting classical information? We prove thermodynamic upper and lower bounds on the one-shot classical capacity assisted by an arbitrarily given class of superchannels. These bounds are given by the extractable work from classical correlation maintained by the transmission channel, subject to additional thermodynamic constraints depending on the chosen class of superchannels. It provides the physical message that, in the one-shot regime, transmitting $n$ bits of classical information through a channel is equivalent to $n\times k_BT\ln2$ extractable work from the maintained classical correlation, consequently revealing the thermodynamic criterion that is necessary to transmit classical information. This result can be further extended to the asymptotic regime, providing thermodynamic meanings for Holevo-Schumacher-Westmoreland Theorem. Finally, our study suggests that work extraction is closely related to resource theories of channels. To quantitatively address this question, we show that work extraction tasks can witness general resources of dynamics, providing the first thermodynamic interpretation of a broad class of channel resources. Our findings provide explicit connections between communication and thermodynamics, demonstrating the possibility of discovering new physical messages from their interplay.

翻訳日:2023-02-27 16:13:54 公開日:2022-01-28

# 量子マイクロ波フォトニクスの実証-原理実証

A proof-of-principle demonstration of quantum microwave photonics ( http://arxiv.org/abs/2201.12106v1 )

ライセンス: Link先を確認

Yaqing Jin, Ye Yang, Huibo Hong, Xiao Xiang, Runai Quan, Tao Liu, Shougang Zhang, Ninghua Zhu, Ming Li, and Ruifang Dong

(参考訳) マイクロ波フォトニクスの急速な発展により、商業的重要性の多くの応用へと発展し、出現するボトルネックを取り除くことが重要となる。例えば、マイクロ波フォトニクスのメインブランチとして、無線オーバーファイバー技術は高帯域幅、低損失、長距離伝搬能力を提供し、通信から無線ネットワークまで幅広い応用を促進する。光キャリアとして超短パルスを用いると、さらに大きな容量が与えられる。しかし、超短パルスの広い帯域幅は、高周波RF信号のファイバ分散に対する深刻な脆弱性をもたらす。光キャリアとして時間エネルギーの絡み合った二光子源と単一光子検出技術を組み合わせた量子マイクロ波フォトニクス法の提案と実証を行った。その結果,超短パルスキャリアによる分散に強い耐性を持つ非局所RF信号変調を実現するだけでなく,分散からRF信号を効果的に抽出する機構を提供することがわかった。さらに,非局所変調RF信号と蒸留RF信号の両方のスプリアスフリーダイナミックレンジが大幅に改善された。超弱検出と低タイミング単一光子検出による高速処理の利点により、量子マイクロ波フォトニクス法は現代の通信やネットワークにおいて新たな可能性を開く。

With the rapid development of microwave photonics, which has expanded to numerous applications of commercial importance, eliminating the emerging bottlenecks becomes of vital importance. For example, as the main branch of microwave photonics, radio-over-fiber technology provides high bandwidth, low-loss, and long-distance propagation capability, facilitating wide applications ranging from telecommunication to wireless networks. With ultrashort pulses as the optical carrier, huge capacity is further endowed. However, the wide bandwidth of ultrashort pulses results in the severe vulnerability of high-frequency RF signals to fiber dispersion. With a time-energy entangled biphoton source as the optical carrier and combined with the single-photon detection technique, a quantum microwave photonics method is proposed and demonstrated experimentally. The results show that it not only realizes unprecedented nonlocal RF signal modulation with strong resistance to the dispersion associated with ultrashort pulse carriers but provides an alternative mechanism to effectively distill the RF signal out from the dispersion. Furthermore, the spurious-free dynamic range of both the nonlocally modulated and distilled RF signals has been significantly improved. With the ultra-weak detection and high-speed processing advantages endowed by the low-timing-jitter single-photon detection, the quantum microwave photonics method opens up new possibilities in modern communication and networks.

翻訳日:2023-02-27 16:13:32 公開日:2022-01-28

# 1-2-3 量子ソフトウェア実験の再現性

1-2-3 Reproducibility for Quantum Software Experiments ( http://arxiv.org/abs/2201.12031v1 )

ライセンス: Link先を確認

Wolfgang Mauerer and Stefanie Scherzinger

(参考訳) 様々な科学分野が再現性危機に直面している。量子ソフトウェア工学が新興分野であるためには、最初から適切な再現性工学に重点を置くことが不可欠である。しかし、複製パッケージの提供はほとんど普遍的に欠落している。このようなパッケージの作り方に関する実践的なアドバイスは、コンピュータサイエンス以外のバックグラウンドを持つ研究者から多くの貢献を受けている分野において、特に不幸である。本稿では,量子ソフトウェア実験における再現性工学への1-2-3～アプローチを提案することで,この不足を是正する方法について議論する。これらは、プロジェクト固有の研究成果物(ソースコード、測定データ、構成データ)のみに基づいて、専門的および学習的な社会の要求を満たすように設計されており、研究者による時間的投資をほとんど必要としない。我々の方式は、量子プロセッサ自体がもはやアクセスできない場合でも、長期的トレーサビリティを確認する。技術的バーを劇的に下げることで、量子ソフトウェア実験における複製パッケージの増殖を促進し、非CS研究者の分野への参加を容易にする。

Various fields of science face a reproducibility crisis. For quantum software engineering as an emerging field, it is therefore imminent to focus on proper reproducibility engineering from the start. Yet the provision of reproduction packages is almost universally lacking. Actionable advice on how to build such packages is rare, particularly unfortunate in a field with many contributions from researchers with backgrounds outside computer science. In this article, we argue how to rectify this deficiency by proposing a 1-2-3~approach to reproducibility engineering for quantum software experiments: Using a meta-generation mechanism, we generate DOI-safe, long-term functioning and dependency-free reproduction packages. They are designed to satisfy the requirements of professional and learned societies solely on the basis of project-specific research artefacts (source code, measurement and configuration data), and require little temporal investment by researchers. Our scheme ascertains long-term traceability even when the quantum processor itself is no longer accessible. By drastically lowering the technical bar, we foster the proliferation of reproduction packages in quantum software experiments and ease the inclusion of non-CS researchers entering the field.

翻訳日:2023-02-27 16:12:18 公開日:2022-01-28

# 標準量子アニールはデコヒーレンスで断熱逆アニールより優れる

Standard quantum annealing outperforms adiabatic reverse annealing with decoherence ( http://arxiv.org/abs/2201.11997v1 )

ライセンス: Link先を確認

Gianluca Passarelli, Ka-Wa Yip, Daniel A. Lidar, Procolo Lucignano

(参考訳) オープンシステムにおけるAdiabatic reverse annealing(ARA)について検討した。閉系(単位)設定では、このアニーリングプロトコルは選択されたモデルの1次量子相転移を回避し、アルゴリズムの初期状態がターゲットモデルとハミング距離に近いことを条件として、標準量子アニーリングと比較して指数的なスピードアップをもたらす。ここで、デコヒーレンスは、この結論を著しく修正できることを示す: 断熱マスター方程式のアプローチを用いて、独立かつ集合的デファスメントの下で$p=3$の強磁性(p$-spin)モデルのダイナミクスをシミュレートする。いずれのデコヒーレンスモデルにおいても、オープンシステムaraの性能は、ユニタリシステムよりも初期状態の選択に対する感受性が低く、最も顕著なのは、オープンシステムaraが標準量子アニーリングに比べてソリューションアドバンテージの時間を失うことである。これらの結果は、ARAが単独の戦略として、標準の「前方」量子アニールを実験的に上回ることは不可能であり、現実的でノイズの多い環境でのARAの利点を実現するためにはエラー軽減戦略が必要であることを示唆している。

We study adiabatic reverse annealing (ARA) in an open system. In the closed system (unitary) setting, this annealing protocol allows avoidance of first-order quantum phase transitions of selected models, resulting in an exponential speedup compared with standard quantum annealing, provided that the initial state of the algorithm is close in Hamming distance to the target one. Here, we show that decoherence can significantly modify this conclusion: by resorting to the adiabatic master equation approach, we simulate the dynamics of the ferromagnetic $p$-spin model with $p=3$ under independent and collective dephasing. For both models of decoherence, we show that the performance of open system ARA is far less sensitive to the choice of the initial state than its unitary counterpart, and, most significantly, that open system ARA by and large loses its time to solution advantage compared to standard quantum annealing. These results suggest that as a stand-alone strategy, ARA is unlikely to experimentally outperform standard "forward" quantum annealing, and that error mitigation strategies will likely be required in order to realize the benefits of ARA in realistic, noisy settings.

翻訳日:2023-02-27 16:12:00 公開日:2022-01-28

# 2次解析によるBennett-Brassard 1984プロトコルにおける2塩基間の最適比

Optimum ratio between two bases in Bennett-Brassard 1984 protocol with second order analysis ( http://arxiv.org/abs/2201.11960v1 )

ライセンス: Link先を確認

Masahito Hayashi

(参考訳) ベネット・ブラッサード 1984 (bb84) プロトコルでは,コヒーレント攻撃時の生成鍵の長さに対する2次拡張を用いて,2つのベース,ビットベース,位相ベースの選択比率を最適化する。この最適化は、ベースの不一致による送信ビットの損失と、位相ベースにおける誤差率の推定誤差とのトレードオフに対処する。次に、第2次漸近性を有する生成鍵の最適比と最適長さを求める。驚くべきことに、2次の順序は$n^{3/4}$であり、これは従来の設定では$n$が量子通信の数であるとき、$n^{1/2}$よりもはるかに大きい。この事実は、我々の設定が従来の問題よりも2階解析においてはるかに重要であることを示している。この重要性を説明するために,第2次補正の効果を数値的にプロットする。

Bennet-Brassard 1984 (BB84) protocol, we optimize the ratio of the choice of two bases, the bit basis and the phase basis by using the second order expansion for the length of the generation keys under the coherent attack. This optimization addresses the trade-off between the loss of transmitted bits due to the disagreement of their bases and the estimation error of the error rate in the phase basis. Then, we derive the optimum ratio and the optimum length of the generation keys with the second order asymptotics. Surprisingly, the second order has the order $n^{3/4}$, which is much larger than the second order $n^{1/2}$ in the conventional setting when $n$ is the number of quantum communication. This fact shows that our setting has much larger importance for the second order analysis than the conventional problem. To illustrate this importance, we numerically plot the effect of the second order correction.

翻訳日:2023-02-27 16:11:37 公開日:2022-01-28

# 遠距離量子メモリの絡み合い

Entangling metropolitan-distance separated quantum memories ( http://arxiv.org/abs/2201.11953v1 )

ライセンス: Link先を確認

Xi-Yu Luo, Yong Yu, Jian-Long Liu, Ming-Yang Zheng, Chao-Yang Wang, Bin Wang, Jun Li, Xiao Jiang, Xiu-Ping Xie, Qiang Zhang, Xiao-Hui Bao, Jian-Wei Pan

(参考訳) 量子インターネットは、すべての量子リソースを接続するという約束を与え、ローカライズされたシナリオをはるかに超えるアプリケーションを可能にする。プロトタイプは、絡み合って分離された量子記憶のネットワークである。従来は距離が限られていた。本稿では,2つの原子量子メモリ間の遠隔絡み合いを,大都市圏で直接12.5km間隔で物理的に分離した。原子-光子結合を1つのノードに生成し、光子を第2ノードに送信して記憶する。 20.5kmのフィールド展開ファイバによる低損失伝送を周波数ダウンコンバージョンとアップコンバージョンを用いて活用する。最終的なメモリ・メモリの絡み合いは、光子を回収することで90%の忠実さが証明される。我々の実験は、実用的なシナリオで量子ネットワークアプリケーションを研究する方法である。

Quantum internet gives the promise of getting all quantum resources connected, and it will enable applications far beyond a localized scenario. A prototype is a network of quantum memories that are entangled and well separated. Previous realizations are limited in the distance. In this paper, we report the establishment of remote entanglement between two atomic quantum memories physically separated by 12.5 km directly in a metropolitan area. We create atom-photon entanglement in one node and send the photon to a second node for storage. We harness low-loss transmission through a field-deployed fiber of 20.5 km by making use of frequency down-conversion and up-conversion. The final memory-memory entanglement is verified to have a fidelity of 90% via retrieving to photons. Our experiment paves the way to study quantum network applications in a practical scenario.

翻訳日:2023-02-27 16:11:22 公開日:2022-01-28

# トラップイオン量子コンピュータ上のマルチラウンドqaoaおよびadvancedミキサー

Multi-round QAOA and advanced mixers on a trapped-ion quantum computer ( http://arxiv.org/abs/2201.12335v1 )

ライセンス: Link先を確認

Yingyue Zhu, Zewen Zhang, Bhuvanesh Sundar, Alaina M. Green, C. Huerta Alderete, Nhung H. Nguyen, Kaden R. A. Hazzard, Norbert M. Linke

(参考訳) グラフ上の組合せ最適化問題は、科学と工学に幅広い応用がある。量子近似最適化アルゴリズム(Quantum Approximate Optimization Algorithm, QAOA)は、変分回路の複数ラウンドを適用して量子コンピュータ上でこれらの問題を解く方法である。しかし、QAOAの実際の応用を制限するいくつかの課題が存在する。本稿では、複数の任意のグラフ上の複数の問題に対するラウンド数によってqaoa結果が改善するトラップイオン量子コンピュータについて述べる。また,任意の重みを持つ最適解をサンプリングできる高度な混合ハミルトニアンを示す。結果は,実世界の問題に量子アルゴリズムを適用するための一歩である。

Combinatorial optimization problems on graphs have broad applications in science and engineering. The Quantum Approximate Optimization Algorithm (QAOA) is a method to solve these problems on a quantum computer by applying multiple rounds of variational circuits. However, there exist several challenges limiting the real-world applications of QAOA. In this paper, we demonstrate on a trapped-ion quantum computer that QAOA results improve with the number of rounds for multiple problems on several arbitrary graphs. We also demonstrate an advanced mixing Hamiltonian that allows sampling of all optimal solutions with predetermined weights. Our results are a step towards applying quantum algorithms to real-world problems.

翻訳日:2023-02-27 16:04:08 公開日:2022-01-28

# 散逸工学による非ユニタリゲート操作

Nonunitary Gate Operations by Dissipation Engineering ( http://arxiv.org/abs/2201.12330v1 )

ライセンス: Link先を確認

E. Zapusek, A. Javadi, F. Reiter

(参考訳) 無可逆論理はユニタリ量子進化と相反する。そのような操作を古典的な測定でエミュレートすることは、外乱と高いリソース要求をもたらす可能性がある。これらの制限を克服するために, 不可逆ゲート操作に必要な非単位進化を実現するために, 散逸を利用するプロトコルを提案する。崩壊する新たな励起状態を用いて、最小の安定ヒルベルト空間上で所望のゲート演算を行う効果的な崩壊過程を設計する。これらは、測定を必要とせず、決定論的かつ自律的に動作する。我々は、OR、NOR、XORゲートなどの古典論理演算を考慮に入れたアプローチを例証する。実験的な実現に向けて、量子ドットの実装の可能性について議論する。本研究では,非可逆論理演算を現実的な量子システム上で効率的に行うことができ,非単体進化を得るためには散逸工学が不可欠であることを示す。提案したオペレーションは、量子エンジニアのツールボックスを拡張し、NISQアルゴリズムと量子機械学習に有望な応用をもたらす。

Irreversible logic is at odds with unitary quantum evolution. Emulating such operations by classical measurements can result in disturbances and high resource demands. To overcome these limitations, we propose protocols that harness dissipation to realize the nonunitary evolution required for irreversible gate operations. Using additional excited states subject to decay, we engineer effective decay processes that perform the desired gate operations on the smallest stable Hilbert space. These operate deterministically and in an autonomous fashion, without the need for measurements. We exemplify our approach considering several classical logic operations, such as the OR, NOR, and XOR gates. Towards experimental realization, we discuss a possible implementation in quantum dots. Our study shows that irreversible logic operations can be efficiently performed on realistic quantum systems and that dissipation engineering is an essential tool for obtaining nonunitary evolutions. The proposed operations expand the quantum engineers' toolbox and have promising applications in NISQ algorithms and quantum machine learning.

翻訳日:2023-02-27 16:03:57 公開日:2022-01-28

# 量子後連想記憶

A Post-Quantum Associative Memory ( http://arxiv.org/abs/2201.12305v1 )

ライセンス: Link先を確認

Ludovico Lami, Daniel Goldwater, Gerardo Adesso

(参考訳) 連想記憶(Associative memory)は、その部分的開示によって完全に検索できる情報を記憶する装置である。我々は,いくつかの基本的な操作公理を満足する物理理論の最も一般的なクラスを表現する一般確率論(gpts)の枠組みの中で,連想記憶のおもちゃモデルとそれを行う究極の限界について検討する。私たちは、gptの次元がどれくらい大きいか自問自答し、n$が完全に区別可能な特性で2^m$の状態に対応できるようにします。 danzer と gr\"unbaum による古い結果を呼び出すと、この質問に対する最適な答えが m+1$ であるとき、理論が古典的あるいは量子的である必要があるとき、$o(2^m)$ と比較できることが証明される。これは、GPTが古典理論と量子理論の両方を指数関数的に上回るタスクの例をもたらす。 N\geq 3$の同じ問題は未解決のままである。

Associative memories are devices storing information that can be fully retrieved given partial disclosure of it. We examine a toy model of associative memory and the ultimate limitations it is subjected to within the framework of general probabilistic theories (GPTs), which represent the most general class of physical theories satisfying some basic operational axioms. We ask ourselves how large the dimension of a GPT should be so that it can accommodate $2^m$ states with the property that any $N$ of them are perfectly distinguishable. Invoking an old result by Danzer and Gr\"unbaum, we prove that when $N=2$ the optimal answer to this question is $m+1$, to be compared with $O(2^m)$ when the theory is required to be either classical or quantum. This yields an example of a task where GPTs outperform both classical and quantum theory exponentially. The same problem for $N\geq 3$ is left open.

翻訳日:2023-02-27 16:03:31 公開日:2022-01-28

# 宇宙の絡み合いに対する幾何学的補正

Geometric corrections to cosmological entanglement ( http://arxiv.org/abs/2201.12299v1 )

ライセンス: Link先を確認

Alessio Belfiglio, Orlando Luongo, Stefano Mancini

(参考訳) 均質および等方的宇宙背景上の不均質摂動による絡み合い生成について検討し、量子効果と幾何効果の相互作用が、均質なシナリオに関して絡み合いエントロピーに関連があることを示した。そのため、共形結合したスカラー場に注目し、スカラー粒子の幾何的生成が絡み合うかについて議論する。摂動的に、第一階ではエントロピー補正の振動を見出すが、第二階では下層幾何が絡み合い生成のモード混合を誘導する。したがって,幾何学的貢献のみによる絡み合いを定量化し,これまでの結果と比較した。ダークマター候補として解釈された幾何学的(準)粒子による幾何学的寄与を特徴付ける。

We investigate entanglement production by inhomogeneous perturbations over a homogeneous and isotropic cosmic background, demonstrating that the interplay between quantum and geometric effects can have relevant consequences on entanglement entropy, with respect to homogeneous scenarios. To do so, we focus on a conformally coupled scalar field and discuss how geometric production of scalar particles leads to entanglement. Perturbatively, at first order we find oscillations in entropy correction, whereas at second order the underlying geometry induces mode-mixing on entanglement production. We thus quantify entanglement solely due to geometrical contribution and compare our outcomes with previous findings. We characterize the geometric contribution through geometric (quasi)-particles, interpreted as dark matter candidates.

翻訳日:2023-02-27 16:03:12 公開日:2022-01-28

# 監視量子回路における絡み合いダイナミクスの3次元解法

Three-fold way of entanglement dynamics in monitored quantum circuits ( http://arxiv.org/abs/2201.12259v1 )

ライセンス: Link先を確認

Tara Kalsi, Alessandro Romito, Henning Schomerus

(参考訳) ダイソンの3つの円形アンサンブル(円形ユニタリ,直交,シンプレクティックアンサンブル,CUE,COE,CSE)上に構築された量子回路における測定誘起エンタングルメント遷移について検討する。局所ランダムユニタリゲートの交互に発展する一次元回路の確立したモデルと、測定速度が増加するにつれて広範囲から集中的な絡み合いスケーリングへの遷移を示すことで、キューから引き出すゲートに対して可変速度で行う射影計測を活用した。このケースをCOEとCSEと対比することにより、ゲートによる局所的な絡み合い発生と測定による絡み合い低減との相互作用の洞察を得る。このために,各ゲートが異なるアンサンブルで生成する絡み合いに対する解析的ランダム行列結果と,完全量子回路に対する数値結果を組み合わせた。これらの考察は、カルタンのKAK分解の本質を捉えた特性エンタングルメント行列の観点で統計エンタングリングパワーの効率的な言い換え、CSEに関連する反対称行列の固有値統計に対する一般的な結果を含む。

We investigate the measurement-induced entanglement transition in quantum circuits built upon Dyson's three circular ensembles (circular unitary, orthogonal, and symplectic ensembles; CUE, COE and CSE). We utilise the established model of a one-dimensional circuit evolving under alternating local random unitary gates and projective measurements performed with tunable rate, which for gates drawn from the CUE is known to display a transition from extensive to intensive entanglement scaling as the measurement rate is increased. By contrasting this case to the COE and CSE, we obtain insights into the interplay between the local entanglement generation by the gates and the entanglement reduction by the measurements. For this, we combine exact analytical random-matrix results for the entanglement generated by the individual gates in the different ensembles, and numerical results for the complete quantum circuit. These considerations include an efficient rephrasing of the statistical entangling power in terms of a characteristic entanglement matrix capturing the essence of Cartan's KAK decomposition, and a general result for the eigenvalue statistics of antisymmetric matrices associated with the CSE.

翻訳日:2023-02-27 16:02:58 公開日:2022-01-28

# 非線形蹴りマッハ・ツェンダー干渉計を用いた量子気象

Quantum metrology with a non-linear kicked Mach-Zehnder interferometer ( http://arxiv.org/abs/2201.12255v1 )

ライセンス: Link先を確認

Sabrina M\"uller and Daniel Braun

(参考訳) 位相シフト器に加えて非線形素子を含むマッハ・ツェンダー干渉計の感度について検討した。キャビティまたは光が何度も横切るループに両方の要素を含めることで、干渉計の非線形キックバージョンが生まれる。本研究では, 位相シフト, キック強度, 最大平均光子数, および初期コヒーレント状態における光子損失による減衰の関数としての感度について検討した。減衰したハイゼンベルクに制限された感度のスケーリングを消すためには、スクイーズが全光子数を支配している場合に生じる。最小から中程度の減衰率では、非線形キックは単位時間当たりの量子フィッシャー情報によって測定される感度をかなり高めることができる。

We study the sensitivity of a Mach-Zehnder interferometer that contains in addition to the phase shifter a non-linear element. By including both elements in a cavity or a loop that the light transverses many times, a non-linear kicked version of the interferometer arises. We study its sensitivity as function of the phase shift, the kicking strength, the maximally reached average number of photons, and damping due to photon loss for an initial coherent state. We find that for vanishing damping Heisenberg-limited scaling of the sensitivity arises if squeezing dominates the total photon number. For small to moderate damping rates the non-linear kicks can considerably increase the sensitivity as measured by the quantum Fisher information per unit time.

翻訳日:2023-02-27 16:02:36 公開日:2022-01-28

# 量子イマジナリー時間進化によるMaxCutの解法

Solving MaxCut with Quantum Imaginary Time Evolution ( http://arxiv.org/abs/2201.12221v1 )

ライセンス: Link先を確認

Rizwanul Alam, George Siopsis, Rebekah Herrman, James Ostrowski, Phillip Lotshaw, Travis Humble

(参考訳) 量子イマジナリー時間発展(qite)に基づくマックスカット問題を効率的に解く手法を提案する。ユニタリ更新には線形Ansatzを使用し、絡み合いを伴わない初期状態とする。この手法を頂点数 |V| = 4,6,8,10 のグラフに適用し、平均解が最大MaxCut 解の 100%, 99%, 98%, 97% となることを示す。与えられたグラフと2つのエッジを持つグラフを補間する仮想時間依存ハミルトニアン補間を用いることで、修正アルゴリズムは最大8頂点のグラフと約100個の10頂点グラフのランダムサンプルに対して最大解に100%の精度で収束することを示した。この改良された手法は頂点数の多項式であるオーバーヘッドを持つ。

We introduce a method to solve the MaxCut problem efficiently based on quantum imaginary time evolution (QITE). We employ a linear Ansatz for unitary updates and an initial state that involve no entanglement. We apply the method to graphs with number of vertices |V| = 4,6,8,10, and show that after ten QITE steps, the average solution is 100%, 99%, 98%, 97%, respectively, of the maximum MaxCut solution. By employing an imaginary-time-dependent Hamiltonian interpolating between a given graph and a subgraph with two edges excised, we show that the modified algorithm has a 100% performance converging to the maximum solution of the MaxCut problem for all graphs up to eight vertices as well as about 100 random samples of ten-vertex graphs. This improved method has an overhead which is polynomial in the number of vertices.

翻訳日:2023-02-27 16:02:22 公開日:2022-01-28

# 熱力学過程における量子コヒーレンスの役割

The roles of quantum coherence in thermodynamic processes ( http://arxiv.org/abs/2201.12202v1 )

ライセンス: Link先を確認

Jingyi Chen, Guozhen Su, Jincan Chen, and Shanhe Su

(参考訳) 2つの異なる固有基底ベクトルの重ね合わせに付随する量子コヒーレンスは熱力学において必須であると考えられている。系密度作用素とハミルトニアンの基底ベクトルの拡張として観測因子を記述することにより、コヒーレント因子を決定することができる。スピンの偏差や光子の自発的放出といった有限時間熱力学過程におけるコヒーレンスの役割を明らかにする。その結果,スピン沈降と自然放出過程における熱は,主にコヒーレンスによって生成されることがわかった。

Quantum coherence associated with the superpositions of two different sets of eigenbasis vectors has been regarded as essential in thermodynamics. It is found that coherent factors can be determined by writing observables as an expansion in the basis vectors of the systemic density operator and Hamiltonian. We reveal the roles of coherence in finite-time thermodynamic processes, such as the spin precession and the spontaneous emission of a photon. Results show that the work in the spin precession and the heat in the spontaneous emission process are mainly generated by coherence.

翻訳日:2023-02-27 16:02:05 公開日:2022-01-28

# uofa-truth at factify 2022 : トランスフォーマーとトランスファー学習に基づくマルチモーダルファクトチェック

UofA-Truth at Factify 2022 : Transformer And Transfer Learning Based Multi-Modal Fact-Checking ( http://arxiv.org/abs/2203.07990v1 )

ライセンス: Link先を確認

Abhishek Dhankar, Osmar R. Za\"iane and Francois Bolduc

(参考訳) 特にテキスト、画像、ビデオ、音声を通じて情報を伝達する複数のモードを考える場合、偽ニュースを特定することは非常に難しい作業である。我々は,De-Factify@AAAI2022におけるFACTIFY共有タスクにおいて,複数のモーダルニュースソース(テキストや画像を含む)の自動誤報・誤報検出の問題に,単純かつ効果的に対処する試みを行った。私たちのモデルはF1重み付けスコア74.807%を生成しました。本稿では,共有タスクを行うためのアプローチについて説明する。

Identifying fake news is a very difficult task, especially when considering the multiple modes of conveying information through text, image, video and/or audio. We attempted to tackle the problem of automated misinformation/disinformation detection in multi-modal news sources (including text and images) through our simple, yet effective, approach in the FACTIFY shared task at De-Factify@AAAI2022. Our model produced an F1-weighted score of 74.807%, which was the fourth best out of all the submissions. In this paper we will explain our approach to undertake the shared task.

翻訳日:2023-02-27 15:56:01 公開日:2022-01-28

# 2次元Tiny外窓を用いた3次元量子力学の二重量子化

A Double Quantization for 3d Quantum Mechanics with 2d Tiny Extra Window ( http://arxiv.org/abs/2202.00539v1 )

ライセンス: Link先を確認

Zahra Ghahreman, Mehdi Dehghani, Majid Monemzadeh

(参考訳) 我々は、検出しようとする粒子の既存のコンパクト余剰次元の仮説に基づいて量子力学を構築する。確率関数を導入することにより、粒子の外部2dウィンドウへの遷移を表現する。この関数の一般的な性質について検討し、余分な窓への粒子発生のための長さスケールが与えられる。多様な視点から考えると、新しい長さスケールはプランク定数のほかに別の量子化のための別の量子基準となる。第二級制約系の正準量子化は、所望の量子力学を構築するための方法であり、その中に確率関数が第二級制約の構造に入る。これは、余剰次元の現象を効果的に3次元量子力学にインポートする。この効果的な二重量子論のいくつかの側面が述べられており、場の理論的な視界とは対照的に、余剰次元の量子を機械的に経験することに焦点を当てている。特に、線形微分方程式を解くためのフロベニウス法則を用いて、自由粒子の波動関数とスペクトルの解を作ろうとする。この文脈では、余剰次元の長さスケールは3次元空間に接続された小さな余剰ウィンドウの境界における波動方程式の特異性を特徴付ける。

We construct a quantum mechanics based on the hypothesis of existing compact extra dimensions for a particle that wants to detect it. By introducing a probability function, we express the transition of particle to the extra 2d window. The general properties of this function has been examined and a length scale for occurrence of particle to extra window is given. By a diverse view point we consider that, the new length scale plays another quantum criteria for another quantization, beside the Planck constant. Canonical quantization of second class constrained systems, is our method for constructing the desired quantum mechanics, in which in it the probability function enters in the structure of second class constraints. This import the phenomena of extra dimension to the 3d quantum mechanics, effectively. Some aspects of this effective double quantum theory are mentioned, which one may investigate them more focused to experience extra dimension quantum mechanically in contrast to field theoretic sights. Specially, we try to make solutions for wave function and spectrum of the free particle, by Frobenius prescription for solving linear differential equations. In this context, the length scale of extra dimension characterizes the singularity of the wave equation at the boundary which tiny extra window connected to 3d space.

翻訳日:2023-02-27 15:55:33 公開日:2022-01-28

# 結合線形ポテンシャル上の準安定状態の非断熱崩壊

Nonadiabatic decay of metastable states on coupled linear potentials ( http://arxiv.org/abs/2201.12388v1 )

ライセンス: Link先を確認

Alisher Duspayev, Ansh Shah, Georg Raithel

(参考訳) 反対の傾斜を持つレベル対の交差は、量子粒子の外部自由度に対するポテンシャルエネルギー曲線を形成することができる。本研究では, メタスタブル状態の非断熱的崩壊について, ダイアバティックおよび断熱的表現を用いて検討した。このシステムは単一のスケールされた断熱パラメータ $v$ によって記述される。時間非依存の2成分Schr\"odinger方程式は両表現で解かれ、MSACの非断続寿命は波動関数のフラックス計算とブライト・ウィグナー式から決定され、各MSACの寿命は4つになる。また,両画像における時間依存schr\"odinger方程式を解き,波動関数崩壊によるmsc寿命を導出する。 msac寿命の6つの非摂動的値の集合はよく一致し、アプローチを検証する。断熱パラメータの$V$が約10倍に増加するにつれて、MSAC文字はわずかに変化し、寿命はおよそ10桁になる。いくつかの体制における寿命の$\nu$-dependenceについて論じる。時間依存摂動理論は、非摂動結果から$\lesssim 30\%$を逸脱する近似寿命を得るのに対し、半古典的なランダウ・ツェナートンネル方程式に基づく予測は、研究された$V$と$\nu$の範囲で最大20オフの係数を持つ。結果は、交差、結合ポテンシャルエネルギー曲線に関する量子状態を持つ多くの原子系と分子系に関係している。

Avoided crossings of level pairs with opposite slopes can form potential energy curves for the external degree of freedom of quantum particles. We investigate nonadiabatic decay of metastable states on such avoided crossings (MSACs) using diabatic and adiabatic representations. The system is described by a single scaled adiabaticity parameter, $V$. The time-independent two-component Schr\"odinger equation is solved in both representations, and the nonadiabatic lifetimes of MSACs are determined from a wave-function flux calculation and from the Breit-Wigner formula, leading to four lifetime values for each MSAC. We also solve the time-dependent Schr\"odinger equation in both pictures and derive the MSAC lifetimes from wave-function decay. The sets of six non-perturbative values for the MSAC lifetimes agree well, validating the approaches. As the adiabaticity parameter $V$ is increased by about a factor of ten, the MSAC character transitions from marginally to highly stable, with the lifetimes increasing by about ten orders of magnitude. The $\nu$-dependence of the lifetimes in several regimes is discussed. Time-dependent perturbation theory is found to yield approximate lifetimes that deviate by $\lesssim 30\%$ from the non-perturbative results, while predictions based on the semi-classical Landau-Zener tunneling equation are found to be up to a factor of twenty off, over the ranges of $V$ and $\nu$ studied. The results are relevant to numerous atomic and molecular systems with quantum states on intersecting, coupled potential energy curves.

翻訳日:2023-02-27 15:54:59 公開日:2022-01-28

# 開群系におけるデコヒーレンスフリー部分空間の断熱制御

Adiabatic Control of Decoherence-Free-Subspaces in an Open Collective System ( http://arxiv.org/abs/2201.12379v1 )

ライセンス: Link先を確認

Jarrod T. Reilly, Simon B. J\"ager, John Cooper, Murray J. Holland

(参考訳) 本稿では,脱コヒーレンスフリー部分空間 (DFS) を用いた原子アンサンブルの消散キャビティ内での制御手法を提案する。我々は,アンサンブルの放射振幅に分解的に干渉するキャビティにフィールドを注入することにより,システムのリンドブラッドジャンプ演算子の固有状態を設計できる。従来の断熱的 DFS 提案とは対照的に,提案手法は集団的デコヒーレンスの存在下で DFS を作成する。したがって、量子情報科学やメトロロジーに利用される多粒子の絡み合いを持つ状態を設計することができる。さらに,いわゆるアディアバティック基準から得られるダイアバティック進化の知識を利用する,より最適化された運転方式を示す。これにより、アンサンブル内の原子数に依存しない時間スケールで、非常に高い忠実度を持つ所望の状態へと進化することができる。 DFS固有状態を理論的に設計することにより、この手法は、散逸のみを用いて所望の状態に減衰する従来のスキームよりも高速な状態準備を可能にする。

We propose a method to adiabatically control an atomic ensemble using a decoherence-free subspace (DFS) within a dissipative cavity. We can engineer a specific eigenstate of the system's Lindblad jump operators by injecting a field into the cavity which deconstructively interferes with the emission amplitude of the ensemble. In contrast to previous adiabatic DFS proposals, our scheme creates a DFS in the presence of collective decoherence. We therefore have the ability to engineer states that have high multi-particle entanglements which may be exploited for quantum information science or metrology. We further demonstrate a more optimized driving scheme that utilizes the knowledge of possible diabatic evolution gained from the so-called adiabatic criteria. This allows us to evolve to a desired state with exceptionally high fidelity on a time scale that does not depend on the number of atoms in the ensemble. By engineering the DFS eigenstate adiabatically, our method allows for faster state preparation than previous schemes that rely on damping into a desired state solely using dissipation.

翻訳日:2023-02-27 15:54:30 公開日:2022-01-28

# $\mathbb{Z}_2$対称性を持つ巡回アーベル格子ゲージ理論の電磁双対性

Electric-magnetic duality of $\mathbb{Z}_2$ symmetry enriched cyclic Abelian lattice gauge theory ( http://arxiv.org/abs/2201.12361v1 )

ライセンス: Link先を確認

Zhian Jia, Dagomir Kaszlikowski

(参考訳) キタエフの量子二重モデルはディクグラフ-ウィッテン位相量子場理論(tqft)の格子ゲージ理論による実現であり、その位相的に保護された基底状態空間は位相量子計算と位相量子記憶に広く応用されている。我々は、圏的枠組みにおける巡回アーベル群のモデルの一般化である $\mathbb{z}_2$ 対称性を調べ、明示的なハミルトニアン構成を示す。このモデルは、$\mathbb{Z}_2$対称性リッチトポロジカル位相(SET)の格子実現を提供する。我々は、電磁(EM)双対性対称性が特別な場合である位相のカテゴリー対称性について詳細に論じる。対称性欠陥の側面を, UBFC ($G$-crossed Unitary Braided fusion category) を用いて検討した。また, 対応するいずれの凝縮も決定し, ギャップ付き境界と境界バルク双対性についても検討した。そして、これらのSET相に対するEM双対性の明示的な格子実現を慎重に構築する。最後に、トポロジカル量子計算とトポロジカルメモリ理論におけるそれらの可能性について論じる。

Kitaev's quantum double model is a lattice gauge theoretic realization of Dijkgraaf-Witten topological quantum field theory (TQFT), its topologically protected ground state space has broad applications for topological quantum computation and topological quantum memory. We investigate the $\mathbb{Z}_2$ symmetry enriched generalization of the model for the cyclic Abelian group in a categorical framework and present an explicit Hamiltonian construction. This model provides a lattice realization of the $\mathbb{Z}_2$ symmetry enriched topological (SET) phase. We discuss in detail the categorical symmetry of the phase, for which the electric-magnetic (EM) duality symmetry is a special case. The aspects of symmetry defects are investigated using the $G$-crossed unitary braided fusion category (UBFC). By determining the corresponding anyon condensation, the gapped boundaries and boundary-bulk duality are also investigated. Then we carefully construct the explicit lattice realization of EM duality for these SET phases. Finally, their potential applications in topological quantum computation and topological memory theories are discussed.

翻訳日:2023-02-27 15:53:52 公開日:2022-01-28

# 最近の配車会社合併後のミュンヘン市における交通モードの嗜好性の検討

Exploring Preferences for Transportation Modes in the City of Munich after the Recent Incorporation of Ride-Hailing Companies ( http://arxiv.org/abs/2201.13284v1 )

ライセンス: Link先を確認

Maged Shoman and Ana Tsui Moreno

(参考訳) 近年のライドシェアリング(RH)企業の成長は、多くの点で都市移動に影響を与えている。このようなサービスのメリットに関する広範な主張にもかかわらず、この話題に関する限定的な研究が進められている。本稿では、ミュンヘン交通利用者のrhサービスに対する支払い意欲を評価する。 RH企業から直接データを取得することの難しさに気付き、前述した選好調査が設計された。データセットには500人の通勤者からの回答が含まれている。 RHサービスとそれに似たモード(オートとトランジット)を用いた8kmの旅行シナリオにおけるソシオドモグラフィー特性,現在の旅行行動および交通モードの嗜好を収集した。所得グループ間でrhサービスを使用するための時間とコスト係数を推定するために多項ロジットモデルが用いられ、rhの時間価値(vot)を推定するために使用された。モデルの結果、rhサービスの人気は18歳から39歳までで、自動車の数が少ない大きな世帯や世帯が多かった。高い収入グループは、RHサービスの使用に対してより多くのお金を払っている。ミュンヘン市におけるRHサービスのモーダルスプリットへの影響を検討するため、インクリメンタルロジットを用いた既存のネストロジットモード選択モデルにRHを新しいモードとして組み込んだ。旅行時間、旅行費、VOTは、通勤者がRHと最も近いモードであるメトロを選択する際の選択肢として用いられた。 20のシナリオを4つの異なる混雑レベルと4つの価格レベルで評価し、許容されるコストと時間的トレードオフに対応するために需要を反映した。

The growth of ridehailing (RH) companies over the past few years has affected urban mobility in numerous ways. Despite widespread claims about the benefits of such services, limited research has been conducted on the topic. This paper assesses the willingness of Munich transportation users to pay for RH services. Realizing the difficulty of obtaining data directly from RH companies, a stated preference survey was designed. The dataset includes responses from 500 commuters. Sociodemographic attributes, current travel behavior and transportation mode preference in an 8 km trip scenario using RH service and its similar modes (auto and transit), were collected. A multinomial logit model was used to estimate the time and cost coefficients for using RH services across income groups, which was then used to estimate the value of time (VOT) for RH. The model results indicate RH services popularity among those aged 18 to 39, larger households and households with fewer autos. Higher income groups are also willing to pay more for using RH services. To examine the impact of RH services on modal split in the city of Munich, we incorporated RH as a new mode into an existing nested logit mode choice model using an incremental logit. Travel time, travel cost and VOT were used as measures for the choice commuters make when choosing between RH and its closest mode, metro. A total of 20 scenarios were evaluated at four different congestion levels and four price levels to reflect the demand in response to acceptable costs and time tradeoffs.

翻訳日:2023-02-19 14:34:56 公開日:2022-01-28

# 統計的匿名性:ユーザーを再識別せずに再識別リスクを定量化する

Statistical anonymity: Quantifying reidentification risks without reidentifying users ( http://arxiv.org/abs/2201.12306v1 )

ライセンス: Link先を確認

Gecia Bravo-Hermsdorff, Robert Busa-Fekete, Lee M. Gunderson, Andr\'es Mun\~oz Medina, Umar Syed

(参考訳) データ匿名化は、参加者の再識別を防ぐためのプライバシ保護データリリースに対するアプローチであり、ノイズの多いデータを許容できないアプリケーションにおいて、差分プライバシーに対する重要な代替手段である。リリースデータに$k$-匿名化を強制する既存のアルゴリズムは、匿名化を実行するキュレーターが元のデータに完全にアクセスしたと仮定している。このアクセスを制限する理由は、望ましくないものから実現不可能なものまで様々である。本稿は,k$-匿名性の統計的概念を維持しつつ,キュレーターに置かれる信頼を減らすための,目的,メトリクス,プロトコル,拡張のアイデアを探求する。このようなフレームワークの主な目的として,信頼(キュレーターに提供する情報量)とプライバシ(参加者の匿名性)を提案する。我々は、これらの目標を達成することを目的としたプロトコルのクラスを説明し、プロセスで新たなプライバシー指標を提案し、関連する境界を証明する。最後に、中央キュレーターの必要性を完全に排除するこの作業の自然な拡張について論じる。

Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification, and it is an important alternative to differential privacy in applications that cannot tolerate noisy data. Existing algorithms for enforcing $k$-anonymity in the released data assume that the curator performing the anonymization has complete access to the original data. Reasons for limiting this access range from undesirability to complete infeasibility. This paper explores ideas -- objectives, metrics, protocols, and extensions -- for reducing the trust that must be placed in the curator, while still maintaining a statistical notion of $k$-anonymity. We suggest trust (amount of information provided to the curator) and privacy (anonymity of the participants) as the primary objectives of such a framework. We describe a class of protocols aimed at achieving these goals, proposing new metrics of privacy in the process, and proving related bounds. We conclude by discussing a natural extension of this work that completely removes the need for a central curator.

翻訳日:2023-02-19 14:33:27 公開日:2022-01-28

# TikTokのパーソナライズ要因に関する実証的研究

An Empirical Investigation of Personalization Factors on TikTok ( http://arxiv.org/abs/2201.12271v1 )

ライセンス: Link先を確認

Maximilian Boeker, Aleksandra Urman

(参考訳) tiktokは現在、急速に成長しているソーシャルメディアプラットフォームであり、月間アクティブユーザー数は10億人を超えている。 TikTokのアルゴリズムがプラットフォームの成功とコンテンツの配布に重要であるにもかかわらず、アルゴリズムの実証的な分析はほとんど行われていない。私たちの研究は、この研究ギャップを埋めるための基礎を築いた。当社が開発したカスタムアルゴリズムを用いたsock-puppet監査手法を用いて,tiktok,フォロー機能,いいね!機能へのアクセスに使用される言語とロケーションの効果と,ユーザが特定の投稿を長く見ることによって推奨されるコンテンツがどう変化するかをテストおよび分析した。テストされたすべての要素がTikTokユーザに推奨されるコンテンツに影響を与える証拠を提供する。さらに,フォロー機能の影響が最も強く,追従機能やビデオ視聴率が高いことがわかった。また,tiktokにおけるフィルタバブルの形成と問題コンテンツの拡散の文脈において,本研究の意義について考察する。

TikTok currently is the fastest growing social media platform with over 1 billion active monthly users of which the majority is from generation Z. Arguably, its most important success driver is its recommendation system. Despite the importance of TikTok's algorithm to the platform's success and content distribution, little work has been done on the empirical analysis of the algorithm. Our work lays the foundation to fill this research gap. Using a sock-puppet audit methodology with a custom algorithm developed by us, we tested and analysed the effect of the language and location used to access TikTok, follow- and like-feature, as well as how the recommended content changes as a user watches certain posts longer than others. We provide evidence that all the tested factors influence the content recommended to TikTok users. Further, we identified that the follow-feature has the strongest influence, followed by the like-feature and video view rate. We also discuss the implications of our findings in the context of the formation of filter bubbles on TikTok and the proliferation of problematic content.

翻訳日:2023-02-19 14:33:10 公開日:2022-01-28

# 正当な決定支援とは何か?

What is Legitimate Decision Support? ( http://arxiv.org/abs/2201.12071v1 )

ライセンス: Link先を確認

Yves Meinard, Alexis Tsouki\`as

(参考訳) 意思決定支援(英: decision support)とは、利用可能な理論知識と経験的データに基づいて、問題に直面した意思決定者への勧告を提供する科学と関連する実践である。この活動は数学的な問題の解決とアルゴリズムの認識に関係していると見なされることが多いが、本質的には経験的かつ社会的に枠づけられた活動であり、クライアントとアナリスト、そしてそれらと関係する第三者が重要な役割を果たす。 80年代以降、この意思決定支援の側面である妥当性と正当性を分析するための文献を2つの概念で構成してきた。妥当性はクライアントとアナリストの相互作用に焦点が当てられているが、正当性は、組織的状況、全体的な問題状況、環境、文化、歴史といった、より広い視点を指す。その重要性にも拘わらず、この概念は決定支援の文献にふさわしい関心を受けていない。本論文は,このギャップを埋めることを目的とする。そこで我々は,意思決定支援の文脈において有効な正当性の概念について,他の分野の文献を精査する。本稿では,本論文で見いだされた関連貢献を包含して,意思決定支援の文脈に適応した正当性に関する一般的な理論を提案する。この一般的な理論によれば、正当な意思決定支援介入とは、決定支援提供者が2つの条件を満たす正当化を行うものである。 (i)決定支援提供者の仲介者(有効性条件)を効果的に説得し、 (ii)できるだけ多様で多様な反論(真面目な条件)の活発な解明を中心に組織されている。その概念的単純さにもかかわらず、この意味で理解されている正当性は非常に厳密な要件であり、我々が主張する野心的な研究の道を開く。

Decision support is the science and associated practice that consist in providing recommendations to decision makers facing problems, based on available theoretical knowledge and empirical data. Although this activity is often seen as being concerned with solving mathematical problems and conceiving algorithms, it is essentially an empirical and socially framed activity, where interactions between clients and analysts, and between them and concerned third parties, play a crucial role. Since the 80s, two concepts have structured the literature devoted to analysing this aspect of decision support: validity and legitimacy. Whereas validity is focused on the interactions between the client and the analyst, legitimacy refers to the broader picture: the organisational context, the overall problem situation, the environment, culture, history. Despite its importance, this concept has not received the attention it deserves in the literature in decision support. The present paper aims at filling this gap. For that purpose, we review the literature in other disciplines relevant to elaborate a concept of legitimacy useful in decision support contexts. Based on this review, we propose a general theory of legitimacy, adapted to decision support contexts, encompassing the relevant contributions we found in the literature. According to this general theory, a legitimate decision support intervention is one for which the decision support provider produces a justification that satisfies two conditions: (i) it effectively convinces the decision support provider's interlocutors (effectiveness condition) and (ii) it is organised around the active elicitation of as many and as diverse counterarguments as possible (truthfulness condition). Despite its conceptual simplicity, legitimacy, understood in this sense, is a very exacting requirement, opening ambitious research avenues that we delineate.

翻訳日:2023-02-19 14:32:51 公開日:2022-01-28

# マルチエージェント強化学習における異種エージェントのパラメータ共有

Parameter Sharing For Heterogeneous Agents in Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2005.13625v7 )

ライセンス: Link先を確認

J. K. Terry, Nathaniel Grammel, Sanghyun Son, Benjamin Black

(参考訳) パラメータ共有は、各エージェントが独立して、すべてのポリシー間で完全に共有されたパラメータを持つポリシーを学習するものである。残念ながら、すべてのエージェントが同じポリシーネットワークを共有しているので、異なるポリシーやタスクを学べません。この問題は、観察にエージェント特異的な指標信号を加えることで実験的に回避され、「エージェント表示」と呼ばれる。エージェント表示は制限されているが、修正なしでは、アクション空間や観測空間が不均一な環境にパラメータ共有を適用することはできない。この研究はエージェント指示の概念を形式化し、それが最適ポリシーへの収束を可能にすることを初めて証明する。次に,不均一な観測と行動空間における学習へのパラメータ共有の拡張手法を正式に導入し,これらの手法が最適ポリシーへの収束を可能にすることを示す。最後に,関数を経験的に導入する方法を実験的に検証し,多数の異なるエージェント表示方式のグラフィカルな観測空間に対する経験的有効性について検討した。

Parameter sharing, where each agent independently learns a policy with fully shared parameters between all policies, is a popular baseline method for multi-agent deep reinforcement learning. Unfortunately, since all agents share the same policy network, they cannot learn different policies or tasks. This issue has been circumvented experimentally by adding an agent-specific indicator signal to observations, which we term "agent indication." Agent indication is limited, however, in that without modification it does not allow parameter sharing to be applied to environments where the action spaces and/or observation spaces are heterogeneous. This work formalizes the notion of agent indication and proves that it enables convergence to optimal policies for the first time. Next, we formally introduce methods to extend parameter sharing to learning in heterogeneous observation and action spaces, and prove that these methods allow for convergence to optimal policies. Finally, we experimentally confirm that the methods we introduce function empirically, and conduct a wide array of experiments studying the empirical efficacy of many different agent indication schemes for graphical observation spaces.

翻訳日:2022-11-28 07:53:25 公開日:2022-01-28

# 分散トレーニングにおける最適複雑性

Optimal Complexity in Decentralized Training ( http://arxiv.org/abs/2006.08085v4 )

ライセンス: Link先を確認

Yucheng Lu, Christopher De Sa

(参考訳) 分散化は、並列機械学習システムをスケールアップする有望な方法である。本稿では、確率的非凸設定において、そのような手法の反復複雑性の厳密な下限を提供する。我々の下限は、D-PSGDのような多くの既存の分散トレーニングアルゴリズムの既知の収束率の理論的ギャップを明らかにしている。我々は、この下限がきつく達成可能であることを構築によって証明する。この知見に動機づけられて,我々はさらに,対数ギャップだけで下限を達成する,実用的なゴシップ型分散アルゴリズムであるdetagを提案する。経験的に,画像分類タスクにおけるdetagと他の分散アルゴリズムを比較し,detagがベースライン,特に非シャッフルデータやスパースネットワークよりも高速に収束することを示す。

Decentralization is a promising method of scaling up parallel machine learning systems. In this paper, we provide a tight lower bound on the iteration complexity for such methods in a stochastic non-convex setting. Our lower bound reveals a theoretical gap in known convergence rates of many existing decentralized training algorithms, such as D-PSGD. We prove by construction this lower bound is tight and achievable. Motivated by our insights, we further propose DeTAG, a practical gossip-style decentralized algorithm that achieves the lower bound with only a logarithm gap. Empirically, we compare DeTAG with other decentralized algorithms on image classification tasks, and we show DeTAG enjoys faster convergence compared to baselines, especially on unshuffled data and in sparse networks.

翻訳日:2022-11-21 02:49:07 公開日:2022-01-28

# バイナリ分類のための量子判別器

Quantum Discriminator for Binary Classification ( http://arxiv.org/abs/2009.01235v3 )

ライセンス: Link先を確認

Prasanna Date and Wyatt Smith

(参考訳) 量子コンピュータは、高次元空間において比較的迅速に動作できるユニークな能力を持っている。本研究では,量子コンピュータが高次元空間で動作する能力を活用する量子識別器(Quantum Discriminator)と呼ばれる新しい量子機械学習モデルを提案する。量子判別器は、O(N logN)時間で量子古典ハイブリッドアルゴリズムを用いて訓練され、線形時間で普遍量子コンピュータ上で推論を行う。量子判別器は、ゼロ状態に初期化された予測キュービットと共に所定のデータムから抽出されたバイナリ特徴を入力とし、予測ラベルを出力する。我々は、irisデータセット上での性能を分析し、量子判別器が99%の精度が得られることを示す。

Quantum computers have the unique ability to operate relatively quickly in high-dimensional spaces -- this is sought to give them a competitive advantage over classical computers. In this work, we propose a novel quantum machine learning model called the Quantum Discriminator, which leverages the ability of quantum computers to operate in the high-dimensional spaces. The quantum discriminator is trained using a quantum-classical hybrid algorithm in O(N logN) time, and inferencing is performed on a universal quantum computer in linear time. The quantum discriminator takes as input the binary features extracted from a given datum along with a prediction qubit initialized to the zero state and outputs the predicted label. We analyze its performance on the Iris data set and show that the quantum discriminator can attain 99% accuracy in simulation.

翻訳日:2022-10-22 18:52:13 公開日:2022-01-28

# 交互K平均によるビクラスタリング

Biclustering with Alternating K-Means ( http://arxiv.org/abs/2009.04550v3 )

ライセンス: Link先を確認

Nicolas Fraiman, Zichao Li

(参考訳) ビクラスタリングは、データマトリックスの行と列を、サブグループ内の行と列が同様のパターンを示すように、異なるサブグループに同時にクラスタ化するタスクである。本稿では,ブロック対角二クラスターの生成事例について考察する。我々は,経験的クラスタリングリスクを最小限に抑えるというアイデアに基づいて,ビクラスタリング問題の新たな定式化を行う。経験的クラスタリングリスクに関して一貫性のある結果を開発し,証明する。最適化問題は本質的に組合せ的であるため、大域的な最小値の探索は計算的に難解である。そこで本研究では,カラムと行間のk-meansクラスタリングアルゴリズムの適応バージョンを交互に使用することにより,局所最小値を求める,シンプルで斬新なアルゴリズムを提案する。我々は,シミュレーションデータと実世界の遺伝子発現データセットを用いて,アルゴリズムの性能を他のビクラスタリング手法と比較した。その結果,本アルゴリズムは,データ中の有意義な構造を検知し,様々な設定や状況において競合する2クラスタリング手法より優れていることを示す。

Biclustering is the task of simultaneously clustering the rows and columns of the data matrix into different subgroups such that the rows and columns within a subgroup exhibit similar patterns. In this paper, we consider the case of producing block-diagonal biclusters. We provide a new formulation of the biclustering problem based on the idea of minimizing the empirical clustering risk. We develop and prove a consistency result with respect to the empirical clustering risk. Since the optimization problem is combinatorial in nature, finding the global minimum is computationally intractable. In light of this fact, we propose a simple and novel algorithm that finds a local minimum by alternating the use of an adapted version of the k-means clustering algorithm between columns and rows. We evaluate and compare the performance of our algorithm to other related biclustering methods on both simulated data and real-world gene expression data sets. The results demonstrate that our algorithm is able to detect meaningful structures in the data and outperform other competing biclustering methods in various settings and situations.

翻訳日:2022-10-20 08:56:48 公開日:2022-01-28

# 衝突バイアスによる分類器フェアネスの評価

Assessing Classifier Fairness with Collider Bias ( http://arxiv.org/abs/2010.03933v2 )

ライセンス: Link先を確認

Zhenlong Xu (1), Ziqi Xu (1), Jixue Liu (1), Debo Cheng (1), Jiuyong Li (1), Lin Liu (1), Ke Wang (2) ((1) STEM, Univsersity of South Austrlia, Adelaide, Australia, (2) Simon Frasier University, Burnaby, Canada) Ziqi Xu and Zhenlong Xu contributed equally to this paper

(参考訳) 日々の意思決定プロセスにおける機械学習技術の適用が増えているため、アルゴリズムによる意思決定の公平性が懸念されている。本稿では, 公平性評価に拍車をかける衝突型バイアスの問題と, 衝突型バイアスを回避する公平性評価を導くための定理を考案する。監査機関が訓練した分類器を監査する実世界の応用について検討する。本研究では, 開発した定理を用いて非バイアス評価アルゴリズムを提案する。実験およびシミュレーションにより, 提案手法は, 評価において有意な衝突バイアスを低減し, 訓練された分類器の監査に有望であることが示された。

The increasing application of machine learning techniques in everyday decision-making processes has brought concerns about the fairness of algorithmic decision-making. This paper concerns the problem of collider bias which produces spurious associations in fairness assessment and develops theorems to guide fairness assessment avoiding the collider bias. We consider a real-world application of auditing a trained classifier by an audit agency. We propose an unbiased assessment algorithm by utilising the developed theorems to reduce collider biases in the assessment. Experiments and simulations show the proposed algorithm reduces collider biases significantly in the assessment and is promising in auditing trained classifiers.

翻訳日:2022-10-09 11:23:08 公開日:2022-01-28

# 検索:バイリンガル辞書がニューラルマシン翻訳を改善する

Look It Up: Bilingual Dictionaries Improve Neural Machine Translation ( http://arxiv.org/abs/2010.05997v2 )

ライセンス: Link先を確認

Xing Jie Zhong, and David Chiang

(参考訳) ニューラルマシン翻訳(nmt)の品質は向上しているが、希少な単語が問題となっている。人間にとって、レアワード問題の解法は、長い間辞書であったが、辞書は直接NMTに組み込むことはできない。本稿では,辞書の定義をレアな単語に"アタッチ"する新しい手法について述べる。二言語辞書を用いて最大1.8 bleuの改善を示す。

Despite advances in neural machine translation (NMT) quality, rare words continue to be problematic. For humans, the solution to the rare-word problem has long been dictionaries, but dictionaries cannot be straightforwardly incorporated into NMT. In this paper, we describe a new method for "attaching" dictionary definitions to rare words so that the network can learn the best way to use them. We demonstrate improvements of up to 1.8 BLEU using bilingual dictionaries.

翻訳日:2022-10-08 05:21:16 公開日:2022-01-28

# コードとテキストによる計算可能科学モデルの自動作成と人力支援

Automated Creation and Human-assisted Curation of Computable Scientific Models from Code and Text ( http://arxiv.org/abs/2202.13739v1 )

ライセンス: Link先を確認

Varish Mulwad, Andrew Crapo, Vijay S. Kumar, James Jobin, Alfredo Gabaldon, Nurali Virani, Sharad Dixit, Narendra Joshi

(参考訳) 科学モデルは複雑なシステムの振る舞いをよりよく理解し予測するための鍵を握る。科学的モデルの最も包括的な表現は、そのユーザビリティを支える重要な仮定やパラメータを含むが、通常は関連するソースコードやドキュメントに埋め込まれ、様々な(潜在的に時代遅れの)プログラミングプラクティスや言語が用いられる。ドメインの専門家は、コードに精通していない場合、科学的モデルの実装を完全に理解することができません。さらに、急速な研究と開発イテレーションは、絶え間なく進化する科学モデルコードベースに追いつくのを難しくします。これらの課題に対処するため、我々は、関連するインラインコメントや外部文書の文脈でモデルのコードを解析する計算可能な科学モデルの知識グラフの自動作成と人力によるキュレーションシステムを開発する。本システムでは,知識駆動型およびデータ駆動型アプローチを用いて,コードや方程式からテキスト文書から,ドメイン用語を用いた意味論的注釈モデルまで,関連する概念を識別・抽出する。これらのモデルは実行可能なpython関数に変換され、さらに複雑なワークフローに構成され、異なる形式のドメイン駆動質問に答えることができる。我々は、nasaのhypersonic aerospaces webサイトから派生したコードと関連するテキストのデータセットを用いて実験結果を示す。

Scientific models hold the key to better understanding and predicting the behavior of complex systems. The most comprehensive manifestation of a scientific model, including crucial assumptions and parameters that underpin its usability, is usually embedded in associated source code and documentation, which may employ a variety of (potentially outdated) programming practices and languages. Domain experts cannot gain a complete understanding of the implementation of a scientific model if they are not familiar with the code. Furthermore, rapid research and development iterations make it challenging to keep up with constantly evolving scientific model codebases. To address these challenges, we develop a system for the automated creation and human-assisted curation of a knowledge graph of computable scientific models that analyzes a model's code in the context of any associated inline comments and external documentation. Our system uses knowledge-driven as well as data-driven approaches to identify and extract relevant concepts from code and equations from textual documents to semantically annotate models using domain terminology. These models are converted into executable Python functions and then can further be composed into complex workflows to answer different forms of domain-driven questions. We present experimental results obtained using a dataset of code and associated text derived from NASA's Hypersonic Aerodynamics website.

翻訳日:2022-03-06 13:08:17 公開日:2022-01-28

# ディープラーニングを活用した自動能力評価

Towards automated Capability Assessment leveraging Deep Learning ( http://arxiv.org/abs/2202.04051v1 )

ライセンス: Link先を確認

Raoul Sch\"onhof and Manuel Fechter

(参考訳) 製造業における経済効率の向上を目指して、自動化の度合いの向上が鍵となる。しかしながら、専用プロセスのための自動組立ソリューションの技術的実現可能性を評価することは困難であり、与えられた製品部品の形状によってしばしば決定される。自動化の実現可能性に関する決定的な基準は、単一部分の分離と分離、最終位置でのコンポーネントの自己調整の能力である。この実現可能性を評価するために,Fraunhofer 研究者によるアンケートに基づく評価手法を開発した。しかし、結果は、単一のエンジニアが評価を行うという暗黙の知識と経験に強く依存する。本稿では,voxelizationを用いた評価を自動化するソフトウェアツールneurocadを提案する。この手法によりCADファイルに基づくディープラーニングにより,抽象的かつ生産的なジオメトリ機能の評価が可能となる。

Aiming for a higher economic efficiency in manufacturing, an increased degree of automation is a key enabler. However, assessing the technical feasibility of an automated assembly solution for a dedicated process is difficult and often determined by the geometry of the given product parts. Among others, decisive criterions of the automation feasibility are the ability to separate and isolate single parts or the capability of component self-alignment in final position. To assess the feasibility, a questionnaire based evaluation scheme has been developed and applied by Fraunhofer researchers. However, the results strongly depend on the implicit knowledge and experience of the single engineer performing the assessment. This paper presents NeuroCAD, a software tool that automates the assessment using voxelization techniques. The approach enables the assessment of abstract and production relevant geometries features through deep-learning based on CAD files.

翻訳日:2022-02-13 14:54:28 公開日:2022-01-28

# (参考訳) 画像分類のための低ランク特徴に基づく二重変換行列学習

Low-rank features based double transformation matrices learning for image classification ( http://arxiv.org/abs/2201.12351v1 )

ライセンス: CC BY 4.0

Yu-Hong Cai, Xiao-Jun Wu, Zhe Chen

(参考訳) 線形回帰は分類タスクで広く使われている教師付き手法である。分類タスクに線形回帰を適用するために,回帰目標を緩和する手法を提案した。しかし、この手法に基づく手法は、データに含まれる複雑な情報によって単一の変換行列の圧力を無視する。この場合、単一の変換行列はフレキシブルな射影を提供するには厳密すぎるため、変換行列に緩和を導入する必要がある。本稿では,潜在低ランク特徴抽出に基づく二重変換行列学習手法を提案する。中心となる考え方は、緩和のために二重変換行列を使い、学習した主特徴と正則な特徴を2方向からラベル空間に共同投影し、単一の変換行列の圧力を共有することである。まず、低ランク特徴を潜在低ランク表現(latlrr)法により学習し、2方向から元のデータを処理する。このプロセスでは、スパースノイズも分離され、射影学習に対する干渉がある程度軽減される。そして、2つの変換行列を導入して2つの特徴を別々に処理し、分類に有用な情報を抽出する。最後に、2つの変換行列は代替最適化法により容易に得ることができる。このような処理により,サンプル中に大量の冗長情報が含まれている場合でも,分類が容易な投影結果を得ることができる。複数のデータセットに対する実験は、特に複雑なシナリオにおいて、分類のためのアプローチの有効性を示す。

Linear regression is a supervised method that has been widely used in classification tasks. In order to apply linear regression to classification tasks, a technique for relaxing regression targets was proposed. However, methods based on this technique ignore the pressure on a single transformation matrix due to the complex information contained in the data. A single transformation matrix in this case is too strict to provide a flexible projection, thus it is necessary to adopt relaxation on transformation matrix. This paper proposes a double transformation matrices learning method based on latent low-rank feature extraction. The core idea is to use double transformation matrices for relaxation, and jointly projecting the learned principal and salient features from two directions into the label space, which can share the pressure of a single transformation matrix. Firstly, the low-rank features are learned by the latent low rank representation (LatLRR) method which processes the original data from two directions. In this process, sparse noise is also separated, which alleviates its interference on projection learning to some extent. Then, two transformation matrices are introduced to process the two features separately, and the information useful for the classification is extracted. Finally, the two transformation matrices can be easily obtained by alternate optimization methods. Through such processing, even when a large amount of redundant information is contained in samples, our method can also obtain projection results that are easy to classify. Experiments on multiple data sets demonstrate the effectiveness of our approach for classification, especially for complex scenarios.

翻訳日:2022-02-05 09:03:18 公開日:2022-01-28

# (参考訳) LULC画像解析のための3次元可視化と空間データマイニング

3D Visualization and Spatial Data Mining for Analysis of LULC Images ( http://arxiv.org/abs/2202.00123v1 )

ライセンス: CC BY 4.0

B. G. Kodge

(参考訳) 本研究では,3次元可視化における土地利用土地被覆(LUCL)画像解析のための新しいツールの開発を試みた。本研究は主に高分解能lc衛星画像の空間データマイニング技術を用いて行う。特徴空間の可視化は、画像データのパターンの探索と分類過程と関連する不確実性に関する洞察を可能にする。視覚的データマイニングは、ユーザーが分類プロセスに関与し、結果に対する自信を高め、理解することができるため、画像分類に付加価値を提供する。本研究では,lucl衛星画像の視覚データマイニング(vdm)のための画像分割,k-meansクラスタリング,および3次元可視化ツールの試作を行った。この体積に基づく表現は、特徴空間を球面またはボクセルに分割する。可視化ツールは,インド・マハラシュトラ州ラトゥル地区の高解像度LULC画像の分類研究において,サンプルデータとして用いられている。

The present study is an attempt made to create a new tool for the analysis of Land Use Land Cover (LUCL) images in 3D visualization. This study mainly uses spatial data mining techniques on high resolution LULC satellite imagery. Visualization of feature space allows exploration of patterns in the image data and insight into the classification process and related uncertainty. Visual Data Mining provides added value to image classifications as the user can be involved in the classification process providing increased confidence in and understanding of the results. In this study, we present a prototype of image segmentation, K-Means clustering and 3D visualization tool for visual data mining (VDM) of LUCL satellite imagery into volume visualization. This volume based representation divides feature space into spheres or voxels. The visualization tool is showcased in a classification study of high-resolution LULC imagery of Latur district (Maharashtra state, India) is used as sample data.

翻訳日:2022-02-05 09:01:17 公開日:2022-01-28

# (参考訳) テキスト分類のための集合型アクティブラーニングとそのオンラインソーシャルメディアへの応用

Dominant Set-based Active Learning for Text Classification and its Application to Online Social Media ( http://arxiv.org/abs/2202.00540v1 )

ライセンス: CC BY 4.0

Toktam A. Oghaz, Ivan Garibay

(参考訳) オンラインソーシャルメディアにおける自然言語処理(NLP)の最近の進歩は、明らかに大規模なデータセットに負っている。しかし、大量のテキストデータポイント(例えばツイート)のラベル付け、保存、処理は依然として困難である。それに加えて、ヘイトスピーチ検出などのアプリケーションでは、攻撃的コンテンツを含む十分に大きなデータセットをラベル付けすることは、人間のアノテータに対して精神的および感情的に課税することができる。したがって、ラベル付きデータポイントを著しく少ないものにできるNLP手法は非常に興味深い。本稿では,最小のアノテーションコストで大規模未ラベルコーパスのトレーニングに使用できる,プールベースのアクティブラーニング手法を提案する。そこで我々は,局所クラスタ群を特徴空間に配置する手法を提案する。これらの集合はデータの最大結合構造を表す。すると、支配的な集合のどれにも属さないサンプルは、局所クラスタの境界を表すため、モデルのトレーニングに使用されるように選択され、分類することがより困難になる。提案手法は,データセットに依存しないパラメータを持たず,完全なトレーニングデータと同等の分類精度をほぼ達成でき,データポイントも大幅に少ない。さらに,本手法は,最先端のアクティブ学習戦略と比較して高い性能を実現する。さらに,提案アルゴリズムは,不確実性に基づくスコアなどの従来のアクティブな学習スコアを選択基準に組み込むことができる。異なるデータセットと異なるニューラルネットワークアーキテクチャを用いて,本手法の有効性を示す。

Recent advances in natural language processing (NLP) in online social media are evidently owed to large-scale datasets. However, labeling, storing, and processing a large number of textual data points, e.g., tweets, has remained challenging. On top of that, in applications such as hate speech detection, labeling a sufficiently large dataset containing offensive content can be mentally and emotionally taxing for human annotators. Thus, NLP methods that can make the best use of significantly less labeled data points are of great interest. In this paper, we present a novel pool-based active learning method that can be used for the training of large unlabeled corpus with minimum annotation cost. For that, we propose to find the dominant sets of local clusters in the feature space. These sets represent maximally cohesive structures in the data. Then, the samples that do not belong to any of the dominant sets are selected to be used to train the model, as they represent the boundaries of the local clusters and are more challenging to classify. Our proposed method does not have any parameters to be tuned, making it dataset-independent, and it can approximately achieve the same classification accuracy as full training data, with significantly fewer data points. Additionally, our method achieves a higher performance in comparison to the state-of-the-art active learning strategies. Furthermore, our proposed algorithm is able to incorporate conventional active learning scores, such as uncertainty-based scores, into its selection criteria. We show the effectiveness of our method on different datasets and using different neural network architectures.

翻訳日:2022-02-05 08:55:14 公開日:2022-01-28

# (参考訳) プライベート(ディープ)学習におけるバウンディングトレーニングデータ再構成

Bounding Training Data Reconstruction in Private (Deep) Learning ( http://arxiv.org/abs/2201.12383v1 )

ライセンス: CC BY 4.0

Chuan Guo, Brian Karrer, Kamalika Chaudhuri, Laurens van der Maaten

(参考訳) 差分プライバシーは、MLにおけるデータ漏洩を防ぐデファクト方法として広く受け入れられており、従来の知恵は、プライバシ攻撃に対する強力な保護を提供することを示している。しかし、既存のDPのセマンティックな保証は、相手の能力を過大評価する可能性があり、メンバーシップステータス自体が非感受性である場合には適用できないメンバーシップ推論に焦点を当てている。本稿では,形式的脅威モデルの下でのトレーニングデータ再構成攻撃に対するDP機構の最初のセマンティック保証を導出する。我々は,renyi differential privacyとfisher information leakという2つの異なるプライバシー会計手法が,データ復元攻撃に対して強い意味的保護を提供することを示した。

Differential privacy is widely accepted as the de facto method for preventing data leakage in ML, and conventional wisdom suggests that it offers strong protection against privacy attacks. However, existing semantic guarantees for DP focus on membership inference, which may overestimate the adversary's capabilities and is not applicable when membership status itself is non-sensitive. In this paper, we derive the first semantic guarantees for DP mechanisms against training data reconstruction attacks under a formal threat model. We show that two distinct privacy accounting methods -- Renyi differential privacy and Fisher information leakage -- both offer strong semantic protection against data reconstruction attacks.

翻訳日:2022-02-05 08:41:42 公開日:2022-01-28

# (参考訳) 加齢黄斑変性診断のための機械学習アルゴリズムの開発

Developing a Machine-Learning Algorithm to Diagnose Age-Related Macular Degeneration ( http://arxiv.org/abs/2201.12384v1 )

ライセンス: CC BY 4.0

Ananya Dua, Pham Hung Minh, Sajid Fahmid, Shikhar Gupta, Sophia Zheng, Vanessa Moyo, Yanran Elisa Xue

(参考訳) 現在、40歳以上の1200万人以上が眼疾患を患っている。最も一般的には、高齢の患者は加齢に伴う黄斑変性(網膜の劣化による中心視のぼやけを引き起こす眼疾患)の影響を受けやすい。前者は、複雑で高価な画像ソフトウェアでしか検出できず、目視検査が行われ、未治療の眼疾患を持つかなりの集団を残し、完全な視力喪失のリスクを負っている。眼疾患に対する機械学習アルゴリズムの使用が提案されている。しかしながら、これらのモデルの開発は、モデル性能を最大化するための適切なモデルとトレーニングパラメータに関する理解の欠如によって制限される。本研究では,n が 0, -1, -2, ... -6 である場合の学習速度 1 * 10^n の6つのモデルを生成し,各モデルに対する f1 スコアを算出した。分析の結果、サンプルの不均衡は機械学習モデルのトレーニングにおいて重要な課題であり、モデル予測性能の真の改善とはならない、トレーニングコストの騙し込みの改善をもたらす可能性があることが示された。この病気の幅広い影響と悪影響を考慮すると、我々は同じことを処理するための機械学習アルゴリズムを開発した。 5000人以上の患者による眼疾患データセットと、その感染した目の画像に基づいて、我々のモデルを訓練した。将来的には、このモデルが特に未資源の地域で広く使われ、眼疾患の診断や人間性の改善に活用されることを願っています。

Today, more than 12 million people over the age of 40 suffer from ocular diseases. Most commonly, older patients are susceptible to age related macular degeneration, an eye disease that causes blurring of the central vision due to the deterioration of the retina. The former can only be detected through complex and expensive imaging software, markedly a visual field test; this leaves a significant population with untreated eye disease and holds them at risk for complete vision loss. The use of machine learning algorithms has been proposed for treating eye disease. However, the development of these models is limited by a lack of understanding regarding appropriate model and training parameters to maximize model performance. In our study, we address these points by generating 6 models, each with a learning rate of 1 * 10^n where n is 0, -1, -2, ... -6, and calculated a f1 score for each of the models. Our analysis shows that sample imbalance is a key challenge in training of machine learning models and can result in deceptive improvements in training cost which does not translate to true improvements in model predictive performance. Considering the wide ranging impact of the disease and its adverse effects, we developed a machine learning algorithm to treat the same. We trained our model on varying eye disease datasets consisting of over 5000 patients, and the pictures of their infected eyes. In the future, we hope this model is used extensively, especially in areas that are under-resourced, to better diagnose eye disease and improve well being for humanity.

翻訳日:2022-02-05 07:53:49 公開日:2022-01-28

# (参考訳) マルチモーダル心内画像分割のための非教師なし領域適応

Few-shot Unsupervised Domain Adaptation for Multi-modal Cardiac Image Segmentation ( http://arxiv.org/abs/2201.12386v1 )

ライセンス: CC BY 4.0

Mingxuan Gu, Sulaiman Vesal, Ronak Kosti, Andreas Maier

(参考訳) 非教師なしドメイン適応(UDA)手法は、ラベル付けされていないターゲットドメインとラベル付けされたソースドメインデータを使用することで、ソースとターゲットドメイン間のギャップを減らすことを目的としている。これにより、新しいドメインに対するUDAメソッドの開発が制限される。本稿では,1つの未ラベル患者サンプルのみを利用できる現実的なシナリオにおいて,UDAの可能性を探る。これをマイショット非教師なしドメイン適応(fuda)と呼ぶ。まず、ソース画像からターゲットスタイルの画像を生成し、ランダム適応インスタンス正規化(rain)のある単一のターゲット患者から多様なターゲットスタイルを探索する。そして、生成された対象画像に教師付きでセグメント化ネットワークを訓練する。実験の結果,FUDAはベースラインに比べて目標領域でのDiceスコアの0.33向上し,より厳密なワンショット設定でDiceスコアの0.28向上を達成できた。私たちのコードは \url{https://github.com/MingxuanGu/Few-shot-UDA} で利用可能です。

Unsupervised domain adaptation (UDA) methods intend to reduce the gap between source and target domains by using unlabeled target domain and labeled source domain data, however, in the medical domain, target domain data may not always be easily available, and acquiring new samples is generally time-consuming. This restricts the development of UDA methods for new domains. In this paper, we explore the potential of UDA in a more challenging while realistic scenario where only one unlabeled target patient sample is available. We call it Few-shot Unsupervised Domain adaptation (FUDA). We first generate target-style images from source images and explore diverse target styles from a single target patient with Random Adaptive Instance Normalization (RAIN). Then, a segmentation network is trained in a supervised manner with the generated target images. Our experiments demonstrate that FUDA improves the segmentation performance by 0.33 of Dice score on the target domain compared with the baseline, and it also gives 0.28 of Dice score improvement in a more rigorous one-shot setting. Our code is available at \url{https://github.com/MingxuanGu/Few-shot-UDA}.

翻訳日:2022-02-05 07:49:36 公開日:2022-01-28

# (参考訳) DoubleU-Net++:Vertebraeセグメンテーションのための爆発的マルチスケール機能を備えたアーキテクチャ

DoubleU-Net++: Architecture with Exploit Multiscale Features for Vertebrae Segmentation ( http://arxiv.org/abs/2201.12389v1 )

ライセンス: CC BY 4.0

Simindokht Jahangard, Mahdi Bonyani, Abbas Khosravi

(参考訳) 脊椎の正確な分節は、外科医を支援する様々な医学的応用(例えば遠隔手術)において重要な前提条件である。ディープニューラルネットワークの開発が成功した後、最近の研究は脊椎分節の本質的な規則に焦点を当てている。以前の作業には多数のパラメータが含まれており、セグメンテーションは1つのビューに制限されている。 DoubleU-Netに触発されてDoubleU-Net++と呼ばれる新しいモデルを提案し、DensNetを特徴抽出モジュールとして、CBAM(Convolutional Block Attention on Module)から特別注意モジュールを、抽出機能を改善するためにPraamid Squeeze Attention(PSA)モジュールを採用する。我々はverse2020とxvertsegデータセットの3つの異なるビュー(sagittal, coronal, axial)で提案モデルを評価する。最新の研究と比較すると,我々のアーキテクチャはより高速に訓練され,評価として高い精度,リコール,およびf1-score(4～6%)を達成し,またverse2020データセットでは矢状図が94%以上,コロナビューが94%以上,軸線ビューが93%以上となった。また,xvertsegデータセットでは,矢状視では97%,コロナ視では93%,軸視では96%以上の精度,リコール,f1-scoreを達成している。

Accurate segmentation of the vertebra is an important prerequisite in various medical applications (E.g. tele surgery) to assist surgeons. Following the successful development of deep neural networks, recent studies have focused on the essential rule of vertebral segmentation. Prior works contain a large number of parameters, and their segmentation is restricted to only one view. Inspired by DoubleU-Net, we propose a novel model named DoubleU-Net++ in which DensNet as feature extractor, special attention module from Convolutional Block Attention on Module (CBAM) and, Pyramid Squeeze Attention (PSA) module are employed to improve extracted features. We evaluate our proposed model on three different views (sagittal, coronal, and axial) of VerSe2020 and xVertSeg datasets. Compared with state-of-the-art studies, our architecture is trained faster and achieves higher precision, recall, and F1-score as evaluation (imporoved by 4-6%) and the result of above 94% for sagittal view and above 94% for both coronal view and above 93% axial view were gained for VerSe2020 dataset, respectively. Also, for xVertSeg dataset, we achieved precision, recall,and F1-score of above 97% for sagittal view, above 93% for coronal view ,and above 96% for axial view.

翻訳日:2022-02-05 07:44:29 公開日:2022-01-28

# (参考訳) com-pound-protein相互作用予測法の性能評価に関する知見

Insights into performance evaluation of com-pound-protein interaction prediction methods ( http://arxiv.org/abs/2202.00001v1 )

ライセンス: CC BY-SA 4.0

Adiba Yaseen (1), Imran Amin (2), Naeem Akhter (1), Asa Ben-Hur (3) and Fayyaz Minhas (4) ((1) Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, Pakistan,(2) National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan,(3) Department of Computer Science, Colorado State University, Fort Collins, USA (4) Tissue Image Analytics Centre, Department of Computer Science, University of Warwick, Coven-try, UK)

(参考訳) モチベーション: 複合タンパク質相互作用(CPI)の機械学習による予測は, 薬物設計, スクリーニング, 再資源化研究において重要であり, 湿式ラボアッセイの効率性と費用対効果を向上させることができる。近年,cpi予測因子を報告する多くの研究論文が公表されているが,モデル性能の楽観的評価に繋がる実験設計の問題点が数多く報告されている。結果:本論文では,既存の研究で見落としているCPI予測器の一般化に影響を及ぼすいくつかの重要な要因について分析する。クロスバリデーションにおけるトレーニングとテスト例の類似性 2. 実験的に検証された否定例がない場合に、否定例を生成するための戦略。 3. 評価プロトコルと性能指標の選択と大規模複合ライブラリのスクリーニングにおけるCPI予測器の現実利用との整合性既存の最先端手法(CPI-NN)とカーネルベースのアプローチの両方を用いて、CPI予測器の予測性能の評価には、トレーニングとテスト例の類似性について慎重に検討する必要があることが分かった。また、訓練や性能評価のための合成陰性例生成のためのランダムペアリングは、既存の研究で使われているより洗練された戦略と比較して、より一般化された性能を持つモデルに結果をもたらすことを示した。さらに、カーネルベースのアプローチは、そのシンプルな設計にもかかわらず、CPI-NNの予測性能を上回ることが判明した。提案したモデルを用いてSARS-CoV-2 SpikeやHuman ACE2などのタンパク質の複合スクリーニングを行い,そのトップヒットを裏付ける強い証拠を見出した。可用性: https://github.com/adibayaseen/HKRCPI Contact: Fayyaz.minhas@warwick.ac.uk

Motivation: Machine learning based prediction of compound-protein interactions (CPIs) is important for drug design, screening and repurposing studies and can improve the efficiency and cost-effectiveness of wet lab assays. Despite the publication of many research papers reporting CPI predictors in the recent years, we have observed a number of fundamental issues in experiment design that lead to over optimistic estimates of model performance. Results: In this paper, we analyze the impact of several important factors affecting generalization perfor-mance of CPI predictors that are overlooked in existing work: 1. Similarity between training and test examples in cross-validation 2. The strategy for generating negative examples, in the absence of experimentally verified negative examples. 3. Choice of evaluation protocols and performance metrics and their alignment with real-world use of CPI predictors in screening large compound libraries. Using both an existing state-of-the-art method (CPI-NN) and a proposed kernel based approach, we have found that assessment of predictive performance of CPI predictors requires careful con-trol over similarity between training and test examples. We also show that random pairing for gen-erating synthetic negative examples for training and performance evaluation results in models with better generalization performance in comparison to more sophisticated strategies used in existing studies. Furthermore, we have found that our kernel based approach, despite its simple design, exceeds the prediction performance of CPI-NN. We have used the proposed model for compound screening of several proteins including SARS-CoV-2 Spike and Human ACE2 proteins and found strong evidence in support of its top hits. Availability: Code and raw experimental results available at https://github.com/adibayaseen/HKRCPI Contact: Fayyaz.minhas@warwick.ac.uk

翻訳日:2022-02-05 07:35:36 公開日:2022-01-28

# (参考訳) Syfer: プライベートデータリリースのための神経障害

Syfer: Neural Obfuscation for Private Data Release ( http://arxiv.org/abs/2201.12406v1 )

ライセンス: CC BY 4.0

Adam Yala, Victor Quach, Homa Esfahanizadeh, Rafael G. L. D'Oliveira, Ken R. Duffy, Muriel M\'edard, Tommi S. Jaakkola, Regina Barzilay

(参考訳) プライバシと予測ユーティリティのバランスは、医療におけるマシンラーニングの中心的な課題である。本稿では,再同定攻撃から保護する神経難読化法syferを開発した。 syferはトレーニングされた層をランダムなニューラルネットワークで構成し、元のデータ(例えばx線)をエンコードすると同時に、エンコードされたデータから診断を予測する能力を維持する。エンコーダのランダム性は、データ所有者のプライベートキーとして振る舞う。 1つの画像(ゲスワーク)を再特定するのに必要な攻撃者の数として、プライバシーを定量化する。推測作業を推定するためのコントラスト学習アルゴリズムを提案する。 DP画像などの差分的プライベートな手法が,実用性を著しく損なうことなく,プライバシを獲得できることを実証的に示す。対照的に、Syferはユーティリティを保ちながら強力なプライバシーを実現している。例えば、DP-image、Syfer、およびオリジナルのデータで構築されたX線分類器は平均AUCを0.53、0.78、0.86とする。

Balancing privacy and predictive utility remains a central challenge for machine learning in healthcare. In this paper, we develop Syfer, a neural obfuscation method to protect against re-identification attacks. Syfer composes trained layers with random neural networks to encode the original data (e.g. X-rays) while maintaining the ability to predict diagnoses from the encoded data. The randomness in the encoder acts as the private key for the data owner. We quantify privacy as the number of attacker guesses required to re-identify a single image (guesswork). We propose a contrastive learning algorithm to estimate guesswork. We show empirically that differentially private methods, such as DP-Image, obtain privacy at a significant loss of utility. In contrast, Syfer achieves strong privacy while preserving utility. For example, X-ray classifiers built with DP-image, Syfer, and original data achieve average AUCs of 0.53, 0.78, and 0.86, respectively.

翻訳日:2022-02-05 07:19:26 公開日:2022-01-28

# (参考訳) モバイル介入のためのネットワークレストレストレストマルチアームバンディット

Networked Restless Multi-Armed Bandits for Mobile Interventions ( http://arxiv.org/abs/2201.12408v1 )

ライセンス: CC BY 4.0

Han-Ching Ou, Christoph Siebenbrunner, Jackson Killian, Meredith B Brooks, David Kempe, Yevgeniy Vorobeychik, Milind Tambe

(参考訳) 幅広い種類のモバイル介入問題に動機づけられ,ネットワーク効果を持つレストレス・マルチアーム・バンディット(rmabs)を提案し,検討した。我々のモデルでは、アームは部分的にリチャージされ、グラフを介して接続されているため、一方のアームを引くことで隣接するアームの状態も改善され、ネットワーク効果のない完全リチャージバンディットの設定が大幅に拡張される。モバイル介入では、ネットワーク効果は通常の人口移動(家と仕事の通勤など)によって生じることがある。 RMABのネットワーク効果は,既存の解法では考慮されていない強い報酬結合をもたらすことを示す。本稿では,ネットワーク化RMABに対する新しい解法を提案し,介入効果の構造に対する自然な仮定の下で生じる凹凸特性を利用する。理想化された環境でのアプローチの最適性に十分な条件を提供し,実世界グラフを用いた3つのモバイル介入領域における最先端のベースラインを経験的に上回っていることを示す。

Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.

翻訳日:2022-02-05 06:53:05 公開日:2022-01-28

# (参考訳) 社会会話におけるエンティティ中心コンテキスト追跡への統一的アプローチ

A Unified Approach to Entity-Centric Context Tracking in Social Conversations ( http://arxiv.org/abs/2201.12409v1 )

ライセンス: CC BY 4.0

Ulrich R\"uckert, Srinivas Sunkara, Abhinav Rastogi, Sushant Prakash, Pranav Khaitan

(参考訳) 人間と人間の会話では、コンテキストトラッキングは重要なエンティティを識別し、その特性と関係を追跡する。これはスロットタグ、コア参照解決、複数の参照の解決、エンティティリンクなど、いくつかのサブタスクを含む難しい問題である。本稿では,これまで述べたエンティティ参照,それらの特性,それらの関係を含むエンティティリポジトリによって,会話コンテキストを表現したエンドツーエンドモデリングタスクとして,この問題にアプローチする。リポジトリはターンバイターンで更新されるため、長い会話であっても、トレーニングと推論が計算的に効率的になる。本稿は,この枠組みを2つの方法で検討するための基礎研究を行う。まず、人間と位置アノテーションによるコンテキスト追跡のための、大規模な人間と人間の会話コーパスであるcontrackをリリースする。平均11.8ターン、5.8エンティティ、15.2参照を持つ7000以上の会話を含んでいる。次に、コンテキストトラッキングのためのニューラルネットワークアーキテクチャをオープンソース化します。最後に、このネットワークをサブタスクの最先端のアプローチと比較し、関連するトレードオフに関する結果を報告します。

In human-human conversations, Context Tracking deals with identifying important entities and keeping track of their properties and relationships. This is a challenging problem that encompasses several subtasks such as slot tagging, coreference resolution, resolving plural mentions and entity linking. We approach this problem as an end-to-end modeling task where the conversational context is represented by an entity repository containing the entity references mentioned so far, their properties and the relationships between them. The repository is updated turn-by-turn, thus making training and inference computationally efficient even for long conversations. This paper lays the groundwork for an investigation of this framework in two ways. First, we release Contrack, a large scale human-human conversation corpus for context tracking with people and location annotations. It contains over 7000 conversations with an average of 11.8 turns, 5.8 entities and 15.2 references per conversation. Second, we open-source a neural network architecture for context tracking. Finally we compare this network to state-of-the-art approaches for the subtasks it subsumes and report results on the involved tradeoffs.

翻訳日:2022-02-05 06:36:19 公開日:2022-01-28

# (参考訳) どんな変分オートエンコーダでも任意の条件付けができる

Any Variational Autoencoder Can Do Arbitrary Conditioning ( http://arxiv.org/abs/2201.12414v1 )

ライセンス: CC BY 4.0

Ryan R. Strauss, Junier B. Oliva

(参考訳) 任意条件付けは教師なし学習において重要な問題であり、ここでは条件密度$p(\mathbf{x}_u \mid \mathbf{x}_o)$をモデル化する。しかし、密度推定の大多数は、特徴間の重要な条件依存が不透明な共同分布 $p(\mathbf{x})$ をモデル化することのみに焦点を当てている。本稿では,任意の変分オートエンコーダ(VAE)をVAE自体を変更することなく任意の条件付けを行うことのできる,シンプルで汎用的なフレームワークであるPosterior Matchingを提案する。後方マッチングは、既存のvaeに基づくジョイント密度推定法に応用され、任意の条件付けに対する以前のアプローチが要求する特殊モデルを回避している。 Posterior Matchingは、様々なタスクに対する現在の最先端メソッドに匹敵する、あるいは優れているパフォーマンスを実現する。

Arbitrary conditioning is an important problem in unsupervised learning, where we seek to model the conditional densities $p(\mathbf{x}_u \mid \mathbf{x}_o)$ that underly some data, for all possible non-intersecting subsets $o, u \subset \{1, \dots , d\}$. However, the vast majority of density estimation only focuses on modeling the joint distribution $p(\mathbf{x})$, in which important conditional dependencies between features are opaque. We propose a simple and general framework, coined Posterior Matching, that enables any Variational Autoencoder (VAE) to perform arbitrary conditioning, without modification to the VAE itself. Posterior Matching applies to the numerous existing VAE-based approaches to joint density estimation, thereby circumventing the specialized models required by previous approaches to arbitrary conditioning. We find that Posterior Matching achieves performance that is comparable or superior to current state-of-the-art methods for a variety of tasks.

翻訳日:2022-02-04 13:25:17 公開日:2022-01-28

# (参考訳) なぜ君を信頼すべきなのか、ベルマン? Bellman Errorは価値エラーの少ない代替品

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error ( http://arxiv.org/abs/2201.12417v1 )

ライセンス: CC BY 4.0

Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu

(参考訳) 本研究では,ベルマン方程式を数値予測精度の代用目的として利用することを検討した。ベルマン方程式はすべての状態-作用対上の真の値関数によって一意に解かれるが、ベルマン誤差(方程式の両側の違い)は値関数の精度の指標として不十分である。特に, 1) ベルマン方程式の両辺のキャンセルにより, ベルマン誤差の大きさは, すべての状態-作用対を考慮に入れた場合でも, 真の値関数との距離と弱い関係しかなく, 2) 有限データ状態においては, ベルマン方程式は無限に多くの準最適解によって正確に満たされることを示す。これは、値関数の精度を向上することなくベルマン誤差を最小化できることを意味する。これらの現象を、一連の命題、例示的なトイ例、標準ベンチマークドメインにおける経験的分析を通じて実証する。

In this work, we study the use of the Bellman equation as a surrogate objective for value prediction accuracy. While the Bellman equation is uniquely solved by the true value function over all state-action pairs, we find that the Bellman error (the difference between both sides of the equation) is a poor proxy for the accuracy of the value function. In particular, we show that (1) due to cancellations from both sides of the Bellman equation, the magnitude of the Bellman error is only weakly related to the distance to the true value function, even when considering all state-action pairs, and (2) in the finite data regime, the Bellman equation can be satisfied exactly by infinitely many suboptimal solutions. This means that the Bellman error can be minimized without improving the accuracy of the value function. We demonstrate these phenomena through a series of propositions, illustrative toy examples, and empirical analysis in standard benchmark domains.

翻訳日:2022-02-04 13:02:47 公開日:2022-01-28

# (参考訳) FastFlows: 分子グラフ生成のためのフローベースモデル

FastFlows: Flow-Based Models for Molecular Graph Generation ( http://arxiv.org/abs/2201.12419v1 )

ライセンス: CC BY 4.0

Nathan C. Frey, Vijay Gadepally, Bharath Ramsundar

(参考訳) 本稿では, 正規化フローモデル, SELF参照組込み文字列, 小分子を効率的に生成する多目的最適化を用いたフレームワークを提案する。最初のトレーニングセットは100個の小さな分子で、FastFlowsは数秒で何千もの化学的に有効な分子を生成する。効率的なサンプリングのため、サブ構造フィルターは不合理なモーティーを持つ化合物を除去するために必要に応じて適用することができる。薬物類似性, 合成アクセシビリティ, 合成複雑性の計算が容易で学習可能なメトリクスを用いて, マルチオブジェクト最適化を行い, 高速な仮想スクリーニング環境でのFastFlowsの動作を実証する。我々のモデルは自己回帰型分子生成モデルよりもはるかにシンプルで訓練が容易であり、薬物様合成可能な分子の高速な生成と同定を可能にする。

We propose a framework using normalizing-flow based models, SELF-Referencing Embedded Strings, and multi-objective optimization that efficiently generates small molecules. With an initial training set of only 100 small molecules, FastFlows generates thousands of chemically valid molecules in seconds. Because of the efficient sampling, substructure filters can be applied as desired to eliminate compounds with unreasonable moieties. Using easily computable and learned metrics for druglikeness, synthetic accessibility, and synthetic complexity, we perform a multi-objective optimization to demonstrate how FastFlows functions in a high-throughput virtual screening context. Our model is significantly simpler and easier to train than autoregressive molecular generative models, and enables fast generation and identification of druglike, synthesizable molecules.

翻訳日:2022-02-04 12:24:56 公開日:2022-01-28

# (参考訳) 効率的な分散ディープラーニングのためのリソース利用ベンチマーク

Benchmarking Resource Usage for Efficient Distributed Deep Learning ( http://arxiv.org/abs/2201.12423v1 )

ライセンス: CC BY 4.0

Nathan C. Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi

(参考訳) ディープラーニング(DL)ワークフローは、はるかに大きな利益を達成するために、計算とエネルギーの予算を継続的に増やすことを要求する。ニューラルネットワークの検索、ハイパーパラメータスイープ、ラピッドプロトタイピングは大量のリソースを消費し、リソース制約のある研究者が大規模なモデルの実験を行なわず、環境への影響も大きい。そのため、ディープニューラルネットワーク(DNN)とトレーニングの違いが、計算資源とエネルギー資源の増大をどのように活用するかを理解することが不可欠である。本稿では,最大424のグラフィックス処理ユニット(GPU)上で,さまざまなドメイン/タスク(自然言語処理,コンピュータビジョン,化学)を表すディープネットワークの配列を3,400以上の実験を行った。実験では,計算資源特性と電力利用やgpuクロックレート制限などの省エネ機構を系統的に変化させ,各代表モデルが様々な資源・エネルギー制約条件下で提示するトレードオフやスケーリング行動の把握と説明を行う。トレーニング時間が利用可能な計算資源とエネルギー制約によってどのようにスケールするかを記述する、パワーローモデルに適合します。これらの知見は,各種ディープラーニングタスク/ワークフローのエネルギー消費を,トレーニングへの影響を最小限に抑えて選択的に削減し,資源利用の最適化において,高性能コンピューティングプロバイダに情報提供と指導を支援することを期待する。

Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototyping consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources -- especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks -- natural language processing, computer vision, and chemistry -- on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy-saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.

翻訳日:2022-02-04 11:36:10 公開日:2022-01-28

# (参考訳) CoordX: 分割型MLPアーキテクチャによる暗黙のニューラル表現の高速化

CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture ( http://arxiv.org/abs/2201.12425v1 )

ライセンス: CC BY 4.0

Ruofan Liang, Hongyi Sun, Nandita Vijaykumar

(参考訳) 多層パーセプトロン(MLP)を用いた暗黙的神経表現は、近年、新しいビュー合成や3Dオブジェクト表現やレンダリングなど、様々なタスクで注目されている。しかし、これらの表現の重大な課題は、画像、ビデオ、または3Dオブジェクトを学習し、表現するために、多数の入力座標に対するMLPによるトレーニングと推論の両方が大量の計算と長い処理時間を必要とすることである。本研究では,新たな分割型MLPアーキテクチャであるCoordXを提案することにより,暗黙的ニューラル表現のための座標ベースMLPの推論と訓練を高速化することを目的とする。 CoordXでは、初期層を分割して入力座標の各次元を個別に学習する。中間の特徴は最後の層によって融合され、対応する座標点で学習信号を生成する。これにより、必要な計算量が大幅に削減され、トレーニングや推論のスピードアップが大きくなり、ベースラインMLPと同じような精度が達成される。このアプローチは、元の信号の分解である最初の学習機能を目標とし、学習した信号を生成するためにそれらを融合させる。提案アーキテクチャは,多くの暗黙的ニューラル表現タスクにおいて,メモリオーバーヘッドを伴わずに利用できる。画像,映像,3次元形状表現およびレンダリングタスクのベースラインモデルと比較して,最大2.92倍の高速化を示す。

Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks such as novel view synthesis and 3D object representation and rendering. However, a significant challenge with these representations is that both training and inference with an MLP over a large number of input coordinates to learn and represent an image, video, or 3D object, require large amounts of computation and incur long processing times. In this work, we aim to accelerate inference and training of coordinate-based MLPs for implicit neural representations by proposing a new split MLP architecture, CoordX. With CoordX, the initial layers are split to learn each dimension of the input coordinates separately. The intermediate features are then fused by the last layers to generate the learned signal at the corresponding coordinate point. This significantly reduces the amount of computation required and leads to large speedups in training and inference, while achieving similar accuracy as the baseline MLP. This approach thus aims at first learning functions that are a decomposition of the original signal and then fusing them to generate the learned signal. Our proposed architecture can be generally used for many implicit neural representation tasks with no additional memory overheads. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.

翻訳日:2022-02-04 11:15:28 公開日:2022-01-28

# (参考訳) FedGCN:グラフ畳み込みネットワークのフェデレーショントレーニングにおける収束とコミュニケーションのトレードオフ

FedGCN: Convergence and Communication Tradeoffs in Federated Training of Graph Convolutional Networks ( http://arxiv.org/abs/2201.12433v1 )

ライセンス: CC BY 4.0

Yuhang Yao, Carlee Joe-Wong

(参考訳) グラフデータセットのトレーニングモデルのための分散メソッドは、グラフデータセットのサイズと、ソーシャルネットワークのようなグラフィカルデータのプライベートな性質により、最近人気が高まっている。しかし、このデータのグラフィカルな構造は、異なる学習クライアント間で疎結合に分割できないことを意味しており、クライアント間の重要な通信オーバーヘッドまたはトレーニング方法で利用可能な情報の損失につながる。フェデレーショングラフ畳み込みネットワーク(Federated Graph Convolutional Network, FedGCN)を導入し, フェデレーション学習を用いてGCNモデルを最適収束率と通信コストで訓練する。各イテレーションでクライアント間の通信を必要とする以前の方法と比較して、federcnはクライアントデータのプライバシを保持し、最初のステップで通信のみを必要とするため、通信コストを大幅に削減し、コンバージェンスレートを高速化する。我々は、FedGCNの収束率と異なるデータ分布下での通信コストのトレードオフを理論的に分析し、一般的なフレームワークを導入して、すべてのエッジ補完に基づくGCNトレーニングアルゴリズムを解析することができる。実験により,本アルゴリズムの有効性を実証し,理論解析を検証した。

Distributed methods for training models on graph datasets have recently grown in popularity, due to the size of graph datasets as well as the private nature of graphical data like social networks. However, the graphical structure of this data means that it cannot be disjointly partitioned between different learning clients, leading to either significant communication overhead between clients or a loss of information available to the training method. We introduce Federated Graph Convolutional Network (FedGCN), which uses federated learning to train GCN models with optimized convergence rate and communication cost. Compared to prior methods that require communication among clients at each iteration, FedGCN preserves the privacy of client data and only needs communication at the initial step, which greatly reduces communication cost and speeds up the convergence rate. We theoretically analyze the tradeoff between FedGCN's convergence rate and communication cost under different data distributions, introducing a general framework can be generally used for the analysis of all edge-completion-based GCN training algorithms. Experimental results demonstrate the effectiveness of our algorithm and validate our theoretical analysis.

翻訳日:2022-02-04 10:49:59 公開日:2022-01-28

# (参考訳) 事前学習型言語モデルを用いた常識知識推論と生成:調査

Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey ( http://arxiv.org/abs/2201.12438v1 )

ライセンス: CC BY 4.0

Prajjwal Bhargava, Vincent Ng

(参考訳) 常識知識の獲得と推論は伝統的に知識表現と推論コミュニティの中核的な研究テーマであったが、近年は自然言語処理コミュニティにおいて、事前訓練されたモデルを開発し、新しく設計された様々な常識知識の推論と生成タスクに対処する能力をテストすることへの関心が高まっている。本稿では,これらの課題に関する調査を行い,これらの課題により明らかになった常識推論と生成のための最先端の事前学習モデルの強みと弱みについて考察し,今後の研究の方向性を考察する。

While commonsense knowledge acquisition and reasoning has traditionally been a core research topic in the knowledge representation and reasoning community, recent years have seen a surge of interest in the natural language processing community in developing pre-trained models and testing their ability to address a variety of newly designed commonsense knowledge reasoning and generation tasks. This paper presents a survey of these tasks, discusses the strengths and weaknesses of state-of-the-art pre-trained models for commonsense reasoning and generation as revealed by these tasks, and reflects on future research directions.

翻訳日:2022-02-04 10:30:29 公開日:2022-01-28

# (参考訳) 分布シフトによるモデル精度の検証

Certifying Model Accuracy under Distribution Shifts ( http://arxiv.org/abs/2201.12440v1 )

ライセンス: CC BY 4.0

Aounon Kumar, Alexander Levine, Tom Goldstein and Soheil Feizi

(参考訳) 機械学習における認証された堅牢性は主に、データ分散の各点に対する固定攻撃予算による入力の逆摂動に焦点を当てている。本研究では,データ分布の有界wassersteinシフト下でのモデルの精度について,証明可能なロバスト性を保証する。変換空間内のモデルの入力をランダム化する単純な手続きは、変換の下での分布シフトに対して確実に頑健であることを示す。提案手法により, datum 特有の摂動径は入力分布の異なる点にまたがって変化し, 固定サイズの摂動も含むことができる。我々の証明は、ワッサーシュタイン球内における入力分布の(自然あるいは逆)シフトに対するモデルの性能に関する保証された低い境界を生成する。この技術を応用します一色シフト、色シフト、明るさ及び彩度の変化等の画像の自然(非逆変換)に対する堅牢性を証明すること。 (ii)入力分布の逆流に対するロバスト性を証明すること、及び (3) モデルトレーニングに干渉する有害ないわゆる「未学習」データセットで訓練されたモデルの性能について、証明可能な下限(硬度結果)を示す。

Certified robustness in machine learning has primarily focused on adversarial perturbations of the input with a fixed attack budget for each point in the data distribution. In this work, we present provable robustness guarantees on the accuracy of a model under bounded Wasserstein shifts of the data distribution. We show that a simple procedure that randomizes the input of the model within a transformation space is provably robust to distributional shifts under the transformation. Our framework allows the datum-specific perturbation size to vary across different points in the input distribution and is general enough to include fixed-sized perturbations as well. Our certificates produce guaranteed lower bounds on the performance of the model for any (natural or adversarial) shift of the input distribution within a Wasserstein ball around the original distribution. We apply our technique to: (i) certify robustness against natural (non-adversarial) transformations of images such as color shifts, hue shifts and changes in brightness and saturation, (ii) certify robustness against adversarial shifts of the input distribution, and (iii) show provable lower bounds (hardness results) on the performance of models trained on so-called "unlearnable" datasets that have been poisoned to interfere with model training.

翻訳日:2022-02-04 10:12:54 公開日:2022-01-28

# 効率的なポリシー空間対応 oracle

Efficient Policy Space Response Oracles ( http://arxiv.org/abs/2202.00633v1 )

ライセンス: Link先を確認

Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu

(参考訳) ポリシー空間応答 oracle method (psro)は、2人プレイのゼロサムゲームにおけるnash均衡に対する一般的な解決策を提供するが、(1)シミュレーションによって現在の人口を一貫して評価することによる計算効率の非効率、(2)各イテレーションにおける固定されたメタストラテジーに対する最善の反応を学ぶことによる探索効率の非効率の2つの問題に苦しむ。本稿では,上記の2つのステップの効率を大幅に向上させるEPSRO(Efficient PSRO)を提案する。我々の開発の中心は、制限なし(URR)ゲームにおけるミニマックス最適化の導入されたサブルーチンである。各ステップでURRを解くことで、現在のゲームを評価し、ゲームシミュレーションを必要とせずに、1回のフォワードパスでベストレスポンスを計算することができる。理論的には、ESPROの解法が、攻撃性に対する単調な改善をもたらすことを証明している。さらに、ESPROの望ましい性質は、並列化可能であり、行動多様性を誘導する政策空間の効率的な探索を可能にすることである。我々は,EPSROを3種類のゲームでテストし,壁面時間における50倍の高速化,10倍のデータ効率,および既存のKuhnおよびLeduc PokerゲームにおけるPSRO手法と同様のエクスプロイザビリティを報告した。

Policy Space Response Oracle method (PSRO) provides a general solution to Nash equilibrium in two-player zero-sum games but suffers from two problems: (1) the computation inefficiency due to consistently evaluating current populations by simulations; and (2) the exploration inefficiency due to learning best responses against a fixed meta-strategy at each iteration. In this work, we propose Efficient PSRO (EPSRO) that largely improves the efficiency of the above two steps. Central to our development is the newly-introduced subroutine of minimax optimization on unrestricted-restricted (URR) games. By solving URR at each step, one can evaluate the current game and compute the best response in one forward pass with no need for game simulations. Theoretically, we prove that the solution procedures of EPSRO offer a monotonic improvement on exploitability. Moreover, a desirable property of EPSRO is that it is parallelizable, this allows for efficient exploration in the policy space that induces behavioral diversity. We test EPSRO on three classes of games and report a 50x speedup in wall-time, 10x data efficiency, and similar exploitability as existing PSRO methods on Kuhn and Leduc Poker games.

翻訳日:2022-02-02 15:38:17 公開日:2022-01-28

# 3次元CADモデルから標準と特徴を認識するニューラルネットワークの開発

Development of a neural network to recognize standards and features from 3D CAD models ( http://arxiv.org/abs/2202.00573v1 )

ライセンス: Link先を確認

Alexander Neb and Iyed Briki and Raoul Schoenhof

(参考訳) この研究の焦点は、3dcadモデルから直接標準や機能を認識することである。このため、ニューラルネットワークは9種類の機械要素を認識するように訓練された。 DIN EN ISO 8676以降の六角形ネジのように、ある部分を標準として特定した後、アプリケーションプログラミングインタフェース(API)を介してCADシステムの幾何学的情報にアクセスする。 APIでは,その部分を適切に記述するために必要な情報を検索する。この情報に基づく標準化部品を詳細に認識し、さらに情報を補うことができる。

Focus of this work is to recognize standards and further features directly from 3D CAD models. For this reason, a neural network was trained to recognize nine classes of machine elements. After the system identified a part as a standard, like a hexagon head screw after the DIN EN ISO 8676, it accesses the geometrical information of the CAD system via the Application Programming Interface (API). In the API, the system searches for necessary information to describe the part appropriately. Based on this information standardized parts can be recognized in detail and supplemented with further information.

翻訳日:2022-02-02 13:34:21 公開日:2022-01-28

# Dynamic-VAEによる電気自動車バッテリーの故障検出

Detecting Electric Vehicle Battery Failure via Dynamic-VAE ( http://arxiv.org/abs/2201.12358v1 )

ライセンス: Link先を確認

Haowei He, Jingzhao Zhang, Yanan Wang, Shaobo Huang, Chen Wang, Yang Zhang, Dongxu Guo, Guannan He, Minggao Ouyang

(参考訳) 本稿では,ディープラーニングモデルによってバックアップされたバッテリ故障検出パイプラインについて述べる。まず、数百台の車両のバッテリー充電データを含む、大規模な電気自動車(EV)バッテリーデータセットを紹介します。次に,バッテリ故障検出を異常検出問題として定式化し,動的システムと変分オートエンコーダに基づく動的VAEという新しいアルゴリズムを提案する。提案アルゴリズムの性能を,提案したデータセットのベースラインに対して検証し,動的VAEの有効性を実証した。

In this note, we describe a battery failure detection pipeline backed up by deep learning models. We first introduce a large-scale Electric vehicle (EV) battery dataset including cleaned battery-charging data from hundreds of vehicles. We then formulate battery failure detection as an outlier detection problem, and propose a new algorithm named Dynamic-VAE based on dynamic system and variational autoencoders. We validate the performance of our proposed algorithm against several baselines on our released dataset and demonstrated the effectiveness of Dynamic-VAE.

翻訳日:2022-02-01 20:02:37 公開日:2022-01-28

# 攻撃グラフを用いた強化学習による濾過経路の検出

Discovering Exfiltration Paths Using Reinforcement Learning with Attack Graphs ( http://arxiv.org/abs/2201.12416v1 )

ライセンス: Link先を確認

Tyler Cody, Abdul Rahman, Christopher Redino, Lanxiao Huang, Ryan Clark, Akshay Kakkar, Deepak Kushwaha, Paul Park, Peter Beling, Edward Bowen

(参考訳) 強化学習 (Reinforcement Learning, RL) は, 攻撃グラフやサイバー地形とともに, 企業ネットワークにおけるデータ流出の最適な経路を決定するための報酬と状態を開発するために用いられる。この研究は以前のクラウンジュエリー(CJ)識別に基づいており、敵が近くのCJやホストを妥協する最適な経路を計算することの目標に焦点を当てている。この作業は、データが盗まれ、ネットワークから静かに流出しなければならないという仮定に基づいて、以前のCJアプローチを逆転させる。 RLは、敵が検出を減らしたいと願う経路の識別に基づいて報酬関数の開発を支援するために利用される。その結果,大規模ネットワーク環境における有望な性能が示された。

Reinforcement learning (RL), in conjunction with attack graphs and cyber terrain, are used to develop reward and state associated with determination of optimal paths for exfiltration of data in enterprise networks. This work builds on previous crown jewels (CJ) identification that focused on the target goal of computing optimal paths that adversaries may traverse toward compromising CJs or hosts within their proximity. This work inverts the previous CJ approach based on the assumption that data has been stolen and now must be quietly exfiltrated from the network. RL is utilized to support the development of a reward function based on the identification of those paths where adversaries desire reduced detection. Results demonstrate promising performance for a sizable network environment.

翻訳日:2022-02-01 20:02:27 公開日:2022-01-28

# 学習最適化のための簡易ガード

A Simple Guard for Learned Optimizers ( http://arxiv.org/abs/2201.12426v1 )

ライセンス: Link先を確認

Isabeau Pr\'emont-Schwarz, Jaroslav V\'itk\r{u}, Jan Feyereisl

(参考訳) 学習したコンポーネントの傾向が最終的に手作りバージョンを上回り続けるなら、学習したオプティマイザは最終的にはSGDやAdamのような手作りのオプティマイザを上回る。しかし、たとえ学習したオプティマイザ(L2Os)が最終的に手作りのものよりも優れているとしても、それらは証明可能な収束性はなく、分布に失敗する可能性がある。これらの質問はここで取り上げている。現在、学習オプティマイザは、学習の開始時に、一般的な手作りのオプティマイザ(勾配降下など)をしばしば上回っていますが、一般的には、ジェネリックアルゴリズムが進歩を続けながら、学習したアルゴリズムをaesopのtortoiseとして上回っており、hareを上回っており、そうでない。 L2Osはまた、分布から一般化するのが難しい。 (heaton et al., 2020)は、学習したオプティマイザをジェネリックな学習アルゴリズムで保護し、2つのアルゴリズムを条件付きで切り替えることで、結果として得られるアルゴリズムが確実に収束するように保護されたl2o(gl2o)を提案した。 L2O(Los-Guarded L2O)と呼ばれる新しいセーフガード型L2O(Los-Guarded L2O)を提案する。ガード機構は、両オプティマイザの期待損失値のみに基づいて決定する。さらに,lgl2o の収束保証の理論的証明と gl2o や他のベースラインと比較し,l2o と sgd を最もよく組み合わせ,実際 gl2o よりも収束することを示す実験結果を示す。

If the trend of learned components eventually outperforming their hand-crafted version continues, learned optimizers will eventually outperform hand-crafted optimizers like SGD or Adam. Even if learned optimizers (L2Os) eventually outpace hand-crafted ones in practice however, they are still not provably convergent and might fail out of distribution. These are the questions addressed here. Currently, learned optimizers frequently outperform generic hand-crafted optimizers (such as gradient descent) at the beginning of learning but they generally plateau after some time while the generic algorithms continue to make progress and often overtake the learned algorithm as Aesop's tortoise which overtakes the hare and are not. L2Os also still have a difficult time generalizing out of distribution. (Heaton et al., 2020) proposed Safeguarded L2O (GL2O) which can take a learned optimizer and safeguard it with a generic learning algorithm so that by conditionally switching between the two, the resulting algorithm is provably convergent. We propose a new class of Safeguarded L2O, called Loss-Guarded L2O (LGL2O), which is both conceptually simpler and computationally less expensive. The guarding mechanism decides solely based on the expected future loss value of both optimizers. Furthermore, we show theoretical proof of LGL2O's convergence guarantee and empirical results comparing to GL2O and other baselines showing that it combines the best of both L2O and SGD and and in practice converges much better than GL2O.

翻訳日:2022-02-01 20:02:15 公開日:2022-01-28

# 情報選択システムのためのトップKランキング深層帯域

Top-K Ranking Deep Contextual Bandits for Information Selection Systems ( http://arxiv.org/abs/2201.13287v1 )

ライセンス: Link先を確認

Jade Freeman and Michael Rawson

(参考訳) 今日の技術環境では、情報は豊富で、動的で、自然に異質である。情報の自動フィルタリングと優先順位付けは、その情報が目標に向かって実質的な価値を付加するかどうかの区別に基づいている。コンテキスト型マルチアームバンディットは、ユーザの関心や関連性に応じてコンテンツをフィルタリングし優先順位付けするために広く使用されている。 Learn-to-Rankテクニックはアイテムの関連ランキングを最適化し、コンテンツの選択を可能にする。本稿では,文脈的マルチアームバンディットフレームワークに基づくトップKランキングに対する新しいアプローチを提案する。確率的報酬関数をニューラルネットワークでモデル化し,非線形近似により報酬と文脈の関係を学習する。本手法を実証し,シミュレーションシナリオにおける実世界のデータセットを用いて実験結果から学習性能を評価する。実験の結果、この手法は報酬構造と高次元の文脈特徴の複雑さの下でうまく機能することが示された。

In today's technology environment, information is abundant, dynamic, and heterogeneous in nature. Automated filtering and prioritization of information is based on the distinction between whether the information adds substantial value toward one's goal or not. Contextual multi-armed bandit has been widely used for learning to filter contents and prioritize according to user interest or relevance. Learn-to-Rank technique optimizes the relevance ranking on items, allowing the contents to be selected accordingly. We propose a novel approach to top-K rankings under the contextual multi-armed bandit framework. We model the stochastic reward function with a neural network to allow non-linear approximation to learn the relationship between rewards and contexts. We demonstrate the approach and evaluate the the performance of learning from the experiments using real world data sets in simulated scenarios. Empirical results show that this approach performs well under the complexity of a reward structure and high dimensional contextual features.

翻訳日:2022-02-01 18:21:10 公開日:2022-01-28

# 注意重み付きイベントベース埋め込みを用いた自動音声キャプション

Automatic Audio Captioning using Attention weighted Event based Embeddings ( http://arxiv.org/abs/2201.12352v1 )

ライセンス: Link先を確認

Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

(参考訳) 自動音声キャプション(automatic audio captioning, aac)は、音声を自然言語に翻訳し、音声イベント、イベントのソース、それらの関係を記述するタスクである。現在、aacデータセットの限られたサンプルは、転送学習とオーディオイベント検出(aed)を親タスクとして組み込むトレンドを設定している。本稿では,aacのための軽量(学習パラメータが小さい)bi-lstmリカレント層を有するエンコーダ・デコーダアーキテクチャを提案する。その結果,時間的注意と拡張技術を組み合わせた効率的なaed埋め込み抽出器は,計算集約的なアーキテクチャで既存の文献を上回ることができることがわかった。さらに,このモデルの一部として生成した不均一な注意重み付き符号化が,各トークンの生成中にオーディオの特定の部分をデコーダが見渡すことができることを示す。

Automatic Audio Captioning (AAC) refers to the task of translating audio into a natural language that describes the audio events, source of the events and their relationships. The limited samples in AAC datasets at present, has set up a trend to incorporate transfer learning with Audio Event Detection (AED) as a parent task. Towards this direction, in this paper, we propose an encoder-decoder architecture with light-weight (i.e. with lesser learnable parameters) Bi-LSTM recurrent layers for AAC and compare the performance of two state-of-the-art pre-trained AED models as embedding extractors. Our results show that an efficient AED based embedding extractor combined with temporal attention and augmentation techniques is able to surpass existing literature with computationally intensive architectures. Further, we provide evidence of the ability of the non-uniform attention weighted encoding generated as a part of our model to facilitate the decoder glance over specific sections of the audio while generating each token.

翻訳日:2022-02-01 18:19:40 公開日:2022-01-28

# ロボット操作のためのタスク焦点Few-Shotオブジェクト検出

Task-Focused Few-Shot Object Detection for Robot Manipulation ( http://arxiv.org/abs/2201.12437v1 )

ライセンス: Link先を確認

Brent Griffin

(参考訳) 本稿では,新しい物体の移動ロボット操作における検出による問題に対処する。我々のアプローチでは、現実のタスクから学習する補完関数として視覚と制御を用いる。検出のみに基づく操作手法を開発し,タスク中心の少数ショット検出を導入し,新しいオブジェクトや設定を学習する。少数ショットオブジェクト検出の現在のパラダイムは、既存のアノテーション付き例を使用する。対照的に、このパラダイムは、特定の下流タスク(例えば、深さ推定と把握)の性能を向上させるアクティブデータ収集とアノテーション選択を用いて拡張する。数ショット学習へのインタラクティブなアプローチの実験では、ロボットを訓練して、検出からオブジェクトを直接操作する(ClickBot)。 clickbotはアノテーションの1クリックでビジュアルサーボ制御を学び、クラッターや他の設定で新しいオブジェクトを把握し、既存のビジュアルサーボ制御と深さ推定ベンチマークで最先端の結果を得る。最後に、将来の研究をサポートするタスク中心の少数ショットオブジェクト検出ベンチマークを確立します。

This paper addresses the problem of mobile robot manipulation of novel objects via detection. Our approach uses vision and control as complementary functions that learn from real-world tasks. We develop a manipulation method based solely on detection then introduce task-focused few-shot object detection to learn new objects and settings. The current paradigm for few-shot object detection uses existing annotated examples. In contrast, we extend this paradigm by using active data collection and annotation selection that improves performance for specific downstream tasks (e.g., depth estimation and grasping). In experiments for our interactive approach to few-shot learning, we train a robot to manipulate objects directly from detection (ClickBot). ClickBot learns visual servo control from a single click of annotation, grasps novel objects in clutter and other settings, and achieves state-of-the-art results on an existing visual servo control and depth estimation benchmark. Finally, we establish a task-focused few-shot object detection benchmark to support future research: https://github.com/griffbr/TFOD.

翻訳日:2022-02-01 18:03:44 公開日:2022-01-28

# 物理インフォームドニューラルネットワークによる複数の電気解剖学的マップからの心臓線維配向の学習

Physics-informed neural networks to learn cardiac fiber orientation from multiple electroanatomical maps ( http://arxiv.org/abs/2201.12362v1 )

ライセンス: Link先を確認

Carlos Ruiz Herrera, Thomas Grandits, Gernot Plank, Paris Perdikaris, Francisco Sahli Costabal and Simone Pezzuto

(参考訳) 本研究では, 複数のカテーテル記録からヒト心房の心線維構造をin-vivoで推定するfibernetを提案する。心臓線維は心臓の電気機械機能において中心的な役割を担っているが、生体内決定が困難であり、それゆえ、既存の心臓モデルにおいて真に患者特異的であることは稀である。逆問題は、スパース活性化マップの集合から心臓伝播モデルの伝導速度テンソルを特定することである。局所繊維角を含む伝導速度テンソルの全ての成分を同時に同定し, 合成2次元および3次元例, 拡散テンソル繊維, 患者特有の場合についてfibernetを広範囲にテストした。 3つの地図は繊維を正確に捉えるのに十分であり、ノイズの予測にも十分であることを示す。地図が少なければ、正規化の役割は顕著になる。さらに, 適応モデルにより, 目に見えないアクティベーションマップを頑健に再現できることを示す。 FiberNetはパーソナライズされた医療のための患者固有のモデルを作成するのに役立つことを期待しています。

We propose FiberNet, a method to estimate in-vivo the cardiac fiber architecture of the human atria from multiple catheter recordings of the electrical activation. Cardiac fibers play a central rolein the electro-mechanical function of the heart, yet they aredifficult to determine in-vivo, and hence rarely truly patient-specificin existing cardiac models.FiberNet learns the fibers arrangement by solvingan inverse problem with physics-informed neural networks. The inverse problem amounts to identifyingthe conduction velocity tensor of a cardiac propagation modelfrom a set of sparse activation maps. The use of multiple mapsenables the simultaneous identification of all the componentsof the conduction velocity tensor, including the local fiber angle.We extensively test FiberNet on synthetic 2-D and 3-D examples, diffusion tensor fibers, and a patient-specific case. We show that 3 maps are sufficient to accurately capture the fibers, also in thepresence of noise. With fewer maps, the role of regularization becomesprominent. Moreover, we show that the fitted model can robustlyreproduce unseen activation maps. We envision that FiberNet will help the creation of patient-specific models for personalized medicine.The full code is available at http://github.com/fsahli/FiberNet.

翻訳日:2022-02-01 17:59:06 公開日:2022-01-28

# Electra: 条件付き生成モデルに基づく述語対応クエリ近似

Electra: Conditional Generative Model based Predicate-Aware Query Approximation ( http://arxiv.org/abs/2201.12420v1 )

ライセンス: Link先を確認

Nikhil Sheoran, Subrata Mitra, Vibhor Porwal, Siddharth Ghetia, Jatin Varshney, Tung Mai, Anup Rao, Vikas Maddukuri

(参考訳) Approximate Query Processing(AQP)の目標は、クエリをコスト的に集約する上で、非常に高速だが“十分正確な”結果を提供することで、大規模なデータセットのインタラクティブな探索におけるユーザエクスペリエンスを向上させることだ。最近提案された機械学習ベースのaqp技術は、クエリの実行が従来のデータベースクラスタでのクエリ処理と比較してモデル推論のみを伴うため、非常に低いレイテンシを提供することができる。しかし、フィルタ述語(WHERE節)の数が増加すると、近似誤差はこれらの手法で著しく増加する。アナリストは洞察の発見に多くの述語を使ったクエリを使うことが多い。したがって、アナリストが誤った結論を出すのを防ぐためには、低い近似誤差を維持することが重要である。本稿では,より少ない近似誤差で多数の述語を用いた分析式クエリに応答できる述語認識型AQPシステムであるELECTRAを提案する。 electraは条件付き生成モデルを使用して、データの条件付き分布を学習し、実行時に小さな(約1000行)だが代表的なサンプルを生成し、クエリを実行して近似結果を計算する。実世界の3つのデータセットに対する4つの異なるベースラインによる評価の結果,ELECTRAはベースラインと比較して多数の述語に対して低いAQP誤差を提供することがわかった。

The goal of Approximate Query Processing (AQP) is to provide very fast but "accurate enough" results for costly aggregate queries thereby improving user experience in interactive exploration of large datasets. Recently proposed Machine-Learning based AQP techniques can provide very low latency as query execution only involves model inference as compared to traditional query processing on database clusters. However, with increase in the number of filtering predicates(WHERE clauses), the approximation error significantly increases for these methods. Analysts often use queries with a large number of predicates for insights discovery. Thus, maintaining low approximation error is important to prevent analysts from drawing misleading conclusions. In this paper, we propose ELECTRA, a predicate-aware AQP system that can answer analytics-style queries with a large number of predicates with much smaller approximation errors. ELECTRA uses a conditional generative model that learns the conditional distribution of the data and at runtime generates a small (~1000 rows) but representative sample, on which the query is executed to compute the approximate result. Our evaluations with four different baselines on three real-world datasets show that ELECTRA provides lower AQP error for large number of predicates compared to baselines.

翻訳日:2022-02-01 17:56:46 公開日:2022-01-28

# コミュニケーション構造を考慮した協調ゲームによるグラフレベルの予測

Explaining Graph-level Predictions with Communication Structure-Aware Cooperative Games ( http://arxiv.org/abs/2201.12380v1 )

ライセンス: Link先を確認

Shichang Zhang, Neil Shah, Yozen Liu, Yizhou Sun

(参考訳) 機械学習モデルによる予測を説明することが重要であり、興味を惹きつけている。協調ゲーム理論のShapley値は、特に画像、テキスト、表データ、および最近のグラフ上のグラフニューラルネットワーク(GNN)において、予測に対する特徴的重要性を計算するための主要なアプローチとして提案されている。本研究では,グラフ記述におけるShapley値の妥当性を再検討し,グラフレベルの予測において最も重要な部分グラフと構成ノードを特定する。我々は、Shapley値がグラフデータに対する非理想的な選択であると仮定する。本稿では、重要なグラフ構造情報を利用して説明を改善するグラフ構造対応eXplanation(GStarX)法を提案する。具体的には,HN値と呼ばれる協調ゲーム理論から,新たな構造認識値に基づくスコアリング関数を提案する。ノードの重要性を評価する場合、HN値はグラフ構造を利用して、GNNのメッセージパッシングに似た近隣ノード間の協調余剰を属性とし、ノードの重要度スコアはノードの特徴の重要性だけでなく構造的役割も反映する。我々はGstarXが定性的に直感的に説明し、化学グラフ特性予測とテキストグラフの感情分類に基づく強いベースラインを定量的に改善することを示した。

Explaining predictions made by machine learning models is important and have attracted an increased interest. The Shapley value from cooperative game theory has been proposed as a prime approach to compute feature importances towards predictions, especially for images, text, tabular data, and recently graph neural networks (GNNs) on graphs. In this work, we revisit the appropriateness of the Shapley value for graph explanation, where the task is to identify the most important subgraph and constituent nodes for graph-level predictions. We purport that the Shapley value is a no-ideal choice for graph data because it is by definition not structure-aware. We propose a Graph Structure-aware eXplanation (GStarX) method to leverage the critical graph structure information to improve the explanation. Specifically, we propose a scoring function based on a new structure-aware value from the cooperative game theory called the HN value. When used to score node importance, the HN value utilizes graph structures to attribute cooperation surplus between neighbor nodes, resembling message passing in GNNs, so that node importance scores reflect not only the node feature importance, but also the structural roles. We demonstrate that GstarX produces qualitatively more intuitive explanations, and quantitatively improves over strong baselines on chemical graph property prediction and text graph sentiment classification.

翻訳日:2022-02-01 16:56:41 公開日:2022-01-28

# 状態マージによるRNNからの有限オートマタ抽出

Extracting Finite Automata from RNNs Using State Merging ( http://arxiv.org/abs/2201.12451v1 )

ライセンス: Link先を確認

William Merrill and Nikolaos Tsilivis

(参考訳) blackbox recurrent neural network(rnn)の振る舞いを解釈する一つの方法は、より解釈可能な離散計算モデルからその振る舞いをキャプチャする有限状態機械のように抽出することである。本研究では,rnnから有限オートマトンを抽出する新しい手法を提案する。提案手法の有効性をTomita言語ベンチマークで検証したところ,ベンチマーク中のすべての言語でトレーニングされたRNNから忠実なオートマトンを抽出できることがわかった。抽出性能は抽出プロセス中に提供されるデータ数と、ターゲット言語を完全に学習した後、RNNモデルが追加のエポックのために訓練されているかどうかによって支援される。我々はこの現象を解析するためにこの手法を用い,RNNの内部状態空間の圧縮につながるため,収束を超えたトレーニングが有用であることを確認した。そこで本研究では,RNNモデルの解釈可能性と解析に本手法をどのように利用できるかを示す。

One way to interpret the behavior of a blackbox recurrent neural network (RNN) is to extract from it a more interpretable discrete computational model, like a finite state machine, that captures its behavior. In this work, we propose a new method for extracting finite automata from RNNs inspired by the state merging paradigm from grammatical inference. We demonstrate the effectiveness of our method on the Tomita languages benchmark, where we find that it is able to extract faithful automata from RNNs trained on all languages in the benchmark. We find that extraction performance is aided by the number of data provided during the extraction process, as well as, curiously, whether the RNN model is trained for additional epochs after perfectly learning its target language. We use our method to analyze this phenomenon, finding that training beyond convergence is useful because it leads to compression of the internal state space of the RNN. This finding demonstrates how our method can be used for interpretability and analysis of trained RNN models.

翻訳日:2022-02-01 16:56:21 公開日:2022-01-28

# タンパク質構造表現学習のための重み付きニューラルネットワーク

Directed Weight Neural Networks for Protein Structure Representation Learning ( http://arxiv.org/abs/2201.13299v1 )

ライセンス: Link先を確認

Jiahan Li, Shitong Luo, Congyue Deng, Chaoran Cheng, Jiaqi Guan, Leonidas Guibas, Jian Peng, Jianzhu Ma

(参考訳) タンパク質は、特定の3D構造に折り畳んで生物学的機能を発揮する。タンパク質構造を正確にモデル化するには、全体的な幾何学的トポロジーとアミノ酸間の局所的細粒度関係(例えば、側鎖ねじれ角とアミノ酸間配向)の両方を慎重に考慮すべきである。本研究では,異なるアミノ酸間の幾何学的関係をよりよく捉えるために,有向重みニューラルネットワークを提案する。スカラーから3次元指向ベクトルへの単一重み拡張により,新しいフレームワークは,古典的および so(3)-表現的特徴の両方に対する幾何学的操作の豊富な集合をサポートし,その上にアミノ酸情報を処理するためのパーセプトロンユニットを構築する。さらに,有向重みを既存のグラフニューラルネットワークに接続するためのタンパク質に同変メッセージパッシングパラダイムを導入し,グローバルスケールでのSO(3)-等価性の維持に優れた長所を示す。実験により,従来のニューラルネットワークや(グローバルに)等価ネットワークと比較して,幾何学的関係を表現する上で,ネットワークの表現性が著しく向上することが示された。また、タンパク質3D構造に関連する様々な計算生物学応用の最先端性能も達成している。

A protein performs biological functions by folding to a particular 3D structure. To accurately model the protein structures, both the overall geometric topology and local fine-grained relations between amino acids (e.g. side-chain torsion angles and inter-amino-acid orientations) should be carefully considered. In this work, we propose the Directed Weight Neural Network for better capturing geometric relations among different amino acids. Extending a single weight from a scalar to a 3D directed vector, our new framework supports a rich set of geometric operations on both classical and SO(3)--representation features, on top of which we construct a perceptron unit for processing amino-acid information. In addition, we introduce an equivariant message passing paradigm on proteins for plugging the directed weight perceptrons into existing Graph Neural Networks, showing superior versatility in maintaining SO(3)-equivariance at the global scale. Experiments show that our network has remarkably better expressiveness in representing geometric relations in comparison to classical neural networks and the (globally) equivariant networks. It also achieves state-of-the-art performance on various computational biology applications related to protein 3D structures.

翻訳日:2022-02-01 16:18:28 公開日:2022-01-28

# 組合せインタラクションテストを用いた機械学習の系統的トレーニングとテスト

Systematic Training and Testing for Machine Learning Using Combinatorial Interaction Testing ( http://arxiv.org/abs/2201.12428v1 )

ライセンス: Link先を確認

Tyler Cody, Erin Lanus, Daniel D. Doyle, Laura Freeman

(参考訳) 本稿では,機械学習モデルにおけるテストおよびトレーニングセットの選択と特徴付けに,組合せカバレッジの体系的利用を示す。提案した研究は、機械学習で使用されるデータを特徴付けるために、ソフトウェアテストの欠陥を特定するためにうまく活用されている組合せ相互作用テストに適応する。 mnist手書き桁データを使用して、コンビネートカバレッジが、マシンラーニングモデルのパフォーマンスを強調するテストセットの選択、堅牢なモデルパフォーマンスにつながるトレーニングセットの選択、新しいドメインへの微調整モデルのためのデータの選択に使用できることを実証する。したがって、結果は機械学習のトレーニングとテストのための総合的なアプローチとして組み合わせカバレッジを実証する。本稿では、ニューラルネットワークの内部におけるカバレッジの利用に注目した先行研究とは対照的に、入力と出力から得られる単純な特徴のカバレッジについて考察する。そこで本稿では,機械学習モデルに対するテストおよびトレーニングセットのサプライヤーが,モデル自体に対する知的財産権を持っていない場合について論じる。最後に、組み合わせカバレッジに対する事前の批判に対処し、機械学習アプリケーションにおけるカバレッジメトリクスの使用を推奨する反論を提供する。

This paper demonstrates the systematic use of combinatorial coverage for selecting and characterizing test and training sets for machine learning models. The presented work adapts combinatorial interaction testing, which has been successfully leveraged in identifying faults in software testing, to characterize data used in machine learning. The MNIST hand-written digits data is used to demonstrate that combinatorial coverage can be used to select test sets that stress machine learning model performance, to select training sets that lead to robust model performance, and to select data for fine-tuning models to new domains. Thus, the results posit combinatorial coverage as a holistic approach to training and testing for machine learning. In contrast to prior work which has focused on the use of coverage in regard to the internal of neural networks, this paper considers coverage over simple features derived from inputs and outputs. Thus, this paper addresses the case where the supplier of test and training sets for machine learning models does not have intellectual property rights to the models themselves. Finally, the paper addresses prior criticism of combinatorial coverage and provides a rebuttal which advocates the use of coverage metrics in machine learning applications.

翻訳日:2022-02-01 16:17:54 公開日:2022-01-28

# マルチエージェント・コントロにおけるレギュレット最小化手法

A Regret Minimization Approach to Multi-Agent Contro ( http://arxiv.org/abs/2201.13288v1 )

ライセンス: Link先を確認

Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan

(参考訳) 本研究では,動的システムのマルチエージェント制御の問題点について考察する。本研究は,中央集権的な事前計算を行なわない最適制御に焦点をあて,安定化制御のみを備えた異なるエージェントに対する適応制御ポリシーを提案する。我々は、任意の(標準的な)後悔の少ない制御方法を分散アルゴリズムに還元する。この削減により、得られた分散アルゴリズムは、最適な事前計算された共同ポリシに対して、後悔の少ないことが保証される。提案手法は,オンライン凸最適化をマルチエージェント設定に一般化し,非定型制御からの最近のツールを適用することを含む。本手法は過度に作動する航空機のモデルを用いて実験的に評価する。分散手法は, 障害に対して頑健であり, ダイナミックスにおける逆摂動に対して頑健であることを示す。

We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances. Our study focuses on optimal control without centralized precomputed policies, but rather with adaptive control policies for the different agents that are only equipped with a stabilizing controller. We give a reduction from any (standard) regret minimizing control method to a distributed algorithm. The reduction guarantees that the resulting distributed algorithm has low regret relative to the optimal precomputed joint policy. Our methodology involves generalizing online convex optimization to a multi-agent setting and applying recent tools from nonstochastic control derived for a single agent. We empirically evaluate our method on a model of an overactuated aircraft. We show that the distributed method is robust to failure and to adversarial perturbations in the dynamics.

翻訳日:2022-02-01 16:15:22 公開日:2022-01-28

# シーケンス生成によるスキーマフリー依存パーシング

Schema-Free Dependency Parsing via Sequence Generation ( http://arxiv.org/abs/2201.12407v1 )

ライセンス: Link先を確認

Boda Lin, Zijun Yao, Jiaxin Shi, Shulin Cao, Binghao Tang, Si Li, Yong Luo, Juanzi Li, Lei Hou

(参考訳) 依存関係解析は、文の構文依存構造や意味依存構造を抽出することを目的としている。既存の方法は、普遍性を欠いたり、補助デコーダに強く依存する欠点を負う。これらの欠点を解消するために、補助構造や解析アルゴリズムを使わずに事前学習された言語モデル(PLM)のみを利用することで、シーケンス生成(SG)DPSGを介して、普遍的でスキーマなしの依存性解析(DP)を実現することを提案する。まず、解析構造をシーケンスに変換するための異なるシリアライズ設計戦略を検討する。次に、依存ユニットを設計し、これらのユニットをDPSGのシーケンスにまとめる。シーケンス生成の柔軟性が高いため、DPSGは単一のモデルを用いて構文DPと意味DPの両方を実現できる。特定のスキーマを示すプレフィックスをシーケンスと結合することで、DPSGはマルチスキーマ解析を達成できます。 dpsgの有効性は,ptb,codt,sdp15,semeval16など,広く使用されているdpベンチマークを用いて実証した。 DPSGは、CODTとSemEval16におけるすべてのベンチマークのファーストレベルメソッドと、最先端(SOTA)のパフォーマンスで同等の結果を得る。本稿ではDPSGが新たな解析パラダイムとなる可能性を実証する。私たちは受け入れ次第コードを公開します。

Dependency parsing aims to extract syntactic dependency structure or semantic dependency structure for sentences. Existing methods suffer the drawbacks of lacking universality or highly relying on the auxiliary decoder. To remedy these drawbacks, we propose to achieve universal and schema-free Dependency Parsing (DP) via Sequence Generation (SG) DPSG by utilizing only the pre-trained language model (PLM) without any auxiliary structures or parsing algorithms. We first explore different serialization designing strategies for converting parsing structures into sequences. Then we design dependency units and concatenate these units into the sequence for DPSG. Thanks to the high flexibility of the sequence generation, our DPSG can achieve both syntactic DP and semantic DP using a single model. By concatenating the prefix to indicate the specific schema with the sequence, our DPSG can even accomplish multi-schemata parsing. The effectiveness of our DPSG is demonstrated by the experiments on widely used DP benchmarks, i.e., PTB, CODT, SDP15, and SemEval16. DPSG achieves comparable results with the first-tier methods on all the benchmarks and even the state-of-the-art (SOTA) performance in CODT and SemEval16. This paper demonstrates our DPSG has the potential to be a new parsing paradigm. We will release our codes upon acceptance.

翻訳日:2022-02-01 16:10:49 公開日:2022-01-28

# 物理エンコード学習による希少データからの非線形pdesの検出

Discovering Nonlinear PDEs from Scarce Data with Physics-encoded Learning ( http://arxiv.org/abs/2201.12354v1 )

ライセンス: Link先を確認

Chengping Rao, Pu Ren, Yang Liu, Hao Sun

(参考訳) 複雑な物理現象を支配する偏微分方程式(pdes)を発見するために、実験的な測定値を活用することへの関心が高まっている。過去の研究はデータ駆動型PDE発見において大きな成功を収めてきたが、低品質の測定データを扱う場合、既存の手法の堅牢性は保証できない。この課題を克服するために,不足・雑音データから時空間PDEを発見するための物理符号化離散学習フレームワークを提案する。まず、1)表現能力に柔軟でありながら事前の物理知識(PDE用語、仮定されたPDE構造、初期/境界条件など)を符号化し、高忠実度データを正確に再構成し、(2)再構成されたデータでスパースレグレッションを行い、PDEの明示的な形式を特定する、新しいディープ・畳み込み・リカレント・ネットワークを導入する。本手法を非線形PDEシステムで検証する。ベースラインモデルに対する提案手法の有効性と優位性を示す。

There have been growing interests in leveraging experimental measurements to discover the underlying partial differential equations (PDEs) that govern complex physical phenomena. Although past research attempts have achieved great success in data-driven PDE discovery, the robustness of the existing methods cannot be guaranteed when dealing with low-quality measurement data. To overcome this challenge, we propose a novel physics-encoded discrete learning framework for discovering spatiotemporal PDEs from scarce and noisy data. The general idea is to (1) firstly introduce a novel deep convolutional-recurrent network, which can encode prior physics knowledge (e.g., known PDE terms, assumed PDE structure, initial/boundary conditions, etc.) while remaining flexible on representation capability, to accurately reconstruct high-fidelity data, and (2) perform sparse regression with the reconstructed data to identify the explicit form of the governing PDEs. We validate our method on three nonlinear PDE systems. The effectiveness and superiority of the proposed method over baseline models are demonstrated.

翻訳日:2022-02-01 15:40:02 公開日:2022-01-28

# 安全編集者政策による安全強化学習に向けて

Towards Safe Reinforcement Learning with a Safety Editor Policy ( http://arxiv.org/abs/2201.12427v1 )

ライセンス: Link先を確認

Haonan Yu, Wei Xu, Haichao Zhang

(参考訳) 制約を満たすとともに実用性を最大化する安全強化学習(RL)問題を考察する。我々は、安全概念の事前知識や事前訓練を前提としないので、漸近的制約満足度に興味を持っている。この研究で一般的なアプローチは、ラグランジアン法とモデルなしRLアルゴリズムを組み合わせることで、制約報酬の重み付けを動的に調整することである。効用と制約報酬の衝突に対処するための単一のポリシーに依存しており、しばしば困難である。安全層設計(dalal et al., 2018)に着想を得た我々は、ユーティリティ最大化ポリシーによって出力される潜在的安全でないアクションを安全なものに変換する安全エディタポリシーを別々に学ぶことを提案する。安全編集者は、編集前後のアクションの実用Q値のヒンジ損失を最小限に抑えつつ、制約報酬を最大化するように訓練される。厳格な制約しきい値を持つ12のカスタムセーフティジム(ray et al., 2019)と2つのセーフレーシングタスクにおいて,本手法は制約に準拠しながら優れた実用性能を示す。アブレーション研究は、我々の2つの政治デザインが重要であることを示している。典型的な単一政治アプローチのモデル容量を2倍にするだけでは、同等の結果にはならない。特定の状況ではQヒンジ損失も重要であり、通常のL2距離に置き換えるには失敗する可能性がある。

We consider the safe reinforcement learning (RL) problem of maximizing utility while satisfying provided constraints. Since we do not assume any prior knowledge or pre-training of the safety concept, we are interested in asymptotic constraint satisfaction. A popular approach in this line of research is to combine the Lagrangian method with a model-free RL algorithm to adjust the weight of the constraint reward dynamically. It relies on a single policy to handle the conflict between utility and constraint rewards, which is often challenging. Inspired by the safety layer design (Dalal et al., 2018), we propose to separately learn a safety editor policy that transforms potentially unsafe actions output by a utility maximizer policy into safe ones. The safety editor is trained to maximize the constraint reward while minimizing a hinge loss of the utility Q values of actions before and after the edit. On 12 custom Safety Gym (Ray et al., 2019) tasks and 2 safe racing tasks with very harsh constraint thresholds, our approach demonstrates outstanding utility performance while complying with the constraints. Ablation studies reveal that our two-policy design is critical. Simply doubling the model capacity of typical single-policy approaches will not lead to comparable results. The Q hinge loss is also important in certain circumstances, and replacing it with the usual L2 distance could fail badly.

翻訳日:2022-02-01 15:39:42 公開日:2022-01-28

# エントロピー・リワード(実践)は必要か?

Do You Need the Entropy Reward (in Practice)? ( http://arxiv.org/abs/2201.12434v1 )

ライセンス: Link先を確認

Haonan Yu, Haichao Zhang, Wei Xu

(参考訳) 最大エントロピー(MaxEnt) RLは、元のタスク報酬とエントロピー報酬の組み合わせを最大化する。エントロピーによって課される規則化は、政策改善と政策評価の両方において、共に良好な探索、訓練の収束、学習した政策の堅牢性に寄与していると考えられている。本稿では,MaxEnt RLの代表者であるソフトアクター・クリティック(SAC)に対する様々なアブレーション研究を行い,エントロピーを本質的な報酬としてより深く考察する。以上の結果から,エントロピー報酬は政策評価に留意して適用すべきである。一方、エントロピー報酬は他の固有の報酬と同様に、適切に管理されていない場合、メインタスク報酬を曖昧にすることができる。特にエピソード的マルコフ決定過程(MDP)におけるエントロピー報酬(entropy reward)の失敗事例を同定し,政策が過度に楽観的あるいは悲観的になる可能性を示唆した。一方,本研究は,エントロピー正規化を政策改善にのみ用いることは,政策改善と政策評価の両方で使用するよりも,同等あるいはそれ以上のパフォーマンスと堅牢性をもたらすことを示した。これらの観測に基づいて、エントロピー報酬をゼロ平均(SACZero)に正規化するか、あるいはより実用的な結果を得るために政策評価(SACLite)から単に取り除くことを推奨する。

Maximum entropy (MaxEnt) RL maximizes a combination of the original task reward and an entropy reward. It is believed that the regularization imposed by entropy, on both policy improvement and policy evaluation, together contributes to good exploration, training convergence, and robustness of learned policies. This paper takes a closer look at entropy as an intrinsic reward, by conducting various ablation studies on soft actor-critic (SAC), a popular representative of MaxEnt RL. Our findings reveal that in general, entropy rewards should be applied with caution to policy evaluation. On one hand, the entropy reward, like any other intrinsic reward, could obscure the main task reward if it is not properly managed. We identify some failure cases of the entropy reward especially in episodic Markov decision processes (MDPs), where it could cause the policy to be overly optimistic or pessimistic. On the other hand, our large-scale empirical study shows that using entropy regularization alone in policy improvement, leads to comparable or even better performance and robustness than using it in both policy improvement and policy evaluation. Based on these observations, we recommend either normalizing the entropy reward to a zero mean (SACZero), or simply removing it from policy evaluation (SACLite) for better practical results.

翻訳日:2022-02-01 15:38:01 公開日:2022-01-28

# Any-Play: ゼロショットコーディネーションに固有の拡張

Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination ( http://arxiv.org/abs/2201.12436v1 )

ライセンス: Link先を確認

Keane Lucas and Ross E. Allen

(参考訳) 協調作業における人間または超人的能力を持つ協調人工知能は、機械学習研究のフロンティアに立っている。先行研究は、セルフプレイ(一緒に訓練されたエージェントで構成されるチーム)とクロスプレイ(同じアルゴリズムを使用して独立して訓練されたエージェントのチーム)の制限パラダイムの下で、協調aiのパフォーマンスを評価する傾向があった。最近の研究によると、これらの狭い設定に最適化されたaiは、現実世界で望ましくない協力者になる可能性がある。我々は、エージェント間のアルゴリズム的類似性を仮定することなく、実験プール内の他のすべてのエージェントとの協調性能の評価を行う、アルゴリズム間クロスプレイと呼ばれる協調AIを評価するための代替基準を定式化する。このパラダイムでは、Other-Play や Off-Belief Learning といった既存の最先端の協調型AIアルゴリズムが低パフォーマンスであることを示す。本稿では,ゼロショットコーディネーション(ZSC)のための多様性に基づく固有報酬のマルチエージェント拡張であるAny-Play学習拡張を提案する。本研究では,Any-Play学習をSAD(Simplified Action Decoder)に適用し,コラボレーションカードゲーム「はなび」の最先端性能を示す。

Cooperative artificial intelligence with human or superhuman proficiency in collaborative tasks stands at the frontier of machine learning research. Prior work has tended to evaluate cooperative AI performance under the restrictive paradigms of self-play (teams composed of agents trained together) and cross-play (teams of agents trained independently but using the same algorithm). Recent work has indicated that AI optimized for these narrow settings may make for undesirable collaborators in the real-world. We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play, where agents are evaluated on teaming performance with all other agents within an experiment pool with no assumption of algorithmic similarities between agents. We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm. We propose the Any-Play learning augmentation -- a multi-agent extension of diversity-based intrinsic rewards for zero-shot coordination (ZSC) -- for generalizing self-play-based algorithms to the inter-algorithm cross-play setting. We apply the Any-Play learning augmentation to the Simplified Action Decoder (SAD) and demonstrate state-of-the-art performance in the collaborative card game Hanabi.

翻訳日:2022-02-01 15:37:34 公開日:2022-01-28

# Automaton-augmented Retrievalを用いたニューロシンボリック言語モデリング

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval ( http://arxiv.org/abs/2201.12431v1 )

ライセンス: Link先を確認

Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig

(参考訳) 検索型言語モデル(R-LM)は、標準言語モデル(LM)とテスト時に外部データストアから取得した例を組み合わせることで、自然言語テキストの確率をモデル化する。効果的ではあるが、実際にこれらのモデルを使用する際の大きなボトルネックは計算コストのかかるデータストア検索である。本稿では,(1)エントリの"状態"へのクラスタリングと(2)前のエントリからの状態遷移に基づいて,データストア検索を近似したRetoMaton --検索オートマトンを提案する。これにより、データストアをフラットリストとして表現するのではなく、データストア上に構築された重み付き有限オートマトンが実現される。オートマトンの作成は監視されず、RetoMatonはオリジナルのトレーニングコーパスまたは他のドメインから、任意のテキストコレクションから構築することができる。このオートマトンをLM推論と並行して推論時にトラバースすることは、その難易度を減少させるか、またはkNN-LM(Khandelwal et al., 2020)上で最も近い隣人探索の83%を、難易度を損なうことなく節約する。

Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton -- retrieval automaton -- which approximates the datastore search, based on (1) clustering of entries into "states", and (2) state transitions from previous entries. This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity, or alternatively saves up to 83% of the nearest neighbor searches over kNN-LM (Khandelwal et al., 2020), without hurting perplexity.

翻訳日:2022-02-01 15:20:15 公開日:2022-01-28

# 善の逆例:不均衡学習を指導する逆例

Adversarial Examples for Good: Adversarial Examples Guided Imbalanced Learning ( http://arxiv.org/abs/2201.12356v1 )

ライセンス: Link先を確認

Jie Zhang, Lei Zhang, Gang Li, Chao Wu

(参考訳) 逆の例は、攻撃者がモデルをミスさせるよう設計した機械学習モデルの入力である。本稿では,不均衡学習の性能向上のために,逆例も有効に活用できることを実証する。我々は,不均衡なデータの扱い方について,GAE(Guiding Adversarial Examples)によるトレーニングによってバイアス付き決定境界を調整するという,新たな視点を提供する。本手法は,少数クラスの精度を効果的に向上し,多数派の精度を損なうことができる。いくつかのベンチマークデータセットにおいて,提案手法は最先端手法に匹敵することを示す。最善の知識として、我々は、逆の例で不均衡な学習を扱う最初の人です。

Adversarial examples are inputs for machine learning models that have been designed by attackers to cause the model to make mistakes. In this paper, we demonstrate that adversarial examples can also be utilized for good to improve the performance of imbalanced learning. We provide a new perspective on how to deal with imbalanced data: adjust the biased decision boundary by training with Guiding Adversarial Examples (GAEs). Our method can effectively increase the accuracy of minority classes while sacrificing little accuracy on majority classes. We empirically show, on several benchmark datasets, our proposed method is comparable to the state-of-the-art method. To our best knowledge, we are the first to deal with imbalanced learning with adversarial examples.

翻訳日:2022-02-01 15:19:04 公開日:2022-01-28

# 適応型ルックアヘッドによる計画と学習

Planning and Learning with Adaptive Lookahead ( http://arxiv.org/abs/2201.12403v1 )

ライセンス: Link先を確認

Aviv Rosenberg and Assaf Hallak and Shie Mannor and Gal Chechik and Gal Dalal

(参考訳) 古典的ポリシーイテレーション(PI)アルゴリズムは、強欲な一段階の政策改善と政策評価を交互に行う。近年の文献では、複数ステップのルックアヘッド政策の改善は、イテレーション毎の複雑さの増加を犠牲にして、収束率の向上につながることが示されている。しかし、アルゴリズムを実行する前に、何が最良の固定された視線地平線であるかを知ることはできない。さらに、与えられた走行ごとに、地平線より大きい視線を使うのは、しばしば無駄である。本研究では,多段階の地平線を状態と推定値の関数として動的に適応する手法として,初めて提案する。 2つのPI変種を考案し、イテレーション数とイテレーション毎の計算複雑性のトレードオフを分析する。第1の変種は所望の縮小係数を目的とし、文単位の複雑さを最小化する。第2の変種は1イテレーションあたりの計算複雑性を入力とし、全体の収縮係数を最小化する。次に、適応木探索地平線を持つ対応するDQNに基づくアルゴリズムを考案する。また、オンライン学習の新たな拡張として、奥行き値関数推定器(per-deepth value function estimator)も含んでいる。最後に, 迷路環境およびアタリにおける適応型ルックアヘッド法の有効性を実証した。

The classical Policy Iteration (PI) algorithm alternates between greedy one-step policy improvement and policy evaluation. Recent literature shows that multi-step lookahead policy improvement leads to a better convergence rate at the expense of increased complexity per iteration. However, prior to running the algorithm, one cannot tell what is the best fixed lookahead horizon. Moreover, per a given run, using a lookahead of horizon larger than one is often wasteful. In this work, we propose for the first time to dynamically adapt the multi-step lookahead horizon as a function of the state and of the value estimate. We devise two PI variants and analyze the trade-off between iteration count and computational complexity per iteration. The first variant takes the desired contraction factor as the objective and minimizes the per-iteration complexity. The second variant takes as input the computational complexity per iteration and minimizes the overall contraction factor. We then devise a corresponding DQN-based algorithm with an adaptive tree search horizon. We also include a novel enhancement for on-policy learning: per-depth value function estimator. Lastly, we demonstrate the efficacy of our adaptive lookahead method in a maze environment and in Atari.

翻訳日:2022-02-01 15:18:54 公開日:2022-01-28

# 抽象的視覚推論のための深層学習法:レイブンの進行行列に関する調査

Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices ( http://arxiv.org/abs/2201.12382v1 )

ライセンス: Link先を確認

Miko{\l}aj Ma{\l}ki\'nski and Jacek Ma\'ndziuk

(参考訳) 抽象視覚推論(AVR)ドメインは、特定のシーンに存在するエンティティ間の関係を推論する能力を必要とする問題を解決する。人間は一般に「自然」な方法でAVRタスクを解くが、従来の経験がなくてもこのような問題は現在の機械学習システムでは難しいことが証明されている。本稿では,AVR問題に対するディープラーニング手法の適用の最近の進歩を,機械学習研究のプロキシとして要約する。我々は、最も一般的なタイプのAVRタスク(Raven's Progressive Matrices (RPM))に焦点を当て、RPMを解決するために適用される学習方法と深層ニューラルネットワークの包括的なレビューとRPMベンチマークセットを提供する。 RPMを解くための最先端のアプローチのパフォーマンス分析は、この分野の現在と将来のトレンドに関する特定の洞察と発言の定式化につながる。本論文は,rpm研究の発見から実世界の問題がどのように恩恵を受けるかを示すことで結論づける。

Abstract visual reasoning (AVR) domain encompasses problems solving which requires the ability to reason about relations among entities present in a given scene. While humans, generally, solve AVR tasks in a ``natural'' way, even without prior experience, this type of problems has proven difficult for current machine learning systems. The paper summarises recent progress in applying deep learning methods to solving AVR problems, as a proxy for studying machine intelligence. We focus on the most common type of AVR tasks -- the Raven's Progressive Matrices (RPMs) -- and provide a comprehensive review of the learning methods and deep neural models applied to solve RPMs, as well as, the RPM benchmark sets. Performance analysis of the state-of-the-art approaches to solving RPMs leads to formulation of certain insights and remarks on the current and future trends in this area. We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies.

翻訳日:2022-02-01 14:38:51 公開日:2022-01-28

# 動的雑音の背景における視覚探索戦略の最適化のための深部q学習法

A deep Q-learning method for optimizing visual search strategies in backgrounds of dynamic noise ( http://arxiv.org/abs/2201.12385v1 )

ライセンス: Link先を確認

Weimin Zhou, Miguel P. Eckstein

(参考訳) 人間は様々な解像度で視覚情報を処理し(探索された視覚システム)、目の動きを通して高解像度の焦点を興味のある点に向けて画像を探索する。タスク関連情報の完全な知識を用いるベイズ理想探索器(is)は、眼球運動戦略を最適化し、最適な探索性能を達成する。 ISは、人間の眼球運動の最適性を評価する重要なツールとして利用でき、人間の視線探索戦略を改善するためのガイダンスを提供する可能性がある。 Najemnik と Geisler (2005) は空間的 1/f ノイズの背景に対する IS を導出した。対応するテンプレート応答はガウス分布に従い、最適な探索戦略を解析的に決定することができる。しかし、医療画像のようなより現実的で複雑な背景を考えると、ISの計算は難解である。現代の強化学習法は、様々なタスクに対して最適なポリシーを得るためにうまく適用され、背景生成関数の完全な知識を必要とせず、解剖学的背景に適用することができる。重要な第一歩は強化学習法の最適性を検証することである。本研究では, isを近似するqネットワークを用いた強化学習手法について検討する。本稿では,qネットワークに対応する検索戦略がis検索戦略と一致することを示す。本研究は,実解剖学的背景を用いた最適眼球運動計画推定のためのq-networkアプローチによる強化学習の可能性を示す。

Humans process visual information with varying resolution (foveated visual system) and explore images by orienting through eye movements the high-resolution fovea to points of interest. The Bayesian ideal searcher (IS) that employs complete knowledge of task-relevant information optimizes eye movement strategy and achieves the optimal search performance. The IS can be employed as an important tool to evaluate the optimality of human eye movements, and potentially provide guidance to improve human observer visual search strategies. Najemnik and Geisler (2005) derived an IS for backgrounds of spatial 1/f noise. The corresponding template responses follow Gaussian distributions and the optimal search strategy can be analytically determined. However, the computation of the IS can be intractable when considering more realistic and complex backgrounds such as medical images. Modern reinforcement learning methods, successfully applied to obtain optimal policy for a variety of tasks, do not require complete knowledge of the background generating functions and can be potentially applied to anatomical backgrounds. An important first step is to validate the optimality of the reinforcement learning method. In this study, we investigate the ability of a reinforcement learning method that employs Q-network to approximate the IS. We demonstrate that the search strategy corresponding to the Q-network is consistent with the IS search strategy. The findings show the potential of the reinforcement learning with Q-network approach to estimate optimal eye movement planning with real anatomical backgrounds.

翻訳日:2022-02-01 14:36:51 公開日:2022-01-28

# (参考訳) データセンターにおけるネットワーク負荷分散のためのマルチエージェント強化学習

Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center ( http://arxiv.org/abs/2201.11727v2 )

ライセンス: CC0 1.0

Zhiyuan Yao, Zihan Ding, Thomas Clausen

(参考訳) 本稿では,マルチエージェント強化学習(marl)手法のための実世界課題であるネットワーク負荷分散問題を提案する。 Weighted-Cost Multi-Path (WCMP)やLocal Shortest Queue (LSQ)のような従来のヒューリスティックなソリューションは、ワークロードの分散や到着率の変化に対して柔軟性が低く、複数のロードバランサ間のバランスが低い。協調的ネットワーク負荷分散タスクはDec-POMDP問題として定式化され、MARL法を自然に誘導する。学習に基づく手法を適用するための現実のギャップを埋めるため、すべての手法は中程度から大規模までのエミュレーションシステム上で直接訓練され、評価される。現実的なテストベッドの実験では、独立的で"利己的"なロードバランシング戦略が必ずしもグローバルな最適戦略ではないことが示され、提案されたMARLソリューションは、異なる現実的な設定よりも優れたパフォーマンスを示している。さらに,ネットワークロードバランシングにおけるmarl手法の潜在的な難しさを解析し,学習者やネットワークコミュニティの関心を引きつけている。

This paper presents the network load balancing problem, a challenging real-world task for multi-agent reinforcement learning (MARL) methods. Traditional heuristic solutions like Weighted-Cost Multi-Path (WCMP) and Local Shortest Queue (LSQ) are less flexible to the changing workload distributions and arrival rates, with a poor balance among multiple load balancers. The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods. To bridge the reality gap for applying learning-based methods, all methods are directly trained and evaluated on an emulation system from moderate-to large-scale. Experiments on realistic testbeds show that the independent and "selfish" load balancing strategies are not necessarily the globally optimal ones, while the proposed MARL solution has a superior performance over different realistic settings. Additionally, the potential difficulties of MARL methods for network load balancing are analysed, which helps to draw the attention of the learning and network communities to such challenges.

翻訳日:2022-02-01 13:35:55 公開日:2022-01-28

# (参考訳) 低頻度かつ説明可能な特徴を有する白血球白血病の分類

Classification of White Blood Cell Leukemia with Low Number of Interpretable and Explainable Features ( http://arxiv.org/abs/2201.11864v1 )

ライセンス: CC BY 4.0

William Franz Lamberti

(参考訳) 白血球(WBC)白血病は画像ベース分類によって検出される。畳み込みニューラルネットワークは、細胞の画像を悪性または健康に分類するために必要な特徴を学ぶために使用される。しかし、この種のモデルは多数のパラメータを学習する必要があるため、解釈や説明が困難である。説明可能なAI(XAI)は、モデルが意思決定を行う方法に関する洞察を提供することで、この問題を緩和しようとする。そこで,本研究では,説明可能かつ解釈可能な特徴を24個しか用いず,他の手法と比較して約4.38\%の精度で高いxaiモデルを提案する。さらに,本手法は,細胞分類においてどの変数が最も重要なのかを考察する。この洞察は、研究室がWBCを別々に扱うと、様々な指標の重要性が大幅に変化することを示す。分類の重要な特徴を理解することは、医学的画像診断や、科学的な追求のために構築されたAIモデルの理解において不可欠である。

White Blood Cell (WBC) Leukaemia is detected through image-based classification. Convolutional Neural Networks are used to learn the features needed to classify images of cells a malignant or healthy. However, this type of model requires learning a large number of parameters and is difficult to interpret and explain. Explainable AI (XAI) attempts to alleviate this issue by providing insights to how models make decisions. Therefore, we present an XAI model which uses only 24 explainable and interpretable features and is highly competitive to other approaches by outperforming them by about 4.38\%. Further, our approach provides insight into which variables are the most important for the classification of the cells. This insight provides evidence that when labs treat the WBCs differently, the importance of various metrics changes substantially. Understanding the important features for classification is vital in medical imaging diagnosis and, by extension, understanding the AI models built in scientific pursuits.

翻訳日:2022-02-01 09:28:22 公開日:2022-01-28

# (参考訳) FedLite: リソース制約のあるクライアント上でのフェデレーション学習のためのスケーラブルなアプローチ

FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients ( http://arxiv.org/abs/2201.11865v1 )

ライセンス: CC BY 4.0

Jianyu Wang, Hang Qi, Ankit Singh Rawat, Sashank Reddi, Sagar Waghmare, Felix X. Yu, Gauri Joshi

(参考訳) 古典的なフェデレーション学習では、クライアントは、プライベートデータ上の基盤モデルのローカルアップデートをコーディネートサーバに伝えて、全体的なトレーニングに寄与する。しかし、リソースに制約のあるクライアントが大規模な機械学習モデルを学習しようとすると、モデル全体の更新と通信は極めて高価になる。分割学習は、モデルの一部だけがクライアントに保存され、トレーニングされ、残りの大部分がサーバに留まっているような環境で、自然なソリューションを提供する。しかし、分割学習で使用されるモデル分割は、かなりの通信コストをもたらす。本稿では,勾配補正法を併用した新しいクラスタリング方式を用いて,付加的な通信を圧縮することで,この問題に対処する。画像およびテキストベンチマークの広範な実証評価により、提案手法は最大490\times$の通信コストを最小の精度で削減でき、望ましい性能と通信トレードオフを実現できることが示された。

In classical federated learning, the clients contribute to the overall training by communicating local updates for the underlying model on their private data to a coordinating server. However, updating and communicating the entire model becomes prohibitively expensive when resource-constrained clients collectively aim to train a large machine learning model. Split learning provides a natural solution in such a setting, where only a small part of the model is stored and trained on clients while the remaining large part of the model only stays at the servers. However, the model partitioning employed in split learning introduces a significant amount of communication cost. This paper addresses this issue by compressing the additional communication using a novel clustering scheme accompanied by a gradient correction method. Extensive empirical evaluations on image and text benchmarks show that the proposed method can achieve up to $490\times$ communication cost reduction with minimal drop in accuracy, and enables a desirable performance vs. communication trade-off.

翻訳日:2022-02-01 09:12:40 公開日:2022-01-28

# (参考訳) ラベル平滑化を用いた病理組織像分類器の校正

Calibrating Histopathology Image Classifiers using Label Smoothing ( http://arxiv.org/abs/2201.11866v1 )

ライセンス: CC BY 4.0

Jerry Wei and Lorenzo Torresani and Jason Wei and Saeed Hassanpour

(参考訳) 病理組織像の分類は、病理組織像が自然に様々な診断的特徴を示すため、従来の画像分類課題と根本的に異なる。しかし、アノテータの不一致の例は、多くの場合、大多数のラベルに割り当てられるか、病理組織学画像分類器の訓練時に完全に破棄される。この広範にわたる慣行は、しばしば難易度を考慮せず、モデルのキャリブレーションが貧弱な分類器をもたらす。本稿では, 組織像分類器にサンプル難易度に関する帰納バイアスを与えることにより, モデル校正を改善することができるか? 画像毎のアノテータ合意を利用したラベル平滑化手法を提案する。提案手法は単純ではあるが,精度を維持(あるいは改善)しながら,モデルキャリブレーションを大幅に改善していることがわかった。大腸ポリープ分類は消化管病理における一般的な課題でありながら課題であり,本提案の合意対応ラベル平滑化手法は校正誤差を約70%削減する。さらに,アノテータ契約のプロキシとしてモデル信頼性を用いることでキャリブレーションと精度が向上し,複数のアノテータを含まないデータセットは,提案手法によるラベル平滑化手法の恩恵を受けられることが示唆された。キャリブレーション(特に病理組織学的画像解析)の重要性を考えると、提案手法の改善は、他の病理組織学的画像分類タスクにおけるさらなる探索と潜在的な実装に役立つ。

The classification of histopathology images fundamentally differs from traditional image classification tasks because histopathology images naturally exhibit a range of diagnostic features, resulting in a diverse range of annotator agreement levels. However, examples with high annotator disagreement are often either assigned the majority label or discarded entirely when training histopathology image classifiers. This widespread practice often yields classifiers that do not account for example difficulty and exhibit poor model calibration. In this paper, we ask: can we improve model calibration by endowing histopathology image classifiers with inductive biases about example difficulty? We propose several label smoothing methods that utilize per-image annotator agreement. Though our methods are simple, we find that they substantially improve model calibration, while maintaining (or even improving) accuracy. For colorectal polyp classification, a common yet challenging task in gastrointestinal pathology, we find that our proposed agreement-aware label smoothing methods reduce calibration error by almost 70%. Moreover, we find that using model confidence as a proxy for annotator agreement also improves calibration and accuracy, suggesting that datasets without multiple annotators can still benefit from our proposed label smoothing methods via our proposed confidence-aware label smoothing methods. Given the importance of calibration (especially in histopathology image analysis), the improvements from our proposed techniques merit further exploration and potential implementation in other histopathology image classification tasks.

翻訳日:2022-02-01 08:47:00 公開日:2022-01-28

# (参考訳) 構造入力による局所潜時空間ベイズ最適化

Local Latent Space Bayesian Optimization over Structured Inputs ( http://arxiv.org/abs/2201.11872v1 )

ライセンス: CC BY-SA 4.0

Natalie Maus, Haydn T. Jones, Juston S. Moore, Matt J. Kusner, John Bradshaw, Jacob R. Gardner

(参考訳) 深層オートエンコーダモデル(英語版)(daes)の潜在空間上のベイズ最適化は、構造化、離散化、列挙困難な探索空間(例えば分子)に対して挑戦的なブラックボックス関数を最適化するための有望な新しいアプローチとして最近登場した。ここで、daeは入力をベイズ最適化ツールがより容易に適用できる連続的潜在空間にマッピングすることで、検索空間を劇的に単純化する。この単純化にもかかわらず、潜在空間は通常高次元のままである。したがって、うまく適合した潜在空間であっても、これらのアプローチは必ずしも完全な解を提供するものではなく、むしろ構造化最適化問題を高次元空間に移すことができる。本稿では,高次元ベイズ最適化における信頼領域の概念を構造化環境に適応させるLOL-BOを提案する。 daeのグローバルエンコーダと信頼領域内のサロゲートモデルのディープカーネルの両方として機能するようにエンコーダを再構成することで、潜在空間における局所最適化の概念を入力空間における局所最適化と一致させる。 LOL-BOは6つの実世界のベンチマークで、最先端の潜在空間ベイズ最適化手法よりも最大20倍の改善を実現し、最適化戦略の改善はより良いDAEモデルの開発と同じくらい重要であることを示した。

Bayesian optimization over the latent spaces of deep autoencoder models (DAEs) has recently emerged as a promising new approach for optimizing challenging black-box functions over structured, discrete, hard-to-enumerate search spaces (e.g., molecules). Here the DAE dramatically simplifies the search space by mapping inputs into a continuous latent space where familiar Bayesian optimization tools can be more readily applied. Despite this simplification, the latent space typically remains high-dimensional. Thus, even with a well-suited latent space, these approaches do not necessarily provide a complete solution, but may rather shift the structured optimization problem to a high-dimensional one. In this paper, we propose LOL-BO, which adapts the notion of trust regions explored in recent work on high-dimensional Bayesian optimization to the structured setting. By reformulating the encoder to function as both an encoder for the DAE globally and as a deep kernel for the surrogate model within a trust region, we better align the notion of local optimization in the latent space with local optimization in the input space. LOL-BO achieves as much as 20 times improvement over state-of-the-art latent space Bayesian optimization methods across six real-world benchmarks, demonstrating that improvement in optimization strategies is as important as developing better DAE models.

翻訳日:2022-02-01 08:35:16 公開日:2022-01-28

# (参考訳) オートエンコーダアーキテクチャにおける分布データの幾何学的不安定性

Geometric instability of out of distribution data across autoencoder architecture ( http://arxiv.org/abs/2201.11902v1 )

ライセンス: CC BY 4.0

Susama Agarwala, Ben Dees, Corey Lowman

(参考訳) mnistで学習したオートエンコーダの系統が学習した地図を調査し,10種類の分布に応じて画素値のランダム選択により作成した10種類のデータセットについて評価した。具体的には,オートエンコーダの重み行列で定義されるジャコビアンの固有値と評価点について検討する。十分高い潜在次元では、各オートエンコーダは、類似の \emph{generalized characters} としてすべての評価データセットを再構成するが、この再構成された \emph{generalized character} は、オートエンコーダをまたいで変化する。固有値解析により、再構成された画像が分布データセットの全てに対してMNIST文字のように見える場合でも、MNIST文字の潜在表現に近い潜在表現を持つとは限らないことが分かる。いずれにせよ、固有値解析は、分布入力の関数としてのオートエンコーダの幾何的不安定性を、同じ入力の集合上のアーキテクチャ全体にわたって証明した。

We study the map learned by a family of autoencoders trained on MNIST, and evaluated on ten different data sets created by the random selection of pixel values according to ten different distributions. Specifically, we study the eigenvalues of the Jacobians defined by the weight matrices of the autoencoder at each training and evaluation point. For high enough latent dimension, we find that each autoencoder reconstructs all the evaluation data sets as similar \emph{generalized characters}, but that this reconstructed \emph{generalized character} changes across autoencoder. Eigenvalue analysis shows that even when the reconstructed image appears to be an MNIST character for all out of distribution data sets, not all have latent representations that are close to the latent representation of MNIST characters. All told, the eigenvalue analysis demonstrated a great deal of geometric instability of the autoencoder both as a function on out of distribution inputs, and across architectures on the same set of inputs.

翻訳日:2022-02-01 08:12:49 公開日:2022-01-28

# (参考訳) 大規模言語モデルにおける思考プロンプトの連鎖

Chain of Thought Prompting Elicits Reasoning in Large Language Models ( http://arxiv.org/abs/2201.11903v1 )

ライセンス: CC BY 4.0

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, Denny Zhou

(参考訳) 言語モデルのスケールアップは、様々なNLPタスクのパフォーマンスを確実に向上させたが、現在最大のモデルでさえ、数学語問題、記号操作、コモンセンス推論のような特定の推論タスクに苦戦している。本稿は,質問に回答する際の推論過程を模倣した一連の短い文である,一貫性のある思考列を生成する言語モデルの能力について考察する。実験により、プロンプトによって思考の連鎖を誘導することで、十分に大きな言語モデルが、平らなスケーリング曲線を持つ推論タスクをより良く実行できるようになることが示されている。

Although scaling up language model size has reliably improved performance on a range of NLP tasks, even the largest models currently struggle with certain reasoning tasks such as math word problems, symbolic manipulation, and commonsense reasoning. This paper explores the ability of language models to generate a coherent chain of thought -- a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves.

翻訳日:2022-02-01 07:53:50 公開日:2022-01-28

# (参考訳) 双対スパイクニューラルネットワークにおける死ニューロンとスパーシティの細線

The fine line between dead neurons and sparsity in binarized spiking neural networks ( http://arxiv.org/abs/2201.11915v1 )

ライセンス: CC BY 4.0

Jason K. Eshraghian, Wei D. Lu

(参考訳) スパイキングニューラルネットワークは、時間領域の情報を符号化したり、より高精度な隠蔽状態の離散化量を処理することによって量子化誤差を補償することができる。理論上、広いダイナミックレンジ状態空間は複数の二項化入力をまとめることを可能にし、個々のニューロンの表現能力を向上させる。これは発射しきい値を増加させることによって達成されるが、過度に高くなり、スパーススパイク活性はスパイク放出しない。本稿では,しきい値のウォームアップ手法として'Threshold annealing'を提案する。複数の層にまたがってスパイクを伝播させることで、ニューロンが火を放つのを防ぎ、その結果、二項化重みを使いながら4つの異なるデータセットに対して高い競争力を発揮することを示す。ソースコードはhttps://github.com/jeshraghian/snn-tha/で入手できる。

Spiking neural networks can compensate for quantization error by encoding information either in the temporal domain, or by processing discretized quantities in hidden states of higher precision. In theory, a wide dynamic range state-space enables multiple binarized inputs to be accumulated together, thus improving the representational capacity of individual neurons. This may be achieved by increasing the firing threshold, but make it too high and sparse spike activity turns into no spike emission. In this paper, we propose the use of `threshold annealing' as a warm-up method for firing thresholds. We show it enables the propagation of spikes across multiple layers where neurons would otherwise cease to fire, and in doing so, achieve highly competitive results on four diverse datasets, despite using binarized weights. Source code is available at https://github.com/jeshraghian/snn-tha/

翻訳日:2022-02-01 07:18:11 公開日:2022-01-28

# (参考訳) バタフライネットワーク上のタスク認識ネットワーク符号化

Task-Aware Network Coding Over Butterfly Network ( http://arxiv.org/abs/2201.11917v1 )

ライセンス: CC BY 4.0

Jiangnan Cheng, Sandeep Chinchali, Ao Tang

(参考訳) ネットワーク符号化により、センサなどの分散情報ソースは、帯域幅制限ネットワークを介して分散受信機にデータを効率よく圧縮し、送信することができる。古典的なネットワーク符号化は主にタスクに依存しない - 受信したデータがどの究極のタスクに使用されるかに関わらず、主に受信側でデータを忠実に再構築することを目的としている。本稿では、分散受信機が機械学習(ML)タスクを介して送信されたデータを渡すタスク駆動型ネットワークコーディング問題を分析し、有能なタスク関連データ表現を送信することで効率を向上する機会を提供する。具体的には、主成分分析(PCA)による損失アナログ圧縮を応用できる実座標空間におけるバタフライネットワーク上のタスク認識ネットワーク符号化問題を定式化する。定式化問題に対する全損失関数に対する下限が与えられ、この下限を達成するために必要な十分な条件も提供される。そこで本研究では,一般のケースで問題を解くためにmlアルゴリズムを導入し,タスク認識型ネットワーク符号化の有効性を実証する。

Network coding allows distributed information sources such as sensors to efficiently compress and transmit data to distributed receivers across a bandwidth-limited network. Classical network coding is largely task-agnostic -- the coding schemes mainly aim to faithfully reconstruct data at the receivers, regardless of what ultimate task the received data is used for. In this paper, we analyze a new task-driven network coding problem, where distributed receivers pass transmitted data through machine learning (ML) tasks, which provides an opportunity to improve efficiency by transmitting salient task-relevant data representations. Specifically, we formulate a task-aware network coding problem over a butterfly network in real-coordinate space, where lossy analog compression through principal component analysis (PCA) can be applied. A lower bound for the total loss function for the formulated problem is given, and necessary and sufficient conditions for achieving this lower bound are also provided. We introduce ML algorithms to solve the problem in the general case, and our evaluation demonstrates the effectiveness of task-aware network coding.

翻訳日:2022-02-01 06:56:39 公開日:2022-01-28

# (参考訳) ヘビーテールマルチアームバンディットのための適応型両世界のベストオブバイザーズアルゴリズム

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits ( http://arxiv.org/abs/2201.11921v1 )

ライセンス: CC BY-SA 4.0

Jiatai Huang, Yan Dai, Longbo Huang

(参考訳) 本稿では,重畳型マルチアーム付きバンディットの概念を敵環境に一般化し,重畳型マルチアーム付きバンディット (mab) に対して頑健な最善のバイザーワールドアルゴリズムを開発し,損失が$\sigma^\alpha$ で区切られた$\alpha$-th ($1<\alpha\le 2$) モーメントを持つ場合,分散は存在しない。具体的には、ヘビーテールパラメータ $\alpha$ と $\sigma$ がエージェントに知られている場合、 \texttt{htinf} は、実際の環境タイプ a-priori を知らずに、確率的環境と敵対的環境の両方に対して最適な後悔を達成する。 alpha,\sigma$ が未知の場合、 \texttt{htinf} は確率ケースでは $\log t$-style instance-dependent regret となり、反対ケースでは $o(t)$ no-regret が保証される。さらに、$\mathcal O(\sigma K^{1-\nicefrac 1\alpha}T^{\nicefrac{1}{\alpha}})$ minimax optimal regret を、$\alpha$ と $\sigma$ について事前の知識を必要とせずに、敵対的設定でも実現できるアルゴリズムである。この結果は、確率的な環境を仮定した既知の後悔(bubeck et al., 2013)と、$\alpha$と$\sigma$の両方が知られている。我々の知る限り、提案した‘texttt{HTINF} アルゴリズムは、最良世界の後悔の保証を初めて享受し、‘texttt{AdaTINF} は $\alpha$ と $\sigma$ の両方に適応できる最初のアルゴリズムであり、古典的な重み付き確率的 MAB 設定と我々の新しい逆数定式化において最適なギャップ非依存性の後悔を達成できる。

In this paper, we generalize the concept of heavy-tailed multi-armed bandits to adversarial environments, and develop robust best-of-both-worlds algorithms for heavy-tailed multi-armed bandits (MAB), where losses have $\alpha$-th ($1<\alpha\le 2$) moments bounded by $\sigma^\alpha$, while the variances may not exist. Specifically, we design an algorithm \texttt{HTINF}, when the heavy-tail parameters $\alpha$ and $\sigma$ are known to the agent, \texttt{HTINF} simultaneously achieves the optimal regret for both stochastic and adversarial environments, without knowing the actual environment type a-priori. When $\alpha,\sigma$ are unknown, \texttt{HTINF} achieves a $\log T$-style instance-dependent regret in stochastic cases and $o(T)$ no-regret guarantee in adversarial cases. We further develop an algorithm \texttt{AdaTINF}, achieving $\mathcal O(\sigma K^{1-\nicefrac 1\alpha}T^{\nicefrac{1}{\alpha}})$ minimax optimal regret even in adversarial settings, without prior knowledge on $\alpha$ and $\sigma$. This result matches the known regret lower-bound (Bubeck et al., 2013), which assumed a stochastic environment and $\alpha$ and $\sigma$ are both known. To our knowledge, the proposed \texttt{HTINF} algorithm is the first to enjoy a best-of-both-worlds regret guarantee, and \texttt{AdaTINF} is the first algorithm that can adapt to both $\alpha$ and $\sigma$ to achieve optimal gap-indepedent regret bound in classical heavy-tailed stochastic MAB setting and our novel adversarial formulation.

翻訳日:2022-02-01 06:27:18 公開日:2022-01-28

# (参考訳) 非凸最適化におけるデフレの簡易化とベイズ推論と位相最適化への応用

Simplifying deflation for non-convex optimization with applications in Bayesian inference and topology optimization ( http://arxiv.org/abs/2201.11926v1 )

ライセンス: CC BY 4.0

Mohamed Tarek, Yijiang Huang

(参考訳) 非凸最適化問題は複数の局所最適解を持つ。非凸最適化問題は、多くのアプリケーションでよく見られる。ランダムな再初期化なしに複数の局所最適解を効率的に探索する手法の1つがデフレの概念に依存している。本稿では,非凸最適化と非線形解法におけるデフレの異なる方法について述べる。既存の非線形プログラミング解法や非線形システム解法とともにデフレを可能にするために, 単純で汎用的で斬新なデフレ制約を提案する。提案したデフレ制約と最小距離制約との接続を示す。さらに、デフレ制約の様々なバリエーションとその制限について論じる。最後に、近似ベイズ推定と位相最適化の分野における提案手法の多くの応用について述べる。

Non-convex optimization problems have multiple local optimal solutions. Non-convex optimization problems are commonly found in numerous applications. One of the methods recently proposed to efficiently explore multiple local optimal solutions without random re-initialization relies on the concept of deflation. In this paper, different ways to use deflation in non-convex optimization and nonlinear system solving are discussed. A simple, general and novel deflation constraint is proposed to enable the use of deflation together with existing nonlinear programming solvers or nonlinear system solvers. The connection between the proposed deflation constraint and a minimum distance constraint is presented. Additionally, a number of variations of deflation constraints and their limitations are discussed. Finally, a number of applications of the proposed methodology in the fields of approximate Bayesian inference and topology optimization are presented.

翻訳日:2022-02-01 05:47:25 公開日:2022-01-28

# (参考訳) 高速解釈可能なグリーディツリー和(fig)

Fast Interpretable Greedy-Tree Sums (FIGS) ( http://arxiv.org/abs/2201.11931v1 )

ライセンス: CC BY 4.0

Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, Bin Yu

(参考訳) 現代の機械学習は印象的な予測性能を達成しているが、多くの問題において重要な考慮事項である解釈性を犠牲にすることが多い。本稿では,簡潔なルールモデルに適合するアルゴリズムであるFast Interpretable Greedy-Tree Sums (FIGS)を提案する。具体的には、FIGSはCARTアルゴリズムを一般化し、累積で柔軟な数の木を同時に成長させる。すべての木にまたがる分割の総数は予め定められたしきい値によって制限され、その木のサイズと数の両方が制御される。両者が小さい場合は、簡単に視覚化して手書きで書けるので、高い解釈が可能である。部分的にオラクルの理論的結果は、生成的加法モデルの付加成分を分離することで、figがシングルツリーモデルの重大な弱点を克服し、同じ特徴の繰り返し分裂による冗長性を低減できる可能性を示唆している。さらに、最適木構造へのオラクルアクセスが与えられた場合、C1成分関数の場合、そのような生成モデルに対してL2一般化境界を得る。さまざまな実世界のデータセットにわたる大規模な実験は、FIGSがほんの数分割(例:20未満)に制限された場合、最先端の予測性能(一般的なルールベースのメソッドすべて)を達成することを示している。 FIGSは繰り返し分割を回避でき、しばしば予測性能を犠牲にすることなく、適合した決定木よりも簡潔な決定ルールを提供する。すべてのコードとモデルはGithub \url{https://github.com/csinva/imodels} のフルパッケージでリリースされる。

Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in many problems. Here, we propose Fast Interpretable Greedy-Tree Sums (FIGS), an algorithm for fitting concise rule-based models. Specifically, FIGS generalizes the CART algorithm to simultaneously grow a flexible number of trees in a summation. The total number of splits across all the trees can be restricted by a pre-specified threshold, thereby keeping both the size and number of its trees under control. When both are small, the fitted tree-sum can be easily visualized and written out by hand, making it highly interpretable. A partially oracle theoretical result hints at the potential for FIGS to overcome a key weakness of single-tree models by disentangling additive components of generative additive models, thereby reducing redundancy from repeated splits on the same feature. Furthermore, given oracle access to optimal tree structures, we obtain L2 generalization bounds for such generative models in the case of C1 component functions, matching known minimax rates in some cases. Extensive experiments across a wide array of real-world datasets show that FIGS achieves state-of-the-art prediction performance (among all popular rule-based methods) when restricted to just a few splits (e.g. less than 20). We find empirically that FIGS is able to avoid repeated splits, and often provides more concise decision rules than fitted decision trees, without sacrificing predictive performance. All code and models are released in a full-fledged package on Github \url{https://github.com/csinva/imodels}.

翻訳日:2022-02-01 05:31:36 公開日:2022-01-28

# (参考訳) 周期グラフの深い生成モデル

Deep Generative Model for Periodic Graphs ( http://arxiv.org/abs/2201.11932v1 )

ライセンス: CC BY 4.0

Shiyu Wang, Xiaojie Guo, Liang Zhao

(参考訳) 周期グラフは、クリスタルネットやポリゴンメッシュのような繰り返し局所構造からなるグラフである。それらの生成モデリングは、マテリアルデザインやグラフィック合成といった現実世界の応用に大きな可能性を秘めている。古典モデルは、ドメイン固有の事前定義された生成原理(例えば、クリスタルネット設計)に依存するか、または幾何学に基づく所定の規則に従う。近年,深層生成モデルが一般グラフの自動生成に大きな期待を寄せている。しかし、それらの周期グラフへの進歩は、いくつかの重要な課題のために、十分に研究されていない。 1) グラフ周期性を維持すること 2) 地域的及びグローバル的パターンの分離 3)反復パターン学習の効率性。そこで本研究では,周期グラフの深部生成モデルとして,局所およびグローバルグラフパターンの自動学習・解離・生成が可能な周期グラフ分散変分オートエンコーダ(PGD-VAE)を提案する。具体的には,グローバル・パターン・エンコーダとローカル・パターン・エンコーダからなる周期グラフエンコーダを開発し,その表現をグローバル・ローカル・セマンティクスに変換する。次に、局所構造デコーダ、近傍デコーダ、大域構造デコーダからなる新しい周期グラフデコーダと、周期性を保証するそれらの出力のアセンブラを提案する。さらに,同じ局所構造を持つグラフに対して局所意味表現の不変性を確実にする新しいモデル学習目標を設計する。提案手法の有効性を実証するための総合的な実験的評価を行った。 PGD-VAEのコードはhttps://github.com/shi-yu-wang/PGD-VAEで公開されている。

Periodic graphs are graphs consisting of repetitive local structures, such as crystal nets and polygon mesh. Their generative modeling has great potential in real-world applications such as material design and graphics synthesis. Classical models either rely on domain-specific predefined generation principles (e.g., in crystal net design), or follow geometry-based prescribed rules. Recently, deep generative models has shown great promise in automatically generating general graphs. However, their advancement into periodic graphs have not been well explored due to several key challenges in 1) maintaining graph periodicity; 2) disentangling local and global patterns; and 3) efficiency in learning repetitive patterns. To address them, this paper proposes Periodical-Graph Disentangled Variational Auto-encoder (PGD-VAE), a new deep generative models for periodic graphs that can automatically learn, disentangle, and generate local and global graph patterns. Specifically, we develop a new periodic graph encoder consisting of global-pattern encoder and local-pattern encoder that ensures to disentangle the representation into global and local semantics. We then propose a new periodic graph decoder consisting of local structure decoder, neighborhood decoder, and global structure decoder, as well as the assembler of their outputs that guarantees periodicity. Moreover, we design a new model learning objective that helps ensure the invariance of local-semantic representations for the graphs with the same local structure. Comprehensive experimental evaluations have been conducted to demonstrate the effectiveness of the proposed method. The code of proposed PGD-VAE is availabe at https://github.com/shi-yu-wang/PGD-VAE.

翻訳日:2022-02-01 04:51:27 公開日:2022-01-28

# (参考訳) より大きな距離を持つとパフォーマンスが悪化する:層利用とモデル一般化の観点から

With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization ( http://arxiv.org/abs/2201.11939v1 )

ライセンス: CC BY 4.0

James Wang, Cheng-Lin Yang

(参考訳) ディープニューラルネットワークの一般化は、マシンラーニングにおける主要なオープン問題の1つだ。モデル複雑性の厳密な境界を導出することに焦点を当てた以前の理論研究では、ニューラルネットワークがトレーニングサンプル数とニューラルネットワークのサイズの両方に関して二重降下を示すことが示されている。本稿では、ニューラルネットワークの異なる層がモデルにどのように貢献するかを実証的に検討し、初期の層は、トレーニングデータとテストデータの両方で、パフォーマンスに関連する表現を一般的に学習していることを発見した。逆に、より深い層はトレーニングのリスクを最小にし、テストや誤ったラベルデータの一般化に失敗します。さらに、トレーニングされた重みと最終層の初期値との距離が一般化誤差と高い相関を持ち、モデルの過度な適合の指標として機能することを示す。さらに,最終層の重みを再初期化することにより,トレーニング後の正規化を支援する証拠を示す。本研究は,ニューラルネットワークの一般化能力を効率的に推定する手法であり,ニューラルネットワークの内部構造を考慮したより優れた一般化境界への導出を促すことができる。

Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit double descent with respect to both training sample counts and the neural network size. In this paper, we empirically examined how different layers of neural networks contribute differently to the model; we found that early layers generally learn representations relevant to performance on both training data and testing data. Contrarily, deeper layers only minimize training risks and fail to generalize well with testing or mislabeled data. We further illustrate the distance of trained weights to its initial value of final layers has high correlation to generalization errors and can serve as an indicator of an overfit of model. Moreover, we show evidence to support post-training regularization by re-initializing weights of final layers. Our findings provide an efficient method to estimate the generalization capability of neural networks, and the insight of those quantitative results may inspire derivation to better generalization bounds that take the internal structure of neural networks into consideration.

翻訳日:2022-02-01 04:29:50 公開日:2022-01-28

# (参考訳) スティル化ニューラルアニメーションのためのワッサースプライン

Wassersplines for Stylized Neural Animation ( http://arxiv.org/abs/2201.11940v1 )

ライセンス: CC BY 4.0

Paul Zhang, Dmitriy Smirnov, Justin Solomon

(参考訳) コンピュータ生成アニメーションの多くは、メッシュをリグで操作することで生成される。このアプローチは動物のような関節のある物体をアニメーションするのにうまく機能するが、「レイアとラスト・ドラゴン」のドーンのようなより構造の低い生物をアニメーションするための柔軟性は限られている。連続正規化流と最適輸送の最近の進歩に基づき,非構造密度をアニメーション化する新しい軌道推定法であるwassersplinesを紹介する。鍵となるアイデアは、キーフレーム間の動きを表す神経パラメータの速度場をトレーニングすることだ。トラジェクトリは、キーフレームを速度場にプッシュすることで計算される。追加のwaserstein barycenter補間問題を解き、キーフレームへの厳格な準拠を保証する。我々のツールは、様々なPDEベースの正規化器を通して軌跡をスタイリングし、異なる視覚効果を生み出すことができる。我々は,様々なキーフレーム補間問題に対して,メッシュ化やリギングを伴わずに時間的にコヒーレントなアニメーションを生成するツールを示す。

Much of computer-generated animation is created by manipulating meshes with rigs. While this approach works well for animating articulated objects like animals, it has limited flexibility for animating less structured creatures such as the Drunn in "Raya and the Last Dragon." We introduce Wassersplines, a novel trajectory inference method for animating unstructured densities based on recent advances in continuous normalizing flows and optimal transport. The key idea is to train a neurally-parameterized velocity field that represents the motion between keyframes. Trajectories are then computed by pushing keyframes through the velocity field. We solve an additional Wasserstein barycenter interpolation problem to guarantee strict adherence to keyframes. Our tool can stylize trajectories through a variety of PDE-based regularizers to create different visual effects. We demonstrate our tool on various keyframe interpolation problems to produce temporally-coherent animations without meshing or rigging.

翻訳日:2022-02-01 04:21:15 公開日:2022-01-28

# (参考訳) 複雑力学におけるペアワイズ相互作用の統一

Unifying Pairwise Interactions in Complex Dynamics ( http://arxiv.org/abs/2201.11941v1 )

ライセンス: CC BY 4.0

Oliver M. Cliff, Joseph T. Lizier, Naotsugu Tsuchiya, Ben D. Fulcher

(参考訳) 科学者は複雑なシステムにおけるプロセスのペア間の相互作用を測定するために何百もの技術を開発した。しかし、これらの計算手法は、相関係数から因果推論まで、大きく切り離された異なる定量的理論に依存している。本稿では,ペアインタラクションのための249の統計ライブラリを紹介し,その挙動を実世界およびモデル生成システムから1053個の多変量時系列上で評価する。本分析では,異なる数学的定式化間の新たな共通性に注目し,リッチで学際的な文学の統一像を提供する。そして,各科学の手法を多用することで,与えられた問題に最も適した問題を明らかにすることができ,高い正確性と解釈可能な理解が得られることを示す。我々のフレームワークは拡張可能なオープンソフトウェアで提供されており、数十年の方法論的進歩を統合することで包括的なデータ駆動分析を可能にする。

Scientists have developed hundreds of techniques to measure the interactions between pairs of processes in complex systems. But these computational methods -- from correlation coefficients to causal inference -- rely on distinct quantitative theories that remain largely disconnected. Here we introduce a library of 249 statistics for pairwise interactions and assess their behavior on 1053 multivariate time series from a wide range of real-world and model-generated systems. Our analysis highlights new commonalities between different mathematical formulations, providing a unified picture of a rich, interdisciplinary literature. We then show that leveraging many methods from across science can uncover those most suitable for addressing a given problem, yielding high accuracy and interpretable understanding. Our framework is provided in extendable open software, enabling comprehensive data-driven analysis by integrating decades of methodological advances.

翻訳日:2022-02-01 04:06:35 公開日:2022-01-28

# (参考訳) DICP:ドップラー反復閉点アルゴリズム

DICP: Doppler Iterative Closest Point Algorithm ( http://arxiv.org/abs/2201.11944v1 )

ライセンス: CC BY 4.0

Bruno Hexsel, Heethesh Vhavle and Yi Chen

(参考訳) 本稿では,向きの瞬時速度を計測できる距離センサのための点雲登録のための新しいアルゴリズムであるドップラーicpを提案する。既存のICPの変種は、通常、非識別的な特徴を持つシナリオや、廊下、トンネル、高速道路、橋などの反復幾何学構造を持つシナリオにおいて、センサーの運動を正確に見積もることができない。本稿では,各点のドップラー計測とセンサの現在の動き推定との整合性を利用した新しいドップラー速度客観的関数を提案する。我々は,特徴量の多い環境においても,ドップラー速度目標関数と,点雲アライメント問題を十分に制約する幾何学的対象関数を共同で最適化する。さらに、ICP溶液を一般的に分解する動的ターゲットから点を切り離すことにより、アライメントに使用する対応マッチングを改善した。本手法は,実センサから収集したデータとシミュレーションから評価する。その結果,ドップラー速度勾配によって導かれる高速収束の利点を付加することにより,登録精度において大幅な性能向上が得られた。

In this paper, we present a novel algorithm for point cloud registration for range sensors capable of measuring per-return instantaneous radial velocity: Doppler ICP. Existing variants of ICP that solely rely on geometry or other features generally fail to estimate the motion of the sensor correctly in scenarios that have non-distinctive features and/or repetitive geometric structures such as hallways, tunnels, highways, and bridges. We propose a new Doppler velocity objective function that exploits the compatibility of each point's Doppler measurement and the sensor's current motion estimate. We jointly optimize the Doppler velocity objective function and the geometric objective function which sufficiently constrains the point cloud alignment problem even in feature-denied environments. Furthermore, the correspondence matches used for the alignment are improved by pruning away the points from dynamic targets which generally degrade the ICP solution. We evaluate our method on data collected from real sensors and from simulation. Our results show a significant performance improvement in terms of the registration accuracy with the added benefit of faster convergence guided by the Doppler velocity gradients.

翻訳日:2022-02-01 04:05:20 公開日:2022-01-28

# (参考訳) 複数のオプティマスを発見するための近位演算子学習

Learning Proximal Operators to Discover Multiple Optima ( http://arxiv.org/abs/2201.11945v1 )

ライセンス: CC0 1.0

Lingxiao Li, Noam Aigerman, Vladimir G. Kim, Jiajin Li, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

(参考訳) 非凸最適化問題の複数の解を見つけることは至るところで困難である。一般的な既存の解は、複数のランダムな初期推測からの単一解最適化法を適用するか、アドホックなヒューリスティックを用いた解の近傍で探索する。本研究では,非凸問題群にまたがる近位演算子をエンドツーエンドに学習し,テスト時に見つからない問題に対する複数の解を復元する手法を提案する。本手法は,真理解の監督を必要とせず,目的へのアクセスのみを要求できる。最近の理論的結果を適用することで、弱い凸目標と穏やかな規則性条件の下では、近似作用素の訓練は過度なパラメータ化条件下でグローバルに収束することを示す。さらに,幅広いアプリケーションを含むマルチソリューション最適化のためのベンチマークを示し,その効果を示すために提案手法を評価した。

Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Typical existing solutions either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator across a family of non-convex problems, which can then be used to recover multiple solutions for unseen problems at test time. Our method only requires access to the objectives without needing the supervision of ground truth solutions. Notably, the added proximal regularization term elevates the convexity of our formulation: by applying recent theoretical results, we show that for weakly-convex objectives and under mild regularity conditions, training of the proximal operator converges globally in the over-parameterized setting. We further present a benchmark for multi-solution optimization including a wide range of applications and evaluate our method to demonstrate its effectiveness.

翻訳日:2022-02-01 03:50:47 公開日:2022-01-28

# (参考訳) 多視点学習のための高次相関分析

Higher Order Correlation Analysis for Multi-View Learning ( http://arxiv.org/abs/2201.11949v1 )

ライセンス: CC BY 4.0

Jiawang Nie and Li Wang and Zequn Zheng

(参考訳) マルチビュー学習はデータサイエンスでよく使われる。ペアワイズ相関最大化は、複数の視点のコンセンサスを探求するための古典的なアプローチである。対相関は2つのビューに固有のため、より多くのビューへの拡張は多様化でき、ビュー間の固有の相互接続は一般的に失われる。この問題に対処するため,高次相関を最大化することを提案する。これは多視点データの高次相関テンソルを用いた低階近似問題として定式化することができる。低階近似問題の解法として生成多項式法を用いる。実マルチビューデータにおける数値計算結果から,本手法が従来手法を一貫して上回っていることが分かる。

Multi-view learning is frequently used in data science. The pairwise correlation maximization is a classical approach for exploring the consensus of multiple views. Since the pairwise correlation is inherent for two views, the extensions to more views can be diversified and the intrinsic interconnections among views are generally lost. To address this issue, we propose to maximize higher order correlations. This can be formulated as a low rank approximation problem with the higher order correlation tensor of multi-view data. We use the generating polynomial method to solve the low rank approximation problem. Numerical results on real multi-view data demonstrate that this method consistently outperforms prior existing methods.

翻訳日:2022-02-01 03:19:49 公開日:2022-01-28

# (参考訳) 暗黙的神経表現を用いた時系列異常検出

Time-Series Anomaly Detection with Implicit Neural Representation ( http://arxiv.org/abs/2201.11950v1 )

ライセンス: CC BY 4.0

Kyeong-Joong Jeong, Yong-Min Shin

(参考訳) 多変量時系列データの異常検出は多くの実世界のアプリケーションで必須である。近年,様々な深層学習に基づく手法が時系列異常検出において大幅に改善されている。しかし、既存の方法には、複雑なモデル設計による長いトレーニング時間や、与えられたデータセットの最適なハイパーパラメータ(例えば、スライディングウィンドウの長さ)を見つけるための高価なチューニング手順など、いくつかの制限がある。本稿では,インプシットニューラル表現に基づく異常検出(INRAD)と呼ばれる新しい手法を提案する。具体的には、入力に時間を要し、その時点で対応する値を出力する単純な多層パーセプトロンを訓練する。次に,異常検出のための異常スコアとして表現誤差を利用する。 5つの実世界のデータセットにおける実験により,提案手法が性能,トレーニング速度,ロバスト性において,他の最先端手法よりも優れていることを証明した。

Detecting anomalies in multivariate time-series data is essential in many real-world applications. Recently, various deep learning-based approaches have shown considerable improvements in time-series anomaly detection. However, existing methods still have several limitations, such as long training time due to their complex model designs or costly tuning procedures to find optimal hyperparameters (e.g., sliding window length) for a given dataset. In our paper, we propose a novel method called Implicit Neural Representation-based Anomaly Detection (INRAD). Specifically, we train a simple multi-layer perceptron that takes time as input and outputs corresponding values at that time. Then we utilize the representation error as an anomaly score for detecting anomalies. Experiments on five real-world datasets demonstrate that our proposed method outperforms other state-of-the-art methods in performance, training speed, and robustness.

翻訳日:2022-02-01 03:03:38 公開日:2022-01-28

# (参考訳) 不均一エルドス・レーニランダムグラフのフレシェ平均(または中間値)に対するシャープ閾値

Sharp Threshold for the Frechet Mean (or Median) of Inhomogeneous Erdos-Renyi Random Graphs ( http://arxiv.org/abs/2201.11954v1 )

ライセンス: CC BY 4.0

Francois G. Meyer

(参考訳) 不均質なエルドス・レーニーのランダムグラフのアンサンブルの人口、サンプル、フレシェ平均(または中央値)グラフとは何か? グラフ間の距離を計算するためにハミング距離を使用すると、アンサンブルの期待隣接行列のしきい値化により、不均一なランダムグラフのアンサンブルのフレシェ平均(または中央値)グラフが得られる。また, 人口予測隣接行列をサンプル平均隣接行列に置き換えた場合, サンプル平均(あるいは中央値)についても, 結果が成り立つことを示した。したがって、不均質なエルドス・レーニーのランダムグラフのフレシェ平均(または中央値)グラフは、空グラフか完全グラフのいずれかである鋭い閾値を示す。この新しい理論的な結果には、いくつかの重要な実用的結果があり、例えば、疎不均質なランダムグラフのアンサンブルのフレシェ平均は常に空グラフである。

We address the following foundational question: what is the population, and sample, Frechet mean (or median) graph of an ensemble of inhomogeneous Erdos-Renyi random graphs? We prove that if we use the Hamming distance to compute distances between graphs, then the Frechet mean (or median) graph of an ensemble of inhomogeneous random graphs is obtained by thresholding the expected adjacency matrix of the ensemble. We show that the result also holds for the sample mean (or median) when the population expected adjacency matrix is replaced with the sample mean adjacency matrix. Consequently, the Frechet mean (or median) graph of inhomogeneous Erdos-Renyi random graphs exhibits a sharp threshold: it is either the empty graph, or the complete graph. This novel theoretical result has some significant practical consequences; for instance, the Frechet mean of an ensemble of sparse inhomogeneous random graphs is always the empty graph.

翻訳日:2022-02-01 02:49:00 公開日:2022-01-28

# (参考訳) 強化学習による動的時間緩和

Dynamic Temporal Reconciliation by Reinforcement learning ( http://arxiv.org/abs/2201.11964v1 )

ライセンス: CC BY 4.0

Himanshi Charotia, Abhishek Garg, Gaurav Dhama, Naman Maheshwari

(参考訳) 長期および短期の時系列予測に基づくプランニングは、多くの業界で一般的なプラクティスである。この文脈では、時間的集約と和解技術は予測を改善し、モデルの不確実性を低減し、異なる時間的地平をまたいだ一貫性のある予測を提供するのに有用である。しかしながら、これらすべての技術にまたがる前提は、時間階層のすべてのレベルにわたるデータの完全な可用性であるが、これは数学的に便利であるが、ほとんどの場合、低周波データが部分的に完了し、予測中に利用できない。一方、新型コロナウイルスのパンデミックのようなシナリオでは、高周波データが大幅に変化し、この変化は、長期的な状況と大きく異なる予測を改善するために利用することができる。そこで本稿では,マルコフ決定プロセス(MDP)として,低周波予測を高頻度で予測する問題を定式化することにより,プロセスのダイナミクスに関する完全な情報が得られないことを確かめる。これにより、低周波周期が部分的にしか完了していない場合でも、最新のデータに基づいて最適な長期推定を行うことができる。 MDPは、時間差強化学習(TDRL)アプローチを用いて、カスタマイズ可能な動作を用いて、歴史的低周波データにのみ依存するよりも、長期予測を劇的に改善している。この結果は、低周波予測が時間調整文献(低周波予測が信号比よりも低雑音であるという仮定に基づく)で述べたような高周波予測を改善することができる一方で、低周波予測も低周波予測に活用できるという事実も強調している。

Planning based on long and short term time series forecasts is a common practice across many industries. In this context, temporal aggregation and reconciliation techniques have been useful in improving forecasts, reducing model uncertainty, and providing a coherent forecast across different time horizons. However, an underlying assumption spanning all these techniques is the complete availability of data across all levels of the temporal hierarchy, while this offers mathematical convenience but most of the time low frequency data is partially completed and it is not available while forecasting. On the other hand, high frequency data can significantly change in a scenario like the COVID pandemic and this change can be used to improve forecasts that will otherwise significantly diverge from long term actuals. We propose a dynamic reconciliation method whereby we formulate the problem of informing low frequency forecasts based on high frequency actuals as a Markov Decision Process (MDP) allowing for the fact that we do not have complete information about the dynamics of the process. This allows us to have the best long term estimates based on the most recent data available even if the low frequency cycles have only been partially completed. The MDP has been solved using a Time Differenced Reinforcement learning (TDRL) approach with customizable actions and improves the long terms forecasts dramatically as compared to relying solely on historical low frequency data. The result also underscores the fact that while low frequency forecasts can improve the high frequency forecasts as mentioned in the temporal reconciliation literature (based on the assumption that low frequency forecasts have lower noise to signal ratio) the high frequency forecasts can also be used to inform the low frequency forecasts.

翻訳日:2022-02-01 02:27:01 公開日:2022-01-28

# (参考訳) 非定常目的と制約を持つCMDPの高能率2次元強化学習

Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints ( http://arxiv.org/abs/2201.11965v1 )

ライセンス: CC BY 4.0

Yuhao Ding and Javad Lavaei

(参考訳) 時間変動環境におけるRLの安全性の確保に中心的な役割を果たす非定常的目的と制約を伴うマルコフ決定過程(CMDP)における原始的双対強化学習(RL)について考察する。この問題では、報酬/有効性関数と状態遷移関数の両方が、その累積変動が既知の変動予算を超えない限り、時間とともに任意に変化することが許される。時間変動環境における安全なrlアルゴリズムの設計は、制約違反の低減、安全な探索、非定常性への適応などを統合する必要があるため、特に困難である。そこで本研究では,周期的再スタートに基づく政策改善,二重正規化による2次更新,周期的再スタートに基づく楽観的政策評価という3つのメカニズムを特徴とする,周期的再スタート最適化(PROPD-PPO)アルゴリズムを提案する。本稿では,線形カーネルCMDP関数近似設定と表計算CMDP設定の両方において,提案アルゴリズムに対する動的後悔境界と制約違反境界を確立する。本稿では,非定常cmdpに対して,安全かつ効率的なアルゴリズムを提案する。

We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which play a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumulative variations do not exceed certain known variation budgets. Designing safe RL algorithms in time-varying environments is particularly challenging because of the need to integrate the constraint violation reduction, safe exploration, and adaptation to the non-stationarity. To this end, we propose a Periodically Restarted Optimistic Primal-Dual Proximal Policy Optimization (PROPD-PPO) algorithm that features three mechanisms: periodic-restart-based policy improvement, dual update with dual regularization, and periodic-restart-based optimistic policy evaluation. We establish a dynamic regret bound and a constraint violation bound for the proposed algorithm in both the linear kernel CMDP function approximation setting and the tabular CMDP setting. This paper provides the first provably efficient algorithm for non-stationary CMDPs with safe exploration.

翻訳日:2022-02-01 02:17:00 公開日:2022-01-28

# (参考訳) 部分微分方程式の学習解演算子に対する擬微分積分演算子

Pseudo-Differential Integral Operator for Learning Solution Operators of Partial Differential Equations ( http://arxiv.org/abs/2201.11967v1 )

ライセンス: CC BY 4.0

Jin Young Shin, Jae Yong Lee, Hyung Ju Hwang

(参考訳) 2つの関数空間間の学習マッピングは、かなりの研究の注目を集めている。しかし、偏微分方程式(PDE)の解演算子を学ぶことは科学計算の課題である。そこで本研究では,微分作用素の一般化であり,ある記号を特徴とする擬似微分作用素に着想を得た新しい擬微分積分作用素(pdio)を提案する。ニューラルネットワークを用いてシンボルをパラメータ化し,ニューラルネットワークに基づくシンボルがスムーズなシンボルクラスに含まれることを示す。その後、PDIO が有界線型作用素であることを証明し、従ってソボレフ空間において連続である。 PDIOとニューラル演算子を組み合わせて擬微分ニューラル演算子(PDNO)を開発し,PDEの非線形解演算子を学習する。提案モデルの有効性を,バーガーズ方程式,ダーシー流,ナビエ・ストークス方程式を用いて実験的に検証した。その結果,提案するpdnoは既存のニューラルオペレータのアプローチに匹敵することがわかった。

Learning mapping between two function spaces has attracted considerable research attention. However, learning the solution operator of partial differential equations (PDEs) remains a challenge in scientific computing. Therefore, in this study, we propose a novel pseudo-differential integral operator (PDIO) inspired by a pseudo-differential operator, which is a generalization of a differential operator and characterized by a certain symbol. We parameterize the symbol by using a neural network and show that the neural-network-based symbol is contained in a smooth symbol class. Subsequently, we prove that the PDIO is a bounded linear operator, and thus is continuous in the Sobolev space. We combine the PDIO with the neural operator to develop a pseudo-differential neural operator (PDNO) to learn the nonlinear solution operator of PDEs. We experimentally validate the effectiveness of the proposed model by using Burgers' equation, Darcy flow, and the Navier-Stokes equation. The results reveal that the proposed PDNO outperforms the existing neural operator approaches in most experiments.

翻訳日:2022-02-01 02:15:30 公開日:2022-01-28

# (参考訳) 訓練不変性と低ランク現象--線形ネットワークを超えて

Training invariances and the low-rank phenomenon: beyond linear networks ( http://arxiv.org/abs/2201.11968v1 )

ライセンス: CC BY 4.0

Thien Le, Stefanie Jegelka

(参考訳) ニューラルネットワークのトレーニングによって引き起こされる暗黙のバイアスは、厳密な研究の対象となっている。勾配流の限界と適切なステップサイズでの勾配降下では、線形分離可能なデータ上で対数的あるいは指数的損失を持つ深い線形ネットワークを訓練すると、重みはランク1ドルの行列に収束することが示されている。本稿では、この理論結果を、完全に接続された層とスキップ接続を含むより広範な非線形ReLU活性化フィードフォワードネットワークに拡張する。私たちの知る限りでは、これらのアーキテクチャで低ランクな現象が厳格に証明されたのはこれが初めてであり、文学における実証的な結果を反映している。この証明は、特定の局所的なトレーニング不変性に依存しており、これはアライメントと呼ばれることがある。我々の証明は、あるパラメータの方向収束の下で重みが一定である多重線型関数とReLUネットワークへのネットワークの特定の分解に依存している。

The implicit bias induced by the training of neural networks has become a topic of rigorous study. In the limit of gradient flow and gradient descent with appropriate step size, it has been shown that when one trains a deep linear network with logistic or exponential loss on linearly separable data, the weights converge to rank-$1$ matrices. In this paper, we extend this theoretical result to the much wider class of nonlinear ReLU-activated feedforward networks containing fully-connected layers and skip connections. To the best of our knowledge, this is the first time a low-rank phenomenon is proven rigorously for these architectures, and it reflects empirical results in the literature. The proof relies on specific local training invariances, sometimes referred to as alignment, which we show to hold for a wide set of ReLU architectures. Our proof relies on a specific decomposition of the network into a multilinear function and another ReLU network whose weights are constant under a certain parameter directional convergence.

翻訳日:2022-02-01 01:57:12 公開日:2022-01-28

# (参考訳) 確率勾配ランゲヴィンダイナミクスのための微分プライバシー保証

Differential Privacy Guarantees for Stochastic Gradient Langevin Dynamics ( http://arxiv.org/abs/2201.11980v1 )

ライセンス: CC BY 4.0

Th\'eo Ryffel, Francis Bach, David Pointcheval

(参考訳) ランジュバン拡散を伴うr\'enyiダイバージェンスダイナミクスのモデル化により,ノイズの確率的勾配降下のプライバシーリークを解析した。非確率的アルゴリズムに関する最近の研究に触発されて、確率的設定における同様の望ましい性質を導出する。特に,従来のdp-sgd分析よりも大幅に改善した,滑らかで強い凸目標に対して,プライバシ損失が指数関数的に高速に収束することを示す。また,様々なステップサイズの任意のシーケンスに解析を拡張し,新たなユーティリティ境界を導出する。最後に,従来のDP-SGDライブラリと比較して,本手法の実用性を示す実装を提案する。

We analyse the privacy leakage of noisy stochastic gradient descent by modeling R\'enyi divergence dynamics with Langevin diffusions. Inspired by recent work on non-stochastic algorithms, we derive similar desirable properties in the stochastic setting. In particular, we prove that the privacy loss converges exponentially fast for smooth and strongly convex objectives under constant step size, which is a significant improvement over previous DP-SGD analyses. We also extend our analysis to arbitrary sequences of varying step sizes and derive new utility bounds. Last, we propose an implementation and our experiments show the practical utility of our approach compared to classical DP-SGD libraries.

翻訳日:2022-02-01 01:19:00 公開日:2022-01-28

# (参考訳) 再生医療用超音波画像を用いた多孔性バイオエラストマーのコンピュータ支援認識と評価

Computer-aided Recognition and Assessment of a Porous Bioelastomer on Ultrasound Images for Regenerative Medicine Applications ( http://arxiv.org/abs/2201.11987v1 )

ライセンス: CC BY 4.0

Dun Wang, Kaixuan Guo, Yanying Zhu, Jia Sun, Aliona Dreglea, Zhengwei You, Jiao Yu

(参考訳) 生分解性弾性足場は軟組織修復や組織工学の分野でますます注目を集めている。多孔質のバイオエラストマーからなるこれらの足場は、組織の成長とそれ自身の分解をサポートする。超音波画像に基づくコンピュータ支援分析手法を開発し, 足場の劣化性能を把握し, 破壊試験を行う必要をなくすだけでなく, 足場の劣化や組織成長を経時的に監視するためにも必要である。多孔質バイオエラストマーの連続的かつ正確な輪郭を抽出するために、単一の伝統的な画像処理アルゴリズムを用いるのは難しい。本稿では,生体エラストマーの輪郭検出のためのジョイントアルゴリズムと,生体エラストマーの劣化挙動を監視するテクスチャ特徴抽出法を提案する。平均シフトクラスタリング法は、生体エラストマーおよび生体組織のクラスタリング特徴情報を得るために用いられる。そして、大津画像2値化方法は、最適な閾値を自動的に選択してグレースケール超音波画像を2値画像に変換する。カニーエッジ検出器は完全なバイオエラストマーの輪郭を抽出するために用いられる。テクスチャの1次および2次統計特徴を抽出する。提案手法は, 超音波画像中の生体エラストマーの輪郭を理想的に抽出するだけでなく, テクスチャ特性と輪郭面積の変化に基づき, インプラント部位における生体エラストマーの劣化挙動に対する貴重なフィードバックを与える。本研究の予備的な結果から, 提案したコンピュータ支援画像処理技術は, 生体内超音波画像を用いた非侵襲的組織足場解析に有用であり, 組織足場劣化と細胞成長の進展を評価し, 足場設計の改善に役立つ可能性が示唆された。

Biodegradable elastic scaffolds have attracted more and more attention in the field of soft tissue repair and tissue engineering. These scaffolds made of porous bioelastomers support tissue ingrowth along with their own degradation. It is necessary to develop a computer-aided analyzing method based on ultrasound images to identify the degradation performance of the scaffold, not only to obviate the need to do destructive testing, but also to monitor the scaffold's degradation and tissue ingrowth over time. It is difficult using a single traditional image processing algorithm to extract continuous and accurate contour of a porous bioelastomer. This paper proposes a joint algorithm for the bioelastomer's contour detection and a texture feature extraction method for monitoring the degradation behavior of the bioelastomer. Mean-shift clustering method is used to obtain the bioelastomer's and native tissue's clustering feature information. Then the OTSU image binarization method automatically selects the optimal threshold value to convert the grayscale ultrasound image into a binary image. The Canny edge detector is used to extract the complete bioelastomer's contour. The first-order and second-order statistical features of texture are extracted. The proposed joint algorithm not only achieves the ideal extraction of the bioelastomer's contours in ultrasound images, but also gives valuable feedback of the degradation behavior of the bioelastomer at the implant site based on the changes of texture characteristics and contour area. The preliminary results of this study suggest that the proposed computer-aided image processing techniques have values and potentials in the non-invasive analysis of tissue scaffolds in vivo based on ultrasound images and may help tissue engineers evaluate the tissue scaffold's degradation and cellular ingrowth progress and improve the scaffold designs.

翻訳日:2022-02-01 00:40:04 公開日:2022-01-28

# (参考訳) 2つの時間スケール更新規則の定学習率を用いた生成逆数ネットワークの訓練

Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks ( http://arxiv.org/abs/2201.11989v1 )

ライセンス: CC BY 4.0

Naoki Sato and Hideaki Iiduka

(参考訳) 従来,一定の学習率を用いた2つの時間スケール更新ルール(TTUR)が,GAN(Generative Adversarial Network)のトレーニングに有用であった。一方、TTURの理論的解析により、2人のプレイヤー(判別器とジェネレータ)とのナッシュ平衡問題の定常局所ナッシュ平衡が崩壊する学習率を用いて与えられる。本稿では,一定の学習率を用いてTTURの理論解析を行い,理論と実践のギャップを埋める。特に,tturでは定常学習率を用いて,バッチサイズが増加するにつれて定常局所ナッシュ平衡を求めるために必要なステップ数が減少することを示す。また,理論解析を支援する数値計算結果も提供する。

Previous numerical results have shown that a two time-scale update rule (TTUR) using constant learning rates is practically useful for training generative adversarial networks (GANs). Meanwhile, a theoretical analysis of TTUR to find a stationary local Nash equilibrium of a Nash equilibrium problem with two players, a discriminator and a generator, has been given using decaying learning rates. In this paper, we give a theoretical analysis of TTUR using constant learning rates to bridge the gap between theory and practice. In particular, we show that, for TTUR using constant learning rates, the number of steps needed to find a stationary local Nash equilibrium decreases as the batch size increases. We also provide numerical results to support our theoretical analyzes.

翻訳日:2022-02-01 00:29:29 公開日:2022-01-28

# (参考訳) 画像とビデオの超解像のためのディープネットワーク

Deep Networks for Image and Video Super-Resolution ( http://arxiv.org/abs/2201.11996v1 )

ライセンス: CC0 1.0

Kuldeep Purohit, Srimanta Mandal, A. N. Rajagopalan

(参考訳) 畳み込みニューラルネットワークの中間層における勾配伝播の効率は,超解像処理において重要である。そこで本研究では,MDCB(Mixed-dense connection block)と呼ぶ効率的な畳み込みユニットを用いて構築した,単一画像超解像(SISR)の深層構造を提案する。 MDCBの設計は、その限界を克服しつつ、残差と密接な接続戦略の強さを組み合わせている。複数因子に対する超解像を実現するために,高次因子に対する低次因子に対して学習したフィルタを再活用するスケール・リカレント・フレームワークを提案する。これにより性能が向上し、より高い因子に対するパラメトリック効率が向上する。ネットワークの2つのバージョンをトレーニングし、異なる損失構成を用いて補完的な画像品質を向上させる。ネットワークは,複数のフレームから情報を集約し,時空間的一貫性を維持する。提案したネットワークは、画像およびビデオ超解像ベンチマークにおける最先端技術に対する質的かつ定量的な改善をもたらす。

Efficiency of gradient propagation in intermediate layers of convolutional neural networks is of key importance for super-resolution task. To this end, we propose a deep architecture for single image super-resolution (SISR), which is built using efficient convolutional units we refer to as mixed-dense connection blocks (MDCB). The design of MDCB combines the strengths of both residual and dense connection strategies, while overcoming their limitations. To enable super-resolution for multiple factors, we propose a scale-recurrent framework which reutilizes the filters learnt for lower scale factors recursively for higher factors. This leads to improved performance and promotes parametric efficiency for higher factors. We train two versions of our network to enhance complementary image qualities using different loss configurations. We further employ our network for video super-resolution task, where our network learns to aggregate information from multiple frames and maintain spatio-temporal consistency. The proposed networks lead to qualitative and quantitative improvements over state-of-the-art techniques on image and video super-resolution benchmarks.

翻訳日:2022-01-31 23:59:41 公開日:2022-01-28

# (参考訳) スケールリカレントDense Networkを用いた画像超解像

Image Superresolution using Scale-Recurrent Dense Network ( http://arxiv.org/abs/2201.11998v1 )

ライセンス: CC0 1.0

Kuldeep Purohit, Srimanta Mandal, A. N. Rajagopalan

(参考訳) 畳み込みニューラルネットワーク(CNN)の設計の最近の進歩は、画像超解像(SR)の性能に大きな改善をもたらした。性能の向上は、これらのネットワークの中間層に残留または密接な接続が存在することに起因する。このような接続の効率的な組み合わせは、修復品質を維持しながらパラメータの数を劇的に削減することができる。本稿では,残差ブロック (residual dense blocks (rdbs)) 内の一連の高密度接続を含むユニット上に,画像から豊富な局所的特徴を抽出するためのスケールリカレントsrアーキテクチャを提案する。我々のスケールリカレント設計は、現在の最先端のアプローチに比べてパラメトリックに効率的でありながら、より高いスケール要因の競合性能を提供する。ネットワークの性能をさらに向上するため,中間層では複数の残差接続(Multi-Residual Dense Blocks)を採用し,既存の層では勾配伝搬を改善する。最近の研究で、従来の損失関数は、高いPSNRを持つが知覚的に劣る結果を生み出すためにネットワークを導くことができることがわかった。我々はGAN(Generative Adversarial Network)ベースのフレームワークとVGG(Deep Feature)の損失を利用してネットワークをトレーニングすることでこの問題を軽減する。実験により,VGG損失と対向損失の重み付けの組み合わせの違いが,ネットワーク出力を知覚歪曲線に沿って横切ることを実証した。提案したネットワークは,より少ないパラメータで知覚的かつ客観的に(PSNRベース)既存の手法に対して良好に動作する。

Recent advances in the design of convolutional neural network (CNN) have yielded significant improvements in the performance of image super-resolution (SR). The boost in performance can be attributed to the presence of residual or dense connections within the intermediate layers of these networks. The efficient combination of such connections can reduce the number of parameters drastically while maintaining the restoration quality. In this paper, we propose a scale recurrent SR architecture built upon units containing series of dense connections within a residual block (Residual Dense Blocks (RDBs)) that allow extraction of abundant local features from the image. Our scale recurrent design delivers competitive performance for higher scale factors while being parametrically more efficient as compared to current state-of-the-art approaches. To further improve the performance of our network, we employ multiple residual connections in intermediate layers (referred to as Multi-Residual Dense Blocks), which improves gradient propagation in existing layers. Recent works have discovered that conventional loss functions can guide a network to produce results which have high PSNRs but are perceptually inferior. We mitigate this issue by utilizing a Generative Adversarial Network (GAN) based framework and deep feature (VGG) losses to train our network. We experimentally demonstrate that different weighted combinations of the VGG loss and the adversarial loss enable our network outputs to traverse along the perception-distortion curve. The proposed networks perform favorably against existing methods, both perceptually and objectively (PSNR-based) with fewer parameters.

翻訳日:2022-01-31 23:36:29 公開日:2022-01-28

# (参考訳) Puppeteer: メモリ階層を越えたハードウェアプリフェッチのためのランダムフォレストベースのマネージャ

Puppeteer: A Random Forest-based Manager for Hardware Prefetchers across the Memory Hierarchy ( http://arxiv.org/abs/2201.12027v1 )

ライセンス: CC BY 4.0

Furkan Eris, Marcia S. Louis, Kubra Eris, Jose L. Abellan, Ajay Joshi

(参考訳) 長年にわたり、プロセッサのスループットは着実に向上した。しかし、メモリスループットは同じ速度では向上せず、結果としてメモリウォールの問題が発生し、効率と理論上のピークプロセッサ性能のギャップが増大した。これに対処するため、データ/インストラクションプリフェッチャー設計の領域では、多くの作業が行われている。プリフェッチは、将来のデータ/インストラクションアドレスアクセスを予測し、データ/インストラクションアクセスレイテンシの低下を目標として、メモリ階層内のデータ/インストラクションを積極的にフェッチする。この目的のために、1つ以上のプリフェッチがメモリ階層の各レベルでデプロイされるが、通常、各プリフェッチはシステム内の他のプリフェッチを包括的に考慮することなく、独立して設計される。その結果、個々のプリフェッチが常に補完するとは限らないため、平均的なパフォーマンス向上や、あるいは多くの負のアウトリーチにつながる。本稿では,ハードウェアプリフェッチマネージャであるpuppeteerを提案する。このpuppeteerは,ランダムフォレストレグレプタのスイートを使用して,プリフェッチが相互補完し,データ/インストラクションアクセスレイテンシを低減するように,メモリ階層の各レベルにおいてプリフェッチがどのレベルにあるべきかを実行時に判断する。 Puppeteer では 1 Core (1C) で 46.0%、 4 Core (4C) で 25.8%、および 8 Core (8C) プロセッサで 11.9% の改善を行い、SPEC2017, SPEC2006, Cloud Suites から生成される平均10KB のオーバヘッドを持つ。さらに,負の外れ値の数が89%以上減少し,最悪の場合の負の外れ値のパフォーマンスが25%から5%に低下した。

Over the years, processor throughput has steadily increased. However, the memory throughput has not increased at the same rate, which has led to the memory wall problem in turn increasing the gap between effective and theoretical peak processor performance. To cope with this, there has been an abundance of work in the area of data/instruction prefetcher designs. Broadly, prefetchers predict future data/instruction address accesses and proactively fetch data/instructions in the memory hierarchy with the goal of lowering data/instruction access latency. To this end, one or more prefetchers are deployed at each level of the memory hierarchy, but typically, each prefetcher gets designed in isolation without comprehensively accounting for other prefetchers in the system. As a result, individual prefetchers do not always complement each other, and that leads to lower average performance gains and/or many negative outliers. In this work, we propose Puppeteer, which is a hardware prefetcher manager that uses a suite of random forest regressors to determine at runtime which prefetcher should be ON at each level in the memory hierarchy, such that the prefetchers complement each other and we reduce the data/instruction access latency. Compared to a design with no prefetchers, using Puppeteer we improve IPC by 46.0% in 1 Core (1C), 25.8% in 4 Core (4C), and 11.9% in 8 Core (8C) processors on average across traces generated from SPEC2017, SPEC2006, and Cloud suites with ~10KB overhead. Moreover, we also reduce the number of negative outliers by over 89%, and the performance loss of the worst-case negative outlier from 25% to only 5% compared to the state-of-the-art.

翻訳日:2022-01-31 23:25:43 公開日:2022-01-28

# (参考訳) FreshPRINCE: 簡単な変換ベースのパイプライン時系列分類器

The FreshPRINCE: A Simple Transformation Based Pipeline Time Series Classifier ( http://arxiv.org/abs/2201.12048v1 )

ライセンス: CC BY 4.0

Matthew Middlehurst and Anthony Bagnall

(参考訳) 近年,時系列分類(TSC)のために提案されるアルゴリズムの精度が著しく向上している。しかし、実際の実践者やデータサイエンティストが研究のトピックに詳しくない質問としてよく聞かれるのは、アルゴリズムの複雑さが最先端にあると考えるかどうかだ。最初に提案されたアプローチは、単純な要約統計のパイプラインやTSFreshのような時系列の特徴抽出アプローチであり、それ自体は理にかなった問題であり、複数の問題タイプに一般化されたTSCアルゴリズムの出版物では、これらのアプローチが考慮または比較されることはめったにない。ベクトルベース分類器を用いて,現在最先端の時系列分類器の連続特性に有効であることを示す。これらのアプローチをucr時系列データセットアーカイブでテストし、tsc文献がこれらのアプローチの有効性を見落としているかどうかを確認した。 TSFreshのパイプラインに続いて, FreshPRINCEと呼ばれる回転森林分類器が最適であることがわかった。最先端の技術ではないが、動的に経時的に振る舞う隣人よりもかなり正確であり、将来の比較のための合理的なベンチマークである。

There have recently been significant advances in the accuracy of algorithms proposed for time series classification (TSC). However, a commonly asked question by real world practitioners and data scientists less familiar with the research topic, is whether the complexity of the algorithms considered state of the art is really necessary. Many times the first approach suggested is a simple pipeline of summary statistics or other time series feature extraction approaches such as TSFresh, which in itself is a sensible question; in publications on TSC algorithms generalised for multiple problem types, we rarely see these approaches considered or compared against. We experiment with basic feature extractors using vector based classifiers shown to be effective with continuous attributes in current state-of-the-art time series classifiers. We test these approaches on the UCR time series dataset archive, looking to see if TSC literature has overlooked the effectiveness of these approaches. We find that a pipeline of TSFresh followed by a rotation forest classifier, which we name FreshPRINCE, performs best. It is not state of the art, but it is significantly more accurate than nearest neighbour with dynamic time warping, and represents a reasonable benchmark for future comparison.

翻訳日:2022-01-31 22:52:29 公開日:2022-01-28

# (参考訳) 浅層ニューラルネットワークにおける確率勾配降下のグローバル収束のためのオーバーパラメータ境界の改善

Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks ( http://arxiv.org/abs/2201.12052v1 )

ライセンス: CC BY 4.0

Bart{\l}omiej Polaczyk and Jacek Cyranka

(参考訳) 隠れ層フィードフォワードニューラルネットワークのクラスに対する確率勾配降下アルゴリズムのグローバル収束に必要な過パラメトリゼーション境界について検討し、ReLUを含む実際に用いられる活性化関数のほとんどを考慮して検討した。必要な層幅を隠蔽することで既存の最先端結果を改善する。本稿では,非線形解析とネットワークのランダム初期化特性を組み合わせた新しい証明手法を提案する。まず、MSE損失に対する勾配流の非滑らかな類似である微分包含の連続解のグローバル収束を確立する。第2に,上記の微分包含の解を(離散)確率的勾配降下列に関連付ける技術的結果(一般近似子に対しても動作する)を提供し,確率的勾配降下反復においてゼロ損失に向かう線形収束を確立する。

We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks, considering most of the activation functions used in practice, including ReLU. We improve the existing state-of-the-art results in terms of the required hidden layer width. We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network. First, we establish the global convergence of continuous solutions of the differential inclusion being a nonsmooth analogue of the gradient flow for the MSE loss. Second, we provide a technical result (working also for general approximators) relating solutions of the aforementioned differential inclusion to the (discrete) stochastic gradient descent sequences, hence establishing linear convergence towards zero loss for the stochastic gradient descent iterations.

翻訳日:2022-01-31 22:29:09 公開日:2022-01-28

# (参考訳) 自動エンコーダを用いたベイズ推論のための学習概要統計

Learning Summary Statistics for Bayesian Inference with Autoencoders ( http://arxiv.org/abs/2201.12059v1 )

ライセンス: CC BY 4.0

Carlo Albert, Simone Ulzega, Firat Ozdemir, Fernando Perez-Cruz, Antonietta Mira

(参考訳) 難解な確率関数を持つ確率モデルに対して、近似ベイズ計算は、シミュレーションされたモデル出力と観測の繰り返し比較を通じて、小さな要約統計量の組で真の後部を近似する方法を提供する。これらの統計は、パラメータを制約するがノイズをキャンセルするための情報を保持する必要がある。したがって、一般の確率モデルでは熱力学的状態変数と見なすことができる。多くの科学的応用において、後部の十分な近似に到達するためにはモデルパラメータよりも厳密な要約統計が必要である。そこで我々は,ディープニューラルネットワークに基づくオートエンコーダの内部次元を要約統計として利用する。パラメータ関連情報を全て符号化するエンコーダのインセンティブを作成するため、トレーニングデータを生成するために使用したノイズに関する明示的または暗黙的な情報にデコーダがアクセスする。このアプローチを2種類の確率モデルで実証的に検証する。

For stochastic models with intractable likelihood functions, approximate Bayesian computation offers a way of approximating the true posterior through repeated comparisons of observations with simulated model outputs in terms of a small set of summary statistics. These statistics need to retain the information that is relevant for constraining the parameters but cancel out the noise. They can thus be seen as thermodynamic state variables, for general stochastic models. For many scientific applications, we need strictly more summary statistics than model parameters to reach a satisfactory approximation of the posterior. Therefore, we propose to use the inner dimension of deep neural network based Autoencoders as summary statistics. To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information on the noise that has been used to generate the training data. We validate the approach empirically on two types of stochastic models.

翻訳日:2022-01-31 21:32:44 公開日:2022-01-28

# (参考訳) DynaMixer:動的ミキシングを備えたビジョンMLPアーキテクチャ

DynaMixer: A Vision MLP Architecture with Dynamic Mixing ( http://arxiv.org/abs/2201.12083v1 )

ライセンス: CC BY 4.0

Ziyu Wang and Wenhao Jiang and Yiming Zhu and Li Yuan and Yibing Song and Wei Liu

(参考訳) 近年,MLPのような視覚モデルが主流の視覚認識タスクにおいて有望な性能を達成している。視覚トランスフォーマーやcnnとは対照的に、mlpライクなモデルの成功は、トークンとチャネル間の単純な情報融合操作が深い認識モデルに優れた表現力をもたらすことを示している。しかし、既存のMLPのようなモデルは、トークンを静的融合操作を通じて融合させ、混在するトークンの内容への適応性に欠ける。したがって、慣用的な情報融合手順は不十分である。そこで本稿では,動的情報融合を利用して,DynaMixerと呼ばれる効率的なMLP型ネットワークアーキテクチャを提案する。本稿では,DynaMixerモデルが依存する手法を提案し,混合する全てのトークンの内容を活用することで,混合行列を動的に生成する。時間の複雑さを低減し、ロバスト性を向上させるため、寸法低減技術と多セグメント融合機構を採用する。提案したDynaMixerモデル (97Mパラメータ) は,ImageNet-1Kデータセットの84.3\%のTop-1精度を実現する。パラメータ数が26mに減少しても82.7\%のtop-1精度を達成し、同様の能力を持つ既存のmlpライクなモデルを上回る。 DynaMixerの実装は一般公開される予定だ。

Recently, MLP-like vision models have achieved promising performances on mainstream visual recognition tasks. In contrast with vision transformers and CNNs, the success of MLP-like models shows that simple information fusion operations among tokens and channels can yield a good representation power for deep recognition models. However, existing MLP-like models fuse tokens through static fusion operations, lacking adaptability to the contents of the tokens to be mixed. Thus, customary information fusion procedures are not effective enough. To this end, this paper presents an efficient MLP-like network architecture, dubbed DynaMixer, resorting to dynamic information fusion. Critically, we propose a procedure, on which the DynaMixer model relies, to dynamically generate mixing matrices by leveraging the contents of all the tokens to be mixed. To reduce the time complexity and improve the robustness, a dimensionality reduction technique and a multi-segment fusion mechanism are adopted. Our proposed DynaMixer model (97M parameters) achieves 84.3\% top-1 accuracy on the ImageNet-1K dataset without extra training data, performing favorably against the state-of-the-art vision MLP models. When the number of parameters is reduced to 26M, it still achieves 82.7\% top-1 accuracy, surpassing the existing MLP-like models with a similar capacity. The implementation of DynaMixer will be made available to the public.

翻訳日:2022-01-31 21:19:54 公開日:2022-01-28

# (参考訳) BLIP:Unified Vision-Language Understanding and Generationのためのブートストラップ言語画像事前学習

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation ( http://arxiv.org/abs/2201.12086v1 )

ライセンス: CC BY 4.0

Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi

(参考訳) Vision-Language Pre-Training (VLP)は多くの視覚言語タスクのパフォーマンスを向上した。しかし、既存のトレーニング済みモデルのほとんどは、理解ベースタスクまたは生成ベースタスクにのみ優れている。さらに、Webから収集されたノイズの多い画像とテキストのペアでデータセットをスケールアップすることで、パフォーマンスが大幅に向上した。本稿では,視覚言語理解と生成の両方に柔軟に変換可能な新しいVLPフレームワークBLIPを提案する。 blipは、キャプションをブートストラップし、キャプションが合成キャプションを生成し、フィルタが騒がしいキャプションを取り除くことで、ノイズの多いwebデータを効果的に活用する。画像テキスト検索(平均リコール@1で+2.7%)、画像キャプション(CIDErで+2.8%)、VQA(VQAで+1.6%)など、幅広い視覚言語タスクにおける最先端の成果を得た。 BLIPはまた、ゼロショット方式で直接ビデオ言語タスクに移行する際に、強力な一般化能力を示す。コード、モデル、データセットはhttps://github.com/salesforce/BLIPで公開されている。

Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score). BLIP also demonstrates strong generalization ability when directly transferred to video-language tasks in a zero-shot manner. Code, models, and datasets are released at https://github.com/salesforce/BLIP.

翻訳日:2022-01-31 20:55:18 公開日:2022-01-28

# (参考訳) 疾患スクリーニングのためのラベル不確実性誘導マルチストリームモデル

Label uncertainty-guided multi-stream model for disease screening ( http://arxiv.org/abs/2201.12089v1 )

ライセンス: CC BY 4.0

Chi Liu, Zongyuan Ge, Mingguang He, Xiaotong Han

(参考訳) 医学画像データセットに対する病気の重症度のアノテーションは、しばしば複数のヒトグレーダーからの協調的な決定に依存している。個々の差異に由来するオブザーバ内変動は、常にこのプロセスで持続するが、影響は過小評価されることが多い。本稿では,不確実性問題としてオブザーバ内変動性(intra-observer variability)を取り上げ,そのラベル不確実性情報を疾患スクリーニングモデルに導入し,最終決定を改善する。主な考え方は、画像を不確実性情報によって単純で難しいケースに分割し、異なるケースを別々に扱うマルチストリームネットワークを開発することである。特に難しい場合は,適切な疾患の特徴を把握し,不確実性の干渉に抵抗するネットワークの能力を強化する。眼底画像を用いた緑内障スクリーニング実験では,提案モデルがいくつかのベースライン,特に難検例よりも優れていた。

The annotation of disease severity for medical image datasets often relies on collaborative decisions from multiple human graders. The intra-observer variability derived from individual differences always persists in this process, yet the influence is often underestimated. In this paper, we cast the intra-observer variability as an uncertainty problem and incorporate the label uncertainty information as guidance into the disease screening model to improve the final decision. The main idea is dividing the images into simple and hard cases by uncertainty information, and then developing a multi-stream network to deal with different cases separately. Particularly, for hard cases, we strengthen the network's capacity in capturing the correct disease features and resisting the interference of uncertainty. Experiments on a fundus image-based glaucoma screening case study show that the proposed model outperforms several baselines, especially in screening hard cases.

翻訳日:2022-01-31 20:32:25 公開日:2022-01-28

# (参考訳) 線形反転概念消去

Linear Adversarial Concept Erasure ( http://arxiv.org/abs/2201.12091v1 )

ライセンス: CC BY 4.0

Shauli Ravfogel, Michael Twiton, Yoav Goldberg and Ryan Cotterell

(参考訳) テキストデータに基づいてトレーニングされた現代のニューラルモデルは、直接の監督なしに現れる事前訓練された表現に依存している。これらの表現が現実のアプリケーションで使われるようになるにつれて、それらのコンテンツが \emph{control} できないことがますます重要な問題になっている。線形予測器が概念を回復するのを防ぐために、与えられた概念に対応する線形部分空間の同定と消去の問題を定式化する。我々は、この問題を制約付き線形ミニマックスゲームとしてモデル化し、既存のソリューションが一般にこのタスクに最適でないことを示す。我々は,ある目的に対する閉形式解を導出し,他の目的にうまく機能する凸緩和 r-レースを提案する。二元性除去の文脈で評価すると、本手法は、内在的および外在的評価によりバイアスを緩和する低次元部分空間を回復する。線形であるにもかかわらず、この手法は、トラクタビリティと解釈可能性を維持しつつ、深い非線形分類器のバイアスを効果的に軽減する。

Modern neural models trained on textual data rely on pre-trained representations that emerge without direct supervision. As these representations are increasingly being used in real-world applications, the inability to \emph{control} their content becomes an increasingly important problem. We formulate the problem of identifying and erasing a linear subspace that corresponds to a given concept, in order to prevent linear predictors from recovering the concept. We model this problem as a constrained, linear minimax game, and show that existing solutions are generally not optimal for this task. We derive a closed-form solution for certain objectives, and propose a convex relaxation, R-LACE, that works well for others. When evaluated in the context of binary gender removal, the method recovers a low-dimensional subspace whose removal mitigates bias by intrinsic and extrinsic evaluation. We show that the method -- despite being linear -- is highly expressive, effectively mitigating bias in deep nonlinear classifiers while maintaining tractability and interpretability.

翻訳日:2022-01-31 20:23:13 公開日:2022-01-28

# (参考訳) 魚眼カメラシステムにおけるグラフ畳み込みネットワークによるオーナー・メンバー関係の検出

Detecting Owner-member Relationship with Graph Convolution Network in Fisheye Camera System ( http://arxiv.org/abs/2201.12099v1 )

ライセンス: CC BY 4.0

Zizhang Wu, Jason Wang, Tianhao Xu, Fan Wang

(参考訳) 車輪と車両の所有者とメンバーの関係は、特に組込み環境での車両の3D知覚に大きく貢献する。しかし、この関係を利用するには、2つの大きな課題に直面する必要がある。 i) 従来のiouベースのヒューリスティックは、交通渋滞シナリオの処理が困難である。二車両搭載システムにおけるソリューションの有効性及び適用性は困難である。これらの問題に対処するために,グラフ畳み込みネットワーク(GCN)を設計し,新しい関係予測手法であるDeepWORDを提案する。具体的には、情報豊かさを向上させるために、ノードへの入力として局所相関を持つ特徴マップを用いる。次に,先行推定バイアスを動的に補正するグラフアテンションネットワーク(GAT)を導入する。最後に、WORDと呼ばれる注釈付きオーナシップを持つ大規模ベンチマークとしてデータセットを設計した。実験の結果,提案手法は最先端の精度と実時間性能を達成した。 WORDデータセットはhttps://github.com/NamespaceMain/ownermember-relationship-datasetで公開されている。

The owner-member relationship between wheels and vehicles contributes significantly to the 3D perception of vehicles, especially in embedded environments. However, to leverage this relationship we must face two major challenges: i) Traditional IoU-based heuristics have difficulty handling occluded traffic congestion scenarios. ii) The effectiveness and applicability of the solution in a vehicle-mounted system is difficult. To address these issues, we propose an innovative relationship prediction method, DeepWORD, by designing a graph convolutional network (GCN). Specifically, to improve the information richness, we use feature maps with local correlation as input to the nodes. Subsequently, we introduce a graph attention network (GAT) to dynamically correct the a priori estimation bias. Finally, we designed a dataset as a large-scale benchmark which has annotated owner-member relationship, called WORD. In the experiments we learned that the proposed method achieved state-of-the-art accuracy and real-time performance. The WORD dataset is made publicly available at https://github.com/NamespaceMain/ownermember-relationship-dataset.

翻訳日:2022-01-31 19:57:02 公開日:2022-01-28

# (参考訳) 忠実性侵害テストによる注意モデル説明可能性の再検討

Rethinking Attention-Model Explainability through Faithfulness Violation Test ( http://arxiv.org/abs/2201.12114v1 )

ライセンス: CC BY 4.0

Yibing Liu, Haoliang Li, Yangyang Guo, Chenqi Kong, Jing Li, Shiqi Wang

(参考訳) 注意機構は深層モデルの説明可能性を支配している。それらは入力上の確率分布を生成し、特徴重要度指標として広く見なされている。しかし,本論文では,機能的影響の極性を識別する弱さという,注意的説明に重要な制限がある。注意重みが高い機能はモデル予測に忠実に寄与しないかもしれないし、代わりに抑制効果を課すことができる。本稿では,現在の注意に基づく手法であるattentio$\odot$gradient や lrp-based attention descriptions について考察する。本稿ではまず,説明重みと衝突極性との整合性を測定するための実用的な診断手法を提案する。広範な実験を通じ,ほとんどの説明手法が不意に忠実性侵害問題,特に生の注意によって妨げられていることを示した。違反問題に影響を及ぼす要因に関する実証分析は、注意モデルにおける説明法の適用に有用である。

Attention mechanisms are dominating the explainability of deep models. They produce probability distributions over the input, which are widely deemed as feature-importance indicators. However, in this paper, we find one critical limitation in attention explanations: weakness in identifying the polarity of feature impact. This would be somehow misleading -- features with higher attention weights may not faithfully contribute to model predictions; instead, they can impose suppression effects. With this finding, we reflect on the explainability of current attention-based techniques, such as Attentio$\odot$Gradient and LRP-based attention explanations. We first propose an actionable diagnostic methodology (henceforth faithfulness violation test) to measure the consistency between explanation weights and the impact polarity. Through the extensive experiments, we then show that most tested explanation methods are unexpectedly hindered by the faithfulness violation issue, especially the raw attention. Empirical analyses on the factors affecting violation issues further provide useful observations for adopting explanation methods in attention models.

翻訳日:2022-01-31 19:47:32 公開日:2022-01-28

# (参考訳) DELAUNAY:精神物理学と機械学習研究のための抽象芸術のデータセット

DELAUNAY: a dataset of abstract art for psychophysical and machine learning research ( http://arxiv.org/abs/2201.12123v1 )

ライセンス: CC BY 4.0

Camille Gontier, Jakob Jordan, Mihai A. Petrovici

(参考訳) 画像データセットは、心理物理学実験や機械学習研究で一般的に使用される。ほとんどの公開データセットは、現実的で自然なオブジェクトのイメージで構成されている。しかし、典型的な機械学習モデルには自然オブジェクトに関するドメイン固有の知識が欠けているが、人間はそのようなデータに対して事前の経験を活用でき、人工学習と自然学習の比較が難しい。本稿では,抽象絵画のデータセットであるDELAUNAYについて紹介する。このデータセットは、自然画像と人工パターンの中間層を提供しており、例えば人間や人工ニューラルネットワークのサンプル効率を調べるために、さまざまなコンテキストで使用することができる。最後に、DELAUNAYで市販の畳み込みニューラルネットワークをトレーニングし、いくつかの興味深い特徴を強調します。

Image datasets are commonly used in psychophysical experiments and in machine learning research. Most publicly available datasets are comprised of images of realistic and natural objects. However, while typical machine learning models lack any domain specific knowledge about natural objects, humans can leverage prior experience for such data, making comparisons between artificial and natural learning challenging. Here, we introduce DELAUNAY, a dataset of abstract paintings and non-figurative art objects labelled by the artists' names. This dataset provides a middle ground between natural images and artificial patterns and can thus be used in a variety of contexts, for example to investigate the sample efficiency of humans and artificial neural networks. Finally, we train an off-the-shelf convolutional neural network on DELAUNAY, highlighting several of its intriguing features.

翻訳日:2022-01-31 19:32:01 公開日:2022-01-28

# (参考訳) 自動過パラメータ最適化問題に対する適応最適化器

Adaptive Optimizer for Automated Hyperparameter Optimization Problem ( http://arxiv.org/abs/2201.12124v1 )

ライセンス: CC BY 4.0

Huayuan Sun

(参考訳) ハイパーパラメータの選択は、機械学習モデルの性能に重大な影響を与える。本稿では,最適化プロセスにおいて適切なアルゴリズムとパラメータを自動的に調整する適応オプティマイザを構築するための汎用フレームワークを提案する。適応最適化手法を検証し、遺伝的アルゴリズムを用いてベイズ最適化器に基づく適応最適化器を構築し、元の最適化器と比較する。特に並列最適化には大きな利点があります。

The choices of hyperparameters have critical effects on the performance of machine learning models. In this paper, we present a general framework that is able to construct an adaptive optimizer, which automatically adjust the appropriate algorithm and parameters in the process of optimization. Examining the method of adaptive optimizer, we product an example of using genetic algorithm to construct an adaptive optimizer based on Bayesian Optimizer and compared effectiveness with original optimizer. Especially, It has great advantages in parallel optimization.

翻訳日:2022-01-31 19:22:15 公開日:2022-01-28

# (参考訳) 残留ポリシー勾配法による共通意味強化学習におけるクラス抽象化の活用

Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods ( http://arxiv.org/abs/2201.12126v1 )

ライセンス: CC BY 4.0

Niklas H\"opner, Ilaria Tiddi, Herke van Hoof

(参考訳) 知識ベースを活用するために強化学習(RL)エージェントを導入し、経験から学習することで、知識集約ドメインにおいてRLを前進させる。しかし、手動で環境に合わせた知識を活用することは困難であることが証明されている。本稿では,オープンソース知識グラフに存在するサブクラス関係を利用して,特定のオブジェクトを抽象化することを提案する。我々は,クラス階層内の異なる抽象レベルにまたがる知識を統合可能な残留ポリシー勾配法を開発した。提案手法は,コモンセンスゲームにおいて,サンプル効率の向上とオブジェクトの一般化を実現するとともに,抽出したクラス知識の過度なノイズや,クラス構造がほとんどない環境など,障害モードについても検討する。

Enabling reinforcement learning (RL) agents to leverage a knowledge base while learning from experience promises to advance RL in knowledge intensive domains. However, it has proven difficult to leverage knowledge that is not manually tailored to the environment. We propose to use the subclass relationships present in open-source knowledge graphs to abstract away from specific objects. We develop a residual policy gradient method that is able to integrate knowledge across different abstraction levels in the class hierarchy. Our method results in improved sample efficiency and generalisation to unseen objects in commonsense games, but we also investigate failure modes, such as excessive noise in the extracted class knowledge or environments with little class structure.

翻訳日:2022-01-31 19:17:10 公開日:2022-01-28

# (参考訳) 教師付き機械学習における意思決定のための学習曲線 - サーベイ

Learning Curves for Decision Making in Supervised Machine Learning -- A Survey ( http://arxiv.org/abs/2201.12150v1 )

ライセンス: CC BY 4.0

Felix Mohr, Jan N. van Rijn

(参考訳) 学習曲線(英: learning curves)とは、機械学習の文脈において、特定の資源(例えば、訓練例の数や訓練イテレーションの数)に対する学習アルゴリズムのパフォーマンスを評価するために採用されている社会科学の概念である。学習曲線は、機械学習のいくつかの文脈において重要な応用であり、最も重要なのは、データ取得の文脈、モデルのトレーニングの早期停止、モデル選択である。例えば、学習曲線をモデル化することで、アルゴリズムとハイパーパラメータの構成が適切な選択の可能性があるかどうかを早期に評価することができ、しばしばアルゴリズムの選択プロセスを高速化することができる。意思決定に学習曲線を使用するための様々なアプローチが提案されている。一部のモデルは、ある予算の特定のアルゴリズムが特定の参照性能を上回るかどうかという二分決定問題に答えるが、より複雑なモデルはアルゴリズムの学習曲線全体を予測する。学習曲線のアプローチを3つの基準で分類するフレームワーク、すなわち、対処する意思決定状況、彼らが回答する本質的学習曲線問題、彼らが使用するリソースの種類に分類する。文献から論文を調査し,この枠組みに分類する。

Learning curves are a concept from social sciences that has been adopted in the context of machine learning to assess the performance of a learning algorithm with respect to a certain resource, e.g. the number of training examples or the number of training iterations. Learning curves have important applications in several contexts of machine learning, most importantly for the context of data acquisition, early stopping of model training and model selection. For example, by modelling the learning curves, one can assess at an early stage whether the algorithm and hyperparameter configuration have the potential to be a suitable choice, often speeding up the algorithm selection process. A variety of approaches has been proposed to use learning curves for decision making. Some models answer the binary decision question of whether a certain algorithm at a certain budget will outperform a certain reference performance, whereas more complex models predict the entire learning curve of an algorithm. We contribute a framework that categorizes learning curve approaches using three criteria: the decision situation that they address, the intrinsic learning curve question that they answer and the type of resources that they use. We survey papers from literature and classify them into this framework.

翻訳日:2022-01-31 18:57:19 公開日:2022-01-28

# (参考訳) 分子コンフォメーションの生成的粗粒化

Generative Coarse-Graining of Molecular Conformations ( http://arxiv.org/abs/2201.12176v1 )

ライセンス: CC BY 4.0

Wujie Wang, Minkai Xu, Chen Cai, Benjamin Kurt Miller, Tess Smidt, Yusu Wang, Jian Tang, Rafael G\'omez-Bombarelli

(参考訳) 分子シミュレーションの粗粒化(CG)は、選択された原子を擬似ビーズに分類することで粒子表現を単純化し、シミュレーションを劇的に加速する。しかし,このようなcg処理は情報損失を誘発し,cg座標からの微細粒度 (fg) 座標の復元を精度良く行う。生成モデルと同変ネットワークの最近の進歩に触発されて、バックマッピング変換の重要な確率的性質と幾何的整合性要件を厳密に組み込む新しいモデルを提案する。我々のモデルはFGの不確かさを不変潜在空間にエンコードし、同変畳み込みを通じてFG測度に復号する。この領域の評価を標準化するために、分子動力学軌道に基づく3つの包括的なベンチマークも提供する。広範な実験によって、我々のアプローチは常により現実的な構造を回復し、既存のデータ駆動型メソッドをかなりのマージンで上回っています。

Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and therefore drastically accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from CG coordinates, a long-standing challenge. Inspired by the recent progress in generative models and equivariant networks, we propose a novel model that rigorously embeds the vital probabilistic nature and geometric consistency requirements of the backmapping transformation. Our model encodes the FG uncertainties into an invariant latent space and decodes them back to FG geometries via equivariant convolutions. To standardize the evaluation of this domain, we further provide three comprehensive benchmarks based on molecular dynamics trajectories. Extensive experiments show that our approach always recovers more realistic structures and outperforms existing data-driven methods with a significant margin.

翻訳日:2022-01-31 18:21:06 公開日:2022-01-28

# (参考訳) カーネル空間における逆概念消去

Adversarial Concept Erasure in Kernel Space ( http://arxiv.org/abs/2201.12191v1 )

ライセンス: CC BY 4.0

Shauli Ravfogel and Francisco Vargas and Yoav Goldberg and Ryan Cotterell

(参考訳) テキストデータに対するニューラルモデルの表現空間は、トレーニング中に教師なしの方法で現れる。性別などの人間に解釈可能な概念がどのようにコード化されているかを理解することで、ユーザーはこれらの表現の内容を‘emph{control}’し、それらに依存するモデルの動作を分析する能力を向上させることができる。制御問題に対する顕著なアプローチの1つは、与えられた概念に対応する表現空間内の線型概念部分空間の同定と除去である。これらは扱いやすく解釈可能であるが、ニューラルネットワークは必ずしも線形部分空間の概念を表すものではない。我々は, [ravfogel et al. 2022] の線形概念除去目的のカーナラライズを提案し, ある種の非線形敵が概念を回復する能力に対抗して有効であることを示した。興味深いことに、線形モデルと非線形モデルの間の分割は過度に単純化され、二項性の概念と中性化を考えると、すべての概念に関連する情報を排他的に含む単一のカーネル空間は見つからない。したがって、一度に \emph{all} 非線形敵から保護することは困難である。

The representation space of neural models for textual data emerges in an unsupervised manner during training. Understanding how human-interpretable concepts, such as gender, are encoded in these representations would improve the ability of users to \emph{control} the content of these representations and analyze the working of the models that rely on them. One prominent approach to the control problem is the identification and removal of linear concept subspaces -- subspaces in the representation space that correspond to a given concept. While those are tractable and interpretable, neural network do not necessarily represent concepts in linear subspaces. We propose a kernalization of the linear concept-removal objective of [Ravfogel et al. 2022], and show that it is effective in guarding against the ability of certain nonlinear adversaries to recover the concept. Interestingly, our findings suggest that the division between linear and nonlinear models is overly simplistic: when considering the concept of binary gender and its neutralization, we do not find a single kernel space that exclusively contains all the concept-related information. It is therefore challenging to protect against \emph{all} nonlinear adversaries at once.

翻訳日:2022-01-31 17:47:09 公開日:2022-01-28

# (参考訳) バリセントリック符号化モデルにおける測度推定

Measure Estimation in the Barycentric Coding Model ( http://arxiv.org/abs/2201.12195v1 )

ライセンス: CC BY 4.0

Matthew Werenski, Ruijie Jiang, Abiy Tasissa, Shuchin Aeron, James M. Murphy

(参考訳) 本稿では、未知の測度が既知の測度の有限集合のワッサーシュタイン2バリーセンタの集合に属すると仮定する、バリー中心符号化モデル(BCM)に基づく測度推定の問題について考察する。このモデルの下で測度を推定することは、未知の重心座標を推定することと同値である。 3つの主要な結果からなるBCMに基づく測度推定のための新しい幾何学的,統計的,および計算的洞察を提供する。最初の主要な結果は、ワッサーシュタイン2空間のリーマン幾何学を利用して、真の基準測度へのアクセスを仮定する二次最適化問題の解として、偏心座標を復元する手順を提供する。本質的な幾何学的洞察は、この二次問題のパラメータは、与えられた測度からBCMを定義する基準測度までの最適変位写像の間の内部積によって決定されるということである。第2の主な結果は、すべての測定値がi.i.d.サンプルによって実証的に観測された場合、bcmにおける座標の解法を確立します。基礎となる測度とその次元の滑らかさによって決定されるこのアルゴリズムの正確な収束率を証明し、その統計的一貫性を保証する。最後に,3つの応用領域において,BCMと関連する評価手順の有用性を実証する。 (i)ガウス測度に対する共分散推定 (ii)画像処理,及び (iii)自然言語処理。

This paper considers the problem of measure estimation under the barycentric coding model (BCM), in which an unknown measure is assumed to belong to the set of Wasserstein-2 barycenters of a finite set of known measures. Estimating a measure under this model is equivalent to estimating the unknown barycenteric coordinates. We provide novel geometrical, statistical, and computational insights for measure estimation under the BCM, consisting of three main results. Our first main result leverages the Riemannian geometry of Wasserstein-2 space to provide a procedure for recovering the barycentric coordinates as the solution to a quadratic optimization problem assuming access to the true reference measures. The essential geometric insight is that the parameters of this quadratic problem are determined by inner products between the optimal displacement maps from the given measure to the reference measures defining the BCM. Our second main result then establishes an algorithm for solving for the coordinates in the BCM when all the measures are observed empirically via i.i.d. samples. We prove precise rates of convergence for this algorithm -- determined by the smoothness of the underlying measures and their dimensionality -- thereby guaranteeing its statistical consistency. Finally, we demonstrate the utility of the BCM and associated estimation procedures in three application areas: (i) covariance estimation for Gaussian measures; (ii) image processing; and (iii) natural language processing.

翻訳日:2022-01-31 17:21:57 公開日:2022-01-28

# (参考訳) データ非依存関数による暗黙正則化の限界

Limitation of characterizing implicit regularization by data-independent functions ( http://arxiv.org/abs/2201.12198v1 )

ライセンス: CC BY 4.0

Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang

(参考訳) 近年,ニューラルネットワーク(nns)の暗黙的正規化の理解が深層学習理論の中心的な課題となっている。しかし、暗黙の正則化自体は完全に定義されておらず、よく理解されていない。本研究では,暗黙の正規化を数学的に定義し,研究する。重要なのは,データ独立関数による暗黙の正規化を特徴付ける共通アプローチの限界を検討することである。本稿では,2つの動的メカニズム,すなわち2点重なり合い機構を提案する。このメカニズムは,1つの隠れニューロンNNのクラスを生成するための2つのレシピを提供する。その結果,暗黙的正則化の深遠なデータ依存性が示され,将来におけるNNの暗黙的正則化のデータ依存性を詳細に研究するきっかけとなった。

In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task of deep learning theory. However, implicit regularization is in itself not completely defined and well understood. In this work, we make an attempt to mathematically define and study the implicit regularization. Importantly, we explore the limitation of a common approach of characterizing the implicit regularization by data-independent functions. We propose two dynamical mechanisms, i.e., Two-point and One-point Overlapping mechanisms, based on which we provide two recipes for producing classes of one-hidden-neuron NNs that provably cannot be fully characterized by a type of or all data-independent functions. Our results signify the profound data-dependency of implicit regularization in general, inspiring us to study in detail the data-dependency of NN implicit regularization in the future.

翻訳日:2022-01-31 17:20:53 公開日:2022-01-28

# (参考訳) 分割コンカレントアルゴリズムによるパーコレーション逆問題の解法

Solving a percolation inverse problem with a divide-and-concur algorithm ( http://arxiv.org/abs/2201.12222v1 )

ライセンス: CC BY 4.0

Sean Deyo

(参考訳) 我々は,ダイオードネットワークに対するパーコレーション逆問題を提案する。どのノードが電流を相互にパーコレーションできるかという情報があれば,観測電流と一致したダイオードネットワークを構築することができるか? そこで本研究では,この問題の非自明な事例に対する包括的アプローチに対する手法の優位性を実証するため,分割・収束反復予測法を実装した。パーコレーションデータが全て隠されていない場合、問題は最も困難であり、一般に再構成する最も難しいネットワークは、電流が1つのダイオードの追加や削除に最も敏感である場合である。

We present a percolation inverse problem for diode networks: Given information about which pairs of nodes allow current to percolate from one to the other, can one construct a diode network consistent with the observed currents? We implement a divide-and-concur iterative projection method for solving the problem and demonstrate the supremacy of our method over an exhaustive approach for nontrivial instances of the problem. We find that the problem is most difficult when some but not all of the percolation data are hidden, and that the most difficult networks to reconstruct generally are those for which the currents are most sensitive to the addition or removal of a single diode.

翻訳日:2022-01-31 16:49:15 公開日:2022-01-28

# 探索の克服:時間論理の仕様から複雑な環境での深層強化学習

Overcoming Exploration: Deep Reinforcement Learning in Complex Environments from Temporal Logic Specifications ( http://arxiv.org/abs/2201.12231v1 )

ライセンス: Link先を確認

Mingyu Cai, Erfan Aasi, Calin Belta, Cristian-Ioan Vasile

(参考訳) 大規模複雑な環境に展開する未知の連続時間ダイナミクスを持つタスク誘導型ロボットに対して,深層強化学習(drl)アルゴリズムを提案する。リニア時間論理(LTL)は、リッチなロボット仕様を表現するために用いられる。環境問題に対処するため,我々は,未知のロボット力学により計算された幾何学的経路が実現不可能な状態空間に密接な経路計画誘導型報酬スキームを提案する。提案手法は,LTLミッションを分散DRLを用いて解いたサブタスクに分解し,そのサブタスクをDeep Policy Gradientアルゴリズムを用いて並列にトレーニングする。本フレームワークは,大規模複雑な環境下での複雑なミッションをこなすロボットの性能(有効性,効率)を著しく向上させる。

We present a Deep Reinforcement Learning (DRL) algorithm for a task-guided robot with unknown continuous-time dynamics deployed in a large-scale complex environment. Linear Temporal Logic (LTL) is applied to express a rich robotic specification. To overcome the environmental challenge, we propose a novel path planning-guided reward scheme that is dense over the state space, and crucially, robust to infeasibility of computed geometric paths due to the unknown robot dynamics. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-tasks that are solved using distributed DRL, where the sub-tasks are trained in parallel, using Deep Policy Gradient algorithms. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale complex environments.

翻訳日:2022-01-31 16:37:55 公開日:2022-01-28

# 二重学習音楽作曲と舞踊振付

Dual Learning Music Composition and Dance Choreography ( http://arxiv.org/abs/2201.11999v1 )

ライセンス: Link先を確認

Shuang Wu, Zhenguang Li, Shijian Lu, Li Cheng

(参考訳) 音楽とダンスは常に人間の活動の柱として共存しており、事実上全ての社会における文化的、社会的、娯楽的な機能に大きく貢献している。音楽とダンスが徐々に2つの独立した分野に体系化されているにもかかわらず、彼らの親密なつながりは否定できないものであり、一方の芸術形態はしばしば他方なしでは不完全に見える。近年の研究では、音楽に基づくダンスシーケンスの生成モデルが研究されている。しかし、与えられたダンスのために作曲する2つの課題は、ほとんど見落とされた。本稿では,両タスクを二重学習方式で協調的にモデル化する新しい拡張法を提案する。 2つのモダリティの双対性を活用するために,機能埋め込みを整列する最適な輸送目標を導入するとともに,全体の整合性を高めるためのサイクル整合損失も導入する。実験結果から,2つの学習フレームワークが個々のタスクパフォーマンスを改善し,生成した楽曲の合成とダンス振付を,条件付き入力に忠実に再現できることが確認された。

Music and dance have always co-existed as pillars of human activities, contributing immensely to the cultural, social, and entertainment functions in virtually all societies. Notwithstanding the gradual systematization of music and dance into two independent disciplines, their intimate connection is undeniable and one art-form often appears incomplete without the other. Recent research works have studied generative models for dance sequences conditioned on music. The dual task of composing music for given dances, however, has been largely overlooked. In this paper, we propose a novel extension, where we jointly model both tasks in a dual learning approach. To leverage the duality of the two modalities, we introduce an optimal transport objective to align feature embeddings, as well as a cycle consistency loss to foster overall consistency. Experimental results demonstrate that our dual learning framework improves individual task performance, delivering generated music compositions and dance choreographs that are realistic and faithful to the conditioned inputs.

翻訳日:2022-01-31 16:37:00 公開日:2022-01-28

# differential privacyのハイブリッドモデルにおける転送学習

Transfer Learning In Differential Privacy's Hybrid-Model ( http://arxiv.org/abs/2201.12018v1 )

ライセンス: Link先を確認

Refael Kohen and Or Sheffet

(参考訳) ディファレンシャルプライバシにおけるハイブリッドモデル(avent et al 2017)は、ローカルモデルの拡張であり、n人のローカルエイジェントに加えて、n人の追加的な個人の繊細な詳細を保持するキュレーターである1人の特別エージェントによって支援される。本稿では,キュレーターデータセットのn個の個体が一般集団(地域エージェント)とは異なる分布から引き出されるハイブリッドモデルにおける機械学習の問題点について考察する。我々は,このトランスファー学習問題に対して,反復サブサンプリングと乗法重みアルゴリズム(bun et al, 2020)の滑らかな変動に基づいて,キュレーターが保持するn個のサンプルの簡約化を用いて,キュレーターモデルdp-learnerをハイブリッドモデル学習者に還元する一般的なスキームを与える。提案手法は, 2つの分布間のカイ二乗発散に依存するサンプル複雑性を有する。プライベートリダクションに必要なサンプルの複雑さに、最悪のケース分析バウンダリを与えます。上記のサンプルの複雑さを減らすため、2つの特定のインスタンスにサンプルの複雑さを劇的に減らすことができる(1つのインスタンスは数学的に分析され、もう1つのインスタンスは経験的に分析される)。

The hybrid-model (Avent et al 2017) in Differential Privacy is a an augmentation of the local-model where in addition to N local-agents we are assisted by one special agent who is in fact a curator holding the sensitive details of n additional individuals. Here we study the problem of machine learning in the hybrid-model where the n individuals in the curators dataset are drawn from a different distribution than the one of the general population (the local-agents). We give a general scheme -- Subsample-Test-Reweigh -- for this transfer learning problem, which reduces any curator-model DP-learner to a hybrid-model learner in this setting using iterative subsampling and reweighing of the n examples held by the curator based on a smooth variation of the Multiplicative-Weights algorithm (introduced by Bun et al, 2020). Our scheme has a sample complexity which relies on the chi-squared divergence between the two distributions. We give worst-case analysis bounds on the sample complexity required for our private reduction. Aiming to reduce said sample complexity, we give two specific instances our sample complexity can be drastically reduced (one instance is analyzed mathematically, while the other - empirically) and pose several directions for follow-up work.

翻訳日:2022-01-31 16:36:42 公開日:2022-01-28

# alpa: 分散ディープラーニングのための操作間並列処理の自動化

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning ( http://arxiv.org/abs/2201.12023v1 )

ライセンス: Link先を確認

Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica

(参考訳) Alpaは、データ、オペレータ、パイプライン並列性を統一する実行計画を生成することで、大規模なディープラーニング(DL)モデルのモデル並列トレーニングを自動化する。既存のモデル並列トレーニングシステムは、ユーザが手動で並列化計画を作成するか、モデル並列化設定の限られたスペースから自動的にモデルを生成する必要があるが、分散コンピューティングデバイス上で複雑なDLモデルをスケールアウトするのに十分ではない。 Alpaは、大きなDLモデルのトレーニングを、並列化を2つの階層レベルとして見ることによって配布する。これに基づいて、Alpaは大規模なモデル並列実行計画のための新しい階層空間を構築している。 Alpaは複数のコンパイルパスを設計し、各独立した並列処理レベルで最適な並列実行計画を自動的に導出し、分散コンピューティングデバイス上で2レベル並列実行をオーケストレーションする効率的なランタイムを実装している。評価の結果,alpaが設計したモデルでも,ハンドチューニング型モデル並列トレーニングシステムと一致するか,あるいは上回る並列化計画を生成することがわかった。特殊なシステムとは異なり、Alpaは手動設計の計画なしで異質なアーキテクチャやモデルを持つモデルに一般化する。

Alpa automates model-parallel training of large deep learning (DL) models by generating execution plans that unify data, operator, and pipeline parallelism. Existing model-parallel training systems either require users to manually create a parallelization plan or automatically generate one from a limited space of model parallelism configurations, which does not suffice to scale out complex DL models on distributed compute devices. Alpa distributes the training of large DL models by viewing parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms. Based on it, Alpa constructs a new hierarchical space for massive model-parallel execution plans. Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel execution on distributed compute devices. Our evaluation shows Alpa generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for. Unlike specialized systems, Alpa also generalizes to models with heterogeneous architectures and models without manually-designed plans.

翻訳日:2022-01-31 16:34:38 公開日:2022-01-28

# MDCT領域における符号化音声の品質向上のためのDNNベースのポストフィルタ

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain ( http://arxiv.org/abs/2201.12039v1 )

ライセンス: Link先を確認

Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs

(参考訳) 周波数領域処理、特にMDCT(Modified Discrete Cosine Transform)は、オーディオ符号化において最も広く使われている手法である。しかし、低ビットレートでは、特に音声の音声品質は、変換係数を直接コードする利用可能なビットがないため、劇的に劣化する。伝統的に、ポストフィルタは、ソースのa-priori情報と余分な送信パラメータを利用して、符号化された音声のアーティファクトを緩和するために使われてきた。近年、データ駆動のポストフィルタはより良い結果を示しているが、複雑さと遅延が大幅に増大している。本研究では,コーデックのmdctドメイン内で直接動作し,余分な遅延を生じさせないマスクベースのポストフィルタを提案する。実数値マスクは量子化MDCT係数に適用され、比較的軽量な畳み込みエンコーダ・デコーダネットワークから推定される。本手法は,最近標準化された低遅延低複雑コーデック (LC3) 上で16kbpsの最小ビットレートで試験する。目的的および主観的評価は従来のポストフィルタよりもこのアプローチの利点を示し,LC3符号化音声よりも平均10MUSHRA点が向上した。

Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding. However, at low bitrates, audio quality, especially for speech, degrades drastically due to the lack of available bits to directly code the transform coefficients. Traditionally, post-filtering has been used to mitigate artefacts in the coded speech by exploiting a-priori information of the source and extra transmitted parameters. Recently, data-driven post-filters have shown better results, but at the cost of significant additional complexity and delay. In this work, we propose a mask-based post-filter operating directly in MDCT domain of the codec, inducing no extra delay. The real-valued mask is applied to the quantized MDCT coefficients and is estimated from a relatively lightweight convolutional encoder-decoder network. Our solution is tested on the recently standardized low-delay, low-complexity codec (LC3) at lowest possible bitrate of 16 kbps. Objective and subjective assessments clearly show the advantage of this approach over the conventional post-filter, with an average improvement of 10 MUSHRA points over the LC3 coded speech.

翻訳日:2022-01-31 16:34:18 公開日:2022-01-28

# 物理誘導型ニューラルネットワークによるフィードフォワード制御:トレーニングコスト正規化と最適化初期化

On feedforward control using physics-guided neural networks: Training cost regularization and optimized initialization ( http://arxiv.org/abs/2201.12088v1 )

ライセンス: Link先を確認

Max Bolderman, Mircea Lazar, Hans Butler

(参考訳) モデルベースフィードフォワードコントローラの性能は通常、逆系のダイナミクスモデルの精度によって制限される。物理誘導型ニューラルネットワーク(PGNN)は、同定された逆ダイナミクスの高精度化手法として最近提案されている。しかし、ニューラルネットワークのフレキシブルな性質は、物理モデルと並行して使用すると過パラメータ化を発生させ、トレーニング中にパラメータドリフトが発生する。このドリフトは、物理モデルのパラメータが物理値に対応しない可能性があるため、トレーニングデータに存在しない運用条件にpgnnの脆弱性が増大する。そこで本研究では, 同定された物理パラメータによる正規化法と, 学習収束を改善する最適化トレーニング初期化法を組み合わせることを提案する。正規化PGNNフレームワークは実生活の産業用リニアモータ上で検証され、追跡精度と外挿性が向上する。

Performance of model-based feedforward controllers is typically limited by the accuracy of the inverse system dynamics model. Physics-guided neural networks (PGNN), where a known physical model cooperates in parallel with a neural network, were recently proposed as a method to achieve high accuracy of the identified inverse dynamics. However, the flexible nature of neural networks can create overparameterization when employed in parallel with a physical model, which results in a parameter drift during training. This drift may result in parameters of the physical model not corresponding to their physical values, which increases vulnerability of the PGNN to operating conditions not present in the training data. To address this problem, this paper proposes a regularization method via identified physical parameters, in combination with an optimized training initialization that improves training convergence. The regularized PGNN framework is validated on a real-life industrial linear motor, where it delivers better tracking accuracy and extrapolation.

翻訳日:2022-01-31 16:33:59 公開日:2022-01-28

# バックドアがフロントドアに突っ込む:バックファイアのマルチエージェントバックドア攻撃

Backdoors Stuck At The Frontdoor: Multi-Agent Backdoor Attacks That Backfire ( http://arxiv.org/abs/2201.12211v1 )

ライセンス: Link先を確認

Siddhartha Datta, Nigel Shadbolt

(参考訳) 協調学習とアウトソースデータ収集における悪意あるエージェントは、クリーンモデルのトレーニングを脅かす。攻撃者が訓練中にモデルに毒を塗って標的の誤分類を成功させるバックドア攻撃は、列車時の堅牢性にとって大きな懸念事項である。本稿では,複数の攻撃者が同時に被害者モデルのバックドアを試みるマルチエージェントバックドア攻撃シナリオについて検討する。エージェントが集団攻撃の成功率の低いゲームで一貫したバックファイリング現象が観察される。バックドア攻撃の態様、非協調/協力、共同分散シフト、ゲーム設定の異なるモードを検証し、下位境界での平衡攻撃成功率を返却する。その結果,実践環境におけるバックドア防衛研究の再評価の動機となった。

Malicious agents in collaborative learning and outsourced data collection threaten the training of clean models. Backdoor attacks, where an attacker poisons a model during training to successfully achieve targeted misclassification, are a major concern to train-time robustness. In this paper, we investigate a multi-agent backdoor attack scenario, where multiple attackers attempt to backdoor a victim model simultaneously. A consistent backfiring phenomenon is observed across a wide range of games, where agents suffer from a low collective attack success rate. We examine different modes of backdoor attack configurations, non-cooperation / cooperation, joint distribution shifts, and game setups to return an equilibrium attack success rate at the lower bound. The results motivate the re-evaluation of backdoor defense research for practical environments.

翻訳日:2022-01-31 16:33:43 公開日:2022-01-28

# データ同化を用いたレベルセット法による海洋出口氷河の表面高さと終点位置のシミュレーション

Simulating surface height and terminus position for marine outlet glaciers using a level set method with data assimilation ( http://arxiv.org/abs/2201.12235v1 )

ライセンス: Link先を確認

M. Alamgir Hossaina, Sam Pimentel, John M. Stockie

(参考訳) 本研究では,氷面と終端位置観測を数値氷流モデルに統合するデータ同化フレームワークを実装した。このモデルは、よく知られた浅層棚近似 (ssa) とレベルセット法を結合して氷の動きと氷河の幾何学の変化を捉えている。レベルセット法は、海洋出口氷河の発達する氷-大気圏と氷-海の境界を明示的に追跡する。氷の界面を記述するレベルセット関数を更新することにより,氷表面の標高と横方向の氷の深さの観測を統一するために,アンサンブル変換カルマンフィルタを用いる。理想的な海洋性氷河における数値実験は,季節・多年氷河の進行・後退サイクルを追跡するデータ同化手法の有効性を実証するものである。このモデルはまた、グリーンランド氷床の潮流を終止する主要な氷河であるヘルハイム氷河をシミュレートするためにも適用され、最近の急速な後退の歴史を経験した。リモートセンシングされた地表高度プロファイルからの観測を同化することで、移動氷河の終端と氷河の表面変化をより正確に追跡することができる。これらの結果は, 短期氷床力学のより正確な予測を行うためのデータ同化手法の利用を支持する。

We implement a data assimilation framework for integrating ice surface and terminus position observations into a numerical ice-flow model. The model uses the well-known shallow shelf approximation (SSA) coupled to a level set method to capture ice motion and changes in the glacier geometry. The level set method explicitly tracks the evolving ice-atmosphere and ice-ocean boundaries for a marine outlet glacier. We use an Ensemble Transform Kalman Filter to assimilate observations of ice surface elevation and lateral ice extent by updating the level set function that describes the ice interface. Numerical experiments on an idealized marine-terminating glacier demonstrate the effectiveness of our data assimilation approach for tracking seasonal and multi-year glacier advance and retreat cycles. The model is also applied to simulate Helheim Glacier, a major tidewater-terminating glacier of the Greenland Ice Sheet that has experienced a recent history of rapid retreat. By assimilating observations from remotely-sensed surface elevation profiles we are able to more accurately track the migrating glacier terminus and glacier surface changes. These results support the use of data assimilation methodologies for obtaining more accurate predictions of short-term ice sheet dynamics.

翻訳日:2022-01-31 16:33:29 公開日:2022-01-28

# 認証強化学習のための共同微分可能最適化と検証

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning ( http://arxiv.org/abs/2201.12243v1 )

ライセンス: Link先を確認

Yixuan Wang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu

(参考訳) 安全クリティカル制御システムのためのモデルベース強化学習では、学習コントローラの下でシステム特性(例えば、安全性、安定性)を正式に認定することが重要である。しかし、既存の手法は一般に正式な検証を施すため、コントローラが学習されているため、学習と検証を何度も繰り返したとしても、証明書を得るのは難しいことがある。そこで,本稿では,価値関数や証明書から勾配によって微分可能な新しい二段階最適化問題を定式化・解決することにより,強化学習と形式検証を共同で行う枠組みを提案する。 svg(model-based stochastic value gradient)法やppo(model-free proximal policy optimization)法に比べて,バリア関数やリアプノフ関数によるシステム安全性と安定性を確保するための実現可能なコントローラを見つける上で,様々な例で実験を行った。

In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties (e.g., safety, stability) under the learned controller. However, as existing methods typically apply formal verification \emph{after} the controller has been learned, it is sometimes difficult to obtain any certificate, even after many iterations between learning and verification. To address this challenge, we propose a framework that jointly conducts reinforcement learning and formal verification by formulating and solving a novel bilevel optimization problem, which is differentiable by the gradients from the value function and certificates. Experiments on a variety of examples demonstrate the significant advantages of our framework over the model-based stochastic value gradient (SVG) method and the model-free proximal policy optimization (PPO) method in finding feasible controllers with barrier functions and Lyapunov functions that ensure system safety and stability.

翻訳日:2022-01-31 16:33:08 公開日:2022-01-28

# 可変化を伴う適応加速度(Extra-)勾配法

Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction ( http://arxiv.org/abs/2201.12302v1 )

ライセンス: Link先を確認

Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen

(参考訳) 本稿では,一般凸の場合に着目した有限サム凸最適化問題について検討する。近年, 分散低減(VR)法とその加速変種の研究は, わくわくする進歩を遂げている。しかし、既存のVRアルゴリズムで使用されるステップサイズは、しばしば未知であり、実際にチューニングを必要とする滑らかさパラメータに依存する。この問題に対処するため,Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) とAdaptive Variance Reduced Accelerated Gradient (AdaVRAG) の2つの新しい適応VRアルゴリズムを提案する。我々のアルゴリズムは滑らかさパラメータの知識を必要としない。 AdaVRAE は $\mathcal{O}\left(n\log\log n+\sqrt {\frac{n\beta}{\epsilon}}\right)$グルーフ評価と AdaVRAG は $\mathcal{O}\left(n\log\log n+\sqrt {\frac{n\beta\log\beta}{\epsilon}}\right)$グルーフ評価を使用して $\mathcal{O}(\epsilon)$-suboptimal Solution を得る。この結果は、非適応型VR手法の最もよく知られた収束率と一致し、アート適応型VR手法であるAdaSVRGの収束率を改善する。実世界のデータセット実験における従来の手法と比較して,アルゴリズムの優れた性能を示す。

In this paper, we study the finite-sum convex optimization problem focusing on the general convex case. Recently, the study of variance reduced (VR) methods and their accelerated variants has made exciting progress. However, the step size used in the existing VR algorithms typically depends on the smoothness parameter, which is often unknown and requires tuning in practice. To address this problem, we propose two novel adaptive VR algorithms: Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) and Adaptive Variance Reduced Accelerated Gradient (AdaVRAG). Our algorithms do not require knowledge of the smoothness parameter. AdaVRAE uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta}{\epsilon}}\right)$ gradient evaluations and AdaVRAG uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{n\beta\log\beta}{\epsilon}}\right)$ gradient evaluations to attain an $\mathcal{O}(\epsilon)$-suboptimal solution, where $n$ is the number of functions in the finite sum and $\beta$ is the smoothness parameter. This result matches the best-known convergence rate of non-adaptive VR methods and it improves upon the convergence of the state of the art adaptive VR method, AdaSVRG. We demonstrate the superior performance of our algorithms compared with previous methods in experiments on real-world datasets.

翻訳日:2022-01-31 16:32:46 公開日:2022-01-28

# 多数派支援の価格

The Price of Majority Support ( http://arxiv.org/abs/2201.12303v1 )

ライセンス: Link先を確認

Robin Fritsch and Roger Wattenhofer

(参考訳) 我々は,個人集団の意見の相違点を,相互に独立した連立的な話題で発見する問題を考察する。本稿では,結果が多数派支持を必要とすることによる代表性の喪失,すなわち「多数派支持の価格」を定量化する。各個人は、少なくとも、同意しないトピック数で結果に同意した場合、結果をサポートすると仮定される。我々の結果は、トピック別多数決の結果が多数派に支持されないかもしれないというアンスコンボのパラドックスを定量化すると見なすこともできる。結果の代表性を測定するために,2つの指標を検討する。まず、できるだけ多くのトピックについて多数派と合意する結果を探します。我々は、この数のトピックで多数派と一致し、多数派が支持する結果が存在することが保証されるような最大数が$\ceil{(t+1)/2}$となることを証明する。第2に、あるトピックに対する投票者の意見が、そのトピックの結果と一致する回数を数えます。ゴールは、最大数のマッチで多数派が支持する結果を見つけることである。我々は,この数字と,過半数の支持を得られないような総合的最適結果の一致数との比を考察する。我々は、多数派支持による結果と、ベスト全体に対するこの一致率が存在することが保証されるような最大比率を見出そうとする。 3つのトピックについて、この比率は5/6\approx 0.83$である。一般に、$t$ が無限大に近づく傾向にあるような 2\sqrt{6}-4\approx 0.90$ に近い上限を証明する。さらに、より優れた上界と非整合な下界を、関連する範囲で$t$で数値計算する。

We consider the problem of finding a compromise between the opinions of a group of individuals on a number of mutually independent, binary topics. In this paper, we quantify the loss in representativeness that results from requiring the outcome to have majority support, in other words, the "price of majority support". Each individual is assumed to support an outcome if they agree with the outcome on at least as many topics as they disagree on. Our results can also be seen as quantifying Anscombes paradox which states that topic-wise majority outcome may not be supported by a majority. To measure the representativeness of an outcome, we consider two metrics. First, we look for an outcome that agrees with a majority on as many topics as possible. We prove that the maximum number such that there is guaranteed to exist an outcome that agrees with a majority on this number of topics and has majority support, equals $\ceil{(t+1)/2}$ where $t$ is the total number of topics. Second, we count the number of times a voter opinion on a topic matches the outcome on that topic. The goal is to find the outcome with majority support with the largest number of matches. We consider the ratio between this number and the number of matches of the overall best outcome which may not have majority support. We try to find the maximum ratio such that an outcome with majority support and this ratio of matches compared to the overall best is guaranteed to exist. For 3 topics, we show this ratio to be $5/6\approx 0.83$. In general, we prove an upper bound that comes arbitrarily close to $2\sqrt{6}-4\approx 0.90$ as $t$ tends to infinity. Furthermore, we numerically compute a better upper and a non-matching lower bound in the relevant range for $t$.

翻訳日:2022-01-31 16:31:04 公開日:2022-01-28

# 独立系鎖を持つnドルプレイヤ確率ゲームにおける定性ナッシュ平衡ポリシのデュアルミラーディフレッシュによる学習

Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains via Dual Mirror Descent ( http://arxiv.org/abs/2201.12224v1 )

ライセンス: Link先を確認

S. Rasoul Etesami

(参考訳) 我々は$n$プレーヤ確率ゲームのサブクラスについて検討し、プレイヤーはペイオフ関数を介して結合された状態で内部の状態/行動空間を持つ。プレイヤーの内部鎖は独立した遷移確率によって駆動されると仮定される。さらに、プレイヤーは実際の機能ではなく、それぞれのペイオフの実現しか受け取れず、お互いの状態や行動も観察できない。ペイオフ関数の構造に関するいくつかの仮定の下で、双対平均化と双対ミラー降下に基づく効率的な学習アルゴリズムを開発し、ほぼ確実に収束し、あるいは$\epsilon$-nash均衡ポリシーの集合に期待できる。特に、ゲームパラメーターの観点から多項式的にスケールするイテレートの数の上界を導出して、$\epsilon$-nash 平衡ポリシーを達成する。マルコフポテンシャルゲームや線型四進確率ゲームに加えて、この研究は、ある仮定の下では、$\epsilon$-Nash平衡ポリシーを見つけるために多項式時間学習アルゴリズムを確実に認める$n$プレイヤ確率ゲームの別の興味深いサブクラスを提供する。

We consider a subclass of $n$-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can only receive realizations of their payoffs but not the actual functions, nor can they observe each others' states/actions. Under some assumptions on the structure of the payoff functions, we develop efficient learning algorithms based on Dual Averaging and Dual Mirror Descent, which provably converge almost surely or in expectation to the set of $\epsilon$-Nash equilibrium policies. In particular, we derive upper bounds on the number of iterates that scale polynomially in terms of the game parameters to achieve an $\epsilon$-Nash equilibrium policy. Besides Markov potential games and linear-quadratic stochastic games, this work provides another interesting subclass of $n$-player stochastic games that under some assumption provably admit polynomial-time learning algorithm for finding their $\epsilon$-Nash equilibrium policies.

翻訳日:2022-01-31 16:30:36 公開日:2022-01-28

# (参考訳) 新しいダイナミックキャリブレーションによる予報と観測を組み合わせた短期風速アンサンブル予測のスキル向上

Increasing the skill of short-term wind speed ensemble forecasts combining forecasts and observations via a new dynamic calibration ( http://arxiv.org/abs/2201.12234v1 )

ライセンス: CC BY 4.0

Gabriele Casciaro, Francesco Ferrari, Daniele Lagomarsino Oneto, Andrea Lira-Loarca, Andrea Mazzino

(参考訳) 風力産業で使用される全ての数値気象予測モデルは、解析が利用可能になったら、メインのシナプス時間00,06,12,18 utcから予測を生成する必要がある。 2つの連続するモデル間の6時間の遅延時間は、少なくとも1時間の周波数を持つ新しい正確な予測を提供することで、ギャップを埋めるための戦略を要求する。これは、頻繁で正確で新鮮な情報をトレーダーやシステム規制当局から要求し、継続的に作業戦略を適用するために行われる。本稿では,準実時間観測風速と気象モデル予測を,新しいアンサンブルモデル出力統計(emos)戦略を用いて組み合わせる手法を提案する。本戦略の成功は,2018年と2019年のイタリア上空の観測風速との比較によって評価された。

All numerical weather prediction models used for the wind industry need to produce their forecasts starting from the main synoptic hours 00, 06, 12, and 18 UTC, once the analysis becomes available. The six-hour latency time between two consecutive model runs calls for strategies to fill the gap by providing new accurate predictions having, at least, hourly frequency. This is done to accommodate the request of frequent, accurate and fresh information from traders and system regulators to continuously adapt their work strategies. Here, we propose a strategy where quasi-real time observed wind speed and weather model predictions are combined by means of a novel Ensemble Model Output Statistics (EMOS) strategy. The success of our strategy is measured by comparisons against observed wind speed from SYNOP stations over Italy in the years 2018 and 2019.

翻訳日:2022-01-31 16:29:49 公開日:2022-01-28

# エンドツーエンドコード切り換え自動音声認識における言語コンテキスト混乱の低減

Reducing language context confusion for end-to-end code-switching automatic speech recognition ( http://arxiv.org/abs/2201.12155v1 )

ライセンス: Link先を確認

Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Yu Ting Yeung, Liqun Deng

(参考訳) コードスイッチングは、コミュニケーションプロセスにおける代替言語を扱うことです。コードスイッチングのための訓練用エンドツーエンド(E2E)自動音声認識(ASR)システムは、複数の言語が存在するため、言語コンテキストの混乱によって複雑化するデータが少ないため、難しい問題であることが知られている。本稿では、等価制約理論(EC)に基づくE2E符号スイッチングASRモデルの多言語文脈混乱を低減するための言語関連注意機構を提案する。言語理論では、コードスイッチング文で発生する任意の単言語フラグメントは、一言語文の1つでなければならない。モノリンガルデータとコードスイッチングデータの間にブリッジを確立する。複数の言語のそれぞれの注意を計算することにより、豊かな単言語データから言語知識を効率的に伝達することができる。本手法をasru 2019 mandarin- english code-switching challengeデータセットで評価した。ベースラインモデルと比較して,提案手法は11.37%の相対混合誤差率低減を実現する。

Code-switching is about dealing with alternative languages in the communication process. Training end-to-end (E2E) automatic speech recognition (ASR) systems for code-switching is known to be a challenging problem because of the lack of data compounded by the increased language context confusion due to the presence of more than one language. In this paper, we propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model based on the Equivalence Constraint Theory (EC). The linguistic theory requires that any monolingual fragment that occurs in the code-switching sentence must occur in one of the monolingual sentences. It establishes a bridge between monolingual data and code-switching data. By calculating the respective attention of multiple languages, our method can efficiently transfer language knowledge from rich monolingual data. We evaluate our method on ASRU 2019 Mandarin-English code-switching challenge dataset. Compared with the baseline model, the proposed method achieves 11.37% relative mix error rate reduction.

翻訳日:2022-01-31 16:12:18 公開日:2022-01-28

# 協調運転自動化のためのインフラストラクチャに基づく物体検出と追跡:調査

Infrastructure-Based Object Detection and Tracking for Cooperative Driving Automation: A Survey ( http://arxiv.org/abs/2201.11871v1 )

ライセンス: Link先を確認

Zhengwei Bai, Guoyuan Wu, Xuewei Qi, Yongkang Liu, Kentaro Oguchi, Matthew J. Barth

(参考訳) オブジェクト検出は、現代交通システムの安全性、モビリティ、持続可能性問題に対処する革命的なソリューションであるCDA(Cooperative Driving Automation)を実現する上で、基本的な役割を果たす。現在のコンピュータビジョン技術は、咬合のないシナリオで十分な物体検出結果を提供できるが、搭載センサーの知覚性能は、範囲や咬合によって必然的に制限される可能性がある。センサ設置のための柔軟な位置とポーズのため、インフラストラクチャベースの検出および追跡システムは、コネクテッドカーの認識能力を高めることができ、すぐに最も人気のある研究トピックの1つとなる。本稿では,インフラに基づく物体検出・追跡システムの研究動向について述べる。各種センサに基づく道路サイドセンシングシステムのアーキテクチャをレビューし,インフラベースのセンシングシステムにおけるワークフローの高レベルな記述を示す。道路サイドセンサと異なる知覚方法論を詳細な文献でレビュー・分析し、特定の方法の低レベルな説明と、インフラストラクチャベースの物体検出と追跡方法の全体像を描くためのデータセットとシミュレータを提供する。議論は、現在の機会、オープンな問題、将来のトレンドを指摘するために行われます。

Object detection plays a fundamental role in enabling Cooperative Driving Automation (CDA), which is regarded as the revolutionary solution to addressing safety, mobility, and sustainability issues of contemporary transportation systems. Although current computer vision technologies could provide satisfactory object detection results in occlusion-free scenarios, the perception performance of onboard sensors could be inevitably limited by the range and occlusion. Owing to flexible position and pose for sensor installation, infrastructure-based detection and tracking systems can enhance the perception capability for connected vehicles and thus quickly become one of the most popular research topics. In this paper, we review the research progress for infrastructure-based object detection and tracking systems. Architectures of roadside perception systems based on different types of sensors are reviewed to show a high-level description of the workflows for infrastructure-based perception systems. Roadside sensors and different perception methodologies are reviewed and analyzed with detailed literature to provide a low-level explanation for specific methods followed by Datasets and Simulators to draw an overall landscape of infrastructure-based object detection and tracking methods. Discussions are conducted to point out current opportunities, open problems, and anticipated future trends.

翻訳日:2022-01-31 16:12:03 公開日:2022-01-28

# GAN生成顔画像の一般視品質評価

Generalized Visual Quality Assessment of GAN-Generated Face Images ( http://arxiv.org/abs/2201.11975v1 )

ライセンス: Link先を確認

Yu Tian and Zhangkai Ni and Baoliang Chen and Shiqi Wang and Hanli Wang and Sam Kwong

(参考訳) 近年では、gans(generative adversarial networks)による顔生成への関心が劇的に高まっている。異なるアプリケーションシナリオに対して鮮明な顔画像を生成するために、多くのganアルゴリズムが開発されている。しかし、そのようなGAN生成顔画像(GFI)の自動品質評価にはほとんど貢献していないが、GANモデルで生成されたGFIの一般化と堅牢な品質評価にはほとんど貢献していない。本稿では, GFIの汎用品質評価に向けて, 主観的, 客観的品質を研究するための最初の試みを行う。具体的には、4つのGANアルゴリズムのGFI、画像品質評価(IQA)尺度の擬似ラベル、および主観的テストによる人間の意見スコアからなる大規模データベースを構築した。その後,メタラーニングに基づくGANアルゴリズムを用いて,GFIの正確な品質予測を行うことができる品質評価モデルを開発した。特に、限られたGANアルゴリズムから生まれるGFIのペアから共有知識を学習するために、畳み込みブロック注意(CBA)と顔属性に基づく分析(ABA)モジュールを開発し、学習知識が人間の視覚的知覚と一致することを保証する。大規模実験により,提案モデルは最先端のIQAモデルと比較して性能が向上し,未知のGANアルゴリズムからGFIを評価する際の有効性を維持することができることがわかった。

Recent years have witnessed the dramatically increased interest in face generation with generative adversarial networks (GANs). A number of successful GAN algorithms have been developed to produce vivid face images towards different application scenarios. However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to generalized and robust quality assessment of GFIs generated with unseen GAN model. Herein, we make the first attempt to study the subjective and objective quality towards generalized quality assessment of GFIs. More specifically, we establish a large-scale database consisting of GFIs from four GAN algorithms, the pseudo labels from image quality assessment (IQA) measures, as well as the human opinion scores via subjective testing. Subsequently, we develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms based on meta-learning. In particular, to learn shared knowledge from GFIs pairs that are born of limited GAN algorithms, we develop the convolutional block attention (CBA) and facial attributes-based analysis (ABA) modules, ensuring that the learned knowledge tends to be consistent with human visual perception. Extensive experiments exhibit that the proposed model achieves better performance compared with the state-of-the-art IQA models, and is capable of retaining the effectiveness when evaluating GFIs from the unseen GAN algorithms.

翻訳日:2022-01-31 16:11:43 公開日:2022-01-28

# 深部畳み込みニューラルネットワークを用いた超音波画像における頸動脈壁セグメンテーション

Carotid artery wall segmentation in ultrasound image sequences using a deep convolutional neural network ( http://arxiv.org/abs/2201.12152v1 )

ライセンス: Link先を確認

Nolann Lain\'e, Guillaume Zahnd, Herv \'e Liebgott, Maciej Orkisz

(参考訳) 本研究の目的は, 胸動脈 intima-media complex の経時的超音波像による分画を行い, その厚みを計測することである。拡張されたU-netネットワークに基づく教師付き領域ベースディープラーニングアプローチを含む完全自動領域ベースセグメンテーション手法を提案する。 2人の専門家が注釈付けした2176の画像からなるマルチセンターデータベース上で、5倍のクロスバリデーションを用いてトレーニングと評価を行った。その結果,参照アノテーションと比較して平均絶対差(<120 um)は,サーバ間変動 (180 um) よりも低かった。 98.7%の成功率、すなわち手動修正を必要とする症例は1.3%に過ぎず、提案手法は堅牢であり、臨床応用に推奨される可能性がある。

The objective of this study is the segmentation of the intima-media complex of the common carotid artery, on longitudinal ultrasound images, to measure its thickness. We propose a fully automatic region-based segmentation method, involving a supervised region-based deep-learning approach based on a dilated U-net network. It was trained and evaluated using a 5-fold cross-validation on a multicenter database composed of 2176 images annotated by two experts. The resulting mean absolute difference (<120 um) compared to reference annotations was less than the inter-observer variability (180 um). With a 98.7% success rate, i.e., only 1.3% cases requiring manual correction, the proposed method has been shown to be robust and thus may be recommended for use in clinical practice.

翻訳日:2022-01-31 16:11:18 公開日:2022-01-28

# VRT:ビデオ再生用トランス

VRT: A Video Restoration Transformer ( http://arxiv.org/abs/2201.12288v1 )

ライセンス: Link先を確認

Jingyun Liang and Jiezhang Cao and Yuchen Fan and Kai Zhang and Rakesh Ranjan and Yawei Li and Radu Timofte and Luc Van Gool

(参考訳) ビデオ復元(ビデオスーパーレゾリューション)は、高品質のフレームを低品質のフレームから復元することを目的としている。単一の画像復元とは異なり、ビデオ復元は通常、隣接する複数のビデオフレームの時間的情報を利用する必要がある。既存のディープメソッドは、スライディングウィンドウ戦略やリカレントアーキテクチャを利用して、フレーム毎の復元や長距離モデリング能力の欠如によって制限される。本稿では,並列フレーム予測と長距離時間依存性モデリング機能を備えたビデオ再生変換器(VRT)を提案する。より具体的には、VRTは複数のスケールから構成されており、それぞれが時間的相互自己注意(TMSA)と並列ワープの2種類のモジュールで構成されている。 tmsaは動画を小さなクリップに分割し、相互注意を関節の動きの推定、特徴のアライメント、特徴の融合に応用し、自己注意を特徴抽出に使用する。クロスクリップインタラクションを可能にするために、ビデオシーケンスを他のレイヤ毎にシフトする。また、並列処理は、隣接するフレームからの情報を並列特徴ワープによってさらに融合するために用いられる。ビデオスーパーレゾリューション、ビデオデブロアリング、ビデオデノーミングを含む3つのタスクの実験結果は、VRTが9つのベンチマークデータセットで最先端の手法よりも大きなマージン($2.16dB}$)で優れていることを示した。

Video restoration (e.g., video super-resolution) aims to restore high-quality frames from low-quality frames. Different from single image restoration, video restoration generally requires to utilize temporal information from multiple adjacent but usually misaligned video frames. Existing deep methods generally tackle with this by exploiting a sliding window strategy or a recurrent architecture, which either is restricted by frame-by-frame restoration or lacks long-range modelling ability. In this paper, we propose a Video Restoration Transformer (VRT) with parallel frame prediction and long-range temporal dependency modelling abilities. More specifically, VRT is composed of multiple scales, each of which consists of two kinds of modules: temporal mutual self attention (TMSA) and parallel warping. TMSA divides the video into small clips, on which mutual attention is applied for joint motion estimation, feature alignment and feature fusion, while self attention is used for feature extraction. To enable cross-clip interactions, the video sequence is shifted for every other layer. Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping. Experimental results on three tasks, including video super-resolution, video deblurring and video denoising, demonstrate that VRT outperforms the state-of-the-art methods by large margins ($\textbf{up to 2.16dB}$) on nine benchmark datasets.

翻訳日:2022-01-31 16:11:06 公開日:2022-01-28

# 完全自動NMRタンパク質構造決定のためのディープラーニングの活用

Leveraging deep learning for fully automated NMR protein structure determination ( http://arxiv.org/abs/2201.12041v1 )

ライセンス: Link先を確認

Piotr Klukowski, Roland Riek, Peter G\"untert

(参考訳) 核磁気共鳴分光法は、タンパク質データバンクに11800以上のタンパク質構造が蓄積された構造生物学における主要な技術の一つである。 NMRは溶液、生体細胞、固体中の中小タンパク質の構造と動態を解明することができるが、退屈なデータ解析プロセスによって制限されている。通常、NMR測定をタンパク質構造に変えるには、訓練された専門家の手作業が数週間から数ヶ月かかる。このプロセスの自動化は、30年以上前にこの分野で定式化されたオープンな問題です。ここでは、この課題に対処する最初のアプローチを示す。本手法はnmrスペクトルとタンパク質配列のみを入力として使用し,人間の介入なしに構造を厳密に伝達する。 100タンパク質のベンチマーク (1329 2D/3D/4D NMRスペクトル) で、ARTINAは、PDB基準に1.44 {\AA} 中央のRMSDと91.36%の正しいNMR共鳴割り当てを持つ構造を解く能力を示した。 ARTINAは非専門家によって使用することができ、NMRによるタンパク質構造決定の労力を、基本的に試料とスペクトルの測定のために削減することができる。

Nuclear Magnetic Resonance (NMR) spectroscopy is one of the major techniques in structural biology with over 11800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. Here, we present the first approach that addresses this challenge. Our method, ARTINA, uses as input only NMR spectra and the protein sequence, delivering a structure strictly without any human intervention. Tested on a 100-protein benchmark (1329 2D/3D/4D NMR spectra), ARTINA demonstrated its ability to solve structures with 1.44 {\AA} median RMSD to the PDB reference and 91.36% correct NMR resonance assignments. ARTINA can be used by non-experts, reducing the effort for a protein structure determination by NMR essentially to the preparation of the sample and the spectra measurements.

翻訳日:2022-01-31 16:08:08 公開日:2022-01-28

# 生成型ガイトネット

Generative GaitNet ( http://arxiv.org/abs/2201.12044v1 )

ライセンス: Link先を確認

Jungnam Park, Sehee Min, Phil Sik Chang, Jaedong Lee, Moonseok Park, Jehee Lee

(参考訳) 解剖学と外界の関係を理解することは、予測歩行シミュレーションの成功の鍵となる。本稿では,304ヒル型マスカロテンを用いた全身筋骨格モデルを制御するための,深層強化学習に基づく新しいネットワークアーキテクチャであるGenerative GaitNetを提案する。生成ゲイト(Generative Gait)は、解剖学的条件(例えば、質量分布、体比、骨変形、筋肉の欠損など)の618次元連続領域で学習された人工ニューラルネットワークの訓練済み、未処理のシステムである。事前学習されたゲイトネットは、入力として解剖学と歩行条件を取り、物理に基づくシミュレーションを通じて条件に合った一連の歩行サイクルを生成させる。我々は,実時間物理学に基づくシミュレーションにおいて,ゲン・エレーティブ・ガイトネットの有効性と表現力について検討する。

Understanding the relation between anatomy andgait is key to successful predictive gait simulation. Inthis paper, we present Generative GaitNet, which isa novel network architecture based on deep reinforce-ment learning for controlling a comprehensive, full-body, musculoskeletal model with 304 Hill-type mus-culotendons. The Generative Gait is a pre-trained, in-tegrated system of artificial neural networks learnedin a 618-dimensional continuous domain of anatomyconditions (e.g., mass distribution, body proportion,bone deformity, and muscle deficits) and gait condi-tions (e.g., stride and cadence). The pre-trained Gait-Net takes anatomy and gait conditions as input andgenerates a series of gait cycles appropriate to theconditions through physics-based simulation. We willdemonstrate the efficacy and expressive power of Gen-erative GaitNet to generate a variety of healthy andpathologic human gaits in real-time physics-based sim-ulation.

翻訳日:2022-01-31 16:07:49 公開日:2022-01-28

# HEAT: ハイパーエッジアテンションネットワーク

HEAT: Hyperedge Attention Networks ( http://arxiv.org/abs/2201.12113v1 )

ライセンス: Link先を確認

Dobrik Georgiev, Marc Brockschmidt, Miltiadis Allamanis

(参考訳) 構造化データからの学習は、コア機械学習タスクである。一般的に、そのようなデータはグラフとして表現され、通常はノードのペア間の(型付けされた)バイナリ関係しか考慮しない。これは、高度に構造化されたデータを持つ多くのドメインにとって実質的な制限である。このようなドメインの重要な1つはソースコードであり、ハイパーグラフベースの表現は、セマンティックにリッチで構造化されたコードの性質をよりよく捉えることができる。本稿では,型付きハイパーグラフと資格付きハイパーグラフを表現可能なニューラルモデルであるHEATについて述べる。これは、メッセージパッシングニューラルネットワークとトランスフォーマーの両方の一般化と見なすことができる。本稿では,プログラムのハイパーグラフ表現を用いた知識ベース補完とバグ検出と修復について評価する。どちらの設定でも、強力なベースラインよりも優れており、そのパワーと汎用性を示している。

Learning from structured data is a core machine learning task. Commonly, such data is represented as graphs, which normally only consider (typed) binary relationships between pairs of nodes. This is a substantial limitation for many domains with highly-structured data. One important such domain is source code, where hypergraph-based representations can better capture the semantically rich and structured nature of code. In this work, we present HEAT, a neural model capable of representing typed and qualified hypergraphs, where each hyperedge explicitly qualifies how participating nodes contribute. It can be viewed as a generalization of both message passing neural networks and Transformers. We evaluate HEAT on knowledge base completion and on bug detection and repair using a novel hypergraph representation of programs. In both settings, it outperforms strong baselines, indicating its power and generality.

翻訳日:2022-01-31 16:07:33 公開日:2022-01-28

# (参考訳) risknet: 信頼できない資源のネットワークにおける神経リスク評価

RiskNet: Neural Risk Assessment in Networks of Unreliable Resources ( http://arxiv.org/abs/2201.12263v1 )

ライセンス: CC BY 4.0

Krzysztof Rusek, Piotr Bory{\l}o, Piotr Jaglarz, Fabien Geyer, Albert Cabellos, Piotr Cho{\l}da

(参考訳) 作業経路とバックアップ経路間で共有されるリソースによって接続が保護される通信ネットワークにおいて、障害によって引き起こされる罰則の分布を予測するグラフニューラルネットワーク(GNN)に基づく手法を提案する。 GNNベースのアルゴリズムは、Barab\'asi-Albertモデルで生成されたランダムグラフでのみ訓練される。しかし, 得られた実験結果から, 既存の様々なトポロジにおいて, ペナルティを正確にモデル化できることが示唆された。 GNNは、研究中のネットワークトポロジの複雑な停止シナリオをシミュレートする必要がない。実際には、設計操作は現代のハードウェアでは4msに制限されている。このようにして、12,000回以上のスピード改善を達成できます。

We propose a graph neural network (GNN)-based method to predict the distribution of penalties induced by outages in communication networks, where connections are protected by resources shared between working and backup paths. The GNN-based algorithm is trained only with random graphs generated with the Barab\'asi-Albert model. Even though, the obtained test results show that we can precisely model the penalties in a wide range of various existing topologies. GNNs eliminate the need to simulate complex outage scenarios for the network topologies under study. In practice, the whole design operation is limited by 4ms on modern hardware. This way, we can gain as much as over 12,000 times in the speed improvement.

翻訳日:2022-01-31 16:06:03 公開日:2022-01-28

# ニューロンの勾配降下とその近似2次最適化への応用

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization ( http://arxiv.org/abs/2201.12250v1 )

ライセンス: Link先を確認

Frederik Benzing

(参考訳) 二階オプティマイザはニューラルネットワークのトレーニングを高速化する可能性を持っていると考えられているが、曲率行列の巨大さのため、計算的に扱いやすい近似が必要となる。最も成功した近似の族はクロネッカー因子付きブロック対角曲率推定 (kfac) である。ここでは、事前の作業から得られたツールを組み合わせて、正確な2次更新を評価するとともに、驚くべき結果を得るための注意深いアブレーションを行う: その近似のため、kfacは2次更新と密接に関連しておらず、特に、真の2次更新よりも大幅に優れています。この課題は広く信じられており、なぜKFACがうまく機能するのかという疑問を即座に提起している。我々は、KFACが重みよりもニューロンに勾配降下を行う1次アルゴリズムを近似していることを示し、この問題に答える。最後に、この最適化は計算コストとデータ効率の観点から、KFACよりも良くなることを示す。

Second-order optimizers are thought to hold the potential to speed up neural network training, but due to the enormous size of the curvature matrix, they typically require approximations to be computationally tractable. The most successful family of approximations are Kronecker-Factored, block-diagonal curvature estimates (KFAC). Here, we combine tools from prior work to evaluate exact second-order updates with careful ablations to establish a surprising result: Due to its approximations, KFAC is not closely related to second-order updates, and in particular, it significantly outperforms true second-order updates. This challenges widely held believes and immediately raises the question why KFAC performs so well. We answer this question by showing that KFAC approximates a first-order algorithm, which performs gradient descent on neurons rather than weights. Finally, we show that this optimizer often improves over KFAC in terms of computational cost and data-efficiency.

翻訳日:2022-01-31 15:53:25 公開日:2022-01-28

# 差分プライバシーによるImageNetスケールのトレーニングに向けて

Toward Training at ImageNet Scale with Differential Privacy ( http://arxiv.org/abs/2201.12328v1 )

ライセンス: Link先を確認

Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, Abhradeep Thakurta

(参考訳) 差分プライバシー(DP)は、ニューラルネットワークを含む機械学習(ML)モデルをトレーニングするためのデファクトスタンダードであり、トレーニングセット内の個々のサンプルのプライバシを保証する。 MLモデルを異なるプライバシでトレーニングする方法に関する豊富な文献があるにも関わらず、現実の大規模ニューラルネットワークを適切な精度とプライバシの両方でトレーニングすることは、依然として極めて難しい。そこで私たちは,イメージネット画像分類をMLタスクのポスター例として用いて,DPで正確に解決することが非常に困難である点を調査した。本論文は,我々の取り組みから得た最初の教訓を共有し,他の研究者に大規模にdpトレーニングを探求するよう促し,知らせることを目的としている。 DPトレーニングを高速化する上で有効なアプローチと、DPに向いているトレーニングプロセスのモデルタイプと設定を示す。この方法を組み合わせることで、差分プライバシーを持つResnet-18を47.9%の精度とプライバシパラメータにトレーニングすることができる。$\epsilon = 10, \delta = 10^{-6}$, ImagenetモデルのDP-SGDトレーニングよりも大幅に改善されるが、同じネットワークがプライバシなしで取得できる7,5\%の正確さとは程遠い。私たちはコードをhttps://github.com/google-research/dp-imagenetで共有しています。

Differential privacy (DP) is the de facto standard for training machine learning (ML) models, including neural networks, while ensuring the privacy of individual examples in the training set. Despite a rich literature on how to train ML models with differential privacy, it remains extremely challenging to train real-life, large neural networks with both reasonable accuracy and privacy. We set out to investigate how to do this, using ImageNet image classification as a poster example of an ML task that is very challenging to resolve accurately with DP right now. This paper shares initial lessons from our effort, in the hope that it will inspire and inform other researchers to explore DP training at scale. We show approaches which help to make DP training faster, as well as model types and settings of the training process that tend to work better for DP. Combined, the methods we discuss let us train a Resnet-18 with differential privacy to 47.9% accuracy and privacy parameters $\epsilon = 10, \delta = 10^{-6}$, a significant improvement over "naive" DP-SGD training of Imagenet models but a far cry from the $75\%$ accuracy that can be obtained by the same network without privacy. We share our code at https://github.com/google-research/dp-imagenet calling for others to join us in moving the needle further on DP at scale.

翻訳日:2022-01-31 15:53:10 公開日:2022-01-28

# 地域最適化

Regionalized optimization ( http://arxiv.org/abs/2201.11876v1 )

ライセンス: Link先を確認

Gr\'egoire Sergeant-Perthuis

(参考訳) Yedidia, Freeman, Weiss の参照論文 "Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms" において、一般エネルギーの一般エネルギー(Generalized Bethe Free Energy)と呼ばれる領域ベースの自由エネルギー近似を導入することにより、一般エネルギー伝播の根底にある変動原理が存在することを示した。彼らは一般信条の固定点がこの自由エネルギーの臨界点であることの証明をスケッチし、この証明はペルトルの論文で完成した。本稿では,局所最適化問題のパッチングとして定義される最適化問題のクラスと,アルゴリズムの臨界点と固定点との対応が成立するメッセージパッシングアルゴリズムを特定する。このフレームワークには多くのアプリケーションがあり、そのうちの1つはフィルタデータのためのpcaであり、領域確率に対する確率的互換性の制約を伴うmaxentの領域ベース近似である。このようなアプローチは、マルチモーダルな統合による推論や、複数のビューを持つシーンでの推論に特に当てはまる。

Yedidia, Freeman, Weiss have shown in their reference article, "Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms", that there is a variational principle underlying the General Belief Propagation, by introducing a region-based free energy approximation of the MaxEnt free energy, that we will call the Generalized Bethe free energy. They sketched a proof that fixed points of the General Belief Propagation are critical points of this free energy, this proof was completed in the thesis of Peltre. In this paper we identify a class of optimization problems defined as patching local optimization problems and associated message passing algorithms for which such correspondence between critical points and fix points of the algorithms holds. This framework holds many applications one of which being a PCA for filtered data and a region-based approximation of MaxEnT with stochastic compatibility constraints on the region probabilities. Such approach is particularly adapted for inference with multimodal integration, inference on scenes with multiple views.

翻訳日:2022-01-31 15:52:22 公開日:2022-01-28

# 二重メタ模倣学習による階層構造伝達

Transfering Hierarchical Structure with Dual Meta Imitation Learning ( http://arxiv.org/abs/2201.11981v1 )

ライセンス: Link先を確認

Chongkai Gao, Yizhou Jiang, Feng Chen

(参考訳) 階層的模倣学習(hil)は、ロボットが長いホリゾンのデモからサブスキルを学ぶ効果的な方法である。しかし、学習された階層構造は、マルチタスクや新しいタスクに転送するメカニズムを欠いているため、新しい状況に直面した時にスクラッチから学ぶ必要がある。モジュラーサブスキルの転送と再構成は階層構造全体の迅速な適応能力を必要とする。本研究では,ハイレベルネットワークとサブスキルをモデルに依存しないメタ学習で反復的にメタ学習する階層的メタ模倣学習法であるDual Meta Imitation Learning (DMIL)を提案する。 DMILは、各サブスキルからのステートアクションペアの可能性をハイレベルネットワーク適応の監督に利用し、適応されたハイレベルネットワークを使用して、サブスキル適応毎に異なるデータセットを決定する。我々は,DMILの反復学習過程の収束を理論的に証明し,DMILと期待最大化アルゴリズムの接続を確立する。実験により,Meta-world \cite{metaworld} ベンチマークによる最先端数発の模倣学習性能と,Kitchen 環境の長期タスクにおける競合結果が得られた。

Hierarchical Imitation Learning (HIL) is an effective way for robots to learn sub-skills from long-horizon unsegmented demonstrations. However, the learned hierarchical structure lacks the mechanism to transfer across multi-tasks or to new tasks, which makes them have to learn from scratch when facing a new situation. Transferring and reorganizing modular sub-skills require fast adaptation ability of the whole hierarchical structure. In this work, we propose Dual Meta Imitation Learning (DMIL), a hierarchical meta imitation learning method where the high-level network and sub-skills are iteratively meta-learned with model-agnostic meta-learning. DMIL uses the likelihood of state-action pairs from each sub-skill as the supervision for the high-level network adaptation, and use the adapted high-level network to determine different data set for each sub-skill adaptation. We theoretically prove the convergence of the iterative training process of DMIL and establish the connection between DMIL and Expectation-Maximization algorithm. Empirically, we achieve state-of-the-art few-shot imitation learning performance on the Meta-world \cite{metaworld} benchmark and competitive results on long-horizon tasks of Kitchen environments.

翻訳日:2022-01-31 15:52:02 公開日:2022-01-28

# エンドツーエンド音声認識のためのニューラルFSTクラス言語モデル

Neural-FST Class Language Model for End-to-End Speech Recognition ( http://arxiv.org/abs/2201.11867v1 )

ライセンス: Link先を確認

Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo Wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer

(参考訳) ニューラルネットワーク言語モデル(NNLM)と有限状態トランスデューサ(FST)を数学的に一貫した枠組みで組み合わせた,エンドツーエンド音声認識のためのニューラルFSTクラス言語モデル(NFCLM)を提案する。提案手法は,汎用的な背景テキストをモデル化するバックグラウンドNNLMと,個別FSTとしてモデル化されたドメイン固有エンティティのコレクションを利用する。それぞれの出力トークンはこれらの成分の混合によって生成され、混合重みは個別に訓練された神経決定器で推定される。その結果,NFCLMは単語誤り率においてNNLMを15.8%上回っていることがわかった。 NFCLM は従来の NNLM や FST の浅層核融合と同等の性能を保ちながら、オーバーバイアスや12倍のコンパクトさを保ち、デバイス上での使用に適している。

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework. Our method utilizes a background NNLM which models generic background text together with a collection of domain-specific entities modeled as individual FSTs. Each output token is generated by a mixture of these components; the mixture weights are estimated with a separately trained neural decider. We show that NFCLM significantly outperforms NNLM by 15.8% relative in terms of Word Error Rate. NFCLM achieves similar performance as traditional NNLM and FST shallow fusion while being less prone to overbiasing and 12 times more compact, making it more suitable for on-device usage.

翻訳日:2022-01-31 15:51:41 公開日:2022-01-28

# DiffGAN-TTS: 拡散GANを用いた高忠実かつ効率的なテキスト音声合成

DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs ( http://arxiv.org/abs/2201.11972v1 )

ライセンス: Link先を確認

Songxiang Liu, Dan Su, Dong Yu

(参考訳) 拡散確率モデル (DDPM) は、様々な音声合成問題を解くために用いられてきた表現的生成モデルである。しかし,サンプリングコストが高いため,リアルタイム音声処理ではDDPMの使用が困難である。本稿では,DiffGAN-TTSについて紹介する。DiffGAN-TTSは,高忠実で効率的な音声合成を実現する新しいDDPMベースのテキスト音声合成モデルである。 DiffGAN-TTSは拡散生成逆数ネットワーク (GAN) をデノナイズし、デノナイズ分布を近似するために逆向きに訓練された表現モデルを採用する。 DiffGAN-TTSは4ステップで高忠実度音声サンプルを生成可能であることを示す。さらに, 推定を高速化するために, アクティブな浅層拡散機構を提案する。ステージ2で訓練されたDDPMに貴重な事前情報を提供する基本的TSS音響モデルを用いて,2段階のトレーニングスキームを提案する。実験の結果,DiffGAN-TTSは1段階のみの高合成性能が得られることがわかった。

Denoising diffusion probabilistic models (DDPMs) are expressive generative models that have been used to solve a variety of speech synthesis problems. However, because of their high sampling costs, DDPMs are difficult to use in real-time speech processing applications. In this paper, we introduce DiffGAN-TTS, a novel DDPM-based text-to-speech (TTS) model achieving high-fidelity and efficient speech synthesis. DiffGAN-TTS is based on denoising diffusion generative adversarial networks (GANs), which adopt an adversarially-trained expressive model to approximate the denoising distribution. We show with multi-speaker TTS experiments that DiffGAN-TTS can generate high-fidelity speech samples within only 4 denoising steps. We present an active shallow diffusion mechanism to further speed up inference. A two-stage training scheme is proposed, with a basic TTS acoustic model trained at stage one providing valuable prior information for a DDPM trained at stage two. Our experiments show that DiffGAN-TTS can achieve high synthesis performance with only 1 denoising step.

翻訳日:2022-01-31 15:51:24 公開日:2022-01-28

# (参考訳) ermよりも一般化された重み付けが改善しない理由を理解する

Understanding Why Generalized Reweighting Does Not Improve Over ERM ( http://arxiv.org/abs/2201.12293v1 )

ライセンス: CC BY 4.0

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

(参考訳) 経験的リスク最小化(experimental risk minimization, erm)は、トレーニングとテスト分布が異なる分散シフトに対する非ロバストであることが知られている。重み付けや分散ロバスト最適化(DRO)の変種といった一連のアプローチがこの問題を解決するために提案されている。しかし、最近の一連の研究は、分散シフトを伴う実際のアプリケーションにおいて、これらのアプローチがermを大幅に改善していないことを実証している。この研究の目的は、この興味深い現象の総合的な理論的理解を得ることである。まず、トレーニングサンプルの反復的再重み付けに基づいてモデルパラメータを反復的に更新するアプローチの幅広いカテゴリとして、一般再重み付け(GRW)アルゴリズムのクラスを仮定する。 GRWで過度パラメータ化モデルをトレーニングした場合,得られたモデルはERMで得られたモデルに近いことを示す。また,経験的トレーニング精度に大きく影響しない小さな正規化を加えることは,効果がないことを示す。以上より,grwアプローチの幅広いカテゴリは,分布的にロバストな一般化を実現することができないことを示した。分布的に堅牢な一般化に向けて進むためには、非GRWアプローチを開発するか、あるいはGRWアプローチのクラスに適応した新しい分類/回帰損失関数を考案する必要がある。

Empirical risk minimization (ERM) is known in practice to be non-robust to distributional shift where the training and the test distributions are different. A suite of approaches, such as importance weighting, and variants of distributionally robust optimization (DRO), have been proposed to solve this problem. But a line of recent work has empirically shown that these approaches do not significantly improve over ERM in real applications with distribution shift. The goal of this work is to obtain a comprehensive theoretical understanding of this intriguing phenomenon. We first posit the class of Generalized Reweighting (GRW) algorithms, as a broad category of approaches that iteratively update model parameters based on iterative reweighting of the training samples. We show that when overparameterized models are trained under GRW, the resulting models are close to that obtained by ERM. We also show that adding small regularization which does not greatly affect the empirical training accuracy does not help. Together, our results show that a broad category of what we term GRW approaches are not able to achieve distributionally robust generalization. Our work thus has the following sobering takeaway: to make progress towards distributionally robust generalization, we either have to develop non-GRW approaches, or perhaps devise novel classification/regression loss functions that are adapted to the class of GRW approaches.

翻訳日:2022-01-31 15:50:26 公開日:2022-01-28

# コストボリュームに基づくスパース不均質伝播によるステレオマッチング

Stereo Matching with Cost Volume based Sparse Disparity Propagation ( http://arxiv.org/abs/2201.11937v1 )

ライセンス: Link先を確認

Wei Xue and Xiaojiang Peng

(参考訳) 両眼立体視にはステレオマッチングが不可欠である。既存の手法は主に、立体マッチングを改善するための単純な不均等写像の融合に焦点を当てている。本稿では,特徴分散伝播と呼ばれる,シンプルながら斬新な手法を提案し,コストの一致量とスパースマッチング特徴点に基づく一般的なステレオマッチングを改善する。具体的には、まず、局所的な特徴マッチングにより、信頼性の高いスパース不一致マップを計算し、その後、マッチングコスト領域において、隣接する画素に信頼性のある不一致を伝播することにより、その不一致マップを洗練する。また,局所的格差領域の勾配および多スケール情報を考慮して,コスト集約ステップがなくてもコストボリュームのロバスト性を保証するad-censusに基づく$\rho$-censusコスト尺度を提案する。ミドルベリーステレオベンチマークv3における広範囲な実験により,提案手法が最先端手法に匹敵する有望な性能を実現することを実証した。

Stereo matching is crucial for binocular stereo vision. Existing methods mainly focus on simple disparity map fusion to improve stereo matching, which require multiple dense or sparse disparity maps. In this paper, we propose a simple yet novel scheme, termed feature disparity propagation, to improve general stereo matching based on matching cost volume and sparse matching feature points. Specifically, our scheme first calculates a reliable sparse disparity map by local feature matching, and then refines the disparity map by propagating reliable disparities to neighboring pixels in the matching cost domain. In addition, considering the gradient and multi-scale information of local disparity regions, we present a $\rho$-Census cost measure based on the well-known AD-Census, which guarantees the robustness of cost volume even without the cost aggregation step. Extensive experiments on Middlebury stereo benchmark V3 demonstrate that our scheme achieves promising performance comparable to state-of-the-art methods.

翻訳日:2022-01-31 15:47:52 公開日:2022-01-28

# 教師なし人物再識別のためのクラスタアンサンブルを用いたハイブリッドコントラスト学習

Hybrid Contrastive Learning with Cluster Ensemble for Unsupervised Person Re-identification ( http://arxiv.org/abs/2201.11995v1 )

ライセンス: Link先を確認

He Sun, Mingkun Li, Chun-Guang Li

(参考訳) unsupervised person re-identification (reid) は、歩行者の問合せ画像とギャラリーセットの画像とを、監督ラベルなしでマッチングすることを目的としている。教師なしのreidに取り組む最も一般的なアプローチは、通常、クラスタリングアルゴリズムを実行して疑似ラベルを生成し、疑似ラベルを利用してディープニューラルネットワークをトレーニングすることです。しかし、疑似ラベルは、クラスタリングアルゴリズムのハイパーパラメータ(s)に対してノイズと感度がある。本稿では,インスタンスレベルのコントラスト損失関数とクラスタレベルのコントラスト損失関数のハイブリッドに基づく教師なしreidのためのハイブリッドコントラスト学習(hcl)手法を提案する。さらに,多粒度クラスタリング型ハイブリッドコントラスト学習(MGCE-HCL)アプローチを提案する。この手法は,擬似正のサンプルペア間の優先情報をマイニングするために,多粒度クラスタリングアンサンブル戦略を採用し,擬似正のサンプルのノイズを許容するための優先重み付きハイブリッドコントラスト損失を定義する。ベンチマークデータセットである Market-1501 と DukeMTMC-reID について広範な実験を行った。提案の有効性を実験的に検証した。

Unsupervised person re-identification (ReID) aims to match a query image of a pedestrian to the images in gallery set without supervision labels. The most popular approaches to tackle unsupervised person ReID are usually performing a clustering algorithm to yield pseudo labels at first and then exploit the pseudo labels to train a deep neural network. However, the pseudo labels are noisy and sensitive to the hyper-parameter(s) in clustering algorithm. In this paper, we propose a Hybrid Contrastive Learning (HCL) approach for unsupervised person ReID, which is based on a hybrid between instance-level and cluster-level contrastive loss functions. Moreover, we present a Multi-Granularity Clustering Ensemble based Hybrid Contrastive Learning (MGCE-HCL) approach, which adopts a multi-granularity clustering ensemble strategy to mine priority information among the pseudo positive sample pairs and defines a priority-weighted hybrid contrastive loss for better tolerating the noises in the pseudo positive samples. We conduct extensive experiments on two benchmark datasets Market-1501 and DukeMTMC-reID. Experimental results validate the effectiveness of our proposals.

翻訳日:2022-01-31 15:47:32 公開日:2022-01-28

# ぼやけたイメージの展開

Unfolding a blurred image ( http://arxiv.org/abs/2201.12010v1 )

ライセンス: Link先を確認

Kuldeep Purohit, Anshul Shah, A. N. Rajagopalan

(参考訳) 本稿では,1つの動きのぼやけた画像から映像を抽出し,露光時にカメラが保持するシーンの明快な視点を順次再構成することを目的とする。まず,ビデオ再構成のサロゲートタスクを実行する畳み込み再生ビデオオートエンコーダネットワークのトレーニングを通じて,シャープビデオからの映像表現を教師なしで学習する。訓練後、ぼやけた画像のためのモーションエンコーダのガイドトレーニングに使用される。このネットワークは、ぼやけた画像から埋め込み動作情報を抽出し、トレーニングされたリカレントビデオデコーダとともにシャープビデオを生成する。中間的なステップとして,リアルタイムの単一画像の分解と,精度,速度,コンパクト性など,競合するすべての要因に対する性能向上が可能な効率的なアーキテクチャを設計する。実際のシーンと標準データセットに関する実験は、最先端のフレームワークの優位性と、時間的に一貫性のあるシャープフレームの生成能力を示しています。

We present a solution for the goal of extracting a video from a single motion blurred image to sequentially reconstruct the clear views of a scene as beheld by the camera during the time of exposure. We first learn motion representation from sharp videos in an unsupervised manner through training of a convolutional recurrent video autoencoder network that performs a surrogate task of video reconstruction. Once trained, it is employed for guided training of a motion encoder for blurred images. This network extracts embedded motion information from the blurred image to generate a sharp video in conjunction with the trained recurrent video decoder. As an intermediate step, we also design an efficient architecture that enables real-time single image deblurring and outperforms competing methods across all factors: accuracy, speed, and compactness. Experiments on real scenes and standard datasets demonstrate the superiority of our framework over the state-of-the-art and its ability to generate a plausible sequence of temporally consistent sharp frames.

翻訳日:2022-01-31 15:47:11 公開日:2022-01-28

# 注意誘導フレームアソシエーションを用いたRGB-D SLAM

RGB-D SLAM Using Attention Guided Frame Association ( http://arxiv.org/abs/2201.12047v1 )

ライセンス: Link先を確認

Ali Caglayan, Nevrez Imamoglu, Oguzhan Guclu, Ali Osman Serhatoglu, Weimin Wang, Ahmet Burak Can, Ryosuke Nakamura

(参考訳) 新たなトピックとしてのディープラーニングモデルは、さまざまな分野で大きな進歩を見せている。特に、クラスアクティベーションマッピング法のような可視化ツールは、畳み込みニューラルネットワーク(CNN)の推論に関する視覚的な説明を提供する。ネットワーク層の勾配を用いることで、特定の画像認識タスク中にネットワークがどこに注意を払っているかを示すことができる。さらに、これらの勾配はcnnの機能と統合でき、シーン内のより一般化されたタスク依存注意(salient)オブジェクトをローカライズすることができる。この進歩にもかかわらず、オブジェクトセマンティクスのcnn表現と統合するために、この勾配(ネットワークの注意)情報はあまり明確には使われていない。これは、空間的に注意すべき物体位置のCNN表現が性能改善につながるような、同時局所化とマッピング(SLAM)のような視覚的タスクに非常に有用である。そこで本研究では,RGB-D屋内SLAMにおけるタスク固有ネットワークアテンションの利用を提案する。そこで我々は,最新のRGB-D屋内SLAM法において,CNN層表現とレイヤワイドオブジェクトアテンション情報(レイヤ勾配)を統合し,フレームアソシエーション性能を向上させる。実験はパフォーマンスを向上して有望な初期結果を示す。

Deep learning models as an emerging topic have shown great progress in various fields. Especially, visualization tools such as class activation mapping methods provided visual explanation on the reasoning of convolutional neural networks (CNNs). By using the gradients of the network layers, it is possible to demonstrate where the networks pay attention during a specific image recognition task. Moreover, these gradients can be integrated with CNN features for localizing more generalized task dependent attentive (salient) objects in scenes. Despite this progress, there is not much explicit usage of this gradient (network attention) information to integrate with CNN representations for object semantics. This can be very useful for visual tasks such as simultaneous localization and mapping (SLAM) where CNN representations of spatially attentive object locations may lead to improved performance. Therefore, in this work, we propose the use of task specific network attention for RGB-D indoor SLAM. To do so, we integrate layer-wise object attention information (layer gradients) with CNN layer representations to improve frame association performance in a state-of-the-art RGB-D indoor SLAM method. Experiments show promising initial results with improved performance.

翻訳日:2022-01-31 15:46:56 公開日:2022-01-28

# ビデオ中の偽顔の検出

Detection of fake faces in videos ( http://arxiv.org/abs/2201.12051v1 )

ライセンス: Link先を確認

M. Shamanth, Russel Mathias, Dr Vijayalakshmi MN

(参考訳) ディープ・ラーニング・方法論は、プライバシー、民主主義、国家安全保障に脅威をもたらし、悪意のある活動をさらに増幅するアプリケーションを作成するために用いられてきた。ディープラーニングを利用した最近のアプリケーションの一つが、有名人格の合成ビデオだ。 Forbesによると、GAN(Generative Adversarial Networks)は、毎年急速に成長するフェイクビデオを生成し、Deeptraceとして知られる組織は2018年から2019年にかけて、ディープフェイクを84%増加させたと見積もっている。それらは人間の顔の生成と修正に使われており、既存のフェイクビデオのほとんどは、その推定では96%、一部ではサイバー犯罪の個人性を偽造している。本稿では、利用可能なビデオデータセットを特定し、顔検出にプリトレーニングモデルblazefaceを使用し、データセット上でトレーニングされたresnetおよびxceptionアンサンブルアーキテクチャドニューラルネットワークを使用して、ビデオ中の偽顔検出の目標を達成する。このモデルは損失値とログ損失値よりも最適化され、そのf1スコアで評価される。データのサンプルでは、焦点損失のガンマがハイパーパラメータとなるにつれて、焦点損失がより良い精度、F1のスコアと損失をもたらすことが観察された。これにより、トレーニングサイクルのピーク時のk折りの精度は約91%となり、実際の精度はモデルが崩壊するにつれて経時的に変化する。

: Deep learning methodologies have been used to create applications that can cause threats to privacy, democracy and national security and could be used to further amplify malicious activities. One of those deep learning-powered applications in recent times is synthesized videos of famous personalities. According to Forbes, Generative Adversarial Networks(GANs) generated fake videos growing exponentially every year and the organization known as Deeptrace had estimated an increase of deepfakes by 84% from the year 2018 to 2019. They are used to generate and modify human faces, where most of the existing fake videos are of prurient non-consensual nature, of which its estimates to be around 96% and some carried out impersonating personalities for cyber crime. In this paper, available video datasets are identified and a pretrained model BlazeFace is used to detect faces, and a ResNet and Xception ensembled architectured neural network trained on the dataset to achieve the goal of detection of fake faces in videos. The model is optimized over a loss value and log loss values and evaluated over its F1 score. Over a sample of data, it is observed that focal loss provides better accuracy, F1 score and loss as the gamma of the focal loss becomes a hyper parameter. This provides a k-folded accuracy of around 91% at its peak in a training cycle with the real world accuracy subjected to change over time as the model decays.

翻訳日:2022-01-31 15:46:36 公開日:2022-01-28

# デジタル顔画像操作検出における人の心理的評価

Psychophysical Evaluation of Human Performance in Detecting Digital Face Image Manipulations ( http://arxiv.org/abs/2201.12084v1 )

ライセンス: Link先を確認

Robert Nichols, Christian Rathgeb, Pawel Drozdowski, Christoph Busch

(参考訳) 近年では、国境管理や法執行など、セキュリティ上重要な設定における顔認識技術の導入が増加し、デジタル操作された顔画像に基づいて発行される正統な文書を利用する攻撃に対する顔認識システムの脆弱性にかなりの関心が寄せられている。自動操作と攻撃検出は依然として困難な課題であり、人間の検査者が身元確認を行う従来のプロセスは不可欠である。これらの状況は、操作された顔画像を検出する人間の能力をより深く研究する上で有効であり、この分野での以前の研究は疎く、しばしば特定のシナリオや生体特性にのみ集中している。本研究は、心理物理学の分野から採用されている原則に基づき、webベースの遠隔視覚識別実験を行い、その後、顔の交換、モーフィング、リタッチなど、様々な種類のデジタル操作された顔画像の検出における人間の習熟度を調べることを目的として、学際的な機会について論じる。適切な性能測定値の解析に加えて,検出可能性の指標も検討した。 306個のプロバンドによる実験データによると、検出性能は個体群全体に広く分布しており、特定の種類の顔画像操作の検出は他よりもはるかに困難である。

In recent years, increasing deployment of face recognition technology in security-critical settings, such as border control or law enforcement, has led to considerable interest in the vulnerability of face recognition systems to attacks utilising legitimate documents, which are issued on the basis of digitally manipulated face images. As automated manipulation and attack detection remains a challenging task, conventional processes with human inspectors performing identity verification remain indispensable. These circumstances merit a closer investigation of human capabilities in detecting manipulated face images, as previous work in this field is sparse and often concentrated only on specific scenarios and biometric characteristics. This work introduces a web-based, remote visual discrimination experiment on the basis of principles adopted from the field of psychophysics and subsequently discusses interdisciplinary opportunities with the aim of examining human proficiency in detecting different types of digitally manipulated face images, specifically face swapping, morphing, and retouching. In addition to analysing appropriate performance measures, a possible metric of detectability is explored. Experimental data of 306 probands indicate that detection performance is widely distributed across the population and detection of certain types of face image manipulations is much more challenging than others.

翻訳日:2022-01-31 15:46:11 公開日:2022-01-28

# 点クラウド登録のための近傍対応幾何エンコーディングネットワーク

Neighborhood-aware Geometric Encoding Network for Point Cloud Registration ( http://arxiv.org/abs/2201.12094v1 )

ライセンス: Link先を確認

Lifa Zhu, Haining Guan, Changwei Lin, Renmin Han

(参考訳) 幾何的特徴の区別は、点雲登録の成功を決定する。しかし、ほとんどの点の雲は部分的に重なり、ノイズによって破損し、識別不能な表面で構成されているため、識別的特徴を抽出することは困難である。本稿では,正確なポイントクラウド登録のためのNighborhood-aware Geometric Encoding Network (NgeNet)を提案する。 NgeNetは幾何学的特徴を考慮に入れた幾何学的ガイド付き符号化モジュール、異なるスケールで意味的にリッチな領域にフォーカスするマルチスケールアーキテクチャ、および適切な近傍サイズを持つ特徴を選択し、特異な特徴を拒絶する一貫した投票戦略を利用する。適応的な近傍点の認識は、投票を伴うマルチスケールアーキテクチャを通して得られる。具体的には、NgeNetの手法はモデルに依存しないため、他のネットワークに容易に移行できる。屋内、屋外、およびオブジェクト中心の合成データセットに関する総合的な実験は、NgeNetが公開された最先端の手法をすべて超越していることを示している。コードはhttps://github.com/zhulf0804/NgeNetで入手できる。

The distinguishing geometric features determine the success of point cloud registration. However, most point clouds are partially overlapping, corrupted by noise, and comprised of indistinguishable surfaces, which makes it a challenge to extract discriminative features. Here, we propose the Neighborhood-aware Geometric Encoding Network (NgeNet) for accurate point cloud registration. NgeNet utilizes a geometric guided encoding module to take geometric characteristics into consideration, a multi-scale architecture to focus on the semantically rich regions in different scales, and a consistent voting strategy to select features with proper neighborhood size and reject the specious features. The awareness of adaptive neighborhood points is obtained through the multi-scale architecture accompanied by voting. Specifically, the proposed techniques in NgeNet are model-agnostic, which could be easily migrated to other networks. Comprehensive experiments on indoor, outdoor and object-centric synthetic datasets demonstrate that NgeNet surpasses all of the published state-of-the-art methods. The code will be available at https://github.com/zhulf0804/NgeNet.

翻訳日:2022-01-31 15:45:48 公開日:2022-01-28

# HSADML:脳腫瘍分類のための超球面角深度に基づく学習

HSADML: Hyper-Sphere Angular Deep Metric based Learning for Brain Tumor Classification ( http://arxiv.org/abs/2201.12269v1 )

ライセンス: Link先を確認

Aman Verma and Vibhav Prakash Singh

(参考訳) 脳腫瘍は脳の領域を貫通する集合細胞の異常な質量である。タイムリーな識別と分類は、医師が適切な治療を行うのに役立つ。しかし、脳腫瘍の分類は、高いイントラクラス類似性と低インタークラス変動のため、かなり複雑である。様々なMRIクラスにおける形態的類似性のため、課題はさらに深まる。この全ては分類モデルの一般化を妨げている。そこで本稿では,SphereFace Loss を用いた深度メトリック学習(DML)を実現する新しいフレームワークであるHSADMLを提案する。 sphereface lossは、機能を超球面マニフォールドに組み込み、埋め込みにマージンを課し、クラス間の差別化性を高める。 sphereface lossベースのディープメトリック学習を利用することで、クラスからのサンプルがクラスタ化され、異なるサンプルがプッシュされる。 k-nn (k=1) を用いた98.69%の検証 accu-racy を達成し,98.47%の検証精度が得られたもののクラス間分離性とクラス内近接性が制限された通常のソフトマックス損失トレーニングよりも有意に高い。様々な分類器と損失関数セットによる実験的解析は、アプローチの可能性を示唆している。

Brain Tumors are abnormal mass of clustered cells penetrating regions of brain. Their timely identification and classification help doctors to provide appropriate treatment. However, Classifi-cation of Brain Tumors is quite intricate because of high-intra class similarity and low-inter class variability. Due to morphological similarity amongst various MRI-Slices of different classes the challenge deepens more. This all leads to hampering generalizability of classification models. To this end, this paper proposes HSADML, a novel framework which enables deep metric learning (DML) using SphereFace Loss. SphereFace loss embeds the features into a hyperspheric-manifold and then imposes margin on the embeddings to enhance differentiability between the classes. With utilization of SphereFace loss based deep metric learning it is ensured that samples from class clustered together while the different ones are pushed apart. Results reflects the promi-nence in the approach, the proposed framework achieved state-of-the-art 98.69% validation accu-racy using k-NN (k=1) and this is significantly higher than normal SoftMax Loss training which though obtains 98.47% validation accuracy but that too with limited inter-class separability and intra-class closeness. Experimental analysis done over various classifiers and loss function set-tings suggests potential in the approach.

翻訳日:2022-01-31 15:44:35 公開日:2022-01-28

# DAB-DETR: DETRのための動的アンカーボックス

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR ( http://arxiv.org/abs/2201.12329v1 )

ライセンス: Link先を確認

Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang

(参考訳) 本稿では,DTR(Detection TRansformer)のための動的アンカーボックスを用いた新しいクエリ定式化を行い,DTRにおけるクエリの役割についてより深く理解する。この新たな定式化は、Transformerデコーダのクエリとしてボックス座標を直接使用し、層ごとに動的に更新する。ボックス座標を用いることで,クエリ・ツー・フィーチャーの類似性を向上し,DETRの遅いトレーニング収束問題を解消するだけでなく,ボックス幅と高さ情報を用いて位置対応マップを変調することが可能になる。このような設計により、DETRにおけるクエリは、カスケード方式でソフトROIプーリング層として実装可能であることが明らかになる。その結果、同じ設定下でのDEC-COCOベンチマークでは、例えばAP 45.7\%でResNet50-DC5を50時間でトレーニングしたバックボーンとして使用した。また,本手法の有効性を検証するため,広範な実験を行った。コードは \url{https://github.com/SlongLiu/DAB-DETR} で入手できる。

We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. Code is available at \url{https://github.com/SlongLiu/DAB-DETR}.

翻訳日:2022-01-31 15:44:10 公開日:2022-01-28

# 不完全対称力学に対する近似同値ネットワーク

Approximately Equivariant Networks for Imperfectly Symmetric Dynamics ( http://arxiv.org/abs/2201.11969v1 )

ライセンス: Link先を確認

Rui Wang, Robin Walters, Rose Yu

(参考訳) ニューラルネットワークアーキテクチャにインダクティブバイアスとして対称性を組み込むことで、ダイナミクスモデリングにおける一般化、データ効率、物理的一貫性が向上した。 cnnや等変ニューラルネットワークのような手法では、シフト不変性や回転同値性といった対称性を強制するために重み付きを用いる。しかし、物理法則が多くの対称性に従うという事実にもかかわらず、実世界の力学データは、ノイズや不完全データによる厳密な数学的対称性や、基礎となる力学系における対称性の破れの特徴にほとんど準拠しない。対称性の保存に偏りがあるが、厳密に制約されていない、概略同変ネットワークを探索する。等分散制約を緩和することにより、我々のモデルは対称性バイアスのないベースラインと、シミュレーションされた乱流領域と実世界のマルチストリームジェットフローの両方において過度に厳密な対称性を持つベースラインの両方より優れていることが分かる。

Incorporating symmetry as an inductive bias into neural network architecture has led to improvements in generalization, data efficiency, and physical consistency in dynamics modeling. Methods such as CNN or equivariant neural networks use weight tying to enforce symmetries such as shift invariance or rotational equivariance. However, despite the fact that physical laws obey many symmetries, real-world dynamical data rarely conforms to strict mathematical symmetry either due to noisy or incomplete data or to symmetry breaking features in the underlying dynamical system. We explore approximately equivariant networks which are biased towards preserving symmetry but are not strictly constrained to do so. By relaxing equivariance constraints, we find that our models can outperform both baselines with no symmetry bias and baselines with overly strict symmetry in both simulated turbulence domains and real-world multi-stream jet flow.

翻訳日:2022-01-31 15:41:28 公開日:2022-01-28

# グラフニューラルネットワークによる未知の物理系シミュレーションの学習

Learning to Simulate Unseen Physical Systems with Graph Neural Networks ( http://arxiv.org/abs/2201.11976v1 )

ライセンス: Link先を確認

Ce Yang, Weihao Gao, Di Wu, Chong Wang

(参考訳) 物理システムのダイナミクスのシミュレーションは、科学と工学の両方の発展に不可欠である。近年,ニューラルネットワークを用いた物理システムのダイナミクスをシミュレートする学習への関心が高まっている。しかし、既存のアプローチでは、粘度の異なる液体や弾性の異なるエラストマーなど、トレーニングセットにない物質に一般化することができない。本稿では,多種多様なシナリオにおいて異なる物質の物理力学を効率的にモデル化するために,物理量と物質パラメータを組み込んだ機械学習手法であるgraph-based physics engine(gpe)を提案する。 GPEはトレーニングセットにない異なる特性を持つ材料に一般化でき、シングルステップ予測からマルチステップロールアウトシミュレーションまでよく機能することを示した。さらに、モデルに運動量保存の法則を導入することで、学習の効率と安定性が大幅に向上し、トレーニングステップの少ないより良いモデルへの収束が可能になる。

Simulation of the dynamics of physical systems is essential to the development of both science and engineering. Recently there is an increasing interest in learning to simulate the dynamics of physical systems using neural networks. However, existing approaches fail to generalize to physical substances not in the training set, such as liquids with different viscosities or elastomers with different elasticities. Here we present a machine learning method embedded with physical priors and material parameters, which we term as "Graph-based Physics Engine" (GPE), to efficiently model the physical dynamics of different substances in a wide variety of scenarios. We demonstrate that GPE can generalize to materials with different properties not seen in the training set and perform well from single-step predictions to multi-step roll-out simulations. In addition, introducing the law of momentum conservation in the model significantly improves the efficiency and stability of learning, allowing convergence to better models with fewer training steps.

翻訳日:2022-01-31 15:41:09 公開日:2022-01-28

# グラフ上の拡張永続ホモロジーの神経近似

Neural Approximation of Extended Persistent Homology on Graphs ( http://arxiv.org/abs/2201.12032v1 )

ライセンス: Link先を確認

Zuoyu Yan, Tengfei Ma, Liangcai Gao, Zhi Tang, Yusu Wang, Chao Chen

(参考訳) 永続ホモロジーは、位相データ解析において広く用いられる理論である。グラフ学習の文脈では、永続的ホモロジーに基づくトポロジ的特徴は、既存のグラフニューラルネットワーク手法を拡張するために、潜在的に高次構造情報をキャプチャするために使われてきた。しかし、特に学習アプリケーションでは、この計算を何度も行わなければならないため、拡張された永続的ホモロジー要約は、大きくて密度の高いグラフでは遅いままである。近年のニューラルアルゴリズム推論の成功に触発されて,グラフ上の拡張永続化図を計算するための新しい学習法を提案する。提案するニューラルネットワークは,特定のアルゴリズムをシミュレートすることを目的として,新しいグラフに対する拡張永続化図の効率的な計算方法を学ぶ。拡張永続化図と下流グラフ表現学習タスクの近似実験により,本手法の有効性が示された。大規模かつ高密度なグラフでは、計算を100倍近く高速化する。

Persistent homology is a widely used theory in topological data analysis. In the context of graph learning, topological features based on persistent homology have been used to capture potentially high-order structural information so as to augment existing graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs, especially since in learning applications one has to carry out this computation potentially many times. Inspired by recent success in neural algorithmic reasoning, we propose a novel learning method to compute extended persistence diagrams on graphs. The proposed neural network aims to simulate a specific algorithm and learns to compute extended persistence diagrams for new graphs efficiently. Experiments on approximating extended persistence diagrams and several downstream graph representation learning tasks demonstrate the effectiveness of our method. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times.

翻訳日:2022-01-31 15:40:54 公開日:2022-01-28

# 脳波を用いた感情分類のためのAsMapの自動特徴抽出

Automated Feature Extraction on AsMap for Emotion Classification using EEG ( http://arxiv.org/abs/2201.12055v1 )

ライセンス: Link先を確認

Md. Zaved Iqubal Ahmed (1), Nidul Sinha (2) and Souvik Phadikar (2) ((1) Department of Computer Science & Engineering, National Institute of Technology, Silchar, India, (2) Department of Electrical Engineering, National Institute of Technology, Silchar, India)

(参考訳) 脳波を用いた感情認識は感情コンピューティングに関連する課題に対処するために広く研究されている。脳波信号に対する手動特徴抽出法を用いることで,学習モデルによる準最適性能が得られる。自動機能エンジニアリングのためのツールとしてのディープラーニングの進歩に伴い、本研究では、手作業と自動機能抽出のハイブリッドが提案されている。異なる脳領域における非対称性は、脳波信号の微分エントロピー(de)特徴からasmapと呼ばれる2次元ベクトルに捕捉される。これらのasmapは、畳み込みニューラルネットワーク(cnn)モデルを使用して自動的に特徴を抽出するために使用される。提案手法は, RASM, DASM, DCAU などの DE に基づく特徴抽出手法と比較した。クラス数に基づく分類問題に対して,DEAPおよびSEEDデータセットを用いて実験を行った。その結果,提案手法はdeに基づく特徴抽出法よりも高い分類精度が得られることがわかった。 SEEDデータセットを用いた3クラス分類問題において,最高分類精度97.10%を達成した。さらに,本研究では,窓サイズが分類精度に与える影響についても検討した。

Emotion recognition using EEG has been widely studied to address the challenges associated with affective computing. Using manual feature extraction method on EEG signals result in sub-optimal performance by the learning models. With the advancements in deep learning as a tool for automated feature engineering, in this work a hybrid of manual and automatic feature extraction method has been proposed. The asymmetry in the different brain regions are captured in a 2-D vector, termed as AsMap from the differential entropy (DE) features of EEG signals. These AsMaps are then used to extract features automatically using Convolutional Neural Network (CNN) model. The proposed feature extraction method has been compared with DE and other DE-based feature extraction methods such as RASM, DASM and DCAU. Experiments are conducted using DEAP and SEED dataset on different classification problems based on number of classes. Results obtained indicate that the proposed method of feature extraction results in higher classification accuracy outperforming the DE based feature extraction methods. Highest classification accuracy of 97.10% is achieved on 3-class classification problem using SEED dataset. Further, the impact of window size on classification accuracy has also been assessed in this work.

翻訳日:2022-01-31 15:37:53 公開日:2022-01-28

# 強化学習のためのマスクベース潜在性再構成

Mask-based Latent Reconstruction for Reinforcement Learning ( http://arxiv.org/abs/2201.12096v1 )

ライセンス: Link先を確認

Tao Yu, Zhizheng Zhang, Cuiling Lan, Zhibo Chen, Yan Lu

(参考訳) 画素からの深部強化学習(RL)では,高い性能を達成するために有効な状態表現の学習が不可欠である。しかし実際には、限られた経験と高次元入力が効果的な表現学習を妨げる。これを解決するために、他の研究分野におけるマスクモデリングの成功を動機として、RLにおける状態表現学習を促進するためにマスクベースの再構築を導入する。具体的には,空間的および時空間的にマスクされた画素を用いた観測から潜在空間の完全な状態表現を予測するための,単純かつ効果的な自己教師あり法であるマスクベース潜時再構成(mlr)を提案する。 MLRは、状態表現を学習する際の文脈情報のより良い利用を可能にし、それらをより情報的にし、RLエージェントの訓練を容易にする。総合的な実験により,MLRはRLの試料効率を大幅に向上し,複数の連続ベンチマーク環境において最先端の試料効率RL法より優れていた。

For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance. However, in practice, limited experience and high-dimensional input prevent effective representation learning. To address this, motivated by the success of masked modeling in other research fields, we introduce mask-based reconstruction to promote state representation learning in RL. Specifically, we propose a simple yet effective self-supervised method, Mask-based Latent Reconstruction (MLR), to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. MLR enables the better use of context information when learning state representations to make them more informative, which facilitates RL agent training. Extensive experiments show that our MLR significantly improves the sample efficiency in RL and outperforms the state-of-the-art sample-efficient RL methods on multiple continuous benchmark environments.

翻訳日:2022-01-31 15:37:37 公開日:2022-01-28

# 定次元潜在空間をもつグラフオートエンコーダ

Graph autoencoder with constant dimensional latent space ( http://arxiv.org/abs/2201.12165v1 )

ライセンス: Link先を確認

Adam Ma{\l}kowski, Jakub Grzechoci\'nski, Pawe{\l} Wawrzy\'nski

(参考訳) 大きいグラフの不変次元ベクトル(埋め込み)への可逆変換は依然として挑戦である。本稿では、再帰的ニューラルネットワーク(エンコーダとデコーダ)でそれに対処する。エンコーダネットワークは、サブグラフの埋め込みをより大きなサブグラフの埋め込みに変換し、最終的に入力グラフの埋め込みに変換する。デコーダは逆を行う。埋め込みの次元は (sub) グラフのサイズに関係なく一定である。本稿では,提案するグラフオートエンコーダが何千もの頂点を持つグラフを処理できることをシミュレーション実験により確認する。

Invertible transformation of large graphs into constant dimensional vectors (embeddings) remains a challenge. In this paper we address it with recursive neural networks: The encoder and the decoder. The encoder network transforms embeddings of subgraphs into embeddings of larger subgraphs, and eventually into the embedding of the input graph. The decoder does the opposite. The dimension of the embeddings is constant regardless of the size of the (sub)graphs. Simulation experiments presented in this paper confirm that our proposed graph autoencoder can handle graphs with even thousands of vertices.

翻訳日:2022-01-31 15:37:21 公開日:2022-01-28

# 構成性を考慮したGraph2Seq学習

Compositionality-Aware Graph2Seq Learning ( http://arxiv.org/abs/2201.12178v1 )

ライセンス: Link先を確認

Takeshi D. Itoh and Takatomi Kubo and Kazushi Ikeda

(参考訳) グラフは非常に表現力のあるデータ構造であるが、複雑なグラフからパターンを見つけることはしばしば困難である。したがって、グラフから人間の解釈可能なシーケンスを生成することは、Graph2seq Learningと呼ばれる関心を集めている。グラフにおける構成性は、多くのグラフ2seqタスクの出力シーケンスにおける構成性に関連付けられることが期待される。したがって、構成性に配慮したGNNアーキテクチャを適用することで、モデルの性能が向上する。本研究では,複数レベルの情報局所性からグラフ表現を集約するマルチレベルアテンションプーリング(MLAP)アーキテクチャを採用する。実世界の例として、極端にソースコードの要約タスクを取り上げ、モデルがそのソースコードからプログラム関数の名前を推定する。 MLAPアーキテクチャを持つモデルは,従来の最先端モデルよりも7倍以上少ないパラメータで性能を向上することを示した。

Graphs are a highly expressive data structure, but it is often difficult for humans to find patterns from a complex graph. Hence, generating human-interpretable sequences from graphs have gained interest, called graph2seq learning. It is expected that the compositionality in a graph can be associated to the compositionality in the output sequence in many graph2seq tasks. Therefore, applying compositionality-aware GNN architecture would improve the model performance. In this study, we adopt the multi-level attention pooling (MLAP) architecture, that can aggregate graph representations from multiple levels of information localities. As a real-world example, we take up the extreme source code summarization task, where a model estimate the name of a program function from its source code. We demonstrate that the model having the MLAP architecture outperform the previous state-of-the-art model with more than seven times fewer parameters than it.

翻訳日:2022-01-31 15:37:12 公開日:2022-01-28

# ニューロカオス学習を用いた因果効果保存と分類

Cause-Effect Preservation and Classification using Neurochaos Learning ( http://arxiv.org/abs/2201.12181v1 )

ライセンス: Link先を確認

Harikrishnan N B, Aditi Kathpalia, Nithin Nagaraj

(参考訳) 観測データからの因果効果の発見は、科学と工学において重要だが困難な問題である。本研究では、シミュレーションデータから原因影響を分類するために、最近提案された脳誘発学習アルゴリズムである-\emph{Neurochaos Learning} (NL) を用いる。使用されるデータインスタンスは、結合ARプロセス、結合1Dカオススキューテントマップ、結合1Dカオスロジスティックマップ、および現実世界の捕食者システムから生成される。提案手法は、0.1$から0.7$までの結合係数値に対して、5層ディープニューラルネットワークアーキテクチャを一貫して上回る。さらに,複合ARプロセスのためのGranger Causality(GC)と,複合カオスシステムと実世界の捕食者データセットのためのCompression-Complexity Causality(CCC)を用いて,NLの特徴抽出空間における因果関係の保存について検討した。 NLがカオス変換の下で因果関係を保ち、因果関係の分類と効果時系列(転帰学習シナリオを含む)をうまく行う能力は、因果的機械学習応用において非常に望ましい。

Discovering cause-effect from observational data is an important but challenging problem in science and engineering. In this work, a recently proposed brain inspired learning algorithm namely-\emph{Neurochaos Learning} (NL) is used for the classification of cause-effect from simulated data. The data instances used are generated from coupled AR processes, coupled 1D chaotic skew tent maps, coupled 1D chaotic logistic maps and a real-world prey-predator system. The proposed method consistently outperforms a five layer Deep Neural Network architecture for coupling coefficient values ranging from $0.1$ to $0.7$. Further, we investigate the preservation of causality in the feature extracted space of NL using Granger Causality (GC) for coupled AR processes and and Compression-Complexity Causality (CCC) for coupled chaotic systems and real-world prey-predator dataset. This ability of NL to preserve causality under a chaotic transformation and successfully classify cause and effect time series (including a transfer learning scenario) is highly desirable in causal machine learning applications.

翻訳日:2022-01-31 15:36:59 公開日:2022-01-28

# データからfuncta: データポイントは関数であり、それを関数として扱うべきです

From data to functa: Your data point is a function and you should treat it like one ( http://arxiv.org/abs/2201.12204v1 )

ライセンス: Link先を確認

Emilien Dupont, Hyunjik Kim, S. M. Ali Eslami, Danilo Rezende, Dan Rosenbaum

(参考訳) ディープラーニングでは、例えばピクセルの2dグリッドのように、離散格子上の世界の測定を表すのが一般的である。しかし、これらの測定で表される基盤となる信号はしばしば連続的であり、例えば画像に表されるシーンなどである。次に、強力な連続的な代替手段として、暗黙の神経表現(入力空間位置の適切な測定値を出力するように訓練された神経関数)を用いてこれらの測定を表現することが挙げられる。この論文では、このアイデアを次のレベルに上げている。これらの関数をデータとして扱う代わりに、ディープラーニングを実行するのに何が必要か? この文脈では、データを functa と呼び、 functa の深層学習のためのフレームワークを提案する。この見解は、データからfunctaへの効率的な変換、functaのコンパクト表現、functaのダウンストリームタスクの効果的解決に関する多くの課題を示している。本稿では,これらの課題を克服するためのレシピを概説し,画像,3次元形状,ニューラル放射場(NeRF),多様体上のデータなど,幅広いデータモダリティに適用する。提案手法は,データモダリティ,特に生成モデル,データ計算,新しいビュー合成,分類の標準的タスクにおいて,様々な魅力的な特性を有することを示す。

It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa. This view presents a number of challenges around efficient conversion from data to functa, compact representation of functa, and effectively solving downstream tasks on functa. We outline a recipe to overcome these challenges and apply it to a wide range of data modalities including images, 3D shapes, neural radiance fields (NeRF) and data on manifolds. We demonstrate that this approach has various compelling properties across data modalities, in particular on the canonical tasks of generative modeling, data imputation, novel view synthesis and classification.

翻訳日:2022-01-31 15:36:39 公開日:2022-01-28

# 神経の最適輸送

Neural Optimal Transport ( http://arxiv.org/abs/2201.12220v1 )

ライセンス: Link先を確認

Alexander Korotin, Daniil Selikhanovych, Evgeny Burnaev

(参考訳) 本稿では,最適輸送マップの計算を行うニューラルネットワークに基づく新しいアルゴリズムと,強い輸送コストと弱い輸送コストの計画を提案する。ニューラルネットワークの利用を正当化するために、確率分布間の輸送計画の普遍的な近似であることを示す。我々は,おもちゃの例や未完成画像から画像への翻訳作業において,最適な輸送アルゴリズムの性能を評価する。

We present a novel neural-networks-based algorithm to compute optimal transport maps and plans for strong and weak transport costs. To justify the usage of neural networks, we prove that they are universal approximators of transport plans between probability distributions. We evaluate the performance of our optimal transport algorithm on toy examples and on the unpaired image-to-image style translation task.

翻訳日:2022-01-31 15:36:16 公開日:2022-01-28

# (参考訳) アンタングルバイシミュレーションによる制御法における意味的類似性の効率的な埋め込み

Efficient Embedding of Semantic Similarity in Control Policies via Entangled Bisimulation ( http://arxiv.org/abs/2201.12300v1 )

ライセンス: CC BY 4.0

Martin Bertran, Walter Talbott, Nitish Srivastava, Joshua Susskind

(参考訳) 視覚障害の存在下で視覚入力から一般化可能なポリシーを学ぶことは、強化学習において難しい問題である。これらの指標は、原則として、状態間の振る舞いの類似性を測定することによって、無関係な気晴らしに不変な表現を学習するために使用することができる。これらのメトリクスの正確で偏りがなく、スケーラブルな評価は、継続的な状態とアクションシナリオにおいて明らかです。本稿では、状態間の距離関数の仕様化を可能にするビシミュレーション計量である絡み合ったビシミュレーションを提案し、連続状態や行動空間のバイアスなしに推定できる。本研究では,データ拡張技術に付加された場合においても,従来のDCS(Distracting Control Suite)の手法よりも有意な改善が可能であることを示す。

Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning. Recently, there has been renewed interest in bisimulation metrics as a tool to address this issue; these metrics can be used to learn representations that are, in principle, invariant to irrelevant distractions by measuring behavioural similarity between states. An accurate, unbiased, and scalable estimation of these metrics has proved elusive in continuous state and action scenarios. We propose entangled bisimulation, a bisimulation metric that allows the specification of the distance function between states, and can be estimated without bias in continuous state and action spaces. We show how entangled bisimulation can meaningfully improve over previous methods on the Distracting Control Suite (DCS), even when added on top of data augmentation techniques.

翻訳日:2022-01-31 15:34:47 公開日:2022-01-28

# 星時区分:部分ラベルデータを用いた系列分類

Star Temporal Classification: Sequence Classification with Partially Labeled Data ( http://arxiv.org/abs/2201.12208v1 )

ライセンス: Link先を確認

Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

(参考訳) ラベル付きおよび未指定の逐次データから学習可能なアルゴリズムを開発した。コネクショニスト時相分類(ctc)のようなほとんどの逐次損失関数は、多くのラベルが欠落した時に崩壊する。この問題は、特別な星のトークンを使用して、トークンが欠落するたびに可能な全てのトークンを含むアライメントを可能にするStar Temporal Classification (STC)によって解決される。我々は、STCを重み付き有限状態トランスデューサ(WFST)の合成として表現し、GTN(WFSTによる自動微分のためのフレームワーク)を用いて勾配を計算する。我々は自動音声認識に関する広範囲な実験を行う。これらの実験により,STCは最大70%のラベルが欠落している場合に,教師付きベースラインの性能を回復できることがわかった。また,手書き認識の実験を行い,この手法が他のシーケンス分類タスクにも容易に適用できることを示す。

We develop an algorithm which can learn from partially labeled and unsegmented sequential data. Most sequential loss functions, such as Connectionist Temporal Classification (CTC), break down when many labels are missing. We address this problem with Star Temporal Classification (STC) which uses a special star token to allow alignments which include all possible tokens whenever a token could be missing. We express STC as the composition of weighted finite-state transducers (WFSTs) and use GTN (a framework for automatic differentiation with WFSTs) to compute gradients. We perform extensive experiments on automatic speech recognition. These experiments show that STC can recover most of the performance of supervised baseline when up to 70% of the labels are missing. We also perform experiments in handwriting recognition to show that our method easily applies to other sequence classification tasks.

翻訳日:2022-01-31 15:02:34 公開日:2022-01-28

# x線異物検出のための深層学習を可能にするトモグラフィーワークフロー

A tomographic workflow to enable deep learning for X-ray based foreign object detection ( http://arxiv.org/abs/2201.12184v1 )

ライセンス: Link先を確認

Math\'e T. Zeegers, Tristan van Leeuwen, Dani\"el M. Pelt, Sophia Bethany Coban, Robert van Liere, Kees Joost Batenburg

(参考訳) 製品内の望ましくない('foreign')オブジェクトの検出は、生産品質を維持するために多くの業界で一般的な手順である。 X線イメージングは、異物検出のための高速で非侵襲的で広く適用可能な方法である。深層学習は最近、x線画像のパターンを認識する強力なアプローチとして登場し、x線ベースの外部物体の自動検出を可能にしている。しかし、これらの方法には多数のトレーニング例が必要であり、手動アノテーションは主観的かつ手間のかかる作業である。本研究では,作業量を最小限に抑えながら,異物検出の教師あり学習のための訓練データを生成するためのct法を提案する。提案手法では,CTをスキャンして3Dで再構成する。 CTスキャンデータの一部として取得されたラジオグラフは、機械学習方法の入力として機能する。外部オブジェクトの高品質な地上真実位置は、正確な3次元再構成とセグメンテーションによって得られる。これらのセグメンテーションボリュームを用いて、仮想投影を作成することで対応する2次元セグメンテーションが得られる。本稿では,従来のラジオグラフィアノテーションと比較して,客観的かつ再現的にトレーニングデータを生成する利点を概説する。また,CT再建に使用する物体数によって精度がどう変わるかを示す。その結果, 産業環境での適切な検出性能を実現するためには, 比較的少数の代表対象(すなわち10未満)しか必要とされないことがわかった。さらに、実際の実験データでは、標準のラジオグラフアノテーションよりも外部物体検出精度が高いことが示されている。

Detection of unwanted (`foreign') objects within products is a common procedure in many branches of industry for maintaining production quality. X-ray imaging is a fast, non-invasive and widely applicable method for foreign object detection. Deep learning has recently emerged as a powerful approach for recognizing patterns in radiographs (i.e., X-ray images), enabling automated X-ray based foreign object detection. However, these methods require a large number of training examples and manual annotation of these examples is a subjective and laborious task. In this work, we propose a Computed Tomography (CT) based method for producing training data for supervised learning of foreign object detection, with minimal labour requirements. In our approach, a few representative objects are CT scanned and reconstructed in 3D. The radiographs that have been acquired as part of the CT-scan data serve as input for the machine learning method. High-quality ground truth locations of the foreign objects are obtained through accurate 3D reconstructions and segmentations. Using these segmented volumes, corresponding 2D segmentations are obtained by creating virtual projections. We outline the benefits of objectively and reproducibly generating training data in this way compared to conventional radiograph annotation. In addition, we show how the accuracy depends on the number of objects used for the CT reconstructions. The results show that in this workflow generally only a relatively small number of representative objects (i.e., fewer than 10) are needed to achieve adequate detection performance in an industrial setting. Moreover, for real experimental data we show that the workflow leads to higher foreign object detection accuracies than with standard radiograph annotation.

翻訳日:2022-01-31 15:01:59 公開日:2022-01-28

# 球状CNNに対するM\"{o}bius Convolutions

M\"{o}bius Convolutions for Spherical CNNs ( http://arxiv.org/abs/2201.12212v1 )

ライセンス: Link先を確認

Thomas W. Mitchel, Noam Aigerman, Vladimir G. Kim, Michael Kazhdan

(参考訳) M\"{o}bius 変換は幾何学と球面画像処理の両方において重要な役割を果たし、それらは2次元曲面の共形自己同型群とホモグラフの球面同値群である。ここでは、M\"{o}bius-equivariant spherical convolution operatorとよばれる新しい M\"{o}bius-equivariant spherical convolution operator を示し、それとともに、M\"{o}bius-equivariant spherical CNNの基礎を開発する。我々のアプローチは単純な観察に基づいている: 等分散を達成するためには、近傍のフレームに見られるような点の位置を変換する低次元部分群を考えるのみである。スケールでのM\"{o}bius畳み込みを効率的に計算するために、球面フィルタ上の変換の作用の近似を導出し、高速な球面高調波変換を用いてスペクトル領域における畳み込みを計算する。得られたフレームワークはフレキシブルかつ記述的であり、形状分類と画像分割の両タスクにおいて有望な結果を達成し、その有用性を実証する。

M\"{o}bius transformations play an important role in both geometry and spherical image processing -- they are the group of conformal automorphisms of 2D surfaces and the spherical equivalent of homographies. Here we present a novel, M\"{o}bius-equivariant spherical convolution operator which we call M\"{o}bius convolution, and with it, develop the foundations for M\"{o}bius-equivariant spherical CNNs. Our approach is based on a simple observation: to achieve equivariance, we only need to consider the lower-dimensional subgroup which transforms the positions of points as seen in the frames of their neighbors. To efficiently compute M\"{o}bius convolutions at scale we derive an approximation of the action of the transformations on spherical filters, allowing us to compute our convolutions in the spectral domain with the fast Spherical Harmonic Transform. The resulting framework is both flexible and descriptive, and we demonstrate its utility by achieving promising results in both shape classification and image segmentation tasks.

翻訳日:2022-01-31 15:01:37 公開日:2022-01-28

# スキップDQと無限時間ニューラルネットワーク(連続DQ)による暗黙と明示的深層学習の混合

Mixing Implicit and Explicit Deep Learning with Skip DEQs and Infinite Time Neural ODEs (Continuous DEQs) ( http://arxiv.org/abs/2201.12240v1 )

ライセンス: Link先を確認

Avik Pal, Alan Edelman, Christopher Rackauckas

(参考訳) Neural ODEsやDeep Equilibrium Models (DEQs)のような暗黙的なディープラーニングアーキテクチャは、そのソリューションプロセスの記述からレイヤの定義を分離する。暗黙の層は、新しいシナリオや入力に自動的に適応する深度などの特徴を許容するが、この適応性は計算コストの予測を困難にする。多くの著者は暗黙的層手法は明示的な層法よりも計算集約的であると指摘している。明示的なレイヤの計算コストを削減しつつ、暗黙的なレイヤの堅牢性を同時に達成する方法はあるのだろうか? そこで我々は,明示的予測と暗黙的補正を同時に学習する暗黙的拡張(imex)層であるskip deqを開発した。この明示的な層をトレーニングすることは自由であり、トレーニング時間を2.5倍、予測時間を3.4倍も短縮する。さらに、時間的逆伝播を必要とせず、標準の神経回路上でのトレーニングコストをパラドックス的に低減する無限時間神経回路の手法を再定義することにより、DECの「単純さ」をさらに増大させる。連続したスキップdeqアーキテクチャが、元のdeqよりも堅牢にトレーニングし、より高速なトレーニングと予測時間を実現する様子を実証する。この写本は、暗黙の深層学習と明示的な深層学習の二分法が両技法の利点を組み合わせていることを示すものである。

Implicit deep learning architectures, like Neural ODEs and Deep Equilibrium Models (DEQs), separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. Numerous authors have noted that implicit layer techniques can be more computationally intensive than explicit layer methods. In this manuscript, we address the question: is there a way to simultaneously achieve the robustness of implicit layers while allowing the reduced computational expense of an explicit layer? To solve this we develop Skip DEQ, an implicit-explicit (IMEX) layer that simultaneously trains an explicit prediction followed by an implicit correction. We show that training this explicit layer is free and even decreases the training time by 2.5x and prediction time by 3.4x. We then further increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE which paradoxically decreases the training cost over a standard neural ODE by not requiring backpropagation through time. We demonstrate how the resulting Continuous Skip DEQ architecture trains more robustly than the original DEQ while achieving faster training and prediction times. Together, this manuscript shows how bridging the dichotomy of implicit and explicit deep learning can combine the advantages of both techniques.

翻訳日:2022-01-31 15:01:15 公開日:2022-01-28

# 胎児超音波画像解析のための深層学習アルゴリズムの検討

A Review on Deep-Learning Algorithms for Fetal Ultrasound-Image Analysis ( http://arxiv.org/abs/2201.12260v1 )

ライセンス: Link先を確認

Maria Chiara Fiorentino and Francesca Pia Villani and Mariachiara Di Cosmo and Emanuele Frontoni and Sara Moccia

(参考訳) 深層学習(DL)アルゴリズムは超音波(US)胎児画像処理の標準となっている。この分野にはすでに多くの調査論文が存在しているが、そのほとんどは、胎児のdlアプリケーションをすべてカバーしていないか、医療画像分析の広い領域に焦点を当てている。本稿は,2017年以降に145の研究論文を出版し,この分野の最新研究を概観する。各論文は方法論とアプリケーションの観点から分析され、コメントされる。私たちは論文を分類した (i)胎児の標準航空機検出 (ii)解剖学的構造解析、及び (iii)バイオメトリパラメータ推定。各カテゴリに対して、主な制限とオープンイシューが提示される。概要表は、異なるアプローチの比較を容易にするために含まれます。アルゴリズムのパフォーマンスを評価するために一般的に使用される公開データセットとパフォーマンスメトリクスも要約されている。本稿では、胎児のUS画像解析におけるDLアルゴリズムの現状と、その研究方法論を実際の臨床実践に翻訳するために現場で働く研究者が取り組まなければならない課題について論じる。

Deep-learning (DL) algorithms are becoming the standard for processing ultrasound (US) fetal images. Despite a large number of survey papers already present in this field, most of them are focusing on a broader area of medical-image analysis or not covering all fetal US DL applications. This paper surveys the most recent work in the field, with a total of 145 research papers published after 2017. Each paper is analyzed and commented on from both the methodology and application perspective. We categorized the papers in (i) fetal standard-plane detection, (ii) anatomical-structure analysis, and (iii) biometry parameter estimation. For each category, main limitations and open issues are presented. Summary tables are included to facilitate the comparison among the different approaches. Publicly-available datasets and performance metrics commonly used to assess algorithm performance are summarized, too. This paper ends with a critical summary of the current state of the art on DL algorithms for fetal US image analysis and a discussion on current challenges that have to be tackled by researchers working in the field to translate the research methodology into the actual clinical practice.

翻訳日:2022-01-31 15:00:49 公開日:2022-01-28

# グローバルコンテキスト埋め込みによるtwitterストリームのエンティティ参照検出の促進

Boosting Entity Mention Detection for Targetted Twitter Streams with Global Contextual Embeddings ( http://arxiv.org/abs/2201.11885v1 )

ライセンス: Link先を確認

Satadisha Saha Bhowmick, Eduard C. Dragut and Weiyi Meng

(参考訳) Twitterのようなマイクロブログサイトは、ユビキタスな情報ソースとして登場した。マイクロブログにおける情報の自動抽出と分析に関する2つの重要なタスクは、エンティティ参照検出(emd)とエンティティ検出(ed)である。最先端のemdシステムは、オフラインの静的データセットでトレーニングすることで、マイクロブログテキストの非文字的性質をモデル化することを目的としている。彼らは、ノイズの多いテキストモデリングとエンティティ抽出のために個々のメッセージから、表面レベルの特徴(正書法、語彙、意味)の組み合わせを抽出する。しかし、マイクロブログストリームの絶え間なく進化する性質を考えると、短いメッセージのさまざまな限られたコンテキストから、すべてのエンティティへの言及を検出することは難しい問題である。そこで本稿では,マイクロブログストリーム上でのEMD学習者実行に適したフレームワークであるEMD Globalizerを提案する。既存のemdシステムによる分離されたマイクロブログメッセージの処理から逸脱し、メッセージの即時コンテキストからの学習知識がエンティティの提案に使用される。 EMDシステムによるエンティティ候補の最初の抽出の後、提案手法は発生源のマイニングを利用して、この最初の検出で見落とされた追加の候補言及を見つける。これらの言及の局所的な文脈表現を集約すると、ストリーム内のエンティティ候補の集団的コンテキストからグローバル埋め込みが引き出される。グローバルな埋め込みは、候補内のエンティティを偽陽性から分離するために使用される。ストリームからの当該エンティティに関するすべての言及は、フレームワークの最終出力で生成される。実験の結果、emd globalizerは、テストしたすべての既存のemdシステム(平均25.61%)の有効性を、計算オーバーヘッドを小さく向上できることがわかった。

Microblogging sites, like Twitter, have emerged as ubiquitous sources of information. Two important tasks related to the automatic extraction and analysis of information in Microblogs are Entity Mention Detection (EMD) and Entity Detection (ED). The state-of-the-art EMD systems aim to model the non-literary nature of microblog text by training upon offline static datasets. They extract a combination of surface-level features -- orthographic, lexical, and semantic -- from individual messages for noisy text modeling and entity extraction. But given the constantly evolving nature of microblog streams, detecting all entity mentions from such varying yet limited context of short messages remains a difficult problem. To this end, we propose a framework named EMD Globalizer, better suited for the execution of EMD learners on microblog streams. It deviates from the processing of isolated microblog messages by existing EMD systems, where learned knowledge from the immediate context of a message is used to suggest entities. After an initial extraction of entity candidates by an EMD system, the proposed framework leverages occurrence mining to find additional candidate mentions that are missed during this first detection. Aggregating the local contextual representations of these mentions, a global embedding is drawn from the collective context of an entity candidate within a stream. The global embeddings are then utilized to separate entities within the candidates from false positives. All mentions of said entities from the stream are produced in the framework's final outputs. Our experiments show that EMD Globalizer can enhance the effectiveness of all existing EMD systems that we tested (on average by 25.61%) with a small additional computational overhead.

翻訳日:2022-01-31 14:59:35 公開日:2022-01-28

# DeepSpeed と Megatron を用いた大規模生成言語モデル NLG 530B の訓練

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model ( http://arxiv.org/abs/2201.11990v1 )

ライセンス: Link先を確認

Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, and Bryan Catanzaro

(参考訳) 事前訓練された汎用言語モデルは、ゼロショット、少数ショット、微調整技術を用いて下流タスクに適応することで、様々な自然言語処理領域における最先端の精度を達成することができる。その成功により、これらのモデルのサイズは急速に増加し、そのような大規模モデルのトレーニングを可能にするために高性能なハードウェア、ソフトウェア、アルゴリズム技術が必要となった。 MicrosoftとNVIDIAの共同作業の結果、我々は最大のモノリシックトランスフォーマーベースの言語モデルであるMegatron-Turing NLG 530B(MT-NLG)のトレーニングの詳細を5300億のパラメータで提示した。本稿では,まず,このモデルをdeepspeedとmegatronを用いてトレーニングするための3次元並列化手法とともに,インフラストラクチャに焦点をあてる。次に、トレーニングプロセス、トレーニングコーパスの設計、データキュレーション技術について詳述する。最後に,MT-NLGによる様々な評価結果と,他の興味深い観測結果と新たな特性について考察する。 MT-NLGは、いくつかのNLPベンチマークにおいて、優れたゼロ、ワンショット、少数ショットの学習精度を実現し、新しい最先端結果を確立することを実証する。私たちの貢献は、大規模トレーニングインフラストラクチャ、大規模言語モデル、および自然言語世代の発展に役立ちます。

Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success, the size of these models has increased rapidly, requiring high-performance hardware, software, and algorithmic techniques to enable training such large models. As the result of a joint effort between Microsoft and NVIDIA, we present details on the training of the largest monolithic transformer based language model, Megatron-Turing NLG 530B (MT-NLG), with 530 billion parameters. In this paper, we first focus on the infrastructure as well as the 3D parallelism methodology used to train this model using DeepSpeed and Megatron. Next, we detail the training process, the design of our training corpus, and our data curation techniques, which we believe is a key ingredient to the success of the model. Finally, we discuss various evaluation results, as well as other interesting observations and new properties exhibited by MT-NLG. We demonstrate that MT-NLG achieves superior zero-, one-, and few-shot learning accuracies on several NLP benchmarks and establishes new state-of-the-art results. We believe that our contributions will help further the development of large-scale training infrastructures, large-scale language models, and natural language generations.

翻訳日:2022-01-31 14:59:07 公開日:2022-01-28

# PCL:教師なし文埋め込みのための多言語拡張によるピアコントラスト学習

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings ( http://arxiv.org/abs/2201.12093v1 )

ライセンス: Link先を確認

Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang

(参考訳) 教師なしの方法で文を埋め込む学習は自然言語処理において基本である。最近の一般的な実践は、教師なしのコントラスト学習と事前訓練された言語モデルを組み合わせることである。それにもかかわらず、既存のアプローチは、通常単調な戦略に依存しているため、バイアスの増大に対する学習の近道が引き起こされ、したがって文埋め込みの品質が損なわれる。率直な解決策は、多元的戦略からより多様なポジティブスに頼ることであるが、オープンな疑問は、様々なポジティブスから教師なしで学ぶ方法でありながら、テキストフィールドの質を均等に増やすことである。 1つの答えとして,多種多様な拡張を伴うペアコントラスト学習(PCL)を提案する。 pclは、教師なし文埋め込みの群レベルでの多様な対比的正と負を構成する。 pclはピアポジティブなコントラストやピアネットワークの協調を行うことができ、独自のアンチバイアス能力と多様な拡張から学ぶ効果的な方法を提供する。 stsベンチマーク実験は,教師なし文埋め込みにおける競合相手に対するpclの有効性を検証する。

Learning sentence embeddings in an unsupervised manner is fundamental in natural language processing. Recent common practice is to couple pre-trained language models with unsupervised contrastive learning, whose success relies on augmenting a sentence with a semantically-close positive instance to construct contrastive pairs. Nonetheless, existing approaches usually depend on a mono-augmenting strategy, which causes learning shortcuts towards the augmenting biases and thus corrupts the quality of sentence embeddings. A straightforward solution is resorting to more diverse positives from a multi-augmenting strategy, while an open question remains about how to unsupervisedly learn from the diverse positives but with uneven augmenting qualities in the text field. As one answer, we propose a novel Peer-Contrastive Learning (PCL) with diverse augmentations. PCL constructs diverse contrastive positives and negatives at the group level for unsupervised sentence embeddings. PCL can perform peer-positive contrast as well as peer-network cooperation, which offers an inherent anti-bias ability and an effective way to learn from diverse augmentations. Experiments on STS benchmarks verify the effectiveness of our PCL against its competitors in unsupervised sentence embeddings.

翻訳日:2022-01-31 14:58:43 公開日:2022-01-28

# Protum: "[MASK]"に基づくプロンプトチューニングの新しい方法

Protum: A New Method For Prompt Tuning Based on "[MASK]" ( http://arxiv.org/abs/2201.12109v1 )

ライセンス: Link先を確認

Pan He and Yuxi Chen and Yan Wang and Yanru Zhang

(参考訳) 近年, 先行学習言語モデル (PLM) のパラメータを凍結することにより, 下流タスクにおける顕著な性能を得ることにより, 単語の表現にのみ依存する, NLP の新たなパラダイムとなっている。 Masked Language Model (MLM) \cite{devlin2018bert} タスクの事前トレーニングプロセスにおける一貫性を維持し、微調整中に発生する可能性のある問題を回避する。当然、"[mask]"トークンは他のトークンよりも有用な情報を持っていると考えます。現在のプロンプトチューニング手法では,複数の単語を予測した場合の解答トークンのランダムな構成に深刻な問題があるため,ヘルプ弁解器を用いてラベルにトークンをマッピングする必要がある。そこで,本稿では,[[mask]"トークンの隠れた層によって保持される情報を通じて分類タスクを構築し,応答トークンではなくラベルを直接予測する手法である,<textbf{m}ask] (\textbf{protum}) 法に基づく新しい \textbf{pro}mpt \textbf{tu}ning を提案する。同時に、"[MASK]"の下に隠された異なる層が、多くの異なるデータセットの分類モデルにどのように影響するかを調査する。最後に、私たちの \textbf{protum} は、時間消費の少ない継続的事前トレーニングの後、微調整よりもずっと優れたパフォーマンスを達成できることがわかりました。我々のモデルは,NLPにおける大規模モデルの実用化を促進する。

Recently, prompt tuning \cite{lester2021power} has gradually become a new paradigm for NLP, which only depends on the representation of the words by freezing the parameters of pre-trained language models (PLMs) to obtain remarkable performance on downstream tasks. It maintains the consistency of Masked Language Model (MLM) \cite{devlin2018bert} task in the process of pre-training, and avoids some issues that may happened during fine-tuning. Naturally, we consider that the "[MASK]" tokens carry more useful information than other tokens because the model combines with context to predict the masked tokens. Among the current prompt tuning methods, there will be a serious problem of random composition of the answer tokens in prediction when they predict multiple words so that they have to map tokens to labels with the help verbalizer. In response to the above issue, we propose a new \textbf{Pro}mpt \textbf{Tu}ning based on "[\textbf{M}ASK]" (\textbf{Protum}) method in this paper, which constructs a classification task through the information carried by the hidden layer of "[MASK]" tokens and then predicts the labels directly rather than the answer tokens. At the same time, we explore how different hidden layers under "[MASK]" impact on our classification model on many different data sets. Finally, we find that our \textbf{Protum} can achieve much better performance than fine-tuning after continuous pre-training with less time consumption. Our model facilitates the practical application of large models in NLP.

翻訳日:2022-01-31 14:58:23 公開日:2022-01-28

# エンティティリソースの広範囲化に向けて:多言語に対するデータ効率なアプローチ

Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages ( http://arxiv.org/abs/2201.12219v1 )

ライセンス: Link先を確認

Silvia Severini, Ayyoob Imani, Philipp Dufter, Hinrich Sch\"utze

(参考訳) 並列コーパスは、MNE(multilingual named entity)リソース、すなわち複数の言語に翻訳された名前のデータセットを抽出するのに理想的である。並列コーパスからMNEデータセットを抽出する以前の作業では、大きなモノリンガルコーパスや単語調整器のようなリソースが必要だった。我々は、mneリソースを作成する新しい手法であるclc-bnを提案し、1000以上の言語からなるコーパスである並列聖書コーパスに適用する。 CLC-BNは、他のバイリンガルリソース、単語調整器、シードデータを必要としない、並列コーパス統計から神経翻訳モデルを学ぶ。実験の結果,CLC-BNは従来より明らかに優れていた。我々は1340言語用のMNEリソースをリリースし、知識グラフ増強とバイリンガル語彙誘導という2つの下流タスクでその効果を示す。

Parallel corpora are ideal for extracting a multilingual named entity (MNE) resource, i.e., a dataset of names translated into multiple languages. Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages. We present CLC-BN, a new method for creating an MNE resource, and apply it to the Parallel Bible Corpus, a corpus of more than 1000 languages. CLC-BN learns a neural transliteration model from parallel-corpus statistics, without requiring any other bilingual resources, word aligners, or seed data. Experimental results show that CLC-BN clearly outperforms prior work. We release an MNE resource for 1340 languages and demonstrate its effectiveness in two downstream tasks: knowledge graph augmentation and bilingual lexicon induction.

翻訳日:2022-01-31 14:57:51 公開日:2022-01-28

# (参考訳) REET:計算病理のロバスト性評価・強化ツールボックス

REET: Robustness Evaluation and Enhancement Toolbox for Computational Pathology ( http://arxiv.org/abs/2201.12311v1 )

ライセンス: CC BY 4.0

Alex Foote, Amina Asif, Nasir Rajpoot and Fayyaz Minhas

(参考訳) モチベーション: デジタルスライドスキャナーによる病理学研究室のデジタル化と、客観的組織学的評価のための深層学習アプローチの進歩により、コンピュータ病理学(CPath)の分野で急速に進歩し、医学・薬学研究や臨床ワークフローにも幅広く応用されている。しかし、入力画像の変動に対するCPathモデルのロバスト性の推定は、これらのアプローチの下流での実用性、展開、受容性に大きな影響を及ぼす、オープンな問題である。さらに、このようなモデルの堅牢性を高めるためのドメイン特化戦略の開発も重要である。実装と可用性: 本研究では, 計算病理学応用のための最初のドメイン固有ロバスト性評価および拡張ツールボックス(reet)を提案する。ステンドリング、圧縮、フォーカス、ぼかし、空間分解能の変化、輝度の変化、幾何学的変化、ピクセルレベルの逆摂動といった特殊な画像変換に関して、予測モデルのロバスト性評価を可能にするアルゴリズム戦略のスイートを提供する。さらにreetは、計算病理学におけるディープラーニングパイプラインの効率的で堅牢なトレーニングも可能にする。 REETはPythonで実装されており、以下のURLで利用できる。連絡先: Fayyaz.minhas@warwick.ac.uk

Motivation: Digitization of pathology laboratories through digital slide scanners and advances in deep learning approaches for objective histological assessment have resulted in rapid progress in the field of computational pathology (CPath) with wide-ranging applications in medical and pharmaceutical research as well as clinical workflows. However, the estimation of robustness of CPath models to variations in input images is an open problem with a significant impact on the down-stream practical applicability, deployment and acceptability of these approaches. Furthermore, development of domain-specific strategies for enhancement of robustness of such models is of prime importance as well. Implementation and Availability: In this work, we propose the first domain-specific Robustness Evaluation and Enhancement Toolbox (REET) for computational pathology applications. It provides a suite of algorithmic strategies for enabling robustness assessment of predictive models with respect to specialized image transformations such as staining, compression, focusing, blurring, changes in spatial resolution, brightness variations, geometric changes as well as pixel-level adversarial perturbations. Furthermore, REET also enables efficient and robust training of deep learning pipelines in computational pathology. REET is implemented in Python and is available at the following URL: https://github.com/alexjfoote/reetoolbox. Contact: Fayyaz.minhas@warwick.ac.uk

翻訳日:2022-01-31 14:56:20 公開日:2022-01-28

# O-ViT:直交型視覚変換器

O-ViT: Orthogonal Vision Transformer ( http://arxiv.org/abs/2201.12133v1 )

ライセンス: Link先を確認

Yanhong Fei, Yingjie Liu, Xian Wei, Mingsong Chen

(参考訳) ViT(Vision Transformer)は、自然言語処理における自己認識機構の素晴らしい成功に触発され、画像パッチシーケンスに創造的に適用し、素晴らしいパフォーマンスを実現します。しかし、ViTのスケールされたドット積自己アテンションは、元の特徴空間の構造にスケールの曖昧さをもたらす。この問題に対処するために、幾何学的視点からViTを最適化するOrthogonal Vision Transformer (O-ViT) という新しい手法を提案する。 O-ViT は自己アテンションブロックのパラメータをノルム維持直交多様体上に制限し、特徴空間の幾何学を維持できる。さらに、O-ViTは直交群とリー代数間の全射写像を採用することで、直交制約と安価な最適化オーバーヘッドの両方を実現し、O-ViTの有効性を実証するために画像認識タスクの比較実験を行い、O-ViTが最大3.6%向上することを示した。

Inspired by the tremendous success of the self-attention mechanism in natural language processing, the Vision Transformer (ViT) creatively applies it to image patch sequences and achieves incredible performance. However, the scaled dot-product self-attention of ViT brings about scale ambiguity to the structure of the original feature space. To address this problem, we propose a novel method named Orthogonal Vision Transformer (O-ViT), to optimize ViT from the geometric perspective. O-ViT limits parameters of self-attention blocks to be on the norm-keeping orthogonal manifold, which can keep the geometry of the feature space. Moreover, O-ViT achieves both orthogonal constraints and cheap optimization overhead by adopting a surjective mapping between the orthogonal group and its Lie algebra.We have conducted comparative experiments on image recognition tasks to demonstrate O-ViT's validity and experiments show that O-ViT can boost the performance of ViT by up to 3.6%.

翻訳日:2022-01-31 14:52:36 公開日:2022-01-28

# 知覚再構成を用いた教師なし単発深度推定

Unsupervised Single-shot Depth Estimation using Perceptual Reconstruction ( http://arxiv.org/abs/2201.12170v1 )

ライセンス: Link先を確認

Christoph Angermann, Matthias Schwab, Markus Haltmeier, Christian Laubichler and Steinbj\"orn J\'onsson

(参考訳) 実物体深度の実時間推定は,3次元再構成,シーン理解,機械部品の状態評価など,様々な自律システムタスクの実行に不可欠なモジュールである。機械学習の過去10年間、コンピュータビジョンタスクへのディープラーニング手法の広範な展開は、単純なRGBモダリティから現実的な深度合成を実現するためのアプローチを生み出してきた。これらのモデルのほとんどは、対の深度データやビデオシーケンスやステレオ画像の可用性に基づいているが、完全な教師なし設定での単視点深度合成の手法はほとんど検討されていない。この研究は、生成ニューラルネットワークの分野における最新の進歩を示し、それらを活用して完全に教師なしの単発深度合成を行う。 RGB-to-deepthとdeep-to-RGB転送用の2つのジェネレータを実装し,Wasserstein-1距離と新しい知覚再構成項を用いて同時に最適化した。提案手法が検証可能であることを確認するため, 工業用表面深度データと, 体深を記録するテキサス3次元顔認識データベースとSURREALデータセットを用いて, モデルを総合的に評価した。この研究で得られた成功は、実世界のアプリケーションにおける教師なし単発深度推定の可能性を示唆している。

Real-time estimation of actual object depth is a module that is essential to performing various autonomous system tasks such as 3D reconstruction, scene understanding and condition assessment of machinery parts. During the last decade of machine learning, extensive deployment of deep learning methods to computer vision tasks has yielded approaches that succeed in achieving realistic depth synthesis out of a simple RGB modality. While most of these models are based on paired depth data or availability of video sequences and stereo images, methods for single-view depth synthesis in a fully unsupervised setting have hardly been explored. This study presents the most recent advances in the field of generative neural networks, leveraging them to perform fully unsupervised single-shot depth synthesis. Two generators for RGB-to-depth and depth-to-RGB transfer are implemented and simultaneously optimized using the Wasserstein-1 distance and a novel perceptual reconstruction term. To ensure that the proposed method is plausible, we comprehensively evaluate the models using industrial surface depth data as well as the Texas 3D Face Recognition Database and the SURREAL dataset that records body depth. The success observed in this study suggests the great potential for unsupervised single-shot depth estimation in real-world applications.

翻訳日:2022-01-31 14:52:15 公開日:2022-01-28

# 失語症例におけるテキスト列検出能力向上のための自己ペース学習

Self-paced learning to improve text row detection in historical documents with missing lables ( http://arxiv.org/abs/2201.12216v1 )

ライセンス: Link先を確認

Mihaela Gaman, Lida Ghadamiyan, Radu Tudor Ionescu, Marius Popescu

(参考訳) 光文字認識システムの重要な予備ステップは、テキスト列の検出である。この課題にラベルを欠いた履歴データを用いて対処するために,行検出性能を向上させる自己評価学習アルゴリズムを提案する。より地味なバウンディングボックスを持つページはアノテーションを欠く可能性が低いと推測する。この仮説に基づいて, 基礎トラス境界ボックスの数に関して, 下位順のトレーニング例をソートし, それらをkバッチに整理する。自己ペース学習法を用いて,k個の反復に対して列検出器を訓練し,基底アノテーションの少ないバッチを徐々に追加する。各イテレーションにおいて、ゼロトラス境界ボックスと擬似バウンディングボックス(モデル自身によって予測されるバウンディングボックス)を非最大抑圧を用いて組み合わせ、次のトレーニングイテレーションで得られたアノテーションを含める。我々の自己ペース学習戦略は、2つの歴史的文書のデータセットで大きなパフォーマンス向上をもたらし、yolov4の平均精度を1つのデータセットで12%以上、もう一方で39%向上させることを実証した。

An important preliminary step of optical character recognition systems is the detection of text rows. To address this task in the context of historical data with missing labels, we propose a self-paced learning algorithm capable of improving the row detection performance. We conjecture that pages with more ground-truth bounding boxes are less likely to have missing annotations. Based on this hypothesis, we sort the training examples in descending order with respect to the number of ground-truth bounding boxes, and organize them into k batches. Using our self-paced learning method, we train a row detector over k iterations, progressively adding batches with less ground-truth annotations. At each iteration, we combine the ground-truth bounding boxes with pseudo-bounding boxes (bounding boxes predicted by the model itself) using non-maximum suppression, and we include the resulting annotations at the next training iteration. We demonstrate that our self-paced learning strategy brings significant performance gains on two data sets of historical documents, improving the average precision of YOLOv4 with more than 12% on one data set and 39% on the other.

翻訳日:2022-01-31 14:51:54 公開日:2022-01-28

# 画像検索:ブラックボックス学習をグレーに変える

Indicative Image Retrieval: Turning Blackbox Learning into Grey ( http://arxiv.org/abs/2201.11898v1 )

ライセンス: Link先を確認

Xulu Zhang (1), Zhenqun Yang (2), Hao Tian (1), Qing Li (3), Xiaoyong Wei (1 and 3) ((1) Sichuan University, (2) Chinese University of Hong Kong, (3) Hong Kong Polytechnic Univeristy)

(参考訳) ディープラーニングは、導入後すぐに画像検索のためのゲームチェンジャーとなった。画像検索のコアとして特徴抽出(表現学習による)を促進し、関連性/マッチング評価を単純な類似度メトリクスに分解する。多くのアプリケーションでは、ランク付けされたリスト(例えば、医療画像中の標的タンパク質/細胞/レシオンの位置)を持つのではなく、一致する証拠を示す必要がある。一致した単語を検索エンジンでハイライトする必要があるように思える。しかし、明示的な適合/マッチングモデリングなしでは、これは実装が容易ではない。深層表現学習モデルはブラックボックスの性質のため実現不可能である。本稿では,深層学習における関連・マッチングモデリングの重要性を再考する。本研究は,表現学習を省略し,一致した証拠を直接モデル化することは可能であることを示す。事前学習されたモデルへの依存を取り除くことで、多くの関連する問題(例えば、分類と検索の間のドメインギャップ、畳み込みによる詳細拡散など)を避けてきた。さらに重要なことに、この研究は一致した証拠を生成するために、マッチングを明示的にモデル化し、後でバックトラックすることができることを示している。深い推論の説明可能性を向上させることができる。本手法はオックスフォード5kとパリ6kの両文献で最高の性能を示し,オックスフォード5kでは97.77%(パリ6kでは97.81%)の新記録を,深い特徴を抽出することなく達成した。

Deep learning became the game changer for image retrieval soon after it was introduced. It promotes the feature extraction (by representation learning) as the core of image retrieval, with the relevance/matching evaluation being degenerated into simple similarity metrics. In many applications, we need the matching evidence to be indicated rather than just have the ranked list (e.g., the locations of the target proteins/cells/lesions in medical images). It is like the matched words need to be highlighted in search engines. However, this is not easy to implement without explicit relevance/matching modeling. The deep representation learning models are not feasible because of their blackbox nature. In this paper, we revisit the importance of relevance/matching modeling in deep learning era with an indicative retrieval setting. The study shows that it is possible to skip the representation learning and model the matching evidence directly. By removing the dependency on the pre-trained models, it has avoided a lot of related issues (e.g., the domain gap between classification and retrieval, the detail-diffusion caused by convolution, and so on). More importantly, the study demonstrates that the matching can be explicitly modeled and backtracked later for generating the matching evidence indications. It can improve the explainability of deep inference. Our method obtains a best performance in literature on both Oxford-5k and Paris-6k, and sets a new record of 97.77% on Oxford-5k (97.81% on Paris-6k) without extracting any deep features.

翻訳日:2022-01-31 14:51:33 公開日:2022-01-28

# 感情応答検出のためのCAREデータセット

The CARE Dataset for Affective Response Detection ( http://arxiv.org/abs/2201.11895v1 )

ライセンス: Link先を確認

Jane A. Yu and Alon Y. Halevy

(参考訳) ソーシャルメディアは、友人や家族とのコミュニケーション、情報とエンターテイメントの消費において、ますます大きな役割を果たしている。したがって、ソーシャルメディア上で投稿の効果的なランキング関数を設計するには、投稿に対する感情的な反応を予測するのが有用である(例えば、ユーザーがユーモア、インスピレーション、怒り、インフォメーションを受けやすいかどうかなど)。感情認識の研究(記事の出版者の影響に焦点を当てている)と同様に、感情的反応を認識する伝統的なアプローチは、トレーニングデータの人間のアノテーションに投資するコストがかかる。そこで我々は,CARE(Common Affective Response Expression, CARE)法を用いて,7つの感情反応に基づいて注釈付き230kのソーシャルメディア投稿のデータセットであるCARE$_{db}$を紹介した。 CARE法は、投稿に反応して投稿されたコメントに存在している信号を活用する手段であり、人間のアノテーションなしで投稿に対する読者の感情反応に関する高精度な証拠を提供する。ヒューマンアノテーションとは異なり、ここで記述したアノテーションプロセスは、特に新しい感情的な反応のために、メソッドのカバレッジを拡大するために繰り返します。本稿では,CAREアノテーションがクラウドソースアノテーションと良好に比較できることを示す実験について述べる。最後に、CARE$_{db}$を使用して、競合するBERTベースのモデルをトレーニングし、感情検出だけでなく、関連するタスクに対するデータセットの有用性を実証する。

Social media plays an increasing role in our communication with friends and family, and our consumption of information and entertainment. Hence, to design effective ranking functions for posts on social media, it would be useful to predict the affective response to a post (e.g., whether the user is likely to be humored, inspired, angered, informed). Similar to work on emotion recognition (which focuses on the affect of the publisher of the post), the traditional approach to recognizing affective response would involve an expensive investment in human annotation of training data. We introduce CARE$_{db}$, a dataset of 230k social media posts annotated according to 7 affective responses using the Common Affective Response Expression (CARE) method. The CARE method is a means of leveraging the signal that is present in comments that are posted in response to a post, providing high-precision evidence about the affective response of the readers to the post without human annotation. Unlike human annotation, the annotation process we describe here can be iterated upon to expand the coverage of the method, particularly for new affective responses. We present experiments that demonstrate that the CARE annotations compare favorably with crowd-sourced annotations. Finally, we use CARE$_{db}$ to train competitive BERT-based models for predicting affective response as well as emotion detection, demonstrating the utility of the dataset for related tasks.

翻訳日:2022-01-31 14:51:10 公開日:2022-01-28

# NLPのためのセキュアで効率的なフェデレート学習フレームワーク

A Secure and Efficient Federated Learning Framework for NLP ( http://arxiv.org/abs/2201.11934v1 )

ライセンス: Link先を確認

Jieren Deng, Chenghong Wang, Xianrui Meng, Yijue Wang, Ji Li, Sheng Lin, Shuo Han, Fei Miao, Sanguthevar Rajasekaran, Caiwen Ding

(参考訳) 本稿では,安全かつ効率的な連合学習(fl)フレームワークの設計について考察する。既存のソリューションは、信頼できるアグリゲータを含むか、重厚な暗号プリミティブを必要とする。さらに、既存のセキュアなFL設計の多くは、トレーニングプロトコルからクライアントを排除できないという制限的な仮定の下でのみ機能します。これらの問題に対処するために,(1)信頼エンティティの必要性をなくすセキュアで効率的なFLフレームワークSEFLを提案し,(2)既存のFL設計と類似したモデル精度を達成し,(3)クライアントのドロップアウトに対して耐性を持つ。自然言語処理(NLP)タスクに関する広範な実験的研究を通じて,SEFLが既存のFLソリューションと同等の精度を実現し,提案手法により実行時の性能を最大13.7倍に向上させることができることを示した。

In this work, we consider the problem of designing secure and efficient federated learning (FL) frameworks. Existing solutions either involve a trusted aggregator or require heavyweight cryptographic primitives, which degrades performance significantly. Moreover, many existing secure FL designs work only under the restrictive assumption that none of the clients can be dropped out from the training protocol. To tackle these problems, we propose SEFL, a secure and efficient FL framework that (1) eliminates the need for the trusted entities; (2) achieves similar and even better model accuracy compared with existing FL designs; (3) is resilient to client dropouts. Through extensive experimental studies on natural language processing (NLP) tasks, we demonstrate that the SEFL achieves comparable accuracy compared to existing FL solutions, and the proposed pruning technique can improve runtime performance up to 13.7x.

翻訳日:2022-01-31 14:50:45 公開日:2022-01-28

# 音声言語理解におけるセット予測のためのエンドツーエンドモデルの改善

Improving End-to-End Models for Set Prediction in Spoken Language Understanding ( http://arxiv.org/abs/2201.12105v1 )

ライセンス: Link先を確認

Hong-Kwang J. Kuo, Zoltan Tuske, Samuel Thomas, Brian Kingsbury, George Saon

(参考訳) 音声言語理解システム(SLU)の目標は,入力音声信号の意味を決定することである。エンド・ツー・エンド(E2E)音声モデリングの進歩により、動詞の転写よりもはるかに安価に収集できるセマンティック・エンティティのみを訓練できるようになった。我々は、エンティティの順序が未定であるこのセットの予測問題に焦点を当てる。 RNNトランスデューサとアテンションベースエンコーダ-デコーダの2種類のE2Eモデルを用いて,トレーニングエンティティシーケンスを音声順に並べた場合,これらのモデルが最もよく動作することを示す。エンティティ音声の順序が不明な場合、E2E SLUモデルを改善するために、暗黙の注意に基づくアライメント手法とともに、新しいデータ拡張手法を提案する。 F1スコアは、RNN-Tで11%以上増加し、アテンションベースのエンコーダデコーダSLUモデルで約2%増加した。

The goal of spoken language understanding (SLU) systems is to determine the meaning of the input speech signal, unlike speech recognition which aims to produce verbatim transcripts. Advances in end-to-end (E2E) speech modeling have made it possible to train solely on semantic entities, which are far cheaper to collect than verbatim transcripts. We focus on this set prediction problem, where entity order is unspecified. Using two classes of E2E models, RNN transducers and attention based encoder-decoders, we show that these models work best when the training entity sequence is arranged in spoken order. To improve E2E SLU models when entity spoken order is unknown, we propose a novel data augmentation technique along with an implicit attention based alignment method to infer the spoken order. F1 scores significantly increased by more than 11% for RNN-T and about 2% for attention based encoder-decoder SLU models, outperforming previously reported results.

翻訳日:2022-01-31 14:50:30 公開日:2022-01-28

# 安全強化学習のための制約付き変分政策最適化

Constrained Variational Policy Optimization for Safe Reinforcement Learning ( http://arxiv.org/abs/2201.11927v1 )

ライセンス: Link先を確認

Zuxin Liu, Zhepeng Cen, Vladislav Isenbaev, Wei Liu, Zhiwei Steven Wu, Bo Li, Ding Zhao

(参考訳) 安全強化学習(RL)は、安全クリティカルなアプリケーションにデプロイする前に、一定の制約を満たすポリシーを学ぶことを目的としている。一般的な制約付き最適化フレームワークであるprimal-dualは不安定な問題に苦しんでおり、最適性の保証が欠けている。本稿では,新しい確率的推論の観点から問題を克服し,安全政策を学習するための期待最大化方式を提案する。安全なRL問題は分解可能であることを示す。 1)非パラメトリック変分分布をもつ凸最適化位相と 2)教師付き学習段階。最適性と政策改善の安定性を証明し,制約付き変動政策最適化の独特な利点を示す。連続ロボットタスクに関する幅広い実験により,本手法は本手法よりも制約満足度とサンプル効率の点で有意に優れた性能が得られることが示された。

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying to safety-critical applications. Primal-dual as a prevalent constrained optimization framework suffers from instability issues and lacks optimality guarantees. This paper overcomes the issues from a novel probabilistic inference perspective and proposes an Expectation-Maximization style approach to learn safe policy. We show that the safe RL problem can be decomposed to 1) a convex optimization phase with a non-parametric variational distribution and 2) a supervised learning phase. We show the unique advantages of constrained variational policy optimization by proving its optimality and policy improvement stability. A wide range of experiments on continuous robotic tasks show that the proposed method achieves significantly better performance in terms of constraint satisfaction and sample efficiency than primal-dual baselines.

翻訳日:2022-01-31 14:49:14 公開日:2022-01-28

# FCMNet: Full Communication Memory Net]{FCMNet: Full Communication Memory Net for Team-Level Cooperation in Multi-Agent Systems

FCMNet: Full Communication Memory Net]{FCMNet: Full Communication Memory Net for Team-Level Cooperation in Multi-Agent Systems ( http://arxiv.org/abs/2201.11994v1 )

ライセンス: Link先を確認

Yutong Wang and Guillaume Sartoretti

(参考訳) 部分観測可能なマルチエージェントシステムにおける分散協調は、エージェント間の効果的な通信を必要とする。この取り組みをサポートするため、本研究は、グローバルコミュニケーションが利用可能だが信頼性に欠ける可能性のある問題のクラスに焦点を当てている。エージェントが同時に学習できる強化学習ベースのアプローチであるFCMNetを導入する。 a) 効果的なマルチホップ通信プロトコル及び b)チームレベルの意思決定を可能にする共通の分散型政策。具体的には,エージェント間の通信メッセージとして,複数方向リカレントニューラルネットワークの隠れ状態を利用する。単純なマルチホップトポロジーを用いて,各エージェントに,各エージェントがシーケンシャルにエンコードした情報を各時間ステップ毎に受信する能力を与え,グローバルな協調性を改善する。 FCMNetは、共有報酬を伴うStarCraft IIマイクロマネジメントタスクの挑戦的なセットと、個別報酬を伴う協調的なマルチエージェントパスフィンディングタスクを実証する。そこで本研究では,FCMNetがStarCraft IIマイクロマネジメントタスクにおいて,最先端のコミュニケーションに基づく強化学習手法と,特定のタスクにおける価値分解手法より優れていることを示す。さらに,ランダムなメッセージ損失や2元化メッセージ(非微分可能通信チャネル)といった現実的通信障害下でのfcmnetのロバスト性について検討し,様々な実環境下でのロボットタスクへのfmcnetの適用可能性を示す。

Decentralized cooperation in partially-observable multi-agent systems requires effective communications among agents. To support this effort, this work focuses on the class of problems where global communications are available but may be unreliable, thus precluding differentiable communication learning methods. We introduce FCMNet, a reinforcement learning based approach that allows agents to simultaneously learn a) an effective multi-hop communications protocol and b) a common, decentralized policy that enables team-level decision-making. Specifically, our proposed method utilizes the hidden states of multiple directional recurrent neural networks as communication messages among agents. Using a simple multi-hop topology, we endow each agent with the ability to receive information sequentially encoded by every other agent at each time step, leading to improved global cooperation. We demonstrate FCMNet on a challenging set of StarCraft II micromanagement tasks with shared rewards, as well as a collaborative multi-agent pathfinding task with individual rewards. There, our comparison results show that FCMNet outperforms state-of-the-art communication-based reinforcement learning methods in all StarCraft II micromanagement tasks, and value decomposition methods in certain tasks. We further investigate the robustness of FCMNet under realistic communication disturbances, such as random message loss or binarized messages (i.e., non-differentiable communication channels), to showcase FMCNet's potential applicability to robotic tasks under a variety of real-world conditions.

翻訳日:2022-01-31 14:49:04 公開日:2022-01-28

# バイオインスパイアされたCortexベースの高速コードブック生成

Bioinspired Cortex-based Fast Codebook Generation ( http://arxiv.org/abs/2201.12322v1 )

ライセンス: Link先を確認

Meric Yucel, Serdar Bagis, Ahmet Sertbas, Mehmet Sarikaya, Burak Berk Ustundag

(参考訳) 人工知能の主な原型は、一般化性能を高めながら時間効率と正確性を促進するアルゴリズムの開発である。機械学習の最近の発展にもかかわらず、重要な制限は初期データから非効率な特徴抽出であり、これは性能最適化に不可欠である。本稿では,脳内の知覚皮質ネットワークに触発された特徴抽出手法を提案する。バイオインスパイアされた皮質と呼ばれるこのアルゴリズムは、圧縮された形式でデータを処理しながら、優れた計算効率でストリーミング信号からの直交的特徴への収束を提供する。本稿では,Birch,GMM,K-meansなどの一般的なクラスタリングアルゴリズムと比較し,人工的な複雑なデータを用いた新しいアルゴリズムの性能を示す。データ処理時間は大幅に短縮されるが、数秒対時間では符号化歪みは、より一般化の基盤となる新しいアルゴリズムで本質的に同じである。ここでは、クラスタリングとベクトル量子化における大脳皮質モデルの優れた性能を示すが、推論、異常検出、大範囲アプリケーションでの分類、例えば金融、サイバーセキュリティ、医療といった機械学習の基本コンポーネントに強力な実装機会を提供する。

A major archetype of artificial intelligence is developing algorithms facilitating temporal efficiency and accuracy while boosting the generalization performance. Even with the latest developments in machine learning, a key limitation has been the inefficient feature extraction from the initial data, which is essential in performance optimization. Here, we introduce a feature extraction method inspired by sensory cortical networks in the brain. Dubbed as bioinspired cortex, the algorithm provides convergence to orthogonal features from streaming signals with superior computational efficiency while processing data in compressed form. We demonstrate the performance of the new algorithm using artificially created complex data by comparing it with the commonly used traditional clustering algorithms, such as Birch, GMM, and K-means. While the data processing time is significantly reduced, seconds versus hours, encoding distortions remain essentially the same in the new algorithm providing a basis for better generalization. Although we show herein the superior performance of the cortex model in clustering and vector quantization, it also provides potent implementation opportunities for machine learning fundamental components, such as reasoning, anomaly detection and classification in large scope applications, e.g., finance, cybersecurity, and healthcare.

翻訳日:2022-01-31 14:48:36 公開日:2022-01-28

# 連続行動空間における政策鏡の隠れバイアスについて

On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces ( http://arxiv.org/abs/2201.12332v1 )

ライセンス: Link先を確認

Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian Sadler, Pratap Tokekar, Alec Koppel

(参考訳) 連続行動空間上での強化学習のためのパラメータ化政策探索に着目した。典型的には、政策に付随するスコア関数が有界であると仮定するが、これはガウスの政策でさえ保持できない。この問題に適切に対処するには、境界のある領域を定量化する探索許容パラメータを導入する必要がある。このようなことは、期待される方針勾配ノルムの減衰率に現れる永続的なバイアスを生じさせ、これは作用空間の半径に逆比例する。この隠れたバイアスを軽減するために、境界スコア関数を示すヘビーテールのポリシーパラメータ化が用いられるが、アルゴリズム更新の不安定性を引き起こす可能性がある。そこで本研究では,重み付きパラメータ化下でのポリシー勾配アルゴリズムの収束について検討し,ミラーアセント型更新と勾配追跡を組み合わせることで安定化する手法を提案する。我々の理論的な主な貢献は、このスキームが一定のステップとバッチサイズに収束すること、一方、以前の研究ではこれらのパラメータをそれぞれnullに縮小するか無限大に成長させる必要があることである。実験的に、重み付きポリシーパラメータ化の下でこのスキームは、標準ベンチマークと比べて様々な設定で報酬の蓄積が改善される。

We focus on parameterized policy search for reinforcement learning over continuous action spaces. Typically, one assumes the score function associated with a policy is bounded, which {fails to hold even for Gaussian policies. } To properly address this issue, one must introduce an exploration tolerance parameter to quantify the region in which it is bounded. Doing so incurs a persistent bias that appears in the attenuation rate of the expected policy gradient norm, which is inversely proportional to the radius of the action space. To mitigate this hidden bias, heavy-tailed policy parameterizations may be used, which exhibit a bounded score function, but doing so can cause instability in algorithmic updates. To address these issues, in this work, we study the convergence of policy gradient algorithms under heavy-tailed parameterizations, which we propose to stabilize with a combination of mirror ascent-type updates and gradient tracking. Our main theoretical contribution is the establishment that this scheme converges with constant step and batch sizes, whereas prior works require these parameters to respectively shrink to null or grow to infinity. Experimentally, this scheme under a heavy-tailed policy parameterization yields improved reward accumulation across a variety of settings as compared with standard benchmarks.

翻訳日:2022-01-31 14:48:17 公開日:2022-01-28

# テンソル分解による一貫した協調フィルタリング

Consistent Collaborative Filtering via Tensor Decomposition ( http://arxiv.org/abs/2201.11936v1 )

ライセンス: Link先を確認

Shiwen Zhao, Charles Crissman, Guillermo R Sapiro

(参考訳) コラボレーティブフィルタリングは、ユーザのアクティビティを分析し、アイテムのレコメンデーションシステムを構築するためのデファクトスタンダードである。本研究では,暗黙的フィードバックに基づく協調フィルタリングの新しいモデルであるsliced anti-symmetric decomposition (sad)を開発した。ユーザ(ユーザベクター)とアイテム(テムベクター)の潜伏表現を推定する従来の手法とは対照的に、SADはユーザ-テムインタラクションの3方向テンソルビューを使用して、各項目に1つの潜伏ベクトルを導入する。この新たなベクターは、標準ドット製品によって計算されたユーザ-項目の嗜好を一般的な内部製品に拡張し、相対的な嗜好を評価する際にアイテム間の相互作用を生成する。 SADはベクトルが1に崩壊すると、最先端(SOTA)協調フィルタリングモデルに還元されるが、本論文では、その値をデータから推定する。提案したSADモデルは単純で,グループ確率勾配降下法(SGD)アルゴリズムが有効である。我々は,100万以上のユーザ・イテムインタラクションを含むシミュレーションおよび実世界のデータセットにおいて,SADの効率を実証する。 SADを7種類のSOTA協調フィルタリングモデルと比較することにより、SADはパーソナライズされた好みをより一貫して推定できることを示す。

Collaborative filtering is the de facto standard for analyzing users' activities and building recommendation systems for items. In this work we develop Sliced Anti-symmetric Decomposition (SAD), a new model for collaborative filtering based on implicit feedback. In contrast to traditional techniques where a latent representation of users (user vectors) and items (item vectors) are estimated, SAD introduces one additional latent vector to each item, using a novel three-way tensor view of user-item interactions. This new vector extends user-item preferences calculated by standard dot products to general inner products, producing interactions between items when evaluating their relative preferences. SAD reduces to state-of-the-art (SOTA) collaborative filtering models when the vector collapses to one, while in this paper we allow its value to be estimated from data. The proposed SAD model is simple, resulting in an efficient group stochastic gradient descent (SGD) algorithm. We demonstrate the efficiency of SAD in both simulated and real world datasets containing over 1M user-item interactions. By comparing SAD with seven alternative SOTA collaborative filtering models, we show that SAD is able to more consistently estimate personalized preferences.

翻訳日:2022-01-31 14:47:57 公開日:2022-01-28

# BCDAG:ガウスDAGのベイズ構造と因果学習のためのRパッケージ

BCDAG: An R package for Bayesian structure and Causal learning of Gaussian DAGs ( http://arxiv.org/abs/2201.12003v1 )

ライセンス: Link先を確認

Federico Castelletti and Alessandro Mascaro

(参考訳) 有向非巡回グラフ(英語版)(dags)は多変量設定における変数間の因果関係をモデル化するための強力な枠組みを提供する。この設定では、データからDAG構造を推定する過程を因果構造学習または因果構造発見と呼ぶ。ガウス観測データからベイジアン因果発見と因果効果推定のためのRパッケージであるBCDAGを導入し,カステレッティ&マスカロ (2021) が提案したマルコフ連鎖モンテカルロ (MCMC) 方式を実装した。我々の実装は、観測回数と、DAGが十分にスパースであるたびに、データセット内の変数の数で効率よくスケールする。また、収束診断や後部推論の可視化及び要約のための機能も提供する。本稿では,BCDAGの実装とともに,基礎となる方法論の重要な特徴について述べる。次に,実データとシミュレーションデータの両方において,主な機能とアルゴリズムを説明する。

Directed Acyclic Graphs (DAGs) provide a powerful framework to model causal relationships among variables in multivariate settings; in addition, through the do-calculus theory, they allow for the identification and estimation of causal effects between variables also from pure observational data. In this setting, the process of inferring the DAG structure from the data is referred to as causal structure learning or causal discovery. We introduce BCDAG, an R package for Bayesian causal discovery and causal effect estimation from Gaussian observational data, implementing the Markov chain Monte Carlo (MCMC) scheme proposed by Castelletti & Mascaro (2021). Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, with the number of variables in the dataset. The package also provides functions for convergence diagnostics and for visualizing and summarizing posterior inference. In this paper, we present the key features of the underlying methodology along with its implementation in BCDAG. We then illustrate the main functions and algorithms on both real and simulated datasets.

翻訳日:2022-01-31 14:47:33 公開日:2022-01-28

# コンフォメーション予測による専門家予測の改善

Provably Improving Expert Predictions with Conformal Prediction ( http://arxiv.org/abs/2201.12006v1 )

ライセンス: Link先を確認

Eleni Straitouri and Lequng Wang and Nastaran Okati and Manuel Gomez Rodriguez

(参考訳) 自動意思決定支援システムは、人間の専門家がより効率的に正確にタスクを解決できるようにする。しかし、既存のシステムは一般に専門家に、いつエージェンシーをシステムに割譲するか、いつ独自のエージェンシーを行使するかを理解する必要がある。さらに、専門家がシステムに対する誤った信頼を育むと、パフォーマンスが悪化する可能性がある。この作業では、上記の要件を引き上げ、設計上、専門家がいつパフォーマンスを確実に向上させるかを理解する必要のない自動意思決定支援システムを開発する。この目的のために,マルチクラス分類タスクに着目し,各データサンプルに対してラベルのサブセットを人間エキスパートに推薦するために分類器を使用する自動決定支援システムを検討する。まず,そのようなシステムの設計を共形予測の観点から見ることにより,ラベルの推奨部分集合が真のラベルを含む確率が,ほぼ正確にターゲット確率値に一致することを確かめる。そこで,提案するサブセット内のラベルの予測が極めて良好であるターゲット確率値のセットを特定し,最適に近いターゲット確率値を求めるための効率的な実用的な方法を開発した。合成データと実データを用いた実験により,本システムはより正確な予測を行うことができ,それに依存する分類器の精度にロバストであることが証明された。

Automated decision support systems promise to help human experts solve tasks more efficiently and accurately. However, existing systems typically require experts to understand when to cede agency to the system or when to exercise their own agency. Moreover, if the experts develop a misplaced trust in the system, their performance may worsen. In this work, we lift the above requirement and develop automated decision support systems that, by design, do not require experts to understand when to trust them to provably improve their performance. To this end, we focus on multiclass classification tasks and consider automated decision support systems that, for each data sample, use a classifier to recommend a subset of labels to a human expert. We first show that, by looking at the design of such systems from the perspective of conformal prediction, we can ensure that the probability that the recommended subset of labels contains the true label matches almost exactly a target probability value. Then, we identify the set of target probability values under which the human expert is provably better off predicting a label among those in the recommended subset and develop an efficient practical method to find a near-optimal target probability value. Experiments on synthetic and real data demonstrate that our system can help the experts make more accurate predictions and is robust to the accuracy of the classifier it relies on.

翻訳日:2022-01-31 14:47:14 公開日:2022-01-28

# ニューラルネットワークの深さとターゲット関数の局所性の間の相互作用

Interplay between depth of neural networks and locality of target functions ( http://arxiv.org/abs/2201.12082v1 )

ライセンス: Link先を確認

Takashi Mori, Masahito Ueda

(参考訳) 過パラメータの深層ニューラルネットワーク(dnn)は、さまざまな機械学習タスクにおいて驚くほど優れた一般化性能を示すことが認識されている。近似理論や統計的学習理論など,様々な観点から深度の利点が研究されてきたが,既存の理論では過パラメータDNNの実証的成功を十分に説明できない。本稿では,対象関数の深さと局所性との間に顕著な相互作用を示す。我々は、k$-local と $k$-global 関数を導入し、深さは局所関数の学習に有用であるが、グローバル関数の学習に不利であることを見出した。この相互作用は、遅延学習システム内で無限に広いニューラルネットワークを記述するニューラルネットワークによって適切にキャプチャされない。

It has been recognized that heavily overparameterized deep neural networks (DNNs) exhibit surprisingly good generalization performance in various machine-learning tasks. Although benefits of depth have been investigated from different perspectives such as the approximation theory and the statistical learning theory, existing theories do not adequately explain the empirical success of overparameterized DNNs. In this work, we report a remarkable interplay between depth and locality of a target function. We introduce $k$-local and $k$-global functions, and find that depth is beneficial for learning local functions but detrimental to learning global functions. This interplay is not properly captured by the neural tangent kernel, which describes an infinitely wide neural network within the lazy learning regime.

翻訳日:2022-01-31 14:46:51 公開日:2022-01-28

# 不完全な測定から学ぶためのサンプリング定理

Sampling Theorems for Learning from Incomplete Measurements ( http://arxiv.org/abs/2201.12151v1 )

ライセンス: Link先を確認

Juli\'an Tachella, Dongdong Chen and Mike Davies

(参考訳) 多くの実世界の環境では、学習に問題を引き起こす可能性のある不完全な測定データのみが利用可能である。固定不完全測定プロセスを用いた信号モデルの教師なし学習は一般に不可能であり、測定演算子のヌルスペースには情報がない。この制限は、複数の演算子の測定によって克服できる。このアイデアは様々な応用でうまく適用されているが、学習条件の正確なキャラクタリゼーションはまだ不足している。本稿では,このギャップを埋めるために,個別計測演算子$g$,オペレータあたりの計測回数$m$,モデル$k$の次元,信号の次元$n$との相互作用を示す信号モデルを学ぶための必要十分条件を提示する。特に,各演算子が少なくとも$m>k+n/G$の測定値を得た場合,一般教師なし学習が可能であることを示す。結果は学習アルゴリズムに依存せず,低ランク行列回復からディープニューラルネットワークまで,多岐にわたる実用的なアルゴリズムに影響を与えている。

In many real-world settings, only incomplete measurement data are available which can pose a problem for learning. Unsupervised learning of the signal model using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multiple operators. While this idea has been successfully applied in various applications, a precise characterization of the conditions for learning is still lacking. In this paper, we fill this gap by presenting necessary and sufficient conditions for learning the signal model which indicate the interplay between the number of distinct measurement operators $G$, the number of measurements per operator $m$, the dimension of the model $k$ and the dimension of the signals $n$. In particular, we show that generically unsupervised learning is possible if each operator obtains at least $m>k+n/G$ measurements. Our results are agnostic of the learning algorithm and have implications in a wide range of practical algorithms, from low-rank matrix recovery to deep neural networks.

翻訳日:2022-01-31 14:46:38 公開日:2022-01-28

# (参考訳) Optimal Transport Tools (OTT): Wasserstein のすべてのもののための JAX ツールボックス

Optimal Transport Tools (OTT): A JAX Toolbox for all things Wasserstein ( http://arxiv.org/abs/2201.12324v1 )

ライセンス: CC BY 4.0

Marco Cuturi, Laetitia Meng-Papaxanthos, Yingtao Tian, Charlotte Bunne, Geoff Davis, Olivier Teboul

(参考訳) 最適なトランスポートツール(OTT-JAX)は、ポイントクラウドとヒストグラム間の最適なトランスポート問題を解決するPythonツールボックスである。ツールボックスは、自動およびカスタムのリバースモードの差別化、ベクタライゼーション、ジャスト・イン・タイムのコンパイル、アクセラレータのサポートなど、さまざまなJAX機能をベースにしている。このツールボックスは、正規化ot問題の解法や、barycenters、gromov-wasserstein、low-rank solvers、凸写像の推定、分位数と階数の微分可能な一般化、ガウス混合物間の近似otといった、より高度な拡張といった基本的な計算を扱っている。ツールボックスコードは \texttt{https://github.com/ott-jax/ott} で入手できる。

Optimal transport tools (OTT-JAX) is a Python toolbox that can solve optimal transport problems between point clouds and histograms. The toolbox builds on various JAX features, such as automatic and custom reverse mode differentiation, vectorization, just-in-time compilation and accelerators support. The toolbox covers elementary computations, such as the resolution of the regularized OT problem, and more advanced extensions, such as barycenters, Gromov-Wasserstein, low-rank solvers, estimation of convex maps, differentiable generalizations of quantiles and ranks, and approximate OT between Gaussian mixtures. The toolbox code is available at \texttt{https://github.com/ott-jax/ott}

翻訳日:2022-01-31 14:45:37 公開日:2022-01-28

# 3D-FlowNet:3次元表現を用いたイベントベース光フロー推定

3D-FlowNet: Event-based optical flow estimation with 3D representation ( http://arxiv.org/abs/2201.12265v1 )

ライセンス: Link先を確認

Haixin Sun, Minh-Quan Dao, Vincent Fremont

(参考訳) イベントベースのカメラは、低い照明条件下での自動運転車のナビゲーション中の高速モーション検出などの重要なタスクのために、フレームベースのカメラの制限を克服することができる。イベントカメラの高時間分解能と高ダイナミックレンジにより、速い動きと極端な光のシナリオで作業することができる。しかし、Deep Neural Networksのような従来のコンピュータビジョン手法は、非同期で離散的なイベントデータを扱うには適していない。さらに、イベントデータに対する従来の2Dエンコーディング表現手法は、時間分解能を犠牲にする。本稿では,まず,事象の時間分布をよりよく保存するために,それを3次元に拡張して2次元符号化表現を改善する。次に,3次元入力表現を処理し,新たな符号化手法に従って光フロー推定を出力するネットワークアーキテクチャである3D-FlowNetを提案する。イベントベースカメラのラベル付きデータセットの欠如を補うために、セルフ教師付きトレーニング戦略が採用されている。最後に,提案ネットワークをmvsec(multi-vehicle stereo event camera)データセットを用いてトレーニングし,評価する。その結果、私たちの3D-FlowNetは、トレーニングエポックの少ない最先端のアプローチ(Spike-FlowNetの100に対して30)よりも優れています。

Event-based cameras can overpass frame-based cameras limitations for important tasks such as high-speed motion detection during self-driving cars navigation in low illumination conditions. The event cameras' high temporal resolution and high dynamic range, allow them to work in fast motion and extreme light scenarios. However, conventional computer vision methods, such as Deep Neural Networks, are not well adapted to work with event data as they are asynchronous and discrete. Moreover, the traditional 2D-encoding representation methods for event data, sacrifice the time resolution. In this paper, we first improve the 2D-encoding representation by expanding it into three dimensions to better preserve the temporal distribution of the events. We then propose 3D-FlowNet, a novel network architecture that can process the 3D input representation and output optical flow estimations according to the new encoding methods. A self-supervised training strategy is adopted to compensate the lack of labeled datasets for the event-based camera. Finally, the proposed network is trained and evaluated with the Multi-Vehicle Stereo Event Camera (MVSEC) dataset. The results show that our 3D-FlowNet outperforms state-of-the-art approaches with less training epoch (30 compared to 100 of Spike-FlowNet).

翻訳日:2022-01-31 14:38:04 公開日:2022-01-28

# ニューロモルフィック転倒検出と行動認識データセットのベンチマーク標準ビジョンモデル

Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset ( http://arxiv.org/abs/2201.12285v1 )

ライセンス: Link先を確認

Karthik Sivarama Krishnan and Koushik Sivarama Krishnan

(参考訳) ニューロモルフィックな視覚ベースのセンサーは近年、低消費電力で時空間イベントをキャプチャする能力で人気が高まっている。これらのセンサーは、記録されている被写体のプライバシーを守るのに役立つ従来のカメラのイベントやスパイクを記録する。これらのイベントはピクセル毎の輝度変化としてキャプチャされ、出力データストリームは時間、位置、ピクセルの強度変化情報でエンコードされる。本稿では,ニューロモルフィックな人間の行動認識と転倒検出データセットに関する,微調整された従来の視覚モデルの性能を評価・評価する。ダイナミックビジョンセンシングカメラからの時空間イベントストリームは、標準シーケンス画像フレームに符号化される。これらのビデオフレームは、従来のディープラーニングベースのアーキテクチャのベンチマークに使用される。提案手法では,DVS-R2+1D,DVS-CSN,DVS-C2D,DVS-SlowFast,DVS-X3D,DVS-MViTと命名した。これらのモデルの性能を比較すると、現在の最先端のMViTベースのアーキテクチャDVS-MViTは0.958の精度とF-1スコアの0.958の精度で他のモデルよりも優れています。 2つ目はDVS-C2Dで、精度0.916、F-1スコア0.916である。第3と第4はDVS-R2+1DとDVS-SlowFastで、精度は0.875と0.833とF-1スコアは0.875と0.861である。 DVS-CSNとDVS-X3Dは0.708と0.625で、F1スコアは0.722と0.625である。

Neuromorphic vision-based sensors are gaining popularity in recent years with their ability to capture Spatio-temporal events with low power sensing. These sensors record events or spikes over traditional cameras which helps in preserving the privacy of the subject being recorded. These events are captured as per-pixel brightness changes and the output data stream is encoded with time, location, and pixel intensity change information. This paper proposes and benchmarks the performance of fine-tuned conventional vision models on neuromorphic human action recognition and fall detection datasets. The Spatio-temporal event streams from the Dynamic Vision Sensing cameras are encoded into a standard sequence image frames. These video frames are used for benchmarking conventional deep learning-based architectures. In this proposed approach, we fine-tuned the state-of-the-art vision models for this Dynamic Vision Sensing (DVS) application and named these models as DVS-R2+1D, DVS-CSN, DVS-C2D, DVS-SlowFast, DVS-X3D, and DVS-MViT. Upon comparing the performance of these models, we see the current state-of-the-art MViT based architecture DVS-MViT outperforms all the other models with an accuracy of 0.958 and an F-1 score of 0.958. The second best is the DVS-C2D with an accuracy of 0.916 and an F-1 score of 0.916. Third and Fourth are DVS-R2+1D and DVS-SlowFast with an accuracy of 0.875 and 0.833 and F-1 score of 0.875 and 0.861 respectively. DVS-CSN and DVS-X3D were the least performing models with an accuracy of 0.708 and 0.625 and an F1 score of 0.722 and 0.625 respectively.

翻訳日:2022-01-31 14:37:43 公開日:2022-01-28

# コーディネートドメインエンコーダとペア分類器による複数ソースドメイン適応

Multiple-Source Domain Adaptation via Coordinated Domain Encoders and Paired Classifiers ( http://arxiv.org/abs/2201.11870v1 )

ライセンス: Link先を確認

Payam Karisani

(参考訳) ドメインシフト下でのテキスト分類のための新しいマルチソース教師なしモデルを提案する。我々のモデルは文書表現における更新率を利用してドメインエンコーダを動的に統合する。また、ソース分類器のペア化のために、ターゲット領域のエラー率を推定するために確率的ヒューリスティックを用いる。我々のヒューリスティックは、対象特徴空間におけるデータ変換コストと分類器の精度を利用する。我々は,本アルゴリズムの有効性を評価するために,ドメイン適応の現実シナリオを用いた。また,事前学習された多層トランスフォーマを文書エンコーダとして使用し,事前学習によってドメイン適応モデルによる改善が達成可能かどうかを実証した。実験では、この設定で私たちのモデルが最もパフォーマンスの高いアプローチであることを証明します。

We present a novel multiple-source unsupervised model for text classification under domain shift. Our model exploits the update rates in document representations to dynamically integrate domain encoders. It also employs a probabilistic heuristic to infer the error rate in the target domain in order to pair source classifiers. Our heuristic exploits data transformation cost and the classifier accuracy in the target feature space. We have used real world scenarios of Domain Adaptation to evaluate the efficacy of our algorithm. We also used pretrained multi-layer transformers as the document encoder in the experiments to demonstrate whether the improvement achieved by domain adaptation models can be delivered by out-of-the-box language model pretraining. The experiments testify that our model is the top performing approach in this setting.

翻訳日:2022-01-31 14:37:02 公開日:2022-01-28

# 自然言語生成のための生成協調ネットワーク

Generative Cooperative Networks for Natural Language Generation ( http://arxiv.org/abs/2201.12320v1 )

ライセンス: Link先を確認

Sylvain Lamprier and Thomas Scialom and Antoine Chaffin and Vincent Claveau and Ewa Kijak and Jacopo Staiano and Benjamin Piwowarski

(参考訳) GAN(Generative Adversarial Networks)は、特に画像生成の分野で、多くの連続生成タスクにおいて大きな成功を収めている。しかし、言語のような離散出力の場合、ガンの最適化は多くの不安定性を持つ未解決問題であり、判別器出力からジェネレータパラメータへの勾配を適切にバックプロパゲーションできない。言い換えると、ジェネレータネットワークを強化学習を通じて学習し、識別器信号を報酬として利用するが、そのような技術は報酬の移動と勾配問題に悩まされる。最後に、直接の最大様相のアプローチに比べれば、しばしば短くなる。本稿では,識別器アーキテクチャを協調的に使用する生成協調ネットワークと,手元のタスクに対して現実的なテキストのサンプルを出力する生成ポリシーを提案する。提案手法ではコンバージェンスを理論的に保証し、2つの主要なnlgタスクにおける最先端の成果を実証的に達成するために,様々な効率的な復号手法を検討する。

Generative Adversarial Networks (GANs) have known a tremendous success for many continuous generation tasks, especially in the field of image generation. However, for discrete outputs such as language, optimizing GANs remains an open problem with many instabilities, as no gradient can be properly back-propagated from the discriminator output to the generator parameters. An alternative is to learn the generator network via reinforcement learning, using the discriminator signal as a reward, but such a technique suffers from moving rewards and vanishing gradient problems. Finally, it often falls short compared to direct maximum-likelihood approaches. In this paper, we introduce Generative Cooperative Networks, in which the discriminator architecture is cooperatively used along with the generation policy to output samples of realistic texts for the task at hand. We give theoretical guarantees of convergence for our approach, and study various efficient decoding schemes to empirically achieve state-of-the-art results in two main NLG tasks.

翻訳日:2022-01-31 14:36:51 公開日:2022-01-28

# フェデレーション学習のための勾配マスク平均化

Gradient Masked Averaging for Federated Learning ( http://arxiv.org/abs/2201.11986v1 )

ライセンス: Link先を確認

Irene Tenison, Sai Aravind Sreeramadas, Vaikkunth Mugunthan, Edouard Oyallon, Eugene Belilovsky, Irina Rish

(参考訳) フェデレートラーニング(Federated Learning)は、異種データを持つ多数のクライアントが互いにデータを共有することなく、統一されたグローバルモデルの学習をコーディネートできるようにする、新たなパラダイムである。標準的なフェデレーション学習アルゴリズムは、サーバーのグローバルモデルを近似するためにモデルパラメータや勾配更新の平均化を伴う。しかし、不均一な設定における平均化は、情報損失をもたらし、支配的なクライアントによって引き起こされるバイアスによる一般化の低下につながる。 FL設定のように、非i.dデータセットをより一般化するためには、クライアント間で異なる急激なメカニズムを無視しながら、一定である不変なメカニズムを学習することに集中すべきである、という仮説を立てる。本研究では,分散型学習のための勾配マスク型平均化手法を,クライアント更新の標準平均化に代わるものとして提案する。このクライアント更新集約技術は、既存のほとんどのフェデレーションアルゴリズムのドロップイン代替として適用することができる。分散性,実世界性,分散性(最悪の場合として)のテストデータセットを備えた複数のflアルゴリズムに対して,勾配マスクによる広範囲な実験を行い,特にヘテロジニアスクライアントの場合において一貫した改善を提供することを示す。

Federated learning is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a unified global model without the need to share data amongst each other. Standard federated learning algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, in heterogeneous settings averaging can result in information loss and lead to poor generalization due to the bias induced by dominant clients. We hypothesize that to generalize better across non-i.i.d datasets as in FL settings, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent work in the Out-of-Distribution literature, we propose a gradient masked averaging approach for federated learning as an alternative to the standard averaging of client updates. This client update aggregation technique can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments with gradient masked approach on multiple FL algorithms with in-distribution, real-world, and out-of-distribution (as the worst case scenario) test dataset and show that it provides consistent improvements, particularly in the case of heterogeneous clients.

翻訳日:2022-01-31 14:35:05 公開日:2022-01-28

# 局所不変説明:局所不変学習による安定・一方向説明に向けて

Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning ( http://arxiv.org/abs/2201.12143v1 )

ライセンス: Link先を確認

Amit Dhurandhar, Karthikeyan Ramamurthy, Kartik Ahuja and Vijay Arya

(参考訳) ローカル解釈可能なモデル非依存説明(lime)メソッドは、例ごとにブラックボックスモデルを説明するために使われる最も一般的な方法の1つである。多くの変種が提案されているが、安定で直感的な高忠実度説明を生成する簡単な方法を提供するものはほとんどない。本研究では,不変リスク最小化(IRM)原理に着想を得たモデル非依存的局所的説明法を提案する。本手法は,理論上,ブラックボックス関数の勾配が説明したい例の局所性において突然符号が変化するような特徴を解消する傾向が強いことを理論的に示すゲーム理論定式化に基づいているが,他の場合ではより慎重であり,より保守的な(特徴)属性を選択する。実験では, ランダムな摂動を用いて生成した近傍における説明の質が, LIMEよりも優れており, また, データ多様体からサンプリングしたリアルな隣人を用いた他の手法に匹敵する場合もある。これは、写実的な隣人を作るか、説明を投影するために多様体を学ぶことは通常高価であるか、あるいは不可能であるかもしれないことを考慮すれば望ましい。さらに,本アルゴリズムは訓練が簡単かつ効率的であり,最近の研究で見られるような(部分的な)因果グラフなどのサイド情報にアクセスせずに,ブラックボックスの局所的な決定に対する安定した入力特徴を確認できる。

Locally interpretable model agnostic explanations (LIME) method is one of the most popular methods used to explain black-box models at a per example level. Although many variants have been proposed, few provide a simple way to produce high fidelity explanations that are also stable and intuitive. In this work, we provide a novel perspective by proposing a model agnostic local explanation method inspired by the invariant risk minimization (IRM) principle -- originally proposed for (global) out-of-distribution generalization -- to provide such high fidelity explanations that are also stable and unidirectional across nearby examples. Our method is based on a game theoretic formulation where we theoretically show that our approach has a strong tendency to eliminate features where the gradient of the black-box function abruptly changes sign in the locality of the example we want to explain, while in other cases it is more careful and will choose a more conservative (feature) attribution, a behavior which can be highly desirable for recourse. Empirically, we show on tabular, image and text data that the quality of our explanations with neighborhoods formed using random perturbations are much better than LIME and in some cases even comparable to other methods that use realistic neighbors sampled from the data manifold. This is desirable given that learning a manifold to either create realistic neighbors or to project explanations is typically expensive or may even be impossible. Moreover, our algorithm is simple and efficient to train, and can ascertain stable input features for local decisions of a black-box without access to side information such as a (partial) causal graph as has been seen in some recent works.

翻訳日:2022-01-31 14:34:41 公開日:2022-01-28

# 離散マルコフ決定過程における安全政策改善アプローチ

Safe Policy Improvement Approaches on Discrete Markov Decision Processes ( http://arxiv.org/abs/2201.12175v1 )

ライセンス: Link先を確認

Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft

(参考訳) 安全政策改善(SPI)は、学習方針が与えられた基準方針とほぼ同等であることを示すことを目的としている。 NadjahiらによるSoft Baseline Bootstrapping (Soft-SPIBB)によるSPI上に構築し、それらのアプローチにおける理論的問題を特定し、補正された理論を提案し、有限マルコフ決定過程(MDP)上で確実に安全な新しいアルゴリズムを導出する。さらに、2つの異なるベンチマークで、多くの最先端SPIアルゴリズムの中で最高の性能を示すヒューリスティックアルゴリズムを提供する。さらに,spiアルゴリズムの分類法を導入し,spiアルゴリズムの2つのクラスの興味深い特性を実証的に示した。

Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrapping (Soft-SPIBB) by Nadjahi et al., we identify theoretical issues in their approach, provide a corrected theory, and derive a new algorithm that is provably safe on finite Markov Decision Processes (MDP). Additionally, we provide a heuristic algorithm that exhibits the best performance among many state of the art SPI algorithms on two different benchmarks. Furthermore, we introduce a taxonomy of SPI algorithms and empirically show an interesting property of two classes of SPI algorithms: while the mean performance of algorithms that incorporate the uncertainty as a penalty on the action-value is higher, actively restricting the set of policies more consistently produces good policies and is, thus, safer.

翻訳日:2022-01-31 14:34:09 公開日:2022-01-28

# 楕円分布と欠測データとの混合に対するロバストかつ柔軟なEMアルゴリズム

A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data ( http://arxiv.org/abs/2201.12020v1 )

ライセンス: Link先を確認

Florian Mouret, Alexandre Hippert-Ferrer, Fr\'ed\'eric Pascal, Jean-Yves Tourneret

(参考訳) 本稿では,ノイズおよび非ガウスデータに対するデータインプテーションの欠如問題に対処する。ガウス混合モデルに対する古典的計算法である期待最大化(EM)アルゴリズムは、k-アネレスト近傍や連鎖方程式による多重計算のような他の一般的なアプローチと比較して興味深い性質を示している。しかし、ガウス混合モデルは不均一なデータに対して堅牢でないことが知られており、データが外れ値によって汚染されたり、非ガウス分布から来る場合、推定性能が低下する可能性がある。この問題を克服するために, 潜在的欠落データを扱う優れた特性を持つ楕円分布の混合について, 新たな期待最大化アルゴリズムについて検討した。楕円分布の混合に付随する全データ確率は、その条件分布によりemフレームワークによく適合しており、これは学生分布であることが示されている。合成データの実験的結果は,提案アルゴリズムが外れ値に対して頑健であり,非ガウスデータで使用可能であることを示す。さらに、実世界のデータセットで実施された実験は、このアルゴリズムが他の古典的計算法と比較して非常に競争力があることを示している。

This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be not robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or come from a non-Gaussian distributions. To overcome this issue, a new expectation maximization algorithm is investigated for mixtures of elliptical distributions with the nice property of handling potential missing data. The complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework thanks to its conditional distribution, which is shown to be a Student distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.

翻訳日:2022-01-31 14:33:50 公開日:2022-01-28

# 埋め込みラプラシア距離によるマルチスケールグラフの比較

Multiscale Graph Comparison via the Embedded Laplacian Distance ( http://arxiv.org/abs/2201.12064v1 )

ライセンス: Link先を確認

Edric Tam, David Dunson

(参考訳) 異なるサイズのグラフを比較するための単純かつ高速な手法を提案する。既存のアプローチは、しばしば同じ頂点数のグラフの比較に制限されるか、計算不可能である。我々は,潜在的に異なるサイズのグラフを比較するために,埋め込みラプラシアン距離(ELD)を提案する。我々のアプローチはまず、図形構造を尊重する共通の低次元ラプラシア埋め込み空間にグラフを投影する。これにより、ユークリッド空間内の点雲を比較する問題になってしまう。距離は自然スライスされたwassersteinアプローチによって効率的に計算できる。 ELDは擬測度であり、グラフ同型の下で不変であることを示す。スペクトルグラフ理論のツールを用いて,EDDの直観的な解釈を行う。シミュレーションデータと実データの両方を用いて, ELD アプローチの有効性を検証した。結果は良好である。

We introduce a simple and fast method for comparing graphs of different sizes. Existing approaches are often either limited to comparing graphs with the same number of vertices or are computationally unscalable. We propose the Embedded Laplacian Distance (ELD) for comparing graphs of potentially vastly different sizes. Our approach first projects the graphs onto a common, low-dimensional Laplacian embedding space that respects graphical structure. This reduces the problem to that of comparing point clouds in a Euclidean space. A distance can then be computed efficiently via a natural sliced Wasserstein approach. We show that the ELD is a pseudo-metric and is invariant under graph isomorphism. We provide intuitive interpretations of the ELD using tools from spectral graph theory. We test the efficacy of the ELD approach extensively on both simulated and real data. Results obtained are excellent.

翻訳日:2022-01-31 14:33:31 公開日:2022-01-28

# ループ内のドメインエキスパートによる近似ベイズ計算

Approximate Bayesian Computation with Domain Expert in the Loop ( http://arxiv.org/abs/2201.12090v1 )

ライセンス: Link先を確認

Ayush Bharti, Louis Filstroff, Samuel Kaski

(参考訳) 近似ベイズ計算(ABC: Approximate Bayesian calculation)は、難解な確率関数を持つモデルに対する確率自由推論法である。 ABC法は通常、観測データとシミュレーションデータの要約統計を比較することに頼っているため、統計の選択は不可欠である。この選択は、情報の喪失と次元の減少の間のトレードオフを伴い、しばしばドメイン知識に基づいて決定される。しかし、手作りと適切な統計の選択は、複数の試行錯誤のステップを伴う面倒な作業である。本研究では,abc統計選択のためのアクティブラーニング手法を導入し,ドメインエキスパートの作業量を大幅に削減する。専門家を巻き込むことで、既存の次元縮小法とは異なり、不特定モデルを扱うことができる。さらに,シミュレーション予算が制限された場合,既存手法よりも後方推定が優れていることを示す。

Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.

翻訳日:2022-01-31 14:31:53 公開日:2022-01-28

# 分子最適化法とバイアス還元評価法のinsilico評価におけるバイアス

Biases in In Silico Evaluation of Molecular Optimization Methods and Bias-Reduced Evaluation Methodology ( http://arxiv.org/abs/2201.12163v1 )

ライセンス: Link先を確認

Hiroshi Kajino, Kohei Miyaguchi, Takayuki Osogami

(参考訳) 分子最適化法におけるシリカ評価手法に興味がある。分子のサンプルとその性質を考慮に入れれば、ターゲットの性質に対して最適化された分子を見つけることができるエージェントを訓練するだけでなく、その性能も評価したい。一般的なプラクティスは、サンプルのターゲットプロパティの予測器をトレーニングし、エージェントのトレーニングと評価の両方に使用することである。この評価器は2つのバイアスを負う可能性がある。1つは予測器の誤特定と、もう1つはトレーニングや評価に同じサンプルを再利用することによるものである。各バイアスに対するバイアス低減手法を包括的に検討し,その効果を実証的に検討した。

We are interested in in silico evaluation methodology for molecular optimization methods. Given a sample of molecules and their properties of our interest, we wish not only to train an agent that can find molecules optimized with respect to the target property but also to evaluate its performance. A common practice is to train a predictor of the target property on the sample and use it for both training and evaluating the agent. We show that this evaluator potentially suffers from two biases; one is due to misspecification of the predictor and the other to reusing the same sample for training and evaluation. We discuss bias reduction methods for each of the biases comprehensively, and empirically investigate their effectiveness.

翻訳日:2022-01-31 14:31:42 公開日:2022-01-28

# ベイセンター推定のためのwasserstein反復ネットワーク

Wasserstein Iterative Networks for Barycenter Estimation ( http://arxiv.org/abs/2201.12245v1 )

ライセンス: Link先を確認

Alexander Korotin, Vage Egiazarian, Lingxiao Li, Evgeny Burnaev

(参考訳) ワッサーシュタインのバリセンターは、幾何学的に意味のある方法で確率測度の平均を表す能力によって人気を博している。本稿では,連続測度のwasserstein-2重心を生成モデルを用いて近似するアルゴリズムを提案する。従来のアプローチでは、バイアスを導入する正規化(エントロピー/クワッドラティック)や、大規模なタスクには不十分な入力凸ニューラルネットワークに依存していた。対照的に,本アルゴリズムではバイアスは導入せず,任意のニューラルネットワークを用いることができる。さらに、有名人の顔のデータセットに基づいて、FIDなどの生成モデルの標準指標を用いて、バリセンタアルゴリズムの定量的評価に使用できるAve, celeba!データセットを構築する。

Wasserstein barycenters have become popular due to their ability to represent the average of probability measures in a geometrically meaningful way. In this paper, we present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. Previous approaches rely on regularization (entropic/quadratic) which introduces bias or on input convex neural networks which are not expressive enough for large-scale tasks. In contrast, our algorithm does not introduce bias and allows using arbitrary neural networks. In addition, based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms by using standard metrics of generative models such as FID.

翻訳日:2022-01-31 14:31:29 公開日:2022-01-28

# 教師なし領域適応のためのラベルなしデータによる特徴のシャッフル強化

Shuffle Augmentation of Features from Unlabeled Data for Unsupervised Domain Adaptation ( http://arxiv.org/abs/2201.11963v1 )

ライセンス: Link先を確認

Changwei Xu, Jianfei Yang, Haoran Tang, Han Zou, Cheng Lu, Tianshuo Zhang

(参考訳) 対象サンプルのラベルが利用できない転写学習の分野であるUnsupervised Domain Adaptation (UDA) は, 近年, 逆学習モデルの助けを借りて, 広く研究・開発されている。既存のUDAアルゴリズムは、ニューラルネットワークを誘導して転送可能で識別可能な特徴を抽出するが、分類器はラベル付きソースデータの監督下でのみ訓練される。ソースドメインとターゲットドメインの区別が避けられないため、分類器はターゲットの分類境界をほとんど認識できない。本稿では,新たなUDAフレームワークであるShuffle Augmentation of Features (SAF)を提案する。 SAFはターゲットサンプルから学習し、クラス認識対象の特徴を適応的に蒸留し、クラス境界を見つけるために暗黙的に分類器を誘導する。広範な実験によって実証されたSAFモジュールは、既存のUDAモデルに組み込むことができ、性能改善を実現している。

Unsupervised Domain Adaptation (UDA), a branch of transfer learning where labels for target samples are unavailable, has been widely researched and developed in recent years with the help of adversarially trained models. Although existing UDA algorithms are able to guide neural networks to extract transferable and discriminative features, classifiers are merely trained under the supervision of labeled source data. Given the inevitable discrepancy between source and target domains, the classifiers can hardly be aware of the target classification boundaries. In this paper, Shuffle Augmentation of Features (SAF), a novel UDA framework, is proposed to address the problem by providing the classifier with supervisory signals from target feature representations. SAF learns from the target samples, adaptively distills class-aware target features, and implicitly guides the classifier to find comprehensive class borders. Demonstrated by extensive experiments, the SAF module can be integrated into any existing adversarial UDA models to achieve performance improvements.

翻訳日:2022-01-31 14:30:59 公開日:2022-01-28

# 一度だけカットする: 1回のカットでデータ拡張を増やす

You Only Cut Once: Boosting Data Augmentation with a Single Cut ( http://arxiv.org/abs/2201.12078v1 )

ライセンス: Link先を確認

Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li

(参考訳) データ拡張を行うためのYOCO(You Only Cut Once)を提案する。 YOCOは1つの画像を2つのピースに分割し、各ピース内で個別にデータ拡張を行う。 YOCOを適用することで、サンプルあたりの増補の多様性が向上し、ニューラルネットワークが部分的な情報からオブジェクトを認識することを奨励する。 YOCOはパラメータフリーで使いやすく、ほとんどすべての拡張を無償で行うことができる。その効果を評価するために徹底的な実験が行われている。我々はまず、YOCOが様々なデータ拡張、ニューラルネットワークアーキテクチャにシームレスに適用できることを実証し、CIFARとImageNetの分類タスクのパフォーマンス向上をもたらし、時には従来の画像レベルの拡張よりも大きなマージンを上回ります。さらに,複数のダウンストリームタスクにより良い転送が可能な,より強力な表現に向けて,YOCOによる事前学習の対照的なメリットを示す。最後に、複数のYOCOの変種を調査し、各設定の性能を実証的に分析する。コードはGitHubで入手できる。

We present You Only Cut Once (YOCO) for performing data augmentations. YOCO cuts one image into two pieces and performs data augmentations individually within each piece. Applying YOCO improves the diversity of the augmentation per sample and encourages neural networks to recognize objects from partial information. YOCO enjoys the properties of parameter-free, easy usage, and boosting almost all augmentations for free. Thorough experiments are conducted to evaluate its effectiveness. We first demonstrate that YOCO can be seamlessly applied to varying data augmentations, neural network architectures, and brings performance gains on CIFAR and ImageNet classification tasks, sometimes surpassing conventional image-level augmentation by large margins. Moreover, we show YOCO benefits contrastive pre-training toward a more powerful representation that can be better transferred to multiple downstream tasks. Finally, we study a number of variants of YOCO and empirically analyze the performance for respective settings. Code is available at GitHub.

翻訳日:2022-01-31 14:30:42 公開日:2022-01-28

# (参考訳) 自然言語によるテキスト分布の違いの要約

Summarizing Differences between Text Distributions with Natural Language ( http://arxiv.org/abs/2201.12323v1 )

ライセンス: CC BY 4.0

Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt

(参考訳) 2つのテキストの分布はどのように異なるのか? パターンの発見には、何百ものサンプルを退屈に読み込む必要があるからだ。 2つの分布 $d_{0}$ と $d_{1}$ が与えられたとき、我々はより頻繁に$d_{1}$、例えば "is military-related" で真となる記述を探す。この問題に対処するために、gpt-3を微調整して、プロンプトで記述する: "[samples of $d_{0}$] + [samples of $d_{1}$] + それらの間の差は ______ である。次に、学習した検証器でより大きなサンプルのセットを保持する頻度をチェックすることで、記述を再評価します。一方, GPT-3 Curie (13B) は人間のアノテーションに類似した記述しか生成しないのに対して, GPT-3 Curie (13B) は微調整と再ランクで61%, GPT-3 Davinci (175B) を用いたベストシステムは76%であった。本稿では,分散シフトの記述,データセットのショートカットのデバッグ,未知タスクの要約,テキストクラスタのラベル付け,自動生成した記述に基づく分析を行う。

How do two distributions of texts differ? Humans are slow at answering this, since discovering patterns might require tediously reading through hundreds of samples. We propose to automatically summarize the differences by "learning a natural language hypothesis": given two distributions $D_{0}$ and $D_{1}$, we search for a description that is more often true for $D_{1}$, e.g., "is military-related." To tackle this problem, we fine-tune GPT-3 to propose descriptions with the prompt: "[samples of $D_{0}$] + [samples of $D_{1}$] + the difference between them is _____". We then re-rank the descriptions by checking how often they hold on a larger set of samples with a learned verifier. On a benchmark of 54 real-world binary classification tasks, while GPT-3 Curie (13B) only generates a description similar to human annotation 7% of the time, the performance reaches 61% with fine-tuning and re-ranking, and our best system using GPT-3 Davinci (175B) reaches 76%. We apply our system to describe distribution shifts, debug dataset shortcuts, summarize unknown tasks, and label text clusters, and present analyses based on automatically generated descriptions.

翻訳日:2022-01-31 14:28:41 公開日:2022-01-28

# wikipediaはオフラインの強化学習に役立つか?

Can Wikipedia Help Offline Reinforcement Learning? ( http://arxiv.org/abs/2201.12122v1 )

ライセンス: Link先を確認

Machel Reid, Yutaro Yamada, Shixiang Shane Gu

(参考訳) 大規模オフザシェルフデータセットの欠如と、異なる環境間の転送可能性のばらつきのため、微調整強化学習(RL)モデルは困難である。最近の研究は、Transformerアーキテクチャの導入により、シーケンスモデリングの観点から、オフラインのRLに取り組むことに注目している。しかし、モデルをスクラッチからトレーニングすると、収束速度が遅くなる。本稿では、この強化学習をシーケンスモデリングとして活用し、オフラインRLタスク(制御、ゲーム)を微調整した場合に、他のドメイン(ビジョン、言語)における事前訓練されたシーケンスモデルの転送可能性を検討する。この目的のために、これらのドメイン間の転送を改善する手法も提案する。結果は,各種環境における収束速度と報酬の両面において一貫したパフォーマンス向上を示し,トレーニングを3～6倍に加速し,WikipediaとGPT2言語モデルを用いた各種タスクにおける最先端のパフォーマンスを達成する。この作業が、汎用シーケンスモデリング技術とrlの事前学習モデルを活用する可能性に光を当てるだけでなく、まったく異なるドメインのジェネレーティブモデリングタスク間の知識共有に関する今後の作業を促すことを期待しています。

Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of large scale off-the-shelf datasets as well as high variance in transferability among different environments. Recent work has looked at tackling offline RL from the perspective of sequence modeling with improved results as result of the introduction of the Transformer architecture. However, when the model is trained from scratch, it suffers from slow convergence speeds. In this paper, we look to take advantage of this formulation of reinforcement learning as sequence modeling and investigate the transferability of pre-trained sequence models on other domains (vision, language) when finetuned on offline RL tasks (control, games). To this end, we also propose techniques to improve transfer between these domains. Results show consistent performance gains in terms of both convergence speed and reward on a variety of environments, accelerating training by 3-6x and achieving state-of-the-art performance in a variety of tasks using Wikipedia-pretrained and GPT2 language models. We hope that this work not only brings light to the potentials of leveraging generic sequence modeling techniques and pre-trained models for RL, but also inspires future work on sharing knowledge between generative modeling tasks of completely different domains.

翻訳日:2022-01-31 13:58:00 公開日:2022-01-28

# 説明可能な人工知能を用いた自動設計評価における特徴可視化

Feature Visualization within an Automated Design Assessment leveraging Explainable Artificial Intelligence Methods ( http://arxiv.org/abs/2201.12107v1 )

ライセンス: Link先を確認

Raoul Sch\"onhof and Artem Werner and Jannes Elstner and Boldizsar Zopcsak and Ramez Awad and Marco Huber

(参考訳) 製造プロセスの自動化だけでなく、自動化手順自体の自動化も、自動化研究に益々関係している。この文脈では、3次元CADデータから駆動されるディープラーニングシステムによって主に活用される自動能力評価が提案されている。現在の評価システムはCADデータを抽象的な特徴(例えば、バルクグッズから部品を自動的に分離する機能やグリップ面の存在など)で評価することができる。それでも彼らはブラックボックスシステムの要因に悩まされており、評価を学習し、容易に生成することができるが、システムの決定の理由に関する幾何学的な指標は持っていない。説明可能なAI(xAI)手法を利用することで、ブラックボックスを開こうとする。説明可能なai手法は、ニューラルネットワークが特定のタスクをうまく学習したかどうかを判断したり、入力のどの特徴が敵の攻撃につながるかを分析するために使われています。これらの方法は、与えられた入力からパターンを分析し、ネットワーク出力に与える影響を分析することによって、ニューラルネットワークへのさらなる洞察を導き出すことを目的としている。 NeuroCADプロジェクト内では、ある抽象的特徴に関連する幾何学的特徴を特定するためにxAIメソッドが使用される。本研究の中で、感度分析(sa)、層間関係伝播(lrp)、勾配重み付けクラス活性化マッピング(gradle-weighted class activation mapping:grad-cam)、局所解釈可能なモデル非依存説明(lime)がニューロカド環境に実装されており、cadモデルを評価するだけでなく、ネットワーク決定に関連する特徴を特定することができる。中期的には、製品デザイナがアセンブリプロセスに関してモデルを最適化するための関心領域を特定できるかもしれません。

Not only automation of manufacturing processes but also automation of automation procedures itself become increasingly relevant to automation research. In this context, automated capability assessment, mainly leveraged by deep learning systems driven from 3D CAD data, have been presented. Current assessment systems may be able to assess CAD data with regards to abstract features, e.g. the ability to automatically separate components from bulk goods, or the presence of gripping surfaces. Nevertheless, they suffer from the factor of black box systems, where an assessment can be learned and generated easily, but without any geometrical indicator about the reasons of the system's decision. By utilizing explainable AI (xAI) methods, we attempt to open up the black box. Explainable AI methods have been used in order to assess whether a neural network has successfully learned a given task or to analyze which features of an input might lead to an adversarial attack. These methods aim to derive additional insights into a neural network, by analyzing patterns from a given input and its impact to the network output. Within the NeuroCAD Project, xAI methods are used to identify geometrical features which are associated with a certain abstract feature. Within this work, a sensitivity analysis (SA), the layer-wise relevance propagation (LRP), the Gradient-weighted Class Activation Mapping (Grad-CAM) method as well as the Local Interpretable Model-Agnostic Explanations (LIME) have been implemented in the NeuroCAD environment, allowing not only to assess CAD models but also to identify features which have been relevant for the network decision. In the medium run, this might enable to identify regions of interest supporting product designers to optimize their models with regards to assembly processes.

翻訳日:2022-01-31 13:57:42 公開日:2022-01-28

# Plug & Playアタック:ロバストでフレキシブルなモデルインバージョンアタックを目指す

Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks ( http://arxiv.org/abs/2201.12179v1 )

ライセンス: Link先を確認

Lukas Struppek, Dominik Hintersdorf, Antonio De Almeida Correia, Antonia Adler, Kristian Kersting

(参考訳) モデルインバージョンアタック(MIA)は、モデルの学習知識を活用して、ターゲット分類器のトレーニングデータからクラスワイズ特性を反映した合成画像を作成することを目的としている。従来の研究では、特定のターゲットモデルに適合した画像先行として、GAN(Generative Adversarial Network)を用いた生成MIAを開発した。これにより、攻撃は時間とリソースを消費し、柔軟性がなく、データセット間の分散シフトに影響を受けやすい。これらの欠点を克服するために、ターゲットモデルと画像間の依存性を緩和し、訓練された単一のGANを使用することで、小さな攻撃調整だけで広範囲のターゲットを攻撃できるPlug & Play Attacksを提案する。さらに, 従来の手法では有意な結果が得られなかったのに対して, 事前学習型GANでも強力なMIAが実現可能であることを示す。我々は,プラグイン・アンド・プレイ・アタックの堅牢性と柔軟性の向上と,クラス特性に敏感な高品質な画像を作成する能力を確認した。

Model inversion attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target classifier's training data by exploiting the model's learned knowledge. Previous research has developed generative MIAs using generative adversarial networks (GANs) as image priors that are tailored to a specific target model. This makes the attacks time- and resource-consuming, inflexible, and susceptible to distributional shifts between datasets. To overcome these drawbacks, we present Plug & Play Attacks that loosen the dependency between the target model and image prior and enable the use of a single trained GAN to attack a broad range of targets with only minor attack adjustments needed. Moreover, we show that powerful MIAs are possible even with publicly available pre-trained GANs and under strong distributional shifts, whereas previous approaches fail to produce meaningful results. Our extensive evaluation confirms the improved robustness and flexibility of Plug & Play Attacks and their ability to create high-quality images revealing sensitive class characteristics.

翻訳日:2022-01-31 13:57:11 公開日:2022-01-28

# 共通破壊に対する3次元点雲認識のロバスト性評価

Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions ( http://arxiv.org/abs/2201.12296v1 )

ライセンス: Link先を確認

Jiachen Sun, Qingzhao Zhang, Bhavya Kailkhura, Zhiding Yu, Chaowei Xiao, and Z. Morley Mao

(参考訳) 3dポイントクラウドデータ上のディープニューラルネットワークは、現実世界、特に安全クリティカルなアプリケーションで広く使われている。しかし、汚職に対する頑丈さは研究されていない。本稿では,15の共通および現実的な腐敗からなる3次元点雲の破壊堅牢性に関する最初の総合的なベンチマークであるModelNet40-Cを提案する。評価の結果,モデルNet40 とモデルNet40-C では,最先端モデル (SOTA) では大きな差がみられた。このギャップを減らすために,多種多様な拡張およびテスト時間適応戦略を評価し,PointCutMix-RとTENTを組み合わせた簡易かつ効果的な手法を提案する。我々は、ポイントクラウド認識における汚職の堅牢性に関する将来の研究に対する重要な洞察を多数特定する。例えば、適切なトレーニングレシピを持つTransformerベースのアーキテクチャは、強力な堅牢性を実現しています。詳細な分析が、3d point cloudドメインにおける堅牢なトレーニング戦略やアーキテクチャ設計の開発を動機付けることを期待しています。私たちのコードベースとデータセットはhttps://github.com/jiachens/ModelNet40-Cに含まれています。

Deep neural networks on 3D point cloud data have been widely used in the real world, especially in safety-critical applications. However, their robustness against corruptions is less studied. In this paper, we present ModelNet40-C, the first comprehensive benchmark on 3D point cloud corruption robustness, consisting of 15 common and realistic corruptions. Our evaluation shows a significant gap between the performances on ModelNet40 and ModelNet40-C for state-of-the-art (SOTA) models. To reduce the gap, we propose a simple but effective method by combining PointCutMix-R and TENT after evaluating a wide range of augmentation and test-time adaptation strategies. We identify a number of critical insights for future studies on corruption robustness in point cloud recognition. For instance, we unveil that Transformer-based architectures with proper training recipes achieve the strongest robustness. We hope our in-depth analysis will motivate the development of robust training strategies or architecture designs in the 3D point cloud domain. Our codebase and dataset are included in https://github.com/jiachens/ModelNet40-C

翻訳日:2022-01-31 13:56:55 公開日:2022-01-28

# (参考訳) 医用画像分割用クラスアウェア生成逆変換器

Class-Aware Generative Adversarial Transformers for Medical Image Segmentation ( http://arxiv.org/abs/2201.10737v2 )

ライセンス: CC BY 4.0

Chenyu You, Ruihan Zhao, Fenglin Liu, Sandeep Chinchali, Ufuk Topcu, Lawrence Staib, James S. Duncan

(参考訳) トランスフォーマーは、医用画像分析領域における長距離依存関係のモデリングにおいて著しく進歩した。しかし,現状のトランスフォーマーベースモデルでは,(1)単純トークン化方式による画像の重要特徴の捕捉に失敗し,(2)単一スケールの特徴表現のみを考慮したモデル,(3)リッチな意味的文脈や解剖学的テクスチャを考慮せずに生成したセグメンテーションラベルマップが十分に正確でない,といった欠点がある。本稿では,医療用画像分割のための新しい生成型逆変換器であるca-ganformerを提案する。まず,ピラミッド構造を利用し,マルチスケール表現を構築し,マルチスケールのバリエーションを扱います。次に、意味構造を持つオブジェクトの識別領域をよりよく学習するために、新しいクラス対応トランスフォーマーモジュールを設計する。最後に, セグメンテーションの精度を向上し, 高レベルの意味的関連のある内容と低レベルの解剖学的特徴をトランスフォーマーベースの識別器で捉えるための対角訓練戦略を利用する。実験の結果、CA-GANformerは3つのベンチマークで従来の最先端のトランスフォーマーベースのアプローチを劇的に上回り、従来のモデルよりも2.54%-5.88%向上した。さらに質的な実験によって、モデルの内部動作のより詳細な図が提供され、透明性向上の課題に光を当て、トランスファーラーニングがパフォーマンスを大幅に向上し、トレーニング中の医療画像データセットのサイズを削減し、CA-GANformerが下流の医療画像解析タスクの強力な出発点となることを示す。コードとモデルは一般公開される予定だ。

Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: (1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; (2) the models suffer from information loss because they only consider single-scale feature representations; and (3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CA-GANformer, a novel type of generative adversarial transformers, for medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CA-GANformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining 2.54%-5.88% absolute improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CA-GANformer a strong starting point for downstream medical image analysis tasks. Codes and models will be available to the public.

翻訳日:2022-01-31 13:24:30 公開日:2022-01-28

# (参考訳) vision checklist: 画像モデルのテスト可能なエラー解析に向けて - システム設計者がモデルの能力に疑問を呈するのに役立つ

Vision Checklist: Towards Testable Error Analysis of Image Models to Help System Designers Interrogate Model Capabilities ( http://arxiv.org/abs/2201.11674v2 )

ライセンス: CC BY 4.0

Xin Du, Benedicte Legastelois, Bhargavi Ganesh, Ajitha Rajan, Hana Chockler, Vaishak Belle, Stuart Anderson, Subramanian Ramamoorthy

(参考訳) 視覚トランスフォーマーなどの最近のモデルや、vggやresnetといったcnnベースのモデルの成功により、画像認識タスクに大規模な事前訓練済みモデルを使用することが増えている。ベンチマークタスクにおけるこれらのモデルの高精度さは、自動運転や医療診断のような安全クリティカルなアプリケーションを含む、多くのドメインで実用化されている。広く使われているにもかかわらず、画像モデルは運用環境の変化に弱いことが示され、その堅牢性に疑問が呈されている。設計者が安全性と堅牢性を理解し、保証するために、これらのモデルの能力を体系的に特徴付け、定量化する手法が緊急に必要である。本稿では,システム設計者がロバスト性評価に使用できるレポートを作成するために,モデルの能力を問うことを目的としたフレームワークであるvision checklistを提案する。このフレームワークは、異なるタイプのテストサンプルを生成するために基礎となるデータに適用できる一連の摂動操作を提案する。摂動は運用環境の潜在的な変化を反映し、厳密な量から質的な性質まで様々な特性を問う。我々のフレームワークは、Tinyimagenet、CIFAR10、CIFAR100、Camelyon17のような複数のデータセットと、ViTやResnetのようなモデルで評価されている。われわれのvision checklistは、モデルカードのコンセプトに組み込むことのできる、特定の評価セットを提案している。私たちのチェックリストのようなロバストネス評価は、視覚認識モジュールの将来の安全性評価に不可欠であり、これらのシステムの認証に関わるデザイナー、デプロイ者、規制官を含む幅広い利害関係者に役立ちます。 Vision Checklistのソースコードは一般に公開されている。

Using large pre-trained models for image recognition tasks is becoming increasingly common owing to the well acknowledged success of recent models like vision transformers and other CNN-based models like VGG and Resnet. The high accuracy of these models on benchmark tasks has translated into their practical use across many domains including safety-critical applications like autonomous driving and medical diagnostics. Despite their widespread use, image models have been shown to be fragile to changes in the operating environment, bringing their robustness into question. There is an urgent need for methods that systematically characterise and quantify the capabilities of these models to help designers understand and provide guarantees about their safety and robustness. In this paper, we propose Vision Checklist, a framework aimed at interrogating the capabilities of a model in order to produce a report that can be used by a system designer for robustness evaluations. This framework proposes a set of perturbation operations that can be applied on the underlying data to generate test samples of different types. The perturbations reflect potential changes in operating environments, and interrogate various properties ranging from the strictly quantitative to more qualitative. Our framework is evaluated on multiple datasets like Tinyimagenet, CIFAR10, CIFAR100 and Camelyon17 and for models like ViT and Resnet. Our Vision Checklist proposes a specific set of evaluations that can be integrated into the previously proposed concept of a model card. Robustness evaluations like our checklist will be crucial in future safety evaluations of visual perception modules, and be useful for a wide range of stakeholders including designers, deployers, and regulators involved in the certification of these systems. Source code of Vision Checklist would be open for public use.

翻訳日:2022-01-31 12:58:08 公開日:2022-01-28

# (参考訳) 複数インスタンス学習におけるモデル非依存解釈可能性

Model Agnostic Interpretability for Multiple Instance Learning ( http://arxiv.org/abs/2201.11701v2 )

ライセンス: CC BY 4.0

Joseph Early, Christine Evers and Sarvapali Ramchurn

(参考訳) 複数のインスタンス学習(mil:multiple instance learning)では、モデルは、各バッグに単一のラベルのみを提供する、インスタンスの袋を使ってトレーニングされる。バッグラベルは、しばしばバッグ内の一握りのキーインスタンスによってのみ決定されるため、分類器が意思決定に使用する情報を理解するのが困難である。本研究では,MILモデルを解釈するための重要な要件を確立する。次に、これらの要件を満たすモデルに依存しないアプローチをいくつか開発します。提案手法は,複数のデータセット上の既存の解釈可能なMILモデルと比較し,解釈可能性の精度を最大30%向上させる。また、インスタンス間の相互作用を識別し、より大きなデータセットにスケールする手法の能力を検証し、実世界の問題への適用性を向上させる。

In Multiple Instance Learning (MIL), models are trained using bags of instances, where only a single label is provided for each bag. A bag label is often only determined by a handful of key instances within a bag, making it difficult to interpret what information a classifier is using to make decisions. In this work, we establish the key requirements for interpreting MIL models. We then go on to develop several model-agnostic approaches that meet these requirements. Our methods are compared against existing inherently interpretable MIL models on several datasets, and achieve an increase in interpretability accuracy of up to 30%. We also examine the ability of the methods to identify interactions between instances and scale to larger datasets, improving their applicability to real-world problems.

翻訳日:2022-01-31 12:43:20 公開日:2022-01-28

# DiscoScore: BERT と Discourse Coherence によるテキスト生成の評価

DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence ( http://arxiv.org/abs/2201.11176v2 )

ライセンス: Link先を確認

Wei Zhao, Michael Strube, Steffen Eger

(参考訳) 近年、文間の相互依存のモデル化など、言論コヒーレンスの観点からテキスト生成システムを設計することへの関心が高まっている。それでも、最近のBERTベースの評価指標では、コヒーレンスを認識することができず、システム出力の非コヒーレントな要素を罰することができない。本研究では,様々な視点から対話コヒーレンスをモデル化するためにBERTを用いたパラメタライズされた談話計量であるDiscoScoreを紹介する。本実験は,要約と文書レベルの機械翻訳(MT)に基づいて評価されたディスコスコアや一般的なコヒーレンスモデルを含む16の非談話・談話指標を含む。私たちはそれを見つけ (i)10年前に考案された,BERTベースの指標の大部分は,初期の談話基準よりも人間のレーティング・コヒーレンスと相関する。 (II)最近の最先端のBARTScoreは、システムレベルでの運用では弱い - この種のシステムと比較される場合、特に問題となる。対照的にDiscoScoreは、コヒーレンスだけでなく、現実の一貫性やその他の面において、人間の評価と強いシステムレベルの相関を達成し、BARTScoreを平均10以上の相関点で上回っている。さらに,ディスコスコアの理解を目指して,評価指標における談話コヒーレンスの重要性を正当化し,一方の変種が他方よりも優れていることを説明する。私たちのコードは \url{https://github.com/AIPHES/DiscoScore} で利用可能です。

Recently, there has been a growing interest in designing text generation systems from a discourse coherence perspective, e.g., modeling the interdependence between sentences. Still, recent BERT-based evaluation metrics cannot recognize coherence and fail to punish incoherent elements in system outputs. In this work, we introduce DiscoScore, a parametrized discourse metric, which uses BERT to model discourse coherence from different perspectives, driven by Centering theory. Our experiments encompass 16 non-discourse and discourse metrics, including DiscoScore and popular coherence models, evaluated on summarization and document-level machine translation (MT). We find that (i) the majority of BERT-based metrics correlate much worse with human rated coherence than early discourse metrics, invented a decade ago; (ii) the recent state-of-the-art BARTScore is weak when operated at system level -- which is particularly problematic as systems are typically compared in this manner. DiscoScore, in contrast, achieves strong system-level correlation with human ratings, not only in coherence but also in factual consistency and other aspects, and surpasses BARTScore by over 10 correlation points on average. Further, aiming to understand DiscoScore, we provide justifications to the importance of discourse coherence for evaluation metrics, and explain the superiority of one variant over another. Our code is available at \url{https://github.com/AIPHES/DiscoScore}.

翻訳日:2022-01-31 12:17:30 公開日:2022-01-28

# 言語間自動音声認識による音声辞書の探索

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition ( http://arxiv.org/abs/2201.11207v2 )

ライセンス: Link先を確認

Piotr \.Zelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak

(参考訳) データ取得のコストが高いため、自動音声認識(asr)モデルのトレーニングは、文字が書かれていない言語や電話の在庫が不明な言語を含む、ほとんどの既存の言語で問題となる。過去の研究は、これらの低リソース言語のためのasrシステムを構築するために、多言語学習、転送学習、ゼロショット学習を探求した。複数の言語からのリソースプールが有用であることが示されているが、トレーニング中に見つからない言語へのASRモデルの適用は、まだ成功していない。 ASRの未確認言語への適応における重要なステップは、未確認言語の電話在庫の作成である。私たちの研究の最終的な目標は、トレーニング中に目に見えない言語の電話在庫を教師なしの方法で構築することにあります。本稿では, 1) 未知言語における電話の認識に及ぼす異なる要因(モデルアーキテクチャ、フォノタクティクスモデル、音声表現の種類など)の影響について検討する。 2)電話機が言語をまたいでうまく転送されるか、また、電話の自動在庫作成のさらなる改善のための限界や領域を理解しない分析を提供する。 3) 教師なしの方法で未認識言語の電話インベントリを構築するための異なる方法を示す。そこで本研究では,13言語を対象に単言語,多言語,多言語,多言語間の実験を行った。クロス言語的によく認識される、多くのユニバーサル電話トークン(ipaシンボル)を見つけました。結果の詳細な分析を通じて,固有音,類似音,トーン言語は音素インベントリの発見において依然として大きな課題となっていると結論づけた。

The high cost of data acquisition makes Automatic Speech Recognition (ASR) model training problematic for most existing languages, including languages that do not even have a written script, or for which the phone inventories remain unknown. Past works explored multilingual training, transfer learning, as well as zero-shot learning in order to build ASR systems for these low-resource languages. While it has been shown that the pooling of resources from multiple languages is helpful, we have not yet seen a successful application of an ASR model to a language unseen during training. A crucial step in the adaptation of ASR from seen to unseen languages is the creation of the phone inventory of the unseen language. The ultimate goal of our work is to build the phone inventory of a language unseen during training in an unsupervised way without any knowledge about the language. In this paper, we 1) investigate the influence of different factors (i.e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language; 2) provide an analysis of which phones transfer well across languages and which do not in order to understand the limitations of and areas for further improvement for automatic phone inventory creation; and 3) present different methods to build a phone inventory of an unseen language in an unsupervised way. To that end, we conducted mono-, multi-, and crosslingual experiments on a set of 13 phonetically diverse languages and several in-depth analyses. We found a number of universal phone tokens (IPA symbols) that are well-recognized cross-linguistically. Through a detailed analysis of results, we conclude that unique sounds, similar sounds, and tone languages remain a major challenge for phonetic inventory discovery.

翻訳日:2022-01-31 12:17:02 公開日:2022-01-28

# コールドスタートレコメンデーションのためのスパーシティ規則化

Sparsity Regularization For Cold-Start Recommendation ( http://arxiv.org/abs/2201.10711v3 )

ライセンス: Link先を確認

Aksheshkumar Ajaykumar Shah and Hemanth Venkateswara

(参考訳) 近年, コールドスタート勧告問題に対してGAN(Generative Adversarial Networks)が適用されているが, これらのモデルのトレーニング性能は, 温かいユーザの購入行動の極端に疎らさによって阻害されている。本稿では,ユーザ人口統計とユーザの嗜好を組み合わせることにより,ユーザ-ベクトルの新たな表現法を提案する。本システムでは,二元的ユーザ製品間インタラクション(簡易フィードバック)ではなく,重み付けされたユーザ製品選好(テストフィードバック)を用いて,ユーザ購入行動のモデル化を行う。これを用いて, 温かいユーザへの過度な適合を回避し, トレーニング安定性を保証したスパースユーザ購入行動を活用した, 冷水星推薦のための新しいスパース対逆モデルSRLGANを開発した。 SRLGANを2つの一般的なデータセットで評価し、最先端の結果を示す。

Recently, Generative Adversarial Networks (GANs) have been applied to the problem of Cold-Start Recommendation, but the training performance of these models is hampered by the extreme sparsity in warm user purchase behavior. In this paper we introduce a novel representation for user-vectors by combining user demographics and user preferences, making the model a hybrid system which uses Collaborative Filtering and Content Based Recommendation. Our system models user purchase behavior using weighted user-product preferences (explicit feedback) rather than binary user-product interactions (implicit feedback). Using this we develop a novel sparse adversarial model, SRLGAN, for Cold-Start Recommendation leveraging the sparse user-purchase behavior which ensures training stability and avoids over-fitting on warm users. We evaluate the SRLGAN on two popular datasets and demonstrate state-of-the-art results.

翻訳日:2022-01-31 12:16:37 公開日:2022-01-28

PDF登録状況（公開日: 20220128）