Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20201219となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# アジャイルで汎用的な量子コミュニケーション:署名と秘密 Agile and versatile quantum communication: signatures and secrets ( http://arxiv.org/abs/2001.10089v3 ) ライセンス: Link先を確認	Stefan Richter, Matthew Thornton, Imran Khan, Hannah Scott, Kevin Jaksch, Ulrich Vogl, Birgit Stiller, Gerd Leuchs, Christoph Marquardt, Natalia Korolkova	(参考訳) アジャイル暗号は、基盤となる古典暗号アルゴリズムのセキュリティが損なわれている場合に、リソース効率のよい暗号コアのスワップを可能にする。逆に、多用途暗号では、ユーザは内部動作に関する知識を必要とせずに暗号タスクを切り替えることができる。本稿では,量子デジタルシグネチャ(qds)と量子シークレット共有(qss)という2つの量子暗号プロトコルを,同じハードウェア送信機と受信機プラットフォーム上で明示的に示すことにより,これらの原理を量子暗号の分野に適用する方法を提案する。重要なことは、プロトコルは古典的な後処理でのみ異なる。また、量子鍵分布(QKD)にも適しており、標準的な2次位相シフトキー(QPSK)エンコーディングとヘテロダイン検出を使用するため、配置された通信インフラと高い互換性を持つ。初めてQDSプロトコルが変更され、受信機でのポストセレクションが可能となり、プロトコルのパフォーマンスが向上した。暗号プリミティブQDSとQSSは本質的にマルチパーティトであり、タスクの内部のプレイヤーが不正であるだけでなく、量子チャネル上の(外部の)盗聴が許されている場合にも安全であることを示す。アジャイルで汎用的な量子通信システムの最初の実証実験では、量子状態はGHz速度で分散された。これにより、QDSプロトコルを使用して、2kmのファイバーリンクで0.05ms未満、20kmのファイバーリンクで0.2~s未満で1ビットメッセージにセキュアに署名できる。我々の知る限り、これは連続可変直接QSSプロトコルの最初の実演でもある。 Agile cryptography allows for a resource-efficient swap of a cryptographic core in case the security of an underlying classical cryptographic algorithm becomes compromised. Conversely, versatile cryptography allows the user to switch the cryptographic task without requiring any knowledge of its inner workings. In this paper, we suggest how these related principles can be applied to the field of quantum cryptography by explicitly demonstrating two quantum cryptographic protocols, quantum digital signatures (QDS) and quantum secret sharing (QSS), on the same hardware sender and receiver platform. Crucially, the protocols differ only in their classical post-processing. The system is also suitable for quantum key distribution (QKD) and is highly compatible with deployed telecommunication infrastructures, since it uses standard quadrature phase shift keying (QPSK) encoding and heterodyne detection. For the first time, QDS protocols are modified to allow for postselection at the receiver, enhancing protocol performance. The cryptographic primitives QDS and QSS are inherently multipartite and we prove that they are secure not only when a player internal to the task is dishonest, but also when (external) eavesdropping on the quantum channel is allowed. In our first proof-of-principle demonstration of an agile and versatile quantum communication system, the quantum states were distributed at GHz rates. This allows for a one-bit message to be securely signed using our QDS protocols in less than 0.05 ms over a 2 km fiber link and in less than 0.2~s over a 20 km fiber link. To our knowledge, this also marks the first demonstration of a continuous-variable direct QSS protocol.	翻訳日:2023-06-05 11:32:31 公開日:2020-12-19
# ニューラルネットワークを用いたSU(N)フェルミオンの熱力学研究のためのヒューリスティック機械 Heuristic machinery for thermodynamic studies of SU(N) fermions with neural networks ( http://arxiv.org/abs/2006.14142v2 ) ライセンス: Link先を確認	Entong Zhao, Jeongwon Lee, Chengdong He, Zejian Ren, Elnur Hajiyev, Junwei Liu, and Gyu-Boong Jo	(参考訳) 機械学習(ML)のパワーは、前例のない感度で実験的な測定を分析する可能性を提供する。しかし、物理観測物に直接関係する微妙な効果を探索し、MLを用いた通常の実験データから物理を理解することは依然として困難である。本稿では,機械学習解析を用いたヒューリスティック機械を提案する。量子シミュレータで作製したSU($N$)スピン対称性内で相互作用する超低温フェルミオンの密度分布における熱力学的研究の導出に機械を用いる。このようなスピン対称性は多体導波路に現れなければならないが、フェルミオンの運動量分布がスピン対称性の影響を最も通常の測定値として示すことは明らかである。スピン乗数の検出に$\sim$94$\%$という極めて高い精度で、完全に訓練された畳み込みニューラルネットワーク(NN)を用いて、その精度が、フィルタされた実験画像による様々な低輝度効果に依存するかを調べる。機械によって導かれる熱力学的圧縮性は, 単一の画像内の密度変動から直接測定する。我々の機械学習フレームワークは、SU($N$)のフェルミ液体の理論的記述を検証し、最小の事前理解を持つ非常に複雑な量子物質であっても、より顕著な効果を識別する可能性を示している。 The power of machine learning (ML) provides the possibility of analyzing experimental measurements with an unprecedented sensitivity. However, it still remains challenging to probe the subtle effects directly related to physical observables and to understand physics behind from ordinary experimental data using ML. Here, we introduce a heuristic machinery by using machine learning analysis. We use our machinery to guide the thermodynamic studies in the density profile of ultracold fermions interacting within SU($N$) spin symmetry prepared in a quantum simulator. Although such spin symmetry should manifest itself in a many-body wavefuction, it is elusive how the momentum distribution of fermions, the most ordinary measurement, reveals the effect of spin symmetry. Using a fully trained convolutional neural network (NN) with a remarkably high accuracy of $\sim$94$\%$ for detection of the spin multiplicity, we investigate how the accuracy depends on various less-pronounced effects with filtered experimental images. Guided by our machinery, we directly measure a thermodynamic compressibility from density fluctuations within the single image. Our machine learning framework shows a potential to validate theoretical descriptions of SU($N$) Fermi liquids, and to identify less-pronounced effects even for highly complex quantum matter with minimal prior understanding.	翻訳日:2023-05-12 20:13:38 公開日:2020-12-19
# 動的システムの制御と自律性のための量子テレポーテーション Quantum Teleportation for Control of Dynamic Systems and Autonomy ( http://arxiv.org/abs/2007.15249v2 ) ライセンス: Link先を確認	Farbod Khoshnoud, Lucas Lamata, Clarence W. de Silva, Marco B. Quadrelli	(参考訳) 本稿では,量子テレポーテーションの古典力学系の制御と自律性への応用について述べる。量子テレポーテーションは本質的に量子現象であり、1993年にアインシュタイン-ポドルスキー-ローゼンの二重チャネルを介して未知の量子状態のテレポーテーションによって初めて導入された。本稿では,本研究で初めて,自律移動型古典的プラットフォームに量子技術を適用する可能性について考察する。まず、量子エンタングルメントと量子暗号を、制御や自律的応用のためのマクロ力学系にどのように統合するか、また量子テレポーテーションの概念を古典領域にどのように適用するかを概観する。量子テレポーテーション(quantum teleportation)では、その分極に相関する一対の光子が生成され、Alice RobotとBob Robotと呼ばれる2つの自律プラットフォームに送られる。アリスは、絡み合った光子を受け取ることに加えて、未知の状態で調製された光子と呼ばれる量子系を与えられた。アリスは、絡み合った光子と未知の状態の状態を共同で測定し、古典的チャンネルを通じてボブに情報を送る。アリス元の未知の状態は(量子非閉化現象により)絡み合った光子の状態を測定する過程で崩壊するが、ボブはユニタリ作用素を適用することでアリス状態の正確なレプリカを構築することができる。本稿では,動的システムの制御におけるハイブリッド古典量子能力の応用について,特に自律型古典システムの自律性と制御において,量子能力の導入と古典領域への優位性を促進することを目的としている。 The application of Quantum Teleportation for control of classical dynamic systems and autonomy is proposed in this paper. Quantum teleportation is an intrinsically quantum phenomenon, and was first introduced by teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels in 1993. In this paper, we consider the possibility of applying this quantum technique to autonomous mobile classical platforms for control and autonomy purposes for the first time in this research. First, a review of how Quantum Entanglement and Quantum Cryptography can be integrated into macroscopic mechanical systems for controls and autonomy applications is presented, as well as how quantum teleportation concepts may be applied to the classical domain. In quantum teleportation, an entangled pair of photons which are correlated in their polarizations are generated and sent to two autonomous platforms, which we call the Alice Robot and the Bob Robot. Alice has been given a quantum system, i.e. a photon, prepared in an unknown state, in addition to receiving an entangled photon. Alice measures the state of her entangled photon and her unknown state jointly and sends the information through a classical channel to Bob. Although Alice original unknown state is collapsed in the process of measuring the state of the entangled photon (due to the quantum non-cloning phenomenon), Bob can construct an accurate replica of Alice state by applying a unitary operator. This paper, and the previous investigations of the applications of hybrid classical-quantum capabilities in control of dynamical systems, are aimed to promote the adoption of quantum capabilities and its advantages to the classical domain particularly for autonomy and control of autonomous classical systems.	翻訳日:2023-05-07 18:37:47 公開日:2020-12-19
# 圧縮二重層構造の散乱データと境界状態 Scattering data and bound states of a squeezed double-layer structure ( http://arxiv.org/abs/2011.11437v2 ) ライセンス: Link先を確認	Alexander V. Zolotaryuk and Yaroslav Zolotaryuk	(参考訳) 2つの平行な均質な層からなるヘテロ構造は、その幅が$l_1$ と $l_2$ であり、それらの間の距離が同時に 0 に縮小するので、極限で研究される。この問題は一次元で研究され、schr\"{o}dinger方程式のスクイーズポテンシャルは層厚に応じて$v_1$と$v_2$によって与えられる。関数のクラス全体の$V_1(l_1)$と$V_2(l_2)$は特定の極限特性によって指定される。有限系に対して導出される散乱データ $a(k)$ および $b(k)$ のスクイーズ限界は、系のパラメータ $V_j$, $l_j$, $j=1,2$, $r$ の条件が成立する場合にのみ存在する。これらの条件は、適切な発散の結果として現れる。このキャンセルの2つの方法を実行し、システムパラメータ空間内の対応する2つの共振セットを導出する。これらの集合の1つにおいて、非自明な境界状態の存在は、ディラックのデルタ函数の微分の形で絞られたポテンシャルの特定の例を含む、スクイージング極限において証明される。有限系内の有限個の有界状態から圧縮された系で1つの有界状態が存続するシナリオを詳細に記述する。 A heterostructure composed of two parallel homogeneous layers is studied in the limit as their widths $l_1$ and $l_2$, and the distance between them $r$ shrinks to zero simultaneously. The problem is investigated in one dimension and the squeezing potential in the Schr\"{o}dinger equation is given by the strengths $V_1$ and $V_2$ depending on the layer thickness. A whole class of functions $V_1(l_1)$ and $V_2(l_2)$ is specified by certain limit characteristics as $l_1$ and $l_2$ tend to zero. The squeezing limit of the scattering data $a(k)$ and $b(k)$ derived for the finite system is shown to exist only if some conditions on the system parameters $V_j$, $l_j$, $j=1,2$, and $r$ take place. These conditions appear as a result of an appropriate cancellation of divergences. Two ways of this cancellation are carried out and the corresponding two resonance sets in the system parameter space are derived. On one of these sets, the existence of non-trivial bound states is proven in the squeezing limit, including the particular example of the squeezed potential in the form of the derivative of Dirac's delta function, contrary to the widespread opinion on the non-existence of bound states in $\delta'$-like systems. The scenario how a single bound state survives in the squeezed system from a finite number of bound states in the finite system is described in detail.	翻訳日:2023-04-23 09:09:21 公開日:2020-12-19
# ADBIS、TPDL、EDA 2020合同会議参加者からのフィードバック Feedback from the participants of the ADBIS, TPDL and EDA 2020 joint conferences ( http://arxiv.org/abs/2012.01184v2 ) ライセンス: Link先を確認	Pegdwend\'e Sawadogo, J\'er\^ome Darmont, Fabien Duchateau	(参考訳) 本稿では,ADBIS,TPDL,EDA 2020の合同会議をオンラインで開催する方法と,その後の参加者調査の結果について述べる。参加者のフィードバックから学んだ教訓を紹介する。 This paper presents the way the joint ADBIS, TPDL and EDA 2020 conferences were organized online and the results of the participant survey conducted thereafter. We present the lessons learned from the participants' feedback.	翻訳日:2023-04-22 20:21:20 公開日:2020-12-19
# フォトニックリザーバコンピュータにおける高性能アナログ読み出し層のオンライントレーニング Online training for high-performance analogue readout layers in photonic reservoir computers ( http://arxiv.org/abs/2012.10613v1 ) ライセンス: Link先を確認	Piotr Antonik, Marc Haelterman, Serge Massar	(参考訳) はじめに。貯水池コンピューティングは、時間に依存した信号を処理するためのバイオインスパイアされたコンピューティングパラダイムである。ハードウェア実装の性能は、一連のベンチマークタスクにおける最先端のデジタルアルゴリズムに匹敵する。これらの実装の最大のボトルネックは、オフライン後処理の遅い読み込み層である。アナログソリューションはほとんど提案されていないが、セットアップの複雑さが増すため、パフォーマンスが著しく低下していることに気付きました。メソッド。本稿では,これらの問題を解決するためのオンライントレーニングを提案する。本手法の適用性について,アナログ読み出し層を有する実験可能な貯水池コンピュータの数値シミュレーションを用いて検討した。また,従来の手法では訓練が困難である非線形出力層も検討した。結果だオンライン学習により,アナログ層の複雑さの増大を回避し,ディジタル層と同じレベルのパフォーマンスが得られることを示す。結論だこの研究は、出力層をオンライントレーニングすることで、高性能な完全アナログ貯水池コンピュータへの道を開いた。 Introduction. Reservoir Computing is a bio-inspired computing paradigm for processing time-dependent signals. The performance of its hardware implementation is comparable to state-of-the-art digital algorithms on a series of benchmark tasks. The major bottleneck of these implementation is the readout layer, based on slow offline post-processing. Few analogue solutions have been proposed, but all suffered from notice able decrease in performance due to added complexity of the setup. Methods. Here we propose the use of online training to solve these issues. We study the applicability of this method using numerical simulations of an experimentally feasible reservoir computer with an analogue readout layer. We also consider a nonlinear output layer, which would be very difficult to train with traditional methods. Results. We show numerically that online learning allows to circumvent the added complexity of the analogue layer and obtain the same level of performance as with a digital layer. Conclusion. This work paves the way to high-performance fully-analogue reservoir computers through the use of online training of the output layers.	翻訳日:2023-04-20 04:21:00 公開日:2020-12-19
# 不確実性の算術は量子形式と相対論的時空を統一する The arithmetic of uncertainty unifies quantum formalism and relativistic spacetime ( http://arxiv.org/abs/2104.05395v1 ) ライセンス: Link先を確認	John Skilling and Kevin H. Knuth	(参考訳) 量子力学と相対性理論は、現代物理学の時代における宇宙の理解を劇的に変えた。量子論は物体を小さなスケールで確率的に扱うが、相対性理論は古典的に空間と時間の運動を扱う。ここでは、量子論と相対性理論の数学的構造が、標準算術と確率の基盤となる同じ基本的「合成とシーケンシング」対称性によって定義され、一意に制約された純粋思考から従うことを示す。鍵となるのは不確実性であり、それは必然的に量の観測に付随し、数対の使用を強制する。この対称性は、複素「$\surd\mathord-1$」算術、量子力学の標準計算、相対論的時空のローレンツ変換に直接導かれる。したがって、時間の1次元と空間の3次元は物理学の深遠で避けられない枠組みとして導出される。 The theories of quantum mechanics and relativity dramatically altered our understanding of the universe ushering in the era of modern physics. Quantum theory deals with objects probabilistically at small scales, whereas relativity deals classically with motion in space and time. We show here that the mathematical structures of quantum theory and of relativity follow together from pure thought, defined and uniquely constrained by the same elementary "combining and sequencing" symmetries that underlie standard arithmetic and probability. The key is uncertainty, which inevitably accompanies observation of quantity and imposes the use of pairs of numbers. The symmetries then lead directly to the use of complex "$\surd\mathord-1$" arithmetic, the standard calculus of quantum mechanics, and the Lorentz transformations of relativistic spacetime. One dimension of time and three dimensions of space are thus derived as the profound and inevitable framework of physics.	翻訳日:2023-04-20 04:17:37 公開日:2020-12-19
# オープンシステムの量子フィッシャー情報フローと非マルコフ過程」へのコメント Comment on "Quantum Fisher information flow and non-Markovian processes of open systems" ( http://arxiv.org/abs/2012.10767v1 ) ライセンス: Link先を確認	Mihaela Vatasescu	(参考訳) 著者らは[phys. rev. a 82, 042103 (2010)]において、「時間局所形式における非マルコフマスター方程式のクラス」について、量子フィッシャー情報(qfi)フローは異なる散逸チャネルに対応する付加的サブフローに分解できることを示した。しかし、この論文は、QFIフローの解析的分解が有効である非マルコフ時間局所マスター方程式のクラスを規定していない。ここでは、Refの中心的な結果に到達するためには、いくつかの仮定が必要であることを示す。密度作用素の狭いクラスである $\rho (\theta;t)$ と量子フィッシャー情報 $\mathcal{f}(\theta;t)$ と、時間-局所マスター方程式の厳密な条件下で有効であるように思われる \cite{luwsun10} 。より正確には、Refで得られたQFIフローの分解である。 \cite{luwsun10} は2つの条件で有効である。 (i) $\frac{d}{dt} \left( \frac {\partial \rho}{\partial \theta} \right)=$$\frac {\partial}{\partial \theta} \left( \frac{d \rho}{dt} \right)$ (ii) $\frac{\partial H}{\partial \theta}=0$, $\frac{\partial \gamma_i}{\partial \theta}=0$, $\frac{\partial A_i}{\partial \theta}=0$, すなわち、ハミルトニアン$H(t)$、崩壊率$\gamma_i(t)$、リンドブラッド作用素$A_i(t)$は、非マルコフ時間局所マスター方程式に現れるが、量子フィッシャー情報が定義されるパラメータ$\theta$に依存してはならない。 In [Phys. Rev. A 82, 042103 (2010)], the authors showed that "for a class of the non-Markovian master equations in time-local forms", the quantum Fisher information (QFI) flow can be decomposed into additive subflows corresponding to different dissipative channels. However, the paper does not specify the class of non-Markovian time-local master equations for which their analytic decomposition of the QFI flow is valid. Here we show that several suppositions have to be made in order to reach the central result of Ref. \cite{luwsun10}, which appears to be valid for a narrow class of density operators $\rho (\theta;t)$ and quantum Fisher information $\mathcal{F}(\theta;t)$, and under strict conditions on the time-local master equation. More precisely, the decomposition of the QFI flow obtained in Ref. \cite{luwsun10} is valid under two conditions not mentioned in the paper: (i) $\frac{d}{dt} \left( \frac{\partial \rho}{\partial \theta} \right)=$ $\frac{\partial}{\partial \theta} \left( \frac{d \rho}{dt} \right)$; (ii) $\frac{\partial H}{\partial \theta}=0$, $\frac{\partial \gamma_i}{\partial \theta}=0$, $\frac{\partial A_i}{\partial \theta}=0$, meaning that the Hamiltonian $H(t)$, the decay rates $\gamma_i(t)$, and the Lindblad operators $A_i(t)$ appearing in the non-Markovian time-local master equation have to not depend on the parameter $\theta$ about which the quantum Fisher information is defined.	翻訳日:2023-04-20 04:17:17 公開日:2020-12-19
# 建築設計のためのパラメトリックシステムの構成要素としてのビズーロモーティブ複雑度 Visuo-Locomotive Complexity as a Component of Parametric Systems for Architecture Design ( http://arxiv.org/abs/2012.10710v1 ) ライセンス: Link先を確認	Vasiliki Kondyli and Mehul Bhatt and Evgenia Spyridonos	(参考訳) 大規模ビルトアップ空間を設計するための人々中心のアプローチは、ナビゲーション、ウェイフィンディング、ユーザビリティといった側面に関連する人間と環境の相互作用要因の観点から、ユーザの具体化されたビズー・ロケーティブな体験を体系的に予測する必要がある。この文脈において、我々は、ビルトアップ空間における認知性能 vis-a-vis 内部ナビゲーションの重要相関として機能する行動に基づくビゾ移動型複雑性モデルを開発する。また,提案した空間的複雑性モデルのパラメータに従って,ナビゲーション経路に沿った構造形態の同定と操作を行うパラメトリックツールとして,モデルの実装と応用を実証する。本稿では,2つの医療施設における実証研究に基づいて,動的かつインタラクティブなパラメトリック(複合性)モデルが設計プロセス全体を通して行動に基づく意思決定を促進する方法を示し,ナビゲーションやウェイフィンディング体験の一部として,所望のvisospatial complexityのレベルを維持する。 A people-centred approach for designing large-scale built-up spaces necessitates systematic anticipation of user's embodied visuo-locomotive experience from the viewpoint of human-environment interaction factors pertaining to aspects such as navigation, wayfinding, usability. In this context, we develop a behaviour-based visuo-locomotive complexity model that functions as a key correlate of cognitive performance vis-a-vis internal navigation in built-up spaces. We also demonstrate the model's implementation and application as a parametric tool for the identification and manipulation of the architectural morphology along a navigation path as per the parameters of the proposed visuospatial complexity model. We present examples based on an empirical study in two healthcare buildings, and showcase the manner in which a dynamic and interactive parametric (complexity) model can promote behaviour-based decision-making throughout the design process to maintain desired levels of visuospatial complexity as part of a navigation or wayfinding experience.	翻訳日:2023-04-20 04:16:12 公開日:2020-12-19
# 完全量子化規則法による中間子スペクトルの解析 Analytical Investigation of Meson Spectrum via Exact Quantization Rule Approach ( http://arxiv.org/abs/2012.10639v1 ) ライセンス: Link先を確認	Etido P. Inyang, Ephraim P. Inyang, Eddy S. William, Etebong E. Ibekwe and Ita O.Akpan	(参考訳) 我々は、Exact Quantization Ruleアプローチを用いてラジアルシュリンガー方程式を解析的に解き、拡張コーネルポテンシャルECPを用いたエネルギー固有値を得た。本研究は、チャーモニウムccやボトムニウムbbなどの重中間子の質量スペクトルや、ボトムチャームbcやチャームストレンジcsなどの重中間子の量子状態の異なる量子状態に対する質量スペクトルの計算に応用した。ポテンシャルパラメータのいくつかがゼロに設定されると、クーロンポテンシャルとコーネルポテンシャルの2つの特別なケースが検討された。現在のポテンシャルは、最大誤差0.0065GeVの実験データと、他の研究者の作業と比較して優れた結果をもたらす。 We solved the radial Schr"odinger equation analytically using the Exact Quantization Rule approach to obtain the energy eigenvalues with the Extended Cornell potential ECP. The present results are applied for calculating the mass spectra of heavy mesons such as charmonium cc and bottomonium bb, and heavylight mesons such as bottom-charm bc and charm-Strange cs for different quantum states. Two special cases were considered when some of the potential parameters were set to zero, resulting into Coulomb potential, and Cornell potential, respectively. The present potential provides excellent results in comparison with experimental data with a maximum error of 0.0065 GeV and work of other researchers.	翻訳日:2023-04-20 04:15:30 公開日:2020-12-19
# 質量変形SYKモデルの量子カオスと熱力学における鋭い遷移 A sharp transition in quantum chaos and thermodynamics of mass deformed SYK model ( http://arxiv.org/abs/2012.10628v1 ) ライセンス: Link先を確認	Tomoki Nosaka	(参考訳) 我々は,我々の最近の研究 [arxiv:2009.10759] を概観し,2つの結合sachdev-ye-kitaev系のカオス性について考察した。局所場定式化を用いて, 大規模N限界における時間外相関器の計算により, このモデルのカオス指数は相転移温度で不連続な降下を示すことがわかった。したがって、このモデルでは、ホーキング・ページのような遷移は、ブラックホール幾何学と双対場理論におけるカオス的挙動の関係から予想されるカオス性の遷移と相関する。 We review our recent work [arXiv:2009.10759] where we studied the chaotic property of the two coupled Sachdev-Ye-Kitaev systems exhibiting a Hawking-Page like phase transition. By computing the out-of-time-ordered correlator in the large N limit by using the bilocal field formalism, we found that the chaos exponent of this model shows a discontinuous fall-off at the phase transition temperature. Hence in this model the Hawking-Page like transition is correlated with a transition in chaoticity, as expected from the relation between a black hole geometry and the chaotic behavior in the dual field theory.	翻訳日:2023-04-20 04:15:17 公開日:2020-12-19
# 出力フィードバックを持つフォトニック貯水池コンピュータを用いたランダムパターンと周波数生成 Random pattern and frequency generation using a photonic reservoir computer with output feedback ( http://arxiv.org/abs/2012.10615v1 ) ライセンス: Link先を確認	Piotr Antonik, Michiel Hermans, Marc Haelterman, Serge Massar	(参考訳) 貯水池コンピューティングは、時間に依存する信号を処理するためのバイオインスパイアされたコンピューティングパラダイムである。アナログ実装の性能は、一連のベンチマークタスクで他のデジタルアルゴリズムと一致している。それらのポテンシャルは、出力信号を貯水池に戻すことでさらに増大し、このアルゴリズムを時系列生成に適用することができる。これは原則として、リアルタイムの出力計算に十分な高速な読み出し層を実装する必要がある。ここではFPGAチップによって駆動されるデジタル出力層を用いてこれを実現する。出力フィードバックを持つ最初の光電子貯水池コンピュータを実演し、時系列生成タスクの2つの例(周波数とランダムパターン生成)でテストする。理想的な数値シミュレーションと同様、最初のタスクで非常に良い結果が得られる。しかし、後者のパフォーマンスは、実験的なノイズに悩まされている。本稿では,出力フィードバックを用いた物理貯留層コンピュータの性能に及ぼすノイズの影響について詳細に検討した。そこで,本研究はアナログ貯水池計算の新たな応用を開拓し,ノイズが出力フィードバックに与える影響について新たな知見をもたらす。 Reservoir computing is a bio-inspired computing paradigm for processing time dependent signals. The performance of its analogue implementations matches other digital algorithms on a series of benchmark tasks. Their potential can be further increased by feeding the output signal back into the reservoir, which would allow to apply the algorithm to time series generation. This requires, in principle, implementing a sufficiently fast readout layer for real-time output computation. Here we achieve this with a digital output layer driven by a FPGA chip. We demonstrate the first opto-electronic reservoir computer with output feedback and test it on two examples of time series generation tasks: frequency and random pattern generation. We obtain very good results on the first task, similar to idealised numerical simulations. The performance on the second one, however, suffers from the experimental noise. We illustrate this point with a detailed investigation of the consequences of noise on the performance of a physical reservoir computer with output feedback. Our work thus opens new possible applications for analogue reservoir computing and brings new insights on the impact of noise on the output feedback.	翻訳日:2023-04-20 04:14:39 公開日:2020-12-19
# InSARフェーズ・デノナイズ:最近の技術動向と今後の方向性 InSAR Phase Denoising: A Review of Current Technologies and Future Directions ( http://arxiv.org/abs/2001.00769v2 ) ライセンス: Link先を確認	Gang Xu, Yandong Gao, Jinwei Li and Mengdao Xing	(参考訳) 近年,インターフェロメトリ合成開口レーダ(InSAR)は情報取得の強化によってリモートセンシングにおいて強力なツールとなっている。 InSAR処理中、干渉電図の位相分解は、トポグラフィマッピングと変形モニタリングの必須ステップである。過去30年にわたって、この話題に取り組むために多くの効果的なアルゴリズムが開発されてきた。本稿では,InSAR位相分解法の概要を概説し,確立されたアルゴリズムと新興アルゴリズムを4つの主要なカテゴリに分類する。最初の2つの部分は、それぞれ伝統的なローカルフィルタとトランスフォーメーションドメインフィルタのカテゴリを参照している。第3部は非局所フィルタ(NL)のカテゴリーに着目し、その優れた性能を考慮に入れている。信号処理の新しい概念に基づく先進的な手法も、この分野でのポテンシャルを示すために導入されている。さらに,シミュレーションデータと測定データの両方を用いて数値実験を行い,いくつかの一般的な位相分数法を比較した。本研究の目的は、InSAR信号処理のアーキテクチャ開発を促進することで、関係研究者に必要なガイドラインとインスピレーションを提供することである。 Nowadays, interferometric synthetic aperture radar (InSAR) has been a powerful tool in remote sensing by enhancing the information acquisition. During the InSAR processing, phase denoising of interferogram is a mandatory step for topography mapping and deformation monitoring. Over the last three decades, a large number of effective algorithms have been developed to do efforts on this topic. In this paper, we give a comprehensive overview of InSAR phase denoising methods, classifying the established and emerging algorithms into four main categories. The first two parts refer to the categories of traditional local filters and transformed-domain filters, respectively. The third part focuses on the category of nonlocal (NL) filters, considering their outstanding performances. Latter, some advanced methods based on new concept of signal processing are also introduced to show their potentials in this field. Moreover, several popular phase denoising methods are illustrated and compared by performing the numerical experiments using both simulated and measured data. The purpose of this paper is intended to provide necessary guideline and inspiration to related researchers by promoting the architecture development of InSAR signal processing.	翻訳日:2023-01-14 18:04:55 公開日:2020-12-19
# パッチによるホログラフィックセンシング Patch-Based Holographic Image Sensing ( http://arxiv.org/abs/2002.03314v3 ) ライセンス: Link先を確認	Alfred Marcel Bruckstein, Martianus Frederic Ezerman, Adamas Aqsa Fahreza, and San Ling	(参考訳) データのホログラフィック表現は、データの格納されたパケットが任意の順序で利用可能になったときに、漸進的に洗練された分散ストレージを可能にする。本稿では,画像データのホログラフィックセンシングを行うパッチベース変換方式を提案する。提案手法は,データのランダムな検索順序下での進行回復に最適化されている。画像パッチのコーディングは、各検索段階で$\ell_2$ノルムの観点から、最適な画像復元を保証する分散プロジェクションの設計に依存している。パフォーマンスは、これまで検索されたデータパケットの数にのみ依存する。データパケットのサイズや数を変えながら、回復の質を高めるためのいくつかの選択肢が議論され、テストされる。これにより,いくつかの興味深いビット配置とレート歪みのトレードオフを検証し,推定された統計特性を持つ自然画像の集合について強調する。 Holographic representations of data enable distributed storage with progressive refinement when the stored packets of data are made available in any arbitrary order. In this paper, we propose and test patch-based transform coding holographic sensing of image data. Our proposal is optimized for progressive recovery under random order of retrieval of the stored data. The coding of the image patches relies on the design of distributed projections ensuring best image recovery, in terms of the $\ell_2$ norm, at each retrieval stage. The performance depends only on the number of data packets that has been retrieved thus far. Several possible options to enhance the quality of the recovery while changing the size and number of data packets are discussed and tested. This leads us to examine several interesting bit-allocation and rate-distortion trade offs, highlighted for a set of natural images with ensemble estimated statistical properties.	翻訳日:2023-01-02 15:03:25 公開日:2020-12-19
# ResiliNet: 分散ニューラルネットワークにおける障害耐性推論 ResiliNet: Failure-Resilient Inference in Distributed Neural Networks ( http://arxiv.org/abs/2002.07386v4 ) ライセンス: Link先を確認	Ashkan Yousefpour, Brian Q. Nguyen, Siddartha Devic, Guanhua Wang, Aboudy Kreidieh, Hans Lobel, Alexandre M. Bayen, Jason P. Jue	(参考訳) Federated Learningの目的は、生データを集中型サーバと共有することなく、分散ディープモデルをトレーニングすることだ。同様に、ニューラルネットワークの分散推論では、ネットワークを分割して複数の物理ノードに分散することで、アクティベーションと勾配が生データではなく物理ノード間で交換される。それでも、ニューラルネットワークを物理的ノードに分割して分散する場合、物理的ノードの障害は、それらのノードに置かれる神経ユニットの障害を引き起こし、結果としてパフォーマンスが大幅に低下する。現在のアプローチは、分散ニューラルネットワークにおけるトレーニングのレジリエンスに重点を置いている。しかし、分散ニューラルネットワークにおける推論のレジリエンスは調査されていない。 ResiliNetは、分散ニューラルネットワークにおいて物理ノード障害に耐性を持たせるためのスキームである。 ResiliNetは2つの概念を組み合わせてレジリエンスを提供する: ハイパーコネクションをスキップする、分散ニューラルネットワークのノードをスキップする、resnetの接続をスキップするのと同様のコンセプト、フェールアウトと呼ばれる新しいテクニック。 Failoutは、ドロップアウトを使用したトレーニング中の物理ノード障害条件をシミュレートし、分散ニューラルネットワークのレジリエンスを改善するように設計されている。 3つのデータセットを用いた実験およびアブレーション研究の結果、分散ニューラルネットワークに対する推論レジリエンスを提供するResiliNetの能力が確認された。 Federated Learning aims to train distributed deep models without sharing the raw data with the centralized server. Similarly, in distributed inference of neural networks, by partitioning the network and distributing it across several physical nodes, activations and gradients are exchanged between physical nodes, rather than raw data. Nevertheless, when a neural network is partitioned and distributed among physical nodes, failure of physical nodes causes the failure of the neural units that are placed on those nodes, which results in a significant performance drop. Current approaches focus on resiliency of training in distributed neural networks. However, resiliency of inference in distributed neural networks is less explored. We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. ResiliNet combines two concepts to provide resiliency: skip hyperconnection, a concept for skipping nodes in distributed neural networks similar to skip connection in resnets, and a novel technique called failout, which is introduced in this paper. Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks. The results of the experiments and ablation studies using three datasets confirm the ability of ResiliNet to provide inference resiliency for distributed neural networks.	翻訳日:2022-12-30 19:25:47 公開日:2020-12-19
# 半自律遠隔操作における優先支援戦略の学習と伝達知識 Learn and Transfer Knowledge of Preferred Assistance Strategies in Semi-autonomous Telemanipulation ( http://arxiv.org/abs/2003.03516v2 ) ライセンス: Link先を確認	Lingfeng Tao, Michael Bowman, Xu Zhou, Jiucai Zhang, Xiaoli Zhang	(参考訳) ロボットを効果的に支援するための支援を行うには、ロボットの補助動作が必ずしも人間の操作者にとって直感的ではなく、人間の行動や嗜好がロボットの解釈に不明瞭であることから、操作者の指示による遠隔操作の操作は極めて困難である。様々な最適化の観点から制御品質を改善するための様々な支援手法が開発されているが、テレマニピュレーションタスクの微妙な動作制約とオペレータの好みを満たす適切なアプローチを決定することは依然として課題である。これらの問題に対処するため、我々は新しい選好支援知識学習アプローチを開発した。補助選好モデルは、人間が好む援助を学習し、段階的モデル更新方法は、人間の選好データのあいまいさに対処しながら、学習安定性を確保する。このような嗜好認識支援知識により、遠隔操作ロボットハンドは、操作の成功に対してより活発で望ましい支援を提供することができる。また,ロボット固有の学習を避けるために,異なるロボットハンド構造にまたがる嗜好知識を伝達する知識伝達手法を開発した。 3フィンガーハンドと2フィンガーハンドをテレマニピュレートして、使用、移動、カップ上のハンドを遠隔操作する実験が行われている。その結果,ロボットはロボットの選好知識を効果的に学習し,学習労力の少ないロボット間での知識伝達を可能にした。 Enabling robots to provide effective assistance yet still accommodating the operator's commands for telemanipulation of an object is very challenging because robot's assistive action is not always intuitive for human operators and human behaviors and preferences are sometimes ambiguous for the robot to interpret. Although various assistance approaches are being developed to improve the control quality from different optimization perspectives, the problem still remains in determining the appropriate approach that satisfies the fine motion constraints for the telemanipulation task and preference of the operator. To address these problems, we developed a novel preference-aware assistance knowledge learning approach. An assistance preference model learns what assistance is preferred by a human, and a stagewise model updating method ensures the learning stability while dealing with the ambiguity of human preference data. Such a preference-aware assistance knowledge enables a teleoperated robot hand to provide more active yet preferred assistance toward manipulation success. We also developed knowledge transfer methods to transfer the preference knowledge across different robot hand structures to avoid extensive robot-specific training. Experiments to telemanipulate a 3-finger hand and 2-finger hand, respectively, to use, move, and hand over a cup have been conducted. Results demonstrated that the methods enabled the robots to effectively learn the preference knowledge and allowed knowledge transfer between robots with less training effort.	翻訳日:2022-12-25 20:07:39 公開日:2020-12-19
# FLOSSバージョンリリースイベントをテキストメッセージから検出することは可能か? stack overflow のケーススタディ Is it feasible to detect FLOSS version release events from textual messages? A case study on Stack Overflow ( http://arxiv.org/abs/2003.14257v3 ) ライセンス: Link先を確認	A. Sokolovsky, T. Gross, J. Bacardit	(参考訳) トピック検出と追跡(TDT)はテキストマイニング領域における非常に活発な研究課題であり、一般的にトピックやイベントを検出するニュースフィードやTwitterデータセットに適用される。イベント"の概念は広いが、通常は単一のポストやメッセージから検出できる事象に適用される。マイクロイベント(micro-events)と呼ばれるもので、その性質上、単一のテキスト情報からは検出できない。この研究は、Stack Overflow Q&AプラットフォームのメッセージのサンプルとLibraries.ioデータセットのFree/Libre Open Source Software(FLOSS)バージョンリリースを使用して、テキストデータ上でのマイクロイベント検出の実現可能性を検討する。格子探索手法を用いてパラメータを最適化した3つの異なる推定器を用いてマイクロイベントを検出するパイプラインを構築する。我々は、感情分析を伴うLDAトピックモデリングと、感情分析を伴うhSBMトピックの2つの特徴空間を考える。特徴空間は、クロスバリデーション(RFECV)戦略による再帰的特徴除去を用いて最適化される。本研究では,マイクロイベント発生前後のトピック分布や感情特性に特徴的な変化があるかどうかを考察し,マイクロイベント検出のための分析パイプラインの各バリエーションのキャパシティを徹底的に評価する。さらに, 影響事例, 分散インフレーション係数, 線形性仮定の検証, 擬似R2乗測度, 無情報率など, モデルに関する詳細な統計分析を行った。最後に,マイクロイベント検出の限界を研究するために,実世界のデータに類似した特性を持つマイクロイベント合成データセットを生成する手法を設計し,評価された各分類器のマイクロイベント検出可能性閾値を同定する。 Topic Detection and Tracking (TDT) is a very active research question within the area of text mining, generally applied to news feeds and Twitter datasets, where topics and events are detected. The notion of "event" is broad, but typically it applies to occurrences that can be detected from a single post or a message. Little attention has been drawn to what we call "micro-events", which, due to their nature, cannot be detected from a single piece of textual information. The study investigates the feasibility of micro-event detection on textual data using a sample of messages from the Stack Overflow Q&A platform and Free/Libre Open Source Software (FLOSS) version releases from Libraries.io dataset. We build pipelines for detection of micro-events using three different estimators whose parameters are optimized using a grid search approach. We consider two feature spaces: LDA topic modeling with sentiment analysis, and hSBM topics with sentiment analysis. The feature spaces are optimized using the recursive feature elimination with cross validation (RFECV) strategy. In our experiments we investigate whether there is a characteristic change in the topics distribution or sentiment features before or after micro-events take place and we thoroughly evaluate the capacity of each variant of our analysis pipeline to detect micro-events. Additionally, we perform a detailed statistical analysis of the models, including influential cases, variance inflation factors, validation of the linearity assumption, pseudo R squared measures and no-information rate. Finally, in order to study limits of micro-event detection, we design a method for generating micro-event synthetic datasets with similar properties to the real-world data, and use them to identify the micro-event detectability threshold for each of the evaluated classifiers.	翻訳日:2022-12-18 07:18:01 公開日:2020-12-19
# o(n)$接続は十分表現力がある:スパーストランスフォーマーの普遍近似可能性 $O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers ( http://arxiv.org/abs/2006.04862v2 ) ライセンス: Link先を確認	Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar	(参考訳) 近年,多くのNLPタスクにおいてトランスフォーマーネットワークが技術状況を再定義している。しかし、これらのモデルは、各層でペアワイズ注意を計算するために入力シーケンス長$n$の2次計算コストに苦しむ。このことは、注意層内の接続を分散させるスパーストランスフォーマーの最近の研究を引き起こしている。長い列に対して経験的に有望な一方で、基本的な疑問は解決されていない。 sparsityパターンとsparsityレベルはパフォーマンスにどのように影響しますか? 本稿では,これらの問題に対処し,既存のスパースアテンションモデルをキャプチャする統一フレームワークを提供する。スパース注意モデルが任意の列列列関数を普遍的に近似できることを示す十分条件を提案する。驚くべきことに、o(n)$の接続しか持たないスパーストランスフォーマは、n^2$の接続を持つ密接なモデルと同じ関数クラスに近似できることがわかった。最後に、標準NLPタスクにおいて、異なるパターンや疎度を比較検討する。 Recently, Transformer networks have redefined the state of the art in many NLP tasks. However, these models suffer from quadratic computational cost in the input sequence length $n$ to compute pairwise attention in each layer. This has prompted recent research into sparse Transformers that sparsify the connections in the attention layers. While empirically promising for long sequences, fundamental questions remain unanswered: Can sparse Transformers approximate any arbitrary sequence-to-sequence function, similar to their dense counterparts? How does the sparsity pattern and the sparsity level affect their performance? In this paper, we address these questions and provide a unifying framework that captures existing sparse attention models. We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function. Surprisingly, our results show that sparse Transformers with only $O(n)$ connections per attention layer can approximate the same function class as the dense model with $n^2$ connections. Lastly, we present experiments comparing different patterns/levels of sparsity on standard NLP tasks.	翻訳日:2022-11-24 00:58:50 公開日:2020-12-19
# ランダムマトリクスによるランダムフーリエ特徴の解析:ガウス核の向こう側、精密相転移、および対応する2重輝線 A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent ( http://arxiv.org/abs/2006.05013v2 ) ライセンス: Link先を確認	Zhenyu Liao, Romain Couillet, Michael W. Mahoney	(参考訳) この記事では、ランダムフーリエ特徴量(rff)回帰の正確な漸近性を特徴づける。データサンプル数(n$)、次元(p$)、特徴空間の次元(n$)がすべて大きく、比較できる現実的な設定である。この状態において、ランダムな RFF 文法行列は、($N \to \infty$ 単独で行うような)よく知られた極限ガウスの核行列に収束しないが、それでも我々の解析によって捉えられる引き込み可能な振る舞いを持つ。この分析はまた、大きな$n,p,N$のトレーニングとテスト回帰エラーの正確な推定も提供する。これらの推定に基づいて、それらの間の相転移を含む2つの定性的に異なる学習相の正確なキャラクタリゼーションを提供し、この相転移挙動から対応する二重降下試験誤差曲線を導出する。これらの結果はデータ分布の強い仮定には依存せず、実世界のデータセットでの経験的な結果と完全に一致する。 This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on real-world data sets.	翻訳日:2022-11-23 13:32:37 公開日:2020-12-19
# おそらくロバストなメトリクス学習 Provably Robust Metric Learning ( http://arxiv.org/abs/2006.07024v2 ) ライセンス: Link先を確認	Lu Wang, Xuanqing Liu, Jinfeng Yi, Yuan Jiang, Cho-Jui Hsieh	(参考訳) メトリック学習は分類と類似性探索のための重要なアルゴリズム群であるが、小さな逆摂動に対する学習指標の頑健性は研究されていない。本稿では,クリーンな精度を高めることに焦点を当てた既存のメトリック学習アルゴリズムが,ユークリッド距離よりも頑健なメトリクスを生成することができることを示す。この問題を解決するために, 対角的摂動に対して頑健なマハラノビス距離を求めるための新しい距離学習アルゴリズムを提案する。提案手法は,検証されたロバストエラーと経験的ロバストエラー(逆攻撃によるエラー)の両方を改善した。さらに、クリーンでロバストなエラーのトレードオフに直面するニューラルネットワークの防御とは異なり、従来のメトリック学習方法に比べてクリーンなエラーを犠牲にしない。私たちのコードはhttps://github.com/wangwllu/provably_robust_metric_learningで利用可能です。 Metric learning is an important family of algorithms for classification and similarity search, but the robustness of learned metrics against small adversarial perturbations is less studied. In this paper, we show that existing metric learning algorithms, which focus on boosting the clean accuracy, can result in metrics that are less robust than the Euclidean distance. To overcome this problem, we propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations, and the robustness of the resulting model is certifiable. Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors (errors under adversarial attacks). Furthermore, unlike neural network defenses which usually encounter a trade-off between clean and robust errors, our method does not sacrifice clean errors compared with previous metric learning methods. Our code is available at https://github.com/wangwllu/provably_robust_metric_learning.	翻訳日:2022-11-22 02:41:07 公開日:2020-12-19
# 時空間クリギングのためのインダクティブグラフニューラルネットワーク Inductive Graph Neural Networks for Spatiotemporal Kriging ( http://arxiv.org/abs/2006.07527v2 ) ライセンス: Link先を確認	Yuankai Wu, Dingyi Zhuang, Aurelie Labbe and Lijun Sun	(参考訳) 時系列予測と時空間クリグは時空間データ解析において最も重要な2つのタスクである。グラフニューラルネットワークに関する最近の研究は、時系列予測にかなりの進歩をもたらしているが、サンプリングされていない場所やセンサーの信号を回復するkriging問題にはほとんど注意が払われていない。既存のスケーラブルなkrigingメソッド(例えば、マトリックス/テンソル補完)は、トランスダクティブであり、補間するための新しいセンサーがある場合、完全なリトレーニングが必要である。本稿では,インダクティブグラフニューラルネットワーク(IGNNK, Inductive Graph Neural Network Kriging)モデルを構築し,ネットワーク/グラフ構造上のアンサンプリングセンサのデータを復元する。距離と到達性の影響を一般化するため,サンプルとしてランダムな部分グラフを生成し,各サンプルに対して対応する隣接行列を再構成する。各サンプルサブグラフ上のすべての信号を再構成することにより、IGNNKは空間メッセージパッシング機構を効果的に学習することができる。実世界の時空間データセットにおける実験結果から,モデルの有効性が示された。さらに、学習したモデルは、目に見えないデータセット上の同じタイプのクリグタスクにうまく転送できることがわかった。結果はこう示しています 1)GNNは空間クリギングの効率的かつ効果的なツールである。 2)誘導性GNNは動的隣接行列を用いて訓練することができる。 3) トレーニングされたモデルを新しいグラフ構造に転送し、 4)IGNNKは仮想センサの生成に使用できる。 Time series forecasting and spatiotemporal kriging are the two most important tasks in spatiotemporal data analysis. Recent research on graph neural networks has made substantial progress in time series forecasting, while little attention has been paid to the kriging problem -- recovering signals for unsampled locations/sensors. Most existing scalable kriging methods (e.g., matrix/tensor completion) are transductive, and thus full retraining is required when we have a new sensor to interpolate. In this paper, we develop an Inductive Graph Neural Network Kriging (IGNNK) model to recover data for unsampled sensors on a network/graph structure. To generalize the effect of distance and reachability, we generate random subgraphs as samples and reconstruct the corresponding adjacency matrix for each sample. By reconstructing all signals on each sample subgraph, IGNNK can effectively learn the spatial message passing mechanism. Empirical results on several real-world spatiotemporal datasets demonstrate the effectiveness of our model. In addition, we also find that the learned model can be successfully transferred to the same type of kriging tasks on an unseen dataset. Our results show that: 1) GNN is an efficient and effective tool for spatial kriging; 2) inductive GNNs can be trained using dynamic adjacency matrices; 3) a trained model can be transferred to new graph structures and 4) IGNNK can be used to generate virtual sensors.	翻訳日:2022-11-21 20:33:39 公開日:2020-12-19
# 科学的プロットのための物体検出ネットワークの系統的評価 A Systematic Evaluation of Object Detection Networks for Scientific Plots ( http://arxiv.org/abs/2007.02240v2 ) ライセンス: Link先を確認	Pritha Ganguly, Nitesh Methani, Mitesh M. Khapra and Pratyush Kumar	(参考訳) 既存の物体検出法は、自然画像に見られる物体と明らかに異なる科学的プロットのテキストや視覚要素を検出するのに適切か? この質問に答えるために、PlotQAデータセット上の様々なSOTAオブジェクト検出ネットワークの精度をトレーニングし比較する。 0.5の標準IOU設定では、ほとんどのネットワークはプロット内の比較的単純な物体を検出する場合、mAPスコアが80%以上である。しかし、パフォーマンスは0.9のより厳格なIOUで評価されると大幅に低下し、最高のモデルでmAPは35.70%となった。このような厳密な評価は、小さな局所化誤差でさえ下流の数値推論において大きな誤差をもたらす科学的なプロットを扱う際に必要である。この性能が劣ると、異なるオブジェクト検出ネットワークのアイデアを組み合わせることで、既存のモデルに小さな修正を加えることを提案する。これはパフォーマンスを大幅に改善するが、依然として2つの大きな問題がある。 (i)推論に欠かせないテキストオブジェクトのパフォーマンスは、非常に貧弱である。 (ii)プロットの単純さを考えると、推論時間は明らかに大きい。この未解決の問題を解決するために一連の貢献をします (a)ラプラシアンエッジ検出器に基づく効率的な領域提案法 (b)隣接情報を含む地域提案の特徴表示 (c)より長いテキストオブジェクトを検出するための複数の領域提案に結合するリンクコンポーネント、 (d)スムーズなL1ロスとIOUベースのロスを組み合わせたカスタムロス関数。これらのアイデアを組み合わせることで、最終モデルは、93.44%@0.9 IOUのmAPを達成する極端なIOU値において非常に正確である。同時に、我々のモデルは1段検出器を含む現在のモデルよりも16倍少ない推論時間で非常に効率的である。これらの貢献により、プロットの自動推論のさらなる探索が可能になる。 Are existing object detection methods adequate for detecting text and visual elements in scientific plots which are arguably different than the objects found in natural images? To answer this question, we train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset. At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in detecting the relatively simple objects in plots. However, the performance drops drastically when evaluated at a stricter IOU of 0.9 with the best model giving a mAP of 35.70%. Note that such a stricter evaluation is essential when dealing with scientific plots where even minor localisation errors can lead to large errors in downstream numerical inferences. Given this poor performance, we propose minor modifications to existing models by combining ideas from different object detection networks. While this significantly improves the performance, there are still 2 main issues: (i) performance on text objects which are essential for reasoning is very poor, and (ii) inference time is unacceptably large considering the simplicity of plots. To solve this open problem, we make a series of contributions: (a) an efficient region proposal method based on Laplacian edge detectors, (b) a feature representation of region proposals that includes neighbouring information, (c) a linking component to join multiple region proposals for detecting longer textual objects, and (d) a custom loss function that combines a smooth L1-loss with an IOU-based loss. Combining these ideas, our final model is very accurate at extreme IOU values achieving a mAP of 93.44%@0.9 IOU. Simultaneously, our model is very efficient with an inference time 16x lesser than the current models, including one-stage detectors. With these contributions, we enable further exploration on the automated reasoning of plots.	翻訳日:2022-11-13 08:23:15 公開日:2020-12-19
# プロジェクション・プールを用いた野生の総合的多視点ビルディング解析 Holistic Multi-View Building Analysis in the Wild with Projection Pooling ( http://arxiv.org/abs/2008.10041v3 ) ライセンス: Link先を確認	Zbigniew Wojna, Krzysztof Maziarz, {\L}ukasz Jocz, Robert Pa{\l}uba, Robert Kozikowski, Iasonas Kokkinos	(参考訳) 建設タイプ, 床数, 屋根のピッチと形状, ファサード材, 占有階級といった, きめ細かい建築特性に関する6つの異なる分類課題に対処する。このようなリモートビル分析問題に取り組むことは、都市シーンの大規模データセットの成長によって最近初めて可能になった。この目的のために,9674棟の49426画像(トップビューとストリートビュー)からなる新しいベンチマークデータセットを導入する。これらの写真は幾何学的メタデータと共にさらに組み立てられる。データセットには、オクルージョン、ぼやけ、部分的に見える物体、広い範囲の建物など、さまざまな現実世界の課題が展示されている。本研究では,高次元空間におけるトップビューとサイドビューの統一的なトップビュー表現を作成する,新しい投影プーリング層を提案する。これにより、ビルドとイメージメタデータをシームレスに利用することができます。このレイヤの導入により、高度に調整されたベースラインモデルと比較して、分類精度が向上する。 We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.	翻訳日:2022-10-26 02:45:13 公開日:2020-12-19
# (参考訳) authnet: 時間的顔特徴運動を用いた深層学習に基づく認証機構 AuthNet: A Deep Learning based Authentication Mechanism using Temporal Facial Feature Movements ( http://arxiv.org/abs/2012.02515v2 ) ライセンス: CC BY 4.0	Mohit Raghavendra, Pravan Omprakash, B R Mukesh, Sowmya Kamath	(参考訳) 機械学習とディープラーニングに基づくバイオメトリックシステムは、スマートフォンや他の小さなコンピューティングデバイスのようなリソースに制約のある環境で認証メカニズムとして広く使われている。これらのAIを利用した顔認識メカニズムは、透明で接触のない非侵襲的な性質のため、近年大きな人気を集めている。効果は大きいが、写真やマスク、メガネなどを使って、許可されていないアクセスを得る方法もある。本稿では,顔認識と,その顔の特異な動作の両方を用いて,パスワードを発話する認証機構,すなわち時間的顔特徴動作を提案する。提案モデルは、ユーザが任意の言語でパスワードを設定できるため、言語障壁によって阻害されない。標準のMIRACL-VC1データセットで評価すると、提案モデルは98.1%の精度を達成し、有効で堅牢なシステムとしての有効性を実証した。提案手法は, 正の映像サンプル10点をトレーニングしても良好な結果が得られたため, データ効率も高い。また, ネットワークの学習能力は, 様々な複合顔認証とリップ読解モデルに対して, 提案方式のベンチマークによって実証される。 Biometric systems based on Machine learning and Deep learning are being extensively used as authentication mechanisms in resource-constrained environments like smartphones and other small computing devices. These AI-powered facial recognition mechanisms have gained enormous popularity in recent years due to their transparent, contact-less and non-invasive nature. While they are effective to a large extent, there are ways to gain unauthorized access using photographs, masks, glasses, etc. In this paper, we propose an alternative authentication mechanism that uses both facial recognition and the unique movements of that particular face while uttering a password, that is, the temporal facial feature movements. The proposed model is not inhibited by language barriers because a user can set a password in any language. When evaluated on the standard MIRACL-VC1 dataset, the proposed model achieved an accuracy of 98.1%, underscoring its effectiveness as an effective and robust system. The proposed method is also data-efficient since the model gave good results even when trained with only 10 positive video samples. The competence of the training of the network is also demonstrated by benchmarking the proposed system against various compounded Facial recognition and Lip reading models.	翻訳日:2021-05-23 07:04:08 公開日:2020-12-19
# MLS:音声研究のための大規模多言語データセット MLS: A Large-Scale Multilingual Dataset for Speech Research ( http://arxiv.org/abs/2012.03411v2 ) ライセンス: Link先を確認	Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert	(参考訳) 本稿では,音声研究に適した多言語コーパスであるMLSデータセットを提案する。データセットは、LibriVoxの読み上げオーディオブックから派生したもので、英語の約44.5K時間、他の言語で約6K時間を含む8言語で構成されている。さらに,言語モデル(LM)とベースライン自動音声認識(ASR)モデル,およびデータセットのすべての言語に対して提供する。このような大きな転写されたデータセットは、ASR と Text-To-Speech (TTS) 研究に新たな道を開くと信じている。データセットはhttp://www.openslr.orgの誰でも自由に利用できる。 This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages, including about 44.5K hours of English and a total of about 6K hours for other languages. Additionally, we provide Language Models (LM) and baseline Automatic Speech Recognition (ASR) models and for all the languages in our dataset. We believe such a large transcribed dataset will open new avenues in ASR and Text-To-Speech (TTS) research. The dataset will be made freely available for anyone at http://www.openslr.org.	翻訳日:2021-05-16 21:02:42 公開日:2020-12-19
# ニューラルネットワーク分類器の最終層および最後層における四面体対称性の出現について On the emergence of tetrahedral symmetry in the final and penultimate layers of neural network classifiers ( http://arxiv.org/abs/2012.05420v2 ) ライセンス: Link先を確認	Weinan E and Stephan Wojtowytsch	(参考訳) 最近の数値的な研究により、ニューラルネットワーク分類器は有極層において大きな対称性を持つことがわかった。すなわち、$h(x) = af(x) +b$ ここで$a$が線型写像であり、$f$がネットワークのペナルティマイト層の出力である場合(活性化後)、すべてのデータポイント $x_{i, 1}, \dots, x_{i, n_i}$ はクラス $c_i$ で 1 つの点 $y_i$ にマッピングされ、ポイント $y_i$ は高次元ユークリッド空間における通常の $k-1$ 次元四面体の頂点に位置する。本研究は,高表現性深層ニューラルネットワークの玩具モデルで解析的に説明する。補完的な例では、$h$ が浅いネットワークであれば $c_i$ クラスからのデータサンプルよりも、$h$ の最終的な出力が均一でないことを厳密に示します(あるいは、より深い層がデータサンプルを便利な幾何学的構成にしない場合)。 A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional tetrahedron in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).	翻訳日:2021-05-15 06:35:44 公開日:2020-12-19
# 可視領域のセグメンテーションと形状を考慮したアモーダルセグメンテーション Amodal Segmentation Based on Visible Region Segmentation and Shape Prior ( http://arxiv.org/abs/2012.05598v2 ) ライセンス: Link先を確認	Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao	(参考訳) 既存のアモダルセグメンテーション手法のほとんど全ては、画像全体に対応する特徴を用いてオクルード領域の推論を行う。これは人間のアモーダル知覚に反し、人間の目に見える部分と、対象の事前の知識を使って、隠された領域を推測する。人間の振る舞いを模倣し,学習の曖昧さを解決するために,まず,粗い目に見えるマスクと粗いアモーダルマスクを推定する枠組みを提案する。そして、粗い予測に基づいて、我々のモデルは、可視領域に集中し、メモリに先行する形状を利用してアモーダルマスクを推定する。これにより、アモーダルマスク推定において、背景と閉塞に対応する特徴を抑えることができる。その結果、アモダルマスクは、オクルージョンが同じ可視領域に与えられるものの影響を受けない。以前の形状の活用により、アモーダルマスクの推定はより堅牢で合理的になる。提案モデルは3つのデータセットで評価される。実験の結果,提案手法は既存手法よりも優れていた。形状の可視化は、コードブックのカテゴリ固有の特徴がある程度解釈可能であることを示している。 Almost all existing amodal segmentation methods make the inferences of occluded regions by using features corresponding to the whole image. This is against the human's amodal perception, where human uses the visible part and the shape prior knowledge of the target to infer the occluded region. To mimic the behavior of human and solve the ambiguity in the learning, we propose a framework, it firstly estimates a coarse visible mask and a coarse amodal mask. Then based on the coarse prediction, our model infers the amodal mask by concentrating on the visible region and utilizing the shape prior in the memory. In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation. Consequently, the amodal mask would not be affected by what the occlusion is given the same visible regions. The leverage of shape prior makes the amodal mask estimation more robust and reasonable. Our proposed model is evaluated on three datasets. Experiments show that our proposed model outperforms existing state-of-the-art methods. The visualization of shape prior indicates that the category-specific feature in the codebook has certain interpretability.	翻訳日:2021-05-15 06:22:56 公開日:2020-12-19
# OpenHoldem: 大規模不完全な情報ゲーム研究のためのオープンツールキット OpenHoldem: An Open Toolkit for Large-Scale Imperfect-Information Game Research ( http://arxiv.org/abs/2012.06168v2 ) ライセンス: Link先を確認	Kai Li, Hang Xu, Meng Zhang, Enmin Zhao, Zhe Wu, Junliang Xing, Kaiqi Huang	(参考訳) 少数の研究所による未許可の努力に則って、大規模不完全情報ゲーム研究の主要な試験場であるNo-Limit Texas Hold'em (NLTH)における超人的AIの設計において、近年大きな進歩が見られた。しかし、既存の手法と比較するための標準ベンチマークがないため、新しい研究者がこの問題を研究することは依然として困難であり、この研究領域のさらなる発展を著しく妨げている。本研究では,NLTHを用いた大規模不完全情報ゲーム研究用統合ツールキットOpenHoldemを提案する。 1)異なるnlth aisを徹底的に評価するための標準化された評価プロトコル、2)nlth aiの3つの公に利用可能な強力なベースライン、3)公開nlth ai評価のための使いやすいapiを備えたオンラインテスティングプラットフォーム。我々はopenholdemをhttp://holdem.ia.ac.cn/でリリースし、この分野における未解決の理論的および計算的問題に関するさらなる研究を促進し、敵モデリング、大規模平衡探索、人間-コンピュータ対話学習といった重要な研究課題を育むことを願っている。 Owning to the unremitting efforts by a few institutes, significant progress has recently been made in designing superhuman AIs in No-limit Texas Hold'em (NLTH), the primary testbed for large-scale imperfect-information game research. However, it remains challenging for new researchers to study this problem since there are no standard benchmarks for comparing with existing methods, which seriously hinders further developments in this research area. In this work, we present OpenHoldem, an integrated toolkit for large-scale imperfect-information game research using NLTH. OpenHoldem makes three main contributions to this research direction: 1) a standardized evaluation protocol for thoroughly evaluating different NLTH AIs, 2) three publicly available strong baselines for NLTH AI, and 3) an online testing platform with easy-to-use APIs for public NLTH AI evaluation. We have released OpenHoldem at http://holdem.ia.ac.cn/, hoping it facilitates further studies on the unsolved theoretical and computational issues in this area and cultivate crucial research problems like opponent modeling, large-scale equilibrium-finding, and human-computer interactive learning.	翻訳日:2021-05-11 03:07:24 公開日:2020-12-19
# (参考訳) on-device full neural end-to-end automatic speech recognition algorithmのレビュー A review of on-device fully neural end-to-end automatic speech recognition algorithms ( http://arxiv.org/abs/2012.07974v2 ) ライセンス: CC BY 4.0	Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, and Changwoo Han	(参考訳) 本稿では,デバイス上での音声認識アルゴリズムとその最適化手法について述べる。従来の音声認識システムは、音響モデル、言語モデル、発音モデル、テキスト正規化器、逆テキスト正規化器、重み付き有限状態変換器(WFST)に基づくデコーダなど、多数の独立したコンポーネントで構成されている。従来の音声認識システムで十分高い音声認識精度を得るには、通常、非常に大きな言語モデル(最大100GB)が必要である。したがって、対応するWFSTサイズは巨大になり、デバイス上での実装が禁止される。近年,完全ニューラルネットワークのエンドツーエンド音声認識アルゴリズムが提案されている。例えば、コネクショニスト時間分類(CTC)に基づく音声認識システム、リカレントニューラルネットワークトランスデューサ(RNN-T)、アテンションベースエンコーダ-デコーダモデル(AED)、モノトニックチャンク-ワイドアテンション(MoChA)、トランスフォーマーベース音声認識システムなどである。これらのニューラルネットワークベースのシステムでは、従来のアルゴリズムに比べてメモリフットプリントがはるかに小さいため、デバイス上での実装が実現可能になっている。本稿では,このようなエンドツーエンド音声認識モデルについてレビューする。従来のアルゴリズムと比較して,それらの構造,性能,利点を広く論じる。 In this paper, we review various end-to-end automatic speech recognition algorithms and their optimization techniques for on-device applications. Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite State Transducer (WFST), and so on. To obtain sufficiently high speech recognition accuracy with such conventional speech recognition systems, a very large language model (up to 100 GB) is usually needed. Hence, the corresponding WFST size becomes enormous, which prohibits their on-device implementation. Recently, fully neural network end-to-end speech recognition algorithms have been proposed. Examples include speech recognition systems based on Connectionist Temporal Classification (CTC), Recurrent Neural Network Transducer (RNN-T), Attention-based Encoder-Decoder models (AED), Monotonic Chunk-wise Attention (MoChA), transformer-based speech recognition systems, and so on. These fully neural network-based systems require much smaller memory footprints compared to conventional algorithms, therefore their on-device implementation has become feasible. In this paper, we review such end-to-end speech recognition models. We extensively discuss their structures, performance, and advantages compared to conventional algorithms.	翻訳日:2021-05-08 17:20:34 公開日:2020-12-19
# (参考訳) LSTMによる高効率建築エネルギー管理に向けた空間占有予測 LSTM-based Space Occupancy Prediction towards Efficient Building Energy Management ( http://arxiv.org/abs/2012.08114v2 ) ライセンス: CC BY 4.0	Juye Kim	(参考訳) 建物で消費されるエネルギーは、総エネルギー使用量のかなりの部分を占める。大量の建築エネルギーは、暖房、冷却、換気、空調(HVAC)に使用される。しかし、その重要性に比較して、近年のエネルギー管理システムの構築は、単純なルールベース制御(RBC)技術に基づくHVACの制御に限られている。空調を効率的に管理できるシステムを設計する能力は、エネルギー使用量と温室効果ガス排出量を減らすことができる。本稿では,LSTMを用いた占領パターンの時系列予測モデルを提案する。 HVACの動作には、次の時間帯(例えば、次の30分)における将来の部屋占有状況の予測信号を直接使用することができる。例えば、予知と冷却または加熱の時間を考慮すると、部屋が使用される前にHVACをオンにすることができる(例えば、10分前にオンにする)。また、次の部屋の空いた予測タイミングに基づき、HVACを早期にオフにすることができ、快適さを損なわずにHVACの効率を高めるのに役立つ。大学ビルの複数の部屋から収集した実世界のエネルギーデータを用いて,本手法の能力を示す。 LSTMの部屋占有予測に基づくHVAC制御は,従来のRBC制御と比較してエネルギー使用量を50%削減できることを示した。 Energy consumed in buildings takes significant portions of the total global energy usage. A large amount of building energy is used for heating, cooling, ventilation, and air-conditioning (HVAC). However, compared to its importance, building energy management systems nowadays are limited in controlling HVAC based on simple rule-based control (RBC) technologies. The ability to design systems that can efficiently manage HVAC can reduce energy usage and greenhouse gas emissions, and, all in all, it can help us to mitigate climate change. This paper proposes predictive time-series models of occupancy patterns using LSTM. Prediction signal for future room occupancy status on the next time span (e.g., next 30 minutes) can be directly used to operate HVAC. For example, based on the prediction and considering the time for cooling or heating, HVAC can be turned on before the room is being used (e.g., turn on 10 minutes earlier). Also, based on the next room empty prediction timing, HVAC can be turned off earlier, and it can help us increase the efficiency of HVAC while not decreasing comfort. We demonstrate our approach's capabilities using real-world energy data collected from multiple rooms of a university building. We show that LSTM's room occupancy prediction based HVAC control could save energy usage by 50% compared to conventional RBC based control.	翻訳日:2021-05-08 11:45:00 公開日:2020-12-19
# 限定的なコミュニケーション下でのマルチエージェントコラボレーションによる分散オンラインメタラーニングの高速化 Accelerating Distributed Online Meta-Learning via Multi-Agent Collaboration under Limited Communication ( http://arxiv.org/abs/2012.08660v2 ) ライセンス: Link先を確認	Sen Lin, Mehmet Dedeoglu and Junshan Zhang	(参考訳) IoTエコシステムにおけるエッジインテリジェンスの実現を可能にする技術として,オンラインメタ学習が登場している。それでも、タスク内高速適応のための優れたメタモデルを学ぶには、単一のエージェントだけで多くのタスクを学習する必要がある。マルチエージェントネットワークにおいて、異なるエージェント間の学習タスクがモデル類似性を共有することが多いことを観察するため、我々は、以下の根本的な疑問に答える:「限られたコミュニケーションと、どの程度の利益が達成できるかどうかによって、エージェント間のオンラインメタラーニングを加速することは可能か? そこで本研究では,マルチエージェントオンラインメタラーニングフレームワークを提案し,それと同等の2レベルネスト型オンライン凸最適化(oco)問題として位置づける。エージェントタスク平均的後悔の上限を特徴づけることで、マルチエージェントオンラインメタ学習の性能は、限られた通信によるメタモデル更新において、エージェントが分散ネットワークレベルのOCOからどれだけ恩恵を受けられるかに大きく依存することを示したが、よく理解されていない。この課題に取り組むために、我々は分散オンライン勾配降下アルゴリズムを考案し、各エージェントが1イテレーションあたり1回の通信ステップだけを使用してグローバル勾配を追跡し、その結果、エージェントあたりの平均後悔額$o(\sqrt{t/n})$が、最適なシングルエージェントの後悔額$o(\sqrt{t})$が、t$イテレーションの後に$n$がエージェント数であることを示す。この急激な性能向上を基盤として,マルチエージェントのオンラインメタ学習アルゴリズムを開発し,単一エージェントのオンラインメタ学習と比較して,O(1/\sqrt{NT})$の速さで最適なタスク平均後悔を達成可能であることを示す。広範な実験は理論結果を裏付ける。 Online meta-learning is emerging as an enabling technique for achieving edge intelligence in the IoT ecosystem. Nevertheless, to learn a good meta-model for within-task fast adaptation, a single agent alone has to learn over many tasks, and this is the so-called 'cold-start' problem. Observing that in a multi-agent network the learning tasks across different agents often share some model similarity, we ask the following fundamental question: "Is it possible to accelerate the online meta-learning across agents via limited communication and if yes how much benefit can be achieved? " To answer this question, we propose a multi-agent online meta-learning framework and cast it as an equivalent two-level nested online convex optimization (OCO) problem. By characterizing the upper bound of the agent-task-averaged regret, we show that the performance of multi-agent online meta-learning depends heavily on how much an agent can benefit from the distributed network-level OCO for meta-model updates via limited communication, which however is not well understood. To tackle this challenge, we devise a distributed online gradient descent algorithm with gradient tracking where each agent tracks the global gradient using only one communication step with its neighbors per iteration, and it results in an average regret $O(\sqrt{T/N})$ per agent, indicating that a factor of $\sqrt{1/N}$ speedup over the optimal single-agent regret $O(\sqrt{T})$ after $T$ iterations, where $N$ is the number of agents. Building on this sharp performance speedup, we next develop a multi-agent online meta-learning algorithm and show that it can achieve the optimal task-average regret at a faster rate of $O(1/\sqrt{NT})$ via limited communication, compared to single-agent online meta-learning. Extensive experiments corroborate the theoretic results.	翻訳日:2021-05-07 05:05:02 公開日:2020-12-19
# 高速かつ連続的なエッジ学習のための非接触ADMMに基づくフェデレーションメタラーニング Inexact-ADMM Based Federated Meta-Learning for Fast and Continual Edge Learning ( http://arxiv.org/abs/2012.08677v2 ) ライセンス: Link先を確認	Sheng Yue, Ju Ren, Jiang Xin, Sen Lin, Junshan Zhang	(参考訳) 多くのIoTアプリケーションのパフォーマンス、安全性、レイテンシの要件を満たすために、インテリジェントな決定をここでネットワークエッジで行う必要があります。しかし、制約のあるリソースと限られたローカルデータ量は、エッジAIの開発に重大な課題をもたらす。これらの課題を克服するため,我々は,先行課題からの知識伝達を活用できる連続エッジ学習について検討する。高速かつ連続的なエッジ学習の実現を目的として,エッジノードが協調してメタモデルを学習する,プラットフォーム支援型統合メタ学習アーキテクチャを提案する。エッジ学習問題を正規化最適化問題として、従来のタスクから得られた貴重な知識を正規化として抽出する。次に,admmベースのフェデレーションメタラーニングアルゴリズムであるadmm-fedmetaを考案する。admmは元の問題を多数のサブ問題に分解する自然なメカニズムを提供し,エッジノードとプラットフォーム間で並列に解くことができる。さらに、線形近似とヘッセン推定によって部分問題が解かれるような inexact-admm 法の変種を用い、1ラウンドあたりの計算コストを$\mathcal{o}(n)$ に削減する。一般の非凸の場合において,ADMM-FedMetaの収束特性,迅速な適応性能,事前知識伝達の忘れ効果を総合的に分析する。大規模な実験ではADMM-FedMetaの有効性と効率が示され、既存のベースラインを大きく上回っている。 In order to meet the requirements for performance, safety, and latency in many IoT applications, intelligent decisions must be made right here right now at the network edge. However, the constrained resources and limited local data amount pose significant challenges to the development of edge AI. To overcome these challenges, we explore continual edge learning capable of leveraging the knowledge transfer from previous tasks. Aiming to achieve fast and continual edge learning, we propose a platform-aided federated meta-learning architecture where edge nodes collaboratively learn a meta-model, aided by the knowledge transfer from prior tasks. The edge learning problem is cast as a regularized optimization problem, where the valuable knowledge learned from previous tasks is extracted as regularization. Then, we devise an ADMM based federated meta-learning algorithm, namely ADMM-FedMeta, where ADMM offers a natural mechanism to decompose the original problem into many subproblems which can be solved in parallel across edge nodes and the platform. Further, a variant of inexact-ADMM method is employed where the subproblems are `solved' via linear approximation as well as Hessian estimation to reduce the computational cost per round to $\mathcal{O}(n)$. We provide a comprehensive analysis of ADMM-FedMeta, in terms of the convergence properties, the rapid adaptation performance, and the forgetting effect of prior knowledge transfer, for the general non-convex case. Extensive experimental studies demonstrate the effectiveness and efficiency of ADMM-FedMeta, and showcase that it substantially outperforms the existing baselines.	翻訳日:2021-05-03 02:48:57 公開日:2020-12-19
# (参考訳) 臨床領域適応による胸部X線写真におけるコンピュータ支援異常検出 Computer-aided abnormality detection in chest radiographs in a clinical setting via domain-adaptation ( http://arxiv.org/abs/2012.10564v1 ) ライセンス: CC BY 4.0	Abhishek K Dubey, Michael T Young, Christopher Stanley, Dalton Lunga, Jacob Hinkle	(参考訳) 深層学習(DL)モデルは、放射線医が胸部X線写真から肺疾患の診断を助けるために医療センターに配備されている。このようなモデルは、しばしば多くの公開ラベル付きラジオグラフィーで訓練される。これらの訓練済みDLモデルが臨床現場で一般化する能力は、公開と非公開のラジオグラフィー間のデータ分布の変化のため、貧弱である。胸部X線写真では、分布の不均一性はX線装置の様々な条件と画像の生成に使用される構成から生じる。機械学習のコミュニティでは、データ生成ソースの多様性によって生じる課題はドメインシフトと呼ばれ、これは生成モデルのモードシフトである。本研究では,ドメインシフト検出と除去手法を導入し,この問題を克服する。臨床における胸部x線画像の異常検出のための事前訓練したdlモデルの導入における提案手法の有効性について検討した。 Deep learning (DL) models are being deployed at medical centers to aid radiologists for diagnosis of lung conditions from chest radiographs. Such models are often trained on a large volume of publicly available labeled radiographs. These pre-trained DL models' ability to generalize in clinical settings is poor because of the changes in data distributions between publicly available and privately held radiographs. In chest radiographs, the heterogeneity in distributions arises from the diverse conditions in X-ray equipment and their configurations used for generating the images. In the machine learning community, the challenges posed by the heterogeneity in the data generation source is known as domain shift, which is a mode shift in the generative model. In this work, we introduce a domain-shift detection and removal method to overcome this problem. Our experimental results show the proposed method's effectiveness in deploying a pre-trained DL model for abnormality detection in chest radiographs in a clinical setting.	翻訳日:2021-05-01 17:09:08 公開日:2020-12-19
# (参考訳) 多要素ベイズニューラルネットワーク:アルゴリズムとその応用 Multi-fidelity Bayesian Neural Networks: Algorithms and Applications ( http://arxiv.org/abs/2012.13294v1 ) ライセンス: CC BY 4.0	Xuhui Meng, Hessam Babaee, and George Em Karniadakis	(参考訳) 本稿では,可変忠実性のノイズデータを用いて学習可能なベイズ型ニューラルネットワーク(bnns)の新たなクラスを提案し,関数近似の学習や偏微分方程式(pdes)に基づく逆問題を解く。 BNNは3つのニューラルネットワークで構成されている: 1つは完全連結ニューラルネットワークで、これは低忠実度データに適合する最大アプテリ確率(MAP)法に従って訓練され、2つ目は低忠実度データと高忠実度データの間の不確実性定量化による相互相関を捉えるために使用されるベイズニューラルネットワーク、そしてもう1つは物理情報処理で記述された物理法則を符号化するニューラルネットワークである。最後の2つのニューラルネットワークのトレーニングのために、ハミルトニアンモンテカルロ法を用いて、対応するハイパーパラメータの後方分布を正確に推定する。本稿では合成データと実測値を用いて,本手法の精度を示す。具体的には、まず1次元と4次元の関数を近似し、1次元と2次元の拡散反応系の反応速度を推定する。さらに,マサチューセッツ州およびケープコッド湾の海面温度 (sst) を衛星画像とその場測定を用いて推定した。その結果,本手法は低次・高次データ間の線形および非線形の相関関係を適応的に捉え,未知パラメータをPDEで同定し,ノイズの多い高忠実度データから予測の不確かさを定量化できることを示した。最後に,特定の一次元関数近似と逆PDE問題を用いて,不確かさを効果的かつ効率的に低減し,能動的学習手法による予測精度を向上させることを実証した。 We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity, and we apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs). These multi-fidelity BNNs consist of three neural networks: The first is a fully connected neural network, which is trained following the maximum a posteriori probability (MAP) method to fit the low-fidelity data; the second is a Bayesian neural network employed to capture the cross-correlation with uncertainty quantification between the low- and high-fidelity data; and the last one is the physics-informed neural network, which encodes the physical laws described by PDEs. For the training of the last two neural networks, we use the Hamiltonian Monte Carlo method to estimate accurately the posterior distributions for the corresponding hyperparameters. We demonstrate the accuracy of the present method using synthetic data as well as real measurements. Specifically, we first approximate a one- and four-dimensional function, and then infer the reaction rates in one- and two-dimensional diffusion-reaction systems. Moreover, we infer the sea surface temperature (SST) in the Massachusetts and Cape Cod Bays using satellite images and in-situ measurements. Taken together, our results demonstrate that the present method can capture both linear and nonlinear correlation between the low- and high-fideilty data adaptively, identify unknown parameters in PDEs, and quantify uncertainties in predictions, given a few scattered noisy high-fidelity data. Finally, we demonstrate that we can effectively and efficiently reduce the uncertainties and hence enhance the prediction accuracy with an active learning approach, using as examples a specific one-dimensional function approximation and an inverse PDE problem.	翻訳日:2021-05-01 16:56:07 公開日:2020-12-19
# (参考訳) T-GAP: 時間的知識グラフ補完のための歩行学習 T-GAP: Learning to Walk across Time for Temporal Knowledge Graph Completion ( http://arxiv.org/abs/2012.10595v1 ) ライセンス: CC BY 4.0	Jaehun Jung, Jinhong Jung, U Kang	(参考訳) 時間的知識グラフ(TKG)は、静的知識グラフとは対照的に、本質的に現実世界の知識の過渡的な性質を反映している。自然に、自動tkg補完はリレーショナル推論のより現実的なモデリングのために多くの研究の関心を集めている。しかし、既存のTKGコンプリート用モジュールのほとんどは、TKG構造を完全に活用しない静的KG埋め込みを拡張しており、1)クエリのlo-cal地区にすでに存在する時間的関連イベントのアカウント化、2)マルチホップ推論とより良い解釈性を促進するパスベースの推論を欠いている。本稿では,そのエンコーダとデコーダにおける時間情報とグラフ構造の両方を最大限に活用するTKG補完の新しいモデルであるT-GAPを提案する。 T-GAPは、各イベントとクエリタイムスタンプ間の時間的変位に着目して、TKGのクエリ固有のサブ構造を符号化し、グラフを通して注意を伝播することでパスベースの推論を行う。我々の実証実験は、T-GAPが最先端のベースラインに対して優れた性能を発揮するだけでなく、目に見えないタイムスタンプを持つクエリにも有能に一般化できることを示した。また, T-GAPは透明な解釈性から, その推論過程において人間の直感に従うことが示唆された。 Temporal knowledge graphs (TKGs) inherently reflect the transient nature of real-world knowledge, as opposed to static knowledge graphs. Naturally, automatic TKG completion has drawn much research interests for a more realistic modeling of relational reasoning. However, most of the existing mod-els for TKG completion extend static KG embeddings that donot fully exploit TKG structure, thus lacking in 1) account-ing for temporally relevant events already residing in the lo-cal neighborhood of a query, and 2) path-based inference that facilitates multi-hop reasoning and better interpretability. In this paper, we propose T-GAP, a novel model for TKG completion that maximally utilizes both temporal information and graph structure in its encoder and decoder. T-GAP encodes query-specific substructure of TKG by focusing on the temporal displacement between each event and the query times-tamp, and performs path-based inference by propagating attention through the graph. Our empirical experiments demonstrate that T-GAP not only achieves superior performance against state-of-the-art baselines, but also competently generalizes to queries with unseen timestamps. Through extensive qualitative analyses, we also show that T-GAP enjoys from transparent interpretability, and follows human intuition in its reasoning process.	翻訳日:2021-05-01 16:54:57 公開日:2020-12-19
# (参考訳) マルチデコーダアテンションモデルによる車両経路問題の可視化 Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems ( http://arxiv.org/abs/2012.10638v1 ) ライセンス: CC BY 4.0	Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang	(参考訳) 車両経路問題に対する建設ヒューリスティックスを学習するための新しい強化学習手法を提案する。具体的には,多種多様なポリシーを学習するためのMDAM(Multi-Decoder Attention Model)を提案する。 MDAMの多様性を完全に活用するために、カスタマイズされたビームサーチ戦略が設計されている。また,提案手法では,mdamにおける再帰的構造に基づく埋め込みの可視化層を提案し,より情報的な埋め込みを提供することで,各ポリシーの質を向上させることができる。 6種類の経路問題に対する広範囲な実験により,本手法が最先端のディープラーニングモデルを大きく上回っていることが示された。 We present a novel deep reinforcement learning method to learn construction heuristics for vehicle routing problems. In specific, we propose a Multi-Decoder Attention Model (MDAM) to train multiple diverse policies, which effectively increases the chance of finding good solutions compared with existing methods that train only one policy. A customized beam search strategy is designed to fully exploit the diversity of MDAM. In addition, we propose an Embedding Glimpse layer in MDAM based on the recursive nature of construction, which can improve the quality of each policy by providing more informative embeddings. Extensive experiments on six different routing problems show that our method significantly outperforms the state-of-the-art deep learning based models.	翻訳日:2021-05-01 16:06:12 公開日:2020-12-19
# (参考訳) 行動認識のためのSMARTフレーム選択 SMART Frame Selection for Action Recognition ( http://arxiv.org/abs/2012.10671v1 ) ライセンス: CC BY 4.0	Shreyank N Gowda, Marcus Rohrbach, Laura Sevilla-Lara	(参考訳) 動作認識は計算コストが高い。本稿では,アクション認識の精度を向上させるために,フレーム選択の問題に対処する。特に,優れたフレームの選択は,トリミングされたビデオ領域においても行動認識性能に寄与することを示す。最近の研究は、多くのコンテンツが関係なく、廃棄が容易な長いビデオに対して、フレーム選択の活用に成功している。しかし、本研究では、より標準的でトリミングされた行動認識問題に焦点を当てる。優れたフレーム選択は、行動認識の計算コストを削減できるだけでなく、分類が難しいフレームを除去することで精度を向上させることができると論じる。従来の研究とは対照的に,フレームの選択を一度に考えるのではなく,共同で考える手法を提案する。これにより、ストーリーを語るスナップショットなど、優れたフレームがビデオ上でより効果的に分散する、より効率的な選択が可能になる。提案したフレーム選択SMARTを,異なるバックボーンアーキテクチャと複数のベンチマーク(Kinetics, Something-something, UCF101)で組み合わせてテストする。 SMARTフレーム選択は,計算コストを4倍から10倍に削減しつつ,他のフレーム選択方法と比較して常に精度を向上することを示す。さらに,認識性能を第一の目標とする場合には,近年の最先端モデルや各種ベンチマーク(UCF101, HMDB51, FCVID, ActivityNet)のフレーム選択戦略よりも優れた選択戦略を実現できることを示す。 Action recognition is computationally expensive. In this paper, we address the problem of frame selection to improve the accuracy of action recognition. In particular, we show that selecting good frames helps in action recognition performance even in the trimmed videos domain. Recent work has successfully leveraged frame selection for long, untrimmed videos, where much of the content is not relevant, and easy to discard. In this work, however, we focus on the more standard short, trimmed action recognition problem. We argue that good frame selection can not only reduce the computational cost of action recognition but also increase the accuracy by getting rid of frames that are hard to classify. In contrast to previous work, we propose a method that instead of selecting frames by considering one at a time, considers them jointly. This results in a more efficient selection, where good frames are more effectively distributed over the video, like snapshots that tell a story. We call the proposed frame selection SMART and we test it in combination with different backbone architectures and on multiple benchmarks (Kinetics, Something-something, UCF101). We show that the SMART frame selection consistently improves the accuracy compared to other frame selection strategies while reducing the computational cost by a factor of 4 to 10 times. Additionally, we show that when the primary goal is recognition performance, our selection strategy can improve over recent state-of-the-art models and frame selection strategies on various benchmarks (UCF101, HMDB51, FCVID, and ActivityNet).	翻訳日:2021-05-01 15:21:52 公開日:2020-12-19
# (参考訳) コンフューズド・モデュロ・プロジェクションに基づくホモモルフィック暗号化 -セキュアスマートシティにおける暗号システム, ライブラリおよび応用- Confused Modulo Projection based Somewhat Homomorphic Encryption -- Cryptosystem, Library and Applications on Secure Smart Cities ( http://arxiv.org/abs/2012.10692v1 ) ライセンス: CC BY 4.0	Xin Jin, Hongyu Zhang, Xiaodong Li, Haoyang Yu, Beisheng Liu, Shujiang Xie, Amit Kumar Singh and Yujie Li	(参考訳) クラウドコンピューティングの発展に伴い、大規模なビジュアルメディアデータのストレージと処理は徐々にクラウドサーバに移されていった。例えば、インテリジェントなビデオ監視システムが大量のデータをローカルで処理できない場合、データはクラウドにアップロードされる。そのため、元のデータを露呈することなくクラウドでデータを処理する方法が重要な研究テーマとなっている。そこで我々は,CMP-SWHEという混同したモジュラープロジェクション定理に基づく暗号システムの単一サーババージョンを提案し,サーバがユーザデータの有効情報を「emph{seeing}」することなく,ブラインドデータ処理を完了できるようにする。クライアント側では、元のデータは増幅、ランダム化、紛らわしい冗長性の設定によって暗号化される。サーバ側で暗号化されたデータを操作することは、元のデータ上での操作と同等である。拡張として,バッチ処理技術に基づく高速化バージョンによるブラインドコンピューティング方式を設計,実装し,効率を向上した。このアルゴリズムを使いやすくするために、cmp-swheに基づく効率的な汎用ブラインドコンピューティングライブラリを設計し実装した。我々は,このライブラリを,スマートシティ構築に有用な,フォアグラウンド抽出,オプティカルフロー追跡,オブジェクト検出に応用した。アルゴリズムをディープラーニングアプリケーションに拡張する方法についても論じる。他の同型暗号システムやライブラリと比較すると,本手法は計算効率において明らかに有利である。我々のアルゴリズムは、データが大きすぎると小さなエラー(10^{-6}$)があるが、非常に効率的で実用的であり、特にブラインド画像やビデオ処理に適している。 With the development of cloud computing, the storage and processing of massive visual media data has gradually transferred to the cloud server. For example, if the intelligent video monitoring system cannot process a large amount of data locally, the data will be uploaded to the cloud. Therefore, how to process data in the cloud without exposing the original data has become an important research topic. We propose a single-server version of somewhat homomorphic encryption cryptosystem based on confused modulo projection theorem named CMP-SWHE, which allows the server to complete blind data processing without \emph{seeing} the effective information of user data. On the client side, the original data is encrypted by amplification, randomization, and setting confusing redundancy. Operating on the encrypted data on the server side is equivalent to operating on the original data. As an extension, we designed and implemented a blind computing scheme of accelerated version based on batch processing technology to improve efficiency. To make this algorithm easy to use, we also designed and implemented an efficient general blind computing library based on CMP-SWHE. We have applied this library to foreground extraction, optical flow tracking and object detection with satisfactory results, which are helpful for building smart cities. We also discuss how to extend the algorithm to deep learning applications. Compared with other homomorphic encryption cryptosystems and libraries, the results show that our method has obvious advantages in computing efficiency. Although our algorithm has some tiny errors ($10^{-6}$) when the data is too large, it is very efficient and practical, especially suitable for blind image and video processing.	翻訳日:2021-05-01 15:07:03 公開日:2020-12-19
# (参考訳) ファジィ認知マップの進化的アルゴリズム Evolutionary Algorithms for Fuzzy Cognitive Maps ( http://arxiv.org/abs/2102.01012v1 ) ライセンス: CC BY 4.0	Stefanos Tsimenidis	(参考訳) ファジィ認知マップ(fcms)は複雑なシステムモデリング手法であり、その特異な利点のために最近人気が高まっている。それらは、モデル化されるシステムのパラメータ間の因果関係を表すグラフに基づいており、解釈可能性と柔軟性のために際立っている。近年のFCMの普及に伴い、モデルの開発と最適化のための研究が数多く行われている。 FCMの最も重要な要素の1つは、彼らが使用する学習アルゴリズムであり、その有効性は、主にそれによって決定される。学習アルゴリズムは、所望の行動に収束することを目的として、fcmのノード重みを学習する。本研究は、FCMの学習に使用される遺伝的アルゴリズムを概説するとともに、FCM学習アルゴリズムの概要を概説し、進化的コンピューティングをより広い文脈に導入する。 Fuzzy Cognitive Maps (FCMs) is a complex systems modeling technique which, due to its unique advantages, has lately risen in popularity. They are based on graphs that represent the causal relationships among the parameters of the system to be modeled, and they stand out for their interpretability and flexibility. With the late popularity of FCMs, a plethora of research efforts have taken place to develop and optimize the model. One of the most important elements of FCMs is the learning algorithm they use, and their effectiveness is largely determined by it. The learning algorithms learn the node weights of an FCM, with the goal of converging towards the desired behavior. The present study reviews the genetic algorithms used for training FCMs, as well as gives a general overview of the FCM learning algorithms, putting evolutionary computing into the wider context.	翻訳日:2021-05-01 14:26:27 公開日:2020-12-19
# (参考訳) 各種駆動サイクルにおけるリチウムイオン電池の電荷推定のためのNARXNNの解析 Analysis of NARXNN for State of Charge Estimation for Li-ion Batteries on various Drive Cycles ( http://arxiv.org/abs/2012.10725v1 ) ライセンス: CC BY 4.0	Aniruddh Herle, Janamejaya Channegowda, Kali Naraharisetti	(参考訳) 電気自動車(EV)は環境に優しいため、急速に普及している。リチウムイオン電池はEV技術の中心であり、EVの重量とコストの大部分に貢献している。充電状態(soc)はevの範囲を予測するのに役立つ非常に重要な指標である。車両の利用可能な範囲が決定できるように、バッテリパックで利用可能なバッテリ容量を正確に推定する必要がある。 SOCを推定する技術は様々である。本稿では,データ駆動アプローチを選択し,外部入出力ニューラルネットワーク(narxnn)を用いた非線形自己回帰ネットワークを用いてsocを正確に推定する。 NARXNNは、文献で利用可能な従来の機械学習技術よりも優れていることが示されている。 NARXNNモデルは、LA92、US06、UDDS、HWFETといった様々なEVドライブサイクル上で開発、テストされ、実世界のシナリオでそのパフォーマンスをテストする。このモデルは,従来の統計的機械学習手法より優れ,平均正方形誤差(MSE)を1e-5の範囲で達成する。 Electric Vehicles (EVs) are rapidly increasing in popularity as they are environment friendly. Lithium Ion batteries are at the heart of EV technology and contribute to most of the weight and cost of an EV. State of Charge (SOC) is a very important metric which helps to predict the range of an EV. There is a need to accurately estimate available battery capacity in a battery pack such that the available range in a vehicle can be determined. There are various techniques available to estimate SOC. In this paper, a data driven approach is selected and a Nonlinear Autoregressive Network with Exogenous Inputs Neural Network (NARXNN) is explored to accurately estimate SOC. NARXNN has been shown to be superior to conventional Machine Learning techniques available in the literature. The NARXNN model is developed and tested on various EV Drive Cycles like LA92, US06, UDDS and HWFET to test its performance on real world scenarios. The model is shown to outperform conventional statistical machine learning methods and achieve a Mean Squared Error (MSE) in the 1e-5 range.	翻訳日:2021-05-01 13:41:53 公開日:2020-12-19
# (参考訳) 外観テキスト融合による政治ポスターの識別 Political Posters Identification with Appearance-Text Fusion ( http://arxiv.org/abs/2012.10728v1 ) ライセンス: CC BY 4.0	Xuan Qin, Meizhu Liu, Yifan Hu, Christina Moo, Christian M. Riblet, Changwei Hu, Kevin Yen and Haibin Ling	(参考訳) 本稿では,外観特徴とテキストベクトルを効率的に活用し,政治ポスターを他の類似の政治イメージから正確に分類する手法を提案する。この作品の大半は、特定の政治イベントのプロモーションとして設計された政治ポスターに焦点が当てられ、その自動識別によって詳細な統計が生成され、様々な分野での判断ニーズを満たすことができる。政治家や政治イベントの包括的なキーワードリストから始めて、運動やキャンペーンを明示的に支援する3K政治ポスターを含む13K人の政治的イメージを含む、効果的で実用的な政治ポスターデータセットをキュレートする。第二に、このデータセットの徹底的なケーススタディを行い、政治ポスターの一般的なパターンや傾向を分析します。最後に, 外観情報とテキスト情報の両方を組み合わせることによって, 政治的ポスターを高い精度で分類するモデルを提案する。 In this paper, we propose a method that efficiently utilizes appearance features and text vectors to accurately classify political posters from other similar political images. The majority of this work focuses on political posters that are designed to serve as a promotion of a certain political event, and the automated identification of which can lead to the generation of detailed statistics and meets the judgment needs in a variety of areas. Starting with a comprehensive keyword list for politicians and political events, we curate for the first time an effective and practical political poster dataset containing 13K human-labeled political images, including 3K political posters that explicitly support a movement or a campaign. Second, we make a thorough case study for this dataset and analyze common patterns and outliers of political posters. Finally, we propose a model that combines the power of both appearance and text information to classify political posters with significantly high accuracy.	翻訳日:2021-05-01 13:35:20 公開日:2020-12-19
# (参考訳) (決定と回帰)回帰と分類のための木のアンサンブルに基づくカーネル (Decision and regression) tree ensemble based kernels for regression and classification ( http://arxiv.org/abs/2012.10737v1 ) ライセンス: CC BY 4.0	Dai Feng and Richard Baumgartner	(参考訳) Breiman's random forest (RF) や Gradient Boosted Trees (GBT) のような木に基づくアンサンブルは暗黙のカーネルジェネレータと解釈できる。 RFのカーネル・パースペクティブは、その統計的性質を理論的に研究するための原則的な枠組みの開発に使用されている。近年、カーネルの解釈は他の木に基づくアンサンブルに対してドイツ語であることが示されている。 GBT。しかしながら、カーネルとツリーアンサンブル間のリンクの実用性は広く研究されておらず、体系的に評価されていない。本研究の焦点は, RFやGBTを含む木に基づくアンサンブルとカーネルメソッドの相互作用を調べることである。 RFおよびGBTをベースとしたカーネルの性能と特性を連続的および二元的ターゲットからなる総合シミュレーション研究で解明する。その結果,rf/gbtカーネルは,高次元のシナリオにおいて,特にノイズが多い場合において,それぞれのアンサンブルと競合することがわかった。バイナリターゲットでは、RF/GBTカーネルとそのアンサンブルは同等のパフォーマンスを示す。回帰と分類のための実際のデータセットの結果を提供し、これらの洞察が実際にどのように活用されるかを示します。全体として、私たちの結果は、実践者のツールボックスに価値ある追加として、ツリーアンサンブルベースのカーネルをサポートします。最後に,サバイバルターゲット,解釈可能なプロトタイプ,ランドマーク分類と回帰のためのツリーアンサンブルベースのカーネルの拡張について述べる。我々は, ベイジアン系の多頻度ツリーアンサンブルによるカーネルの研究の今後の展開について概説する。 Tree based ensembles such as Breiman's random forest (RF) and Gradient Boosted Trees (GBT) can be interpreted as implicit kernel generators, where the ensuing proximity matrix represents the data-driven tree ensemble kernel. Kernel perspective on the RF has been used to develop a principled framework for theoretical investigation of its statistical properties. Recently, it has been shown that the kernel interpretation is germane to other tree-based ensembles e.g. GBTs. However, practical utility of the links between kernels and the tree ensembles has not been widely explored and systematically evaluated. Focus of our work is investigation of the interplay between kernel methods and the tree based ensembles including the RF and GBT. We elucidate the performance and properties of the RF and GBT based kernels in a comprehensive simulation study comprising of continuous and binary targets. We show that for continuous targets, the RF/GBT kernels are competitive to their respective ensembles in higher dimensional scenarios, particularly in cases with larger number of noisy features. For the binary target, the RF/GBT kernels and their respective ensembles exhibit comparable performance. We provide the results from real life data sets for regression and classification to show how these insights may be leveraged in practice. Overall, our results support the tree ensemble based kernels as a valuable addition to the practitioner's toolbox. Finally, we discuss extensions of the tree ensemble based kernels for survival targets, interpretable prototype and landmarking classification and regression. We outline future line of research for kernels furnished by Bayesian counterparts of the frequentist tree ensembles.	翻訳日:2021-05-01 13:26:31 公開日:2020-12-19
# (参考訳) GlocalNet: クラスを意識した長期人間の動作合成 GlocalNet: Class-aware Long-term Human Motion Synthesis ( http://arxiv.org/abs/2012.10744v1 ) ライセンス: CC BY 4.0	Neeraj Battan, Yudhik Agrawal, Veeravalli Saisooryarao, Aman Goel and Avinash Sharma	(参考訳) Augmented Reality, 3Dキャラクタアニメーション, 歩行者軌道予測などに適用可能な, 人間中心のビデオ生成を支援するためには, 長期人間の骨格配列の合成が不可欠である。ポーズ間の長期的時間的依存関係、ポーズ間の周期的反復、ポーズ間の双方向およびマルチスケールの依存関係、行動の変動速度、および人間の活動の複数のクラス/タイプにまたがる時間的ポーズ変動の空間が部分的に重なり合うため、長期的人間の動作合成は難しい課題である。本稿では,多種多様な活動クラス (>50) において,長期的(6000ms以下)人間の運動軌跡を合成する課題を解決することを目的とする。本稿では,この目標を達成するための2段階のアクティビティ生成手法を提案する。第1段階は,活動系列の長期的グローバルなポーズ依存性を学習し,スパース動作軌跡を合成し,第2段階は第1段階の出力を取り入れた濃密な動き軌跡の生成に対処する。公開されているデータセットの様々な定量的評価指標を用いて,SOTA法よりも提案手法の方が優れていることを示す。 Synthesis of long-term human motion skeleton sequences is essential to aid human-centric video generation with potential applications in Augmented Reality, 3D character animations, pedestrian trajectory prediction, etc. Long-term human motion synthesis is a challenging task due to multiple factors like, long-term temporal dependencies among poses, cyclic repetition across poses, bi-directional and multi-scale dependencies among poses, variable speed of actions, and a large as well as partially overlapping space of temporal pose variations across multiple class/types of human activities. This paper aims to address these challenges to synthesize a long-term (> 6000 ms) human motion trajectory across a large variety of human activity classes (>50). We propose a two-stage activity generation method to achieve this goal, where the first stage deals with learning the long-term global pose dependencies in activity sequences by learning to synthesize a sparse motion trajectory while the second stage addresses the generation of dense motion trajectories taking the output of the first stage. We demonstrate the superiority of the proposed method over SOTA methods using various quantitative evaluation metrics on publicly available datasets.	翻訳日:2021-05-01 13:08:52 公開日:2020-12-19
# (参考訳) 風環境における汚染センサの最適配置 Optimising Placement of Pollution Sensors in Windy Environments ( http://arxiv.org/abs/2012.10770v1 ) ライセンス: CC BY 4.0	Sigrid Passano Hellan, Christopher G. Lucas and Nigel H. Goddard	(参考訳) 大気汚染は世界で最も重要な死亡原因の1つである。大気汚染のモニタリングは、健康と汚染物質との関係についてより深く学び、介入すべき領域を特定するのに有用である。このようなモニタリングは高価であるため、センサーをできるだけ効率的に設置することが重要である。ベイズ最適化はセンサの位置を選択するのに有用であることが証明されているが、一般的には大気汚染の統計構造を無視したカーネル機能に依存している。本稿では,2つの新しい風化カーネルについて述べるとともに,ベイズ最適化による最大汚染箇所の学習を積極的に行うことの利点について考察する。 Air pollution is one of the most important causes of mortality in the world. Monitoring air pollution is useful to learn more about the link between health and pollutants, and to identify areas for intervention. Such monitoring is expensive, so it is important to place sensors as efficiently as possible. Bayesian optimisation has proven useful in choosing sensor locations, but typically relies on kernel functions that neglect the statistical structure of air pollution, such as the tendency of pollution to propagate in the prevailing wind direction. We describe two new wind-informed kernels and investigate their advantage for the task of actively learning locations of maximum pollution using Bayesian optimisation.	翻訳日:2021-05-01 12:57:40 公開日:2020-12-19
# (参考訳) データマイニング変数による信頼性のある因果推論の実現:測定誤差問題に対するランダムフォレストアプローチ Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem ( http://arxiv.org/abs/2012.10790v1 ) ライセンス: CC BY 4.0	Mochen Yang, Edward McFowland III, Gordon Burtch and Gediminas Adomavicius	(参考訳) 機械学習と計量分析の組み合わせは、研究と実践の両方でますます普及している。一般的な実証的戦略は、利用可能なデータから興味のある変数を「マイニング」するために予測モデリング技術を適用し、その後、因果効果を推定する目的で、それらの変数を計量的フレームワークに含めることである。最近の研究は、機械学習モデルからの予測は必然的に不完全であるため、予測変数に基づく計量分析は測定誤差によるバイアスに悩まされる可能性が高いことを強調している。本稿では,ランダム森林として知られるアンサンブル学習手法を利用して,これらのバイアスを軽減する新しい手法を提案する。予測に埋め込まれた測定誤差に対処するために,予測だけでなく,機器変数の生成にもランダムフォレストを用いることを提案する。ランダムフォレストアルゴリズムは、予測において個別に正確でありながら「異なる」誤り、すなわち弱い相関の予測誤差を生じさせる一連の木からなる場合に最もよく機能する。鍵となる観察は、これらの性質が有効な機器変数の関連性と排除要件に密接に関連していることである。ランダムな森林から個々の樹木のタプルを選抜するデータ駆動手法を考案し,1つの木が内生的共変量体として,もう1つの木がその道具として機能する。シミュレーション実験により, 推定バイアスの軽減における提案手法の有効性と, バイアス補正のための3つの代替手法よりも優れた性能を示す。 Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest from available data, followed by the inclusion of those variables into an econometric framework, with the objective of estimating causal effects. Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error. We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest. We propose employing random forest not just for prediction, but also for generating instrumental variables to address the measurement error embedded in the prediction. The random forest algorithm performs best when comprised of a set of trees that are individually accurate in their predictions, yet which also make 'different' mistakes, i.e., have weakly correlated prediction errors. A key observation is that these properties are closely related to the relevance and exclusion requirements of valid instrumental variables. We design a data-driven procedure to select tuples of individual trees from a random forest, in which one tree serves as the endogenous covariate and the other trees serve as its instruments. Simulation experiments demonstrate the efficacy of the proposed approach in mitigating estimation biases and its superior performance over three alternative methods for bias correction.	翻訳日:2021-05-01 12:38:52 公開日:2020-12-19
# (参考訳) 分離データに基づく逆ロバスト線形分類のサンプル複雑性 Sample Complexity of Adversarially Robust Linear Classification on Separated Data ( http://arxiv.org/abs/2012.10794v1 ) ライセンス: CC BY 4.0	Robi Bhattacharjee, Somesh Jha, Kamalika Chaudhuri	(参考訳) 対向的堅牢性を伴う学習の複雑さについて考察する。この問題の最も初期の理論的な結果は、データの異なるクラスが近接したり重なり合うような設定を考えることである。現実の応用に動機づけられ、対照的に、完全な正確性と堅牢性を持った分類器が存在する、十分に分離されたケースを検討し、サンプル複雑性が全く異なるストーリーを成すことを示す。具体的には、線形分類器に対して、任意のアルゴリズムの期待ロバストな損失が少なくとも$\omega(\frac{d}{n})$であるようなよく分離された分布のクラスを示し、一方max marginアルゴリズムは標準損失$o(\frac{1}{n})$を期待する。これは、従来の技術では得られない標準と堅牢な損失のギャップを示している。さらに,ロバスト性半径がクラス間のギャップよりはるかに小さい場合において,ロバスト損失が期待される解が$o(\frac{1}{n})$となるようなアルゴリズムを提案する。これは、非常によく分離されたデータの場合、$o(\frac{1}{n})$の収束率は達成可能であることを示している。我々の結果は、$p > 1$ ($p = \infty$を含む) の任意の$\ell_p$ノルムで測定されたロバスト性に適用できる。 We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least $\Omega(\frac{d}{n})$, whereas the max margin algorithm has expected standard loss $O(\frac{1}{n})$. This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is $O(\frac{1}{n})$. This shows that for very well-separated data, convergence rates of $O(\frac{1}{n})$ are achievable, which is not the case otherwise. Our results apply to robustness measured in any $\ell_p$ norm with $p > 1$ (including $p = \infty$).	翻訳日:2021-05-01 12:01:01 公開日:2020-12-19
# (参考訳) 確率的依存グラフ Probabilistic Dependency Graphs ( http://arxiv.org/abs/2012.10800v1 ) ライセンス: CC BY 4.0	Oliver Richardson, Joseph Y Halpern	(参考訳) 我々は,有向グラフィカルモデルの新しいクラスである確率依存グラフ(pdgs)を導入する。 pdgは自然な方法で一貫性のない信念を捉えることができ、ベイジアンネットワーク(bns)よりもモジュラーであり、新しい情報を取り入れ、表現を再構成しやすくする。 PDGが特に自然なモデリングツールであることを示す。 PDGに対する3つのセマンティクスを提供し、それぞれが、PDGとの不整合性を表すものとみなすことのできるスコアリング関数(ネットワーク上の変数の共役分布)から導出することができる。 BNに対応するPDGに対して、この関数はBNが表す分布によって一意に最小化され、PDG意味論がBN意味論を拡張することを示す。さらに,因子グラフとその指数関数族は pdg として忠実に表現できるが,因子グラフを用いた pdg のモデル化には大きな障壁がある。 We introduce Probabilistic Dependency Graphs (PDGs), a new class of directed graphical models. PDGs can capture inconsistent beliefs in a natural way and are more modular than Bayesian Networks (BNs), in that they make it easier to incorporate new information and restructure the representation. We show by example how PDGs are an especially natural modeling tool. We provide three semantics for PDGs, each of which can be derived from a scoring function (on joint distributions over the variables in the network) that can be viewed as representing a distribution's incompatibility with the PDG. For the PDG corresponding to a BN, this function is uniquely minimized by the distribution the BN represents, showing that PDG semantics extend BN semantics. We show further that factor graphs and their exponential families can also be faithfully represented as PDGs, while there are significant barriers to modeling a PDG with a factor graph.	翻訳日:2021-05-01 11:58:23 公開日:2020-12-19
# 修正による学習:弱い監督で数学の単語問題を解決する Learning by Fixing: Solving Math Word Problems with Weak Supervision ( http://arxiv.org/abs/2012.10582v1 ) ライセンス: Link先を確認	Yining Hong, Qing Li, Daniel Ciao, Siyuan Haung, Song-Chun Zhu	(参考訳) 数学用語問題(mwps)の従来のニューラルネットワークソルバは、完全な監視によって学習され、多様なソリューションを生み出すことができない。本稿では,MWPを学習するための‘textit{weakly-supervised} パラダイムを導入することでこの問題に対処する。この手法は最終回答のアノテーションのみを必要とし、単一の問題に対して様々な解決策を生成できる。弱い教師付き学習を促進するために,シンボリック推論によるニューラルネットワークの誤認識を補正する新しい \textit{learning-by-fixing} (lbf) フレームワークを提案する。具体的には、ニューラルネットワークによって生成された誤った解木に対して、‘textit{fixing} メカニズムは、ルートノードから葉ノードへのエラーを伝搬し、最も確率の高い修正を推測して、所望の回答を得る。より多様なソリューションを生成するために、ソリューション空間の効率的な縮小と探索を導くために \textit{tree regularization} が適用され、各問題で発見された様々な修正を追跡し保存する \textit{memory buffer} が設計されている。 Math23Kデータセットによる実験結果から,提案したLBFフレームワークは,弱教師付き学習における強化学習ベースラインを著しく上回ることがわかった。さらに、完全な教師付き手法よりも優れたトップ1とトップ3/5の回答精度を実現し、多様なソリューションを生み出す上での強みを示している。 Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions. In this paper, we address this issue by introducing a \textit{weakly-supervised} paradigm for learning MWPs. Our method only requires the annotations of the final answers and can generate various solutions for a single problem. To boost weakly-supervised learning, we propose a novel \textit{learning-by-fixing} (LBF) framework, which corrects the misperceptions of the neural network via symbolic reasoning. Specifically, for an incorrect solution tree generated by the neural network, the \textit{fixing} mechanism propagates the error from the root node to the leaf nodes and infers the most probable fix that can be executed to get the desired answer. To generate more diverse solutions, \textit{tree regularization} is applied to guide the efficient shrinkage and exploration of the solution space, and a \textit{memory buffer} is designed to track and save the discovered various fixes for each problem. Experimental results on the Math23K dataset show the proposed LBF framework significantly outperforms reinforcement learning baselines in weakly-supervised learning. Furthermore, it achieves comparable top-1 and much better top-3/5 answer accuracies than fully-supervised methods, demonstrating its strength in producing diverse solutions.	翻訳日:2021-05-01 11:18:35 公開日:2020-12-19
# ストレートスルーガムベル・ソフトマックス推定器を用いた視覚参照ゲームにおける体系的一般化と構成性について On (Emergent) Systematic Generalisation and Compositionality in Visual Referential Games with Straight-Through Gumbel-Softmax Estimator ( http://arxiv.org/abs/2012.10776v1 ) ライセンス: Link先を確認	Kevin Denamgana\"i and James Alfred Walker	(参考訳) 2つの(またはそれ以上の)エージェントが非視覚的な参照ゲームを行うときに現れる人工言語における構成性のドライバは、強化アルゴリズムと(神経)反復学習モデルに基づくアプローチを用いて以前に研究されてきた。より最近の textit{Straight-Through Gumbel-Softmax} (ST-GS) アプローチの導入に続いて,本研究では,ST-GS の文脈において,これまでフィールドで認識されていた構成性の要因がどの程度適用され,また,視覚的参照ゲームにおいて,それらが体系的一般化能力(創発的)にどの程度変換されるかを検討する。地形類似性とゼロショット合成テストを用いて,創発言語の構成性と一般化能力を評価する。第一に,テストトレイン分割戦略が視覚刺激の処理においてゼロショット構成テストに大きく影響することを示す一方で,シンボル刺激の処理では影響しないことを示す。第2に,st-gsアプローチをバッチサイズとオーバーコンプリート通信チャネルで使用すると,新興言語のコンポジション性が向上することを示す実証的証拠がある。それにもかかわらず、視覚的な刺激を扱う場合、バッチサイズの影響はそれほど明確ではない。また,全通信チャネルが等しく作成されるわけではないことを示した。実際、最大文長の増大は、合成能力と一般化能力の両方に有益であるが、語彙サイズの増加は有害である。最後に,視覚刺激を伴う識別的参照ゲームにおいて,学習時の言語構成性とエージェントの一般化能力の相関性の欠如が観察された。これは、シンボリック刺激を伴う生成変異体を用いたフィールドでの以前の観測と似ている。 The drivers of compositionality in artificial languages that emerge when two (or more) agents play a non-visual referential game has been previously investigated using approaches based on the REINFORCE algorithm and the (Neural) Iterated Learning Model. Following the more recent introduction of the \textit{Straight-Through Gumbel-Softmax} (ST-GS) approach, this paper investigates to what extent the drivers of compositionality identified so far in the field apply in the ST-GS context and to what extent do they translate into (emergent) systematic generalisation abilities, when playing a visual referential game. Compositionality and the generalisation abilities of the emergent languages are assessed using topographic similarity and zero-shot compositional tests. Firstly, we provide evidence that the test-train split strategy significantly impacts the zero-shot compositional tests when dealing with visual stimuli, whilst it does not when dealing with symbolic ones. Secondly, empirical evidence shows that using the ST-GS approach with small batch sizes and an overcomplete communication channel improves compositionality in the emerging languages. Nevertheless, while shown robust with symbolic stimuli, the effect of the batch size is not so clear-cut when dealing with visual stimuli. Our results also show that not all overcomplete communication channels are created equal. Indeed, while increasing the maximum sentence length is found to be beneficial to further both compositionality and generalisation abilities, increasing the vocabulary size is found detrimental. Finally, a lack of correlation between the language compositionality at training-time and the agents' generalisation abilities is observed in the context of discriminative referential games with visual stimuli. This is similar to previous observations in the field using the generative variant with symbolic stimuli.	翻訳日:2021-05-01 11:18:10 公開日:2020-12-19
# 不変表現学習における基本限界とトレードオフ Fundamental Limits and Tradeoffs in Invariant Representation Learning ( http://arxiv.org/abs/2012.10713v1 ) ライセンス: Link先を確認	Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar	(参考訳) 多くの機械学習アプリケーションは、2つの競合する目標を達成する学習表現を含んでいる: 機能のサブセット(例えば、予測のために)に関する情報や精度を最大化し、同時に別の、潜在的に重複している、機能のサブセット(例えば、公正性、プライバシーなど)に関して不変または独立性を最大化する。典型的な例としては、プライバシー保護学習、ドメイン適応、アルゴリズムフェアネスなどがある。実際、上記の問題はすべて、その平衡が精度と不変性の基本的なトレードオフを表す、共通のミニマックスゲーム理論の定式化を受け入れている。上記の領域における豊富な応用にもかかわらず、不変表現の極限とトレードオフに関する理論的理解は著しく不足している。本稿では,分類と回帰設定の両方において,この一般的かつ重要な問題を情報論的に解析する。いずれの場合においても、情報平面における実現可能領域の幾何学的特徴付けを提供し、この実現可能領域の幾何学的性質とトレードオフ問題の基本的な制限を結びつけることで、精度と不変性の固有のトレードオフを分析する。回帰設定では、精度と不変性の間のトレードオフを定量化するラグランジアン目的の厳密な下限も導出する。この低い境界は、関節分布のスペクトル特性を通じてトレードオフをよりよく理解する。いずれの場合も,正確性と不変性の間の相互作用に関する洞察を提供することで,この根本的な問題に新たな光を当てた。これらの結果は、この根本的な問題の理解を深め、対向表現学習アルゴリズムの設計を導くのに役立つかもしれない。 Many machine learning applications involve learning representations that achieve two competing goals: To maximize information or accuracy with respect to a subset of features (e.g.\ for prediction) while simultaneously maximizing invariance or independence with respect to another, potentially overlapping, subset of features (e.g.\ for fairness, privacy, etc). Typical examples include privacy-preserving learning, domain adaptation, and algorithmic fairness, just to name a few. In fact, all of the above problems admit a common minimax game-theoretic formulation, whose equilibrium represents a fundamental tradeoff between accuracy and invariance. Despite its abundant applications in the aforementioned domains, theoretical understanding on the limits and tradeoffs of invariant representations is severely lacking. In this paper, we provide an information-theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we also derive a tight lower bound on the Lagrangian objective that quantifies the tradeoff between accuracy and invariance. This lower bound leads to a better understanding of the tradeoff via the spectral properties of the joint distribution. In both cases, our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of adversarial representation learning algorithms.	翻訳日:2021-05-01 11:17:41 公開日:2020-12-19
# 不確実性を考慮した政策最適化:ロバストで適応的な信頼領域アプローチ Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach ( http://arxiv.org/abs/2012.10791v1 ) ライセンス: Link先を確認	James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras	(参考訳) 強化学習技術が実世界の意思決定プロセスで有用になるためには、限られたデータから堅牢なパフォーマンスを生み出す必要がある。深いポリシー最適化手法は複雑なタスクで素晴らしい結果を得ていますが、実際の採用は、成功するためにかなりの量のデータを必要とするため、限られています。小さなサンプルサイズと組み合わせると、これらの手法は高次元のサンプルベース推定に依存するため不安定な学習をもたらす。本研究では,これらの推定値がもたらす不確実性を制御する手法を開発する。我々は,これらの手法を活用して,データが不足しても安定したパフォーマンスを実現するように設計された,深いポリシー最適化手法を提案する。得られたアルゴリズムである不確実性認識地域政策最適化は、学習プロセスを通じて存在する不確実性レベルに適応する堅牢なポリシー更新を生成する。 In order for reinforcement learning techniques to be useful in real-world decision making processes, they must be able to produce robust performance from limited data. Deep policy optimization methods have achieved impressive results on complex tasks, but their real-world adoption remains limited because they often require significant amounts of data to succeed. When combined with small sample sizes, these methods can result in unstable learning due to their reliance on high-dimensional sample-based estimates. In this work, we develop techniques to control the uncertainty introduced by these estimates. We leverage these techniques to propose a deep policy optimization approach designed to produce stable performance even when data is scarce. The resulting algorithm, Uncertainty-Aware Trust Region Policy Optimization, generates robust policy updates that adapt to the level of uncertainty present throughout the learning process.	翻訳日:2021-05-01 11:17:19 公開日:2020-12-19
# コミュニケーションを意識した協調学習 Communication-Aware Collaborative Learning ( http://arxiv.org/abs/2012.10569v1 ) ライセンス: Link先を確認	Avrim Blum, Shelby Heinecke, Lev Reyzin	(参考訳) ノイズレス協調pac学習のアルゴリズムは,近年,サンプル複雑性に関して解析・最適化されている。本稿では,通信コストの削減を目標とし,サンプル複雑性に対して実質的にペナルティを伴わない協調的pac学習について検討する。分散ブースティングを用いた通信効率の高い協調pac学習アルゴリズムを開発した。次に,分類ノイズの存在下での協調学習のコミュニケーションコストを検討する。中間段階として、協調的なPAC学習アルゴリズムが分類ノイズにどのように適応できるかを示す。そこで本研究では,ノイズ分類に頑健な協調pac学習のための通信効率の高いアルゴリズムを開発した。 Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an intermediate step, we show how collaborative PAC learning algorithms can be adapted to handle classification noise. With this insight, we develop communication efficient algorithms for collaborative PAC learning robust to classification noise.	翻訳日:2021-05-01 11:16:18 公開日:2020-12-19
# 粗大かつ微細なマルチグラフ学習を目指して Towards Coarse and Fine-grained Multi-Graph Multi-Label Learning ( http://arxiv.org/abs/2012.10650v1 ) ライセンス: Link先を確認	Yejiang Wang and Yuhai Zhao and Zhengkui Wang and Chengqi Zhang	(参考訳) Multi-graph Multi-label Learning (\textsc{Mgml})は、複数のグラフを含むラベル付きバッグの集合からマルチラベル分類器を学習することを目的とした教師付き学習フレームワークである。従来のテクニックは、グラフをインスタンスに転送し、バッグレベルでのみ目に見えないラベルを学習することに集中して開発された。本稿では,グラフ上に学習モデルを直接構築し,その両方においてラベル予測の権限を付与する多層グラフ多層学習フレームワークである \textit{coarse} と \textit{fine-fine} multi-graph multi-label (cfmgml) を提案する。 bag) レベルと \textit{fine-fine} (別名。それぞれのバッグにグラフ)レベル。特に,ラベル付きマルチグラフバッグの集合を考えると,グラフレベルとバッグレベルのスコアリング関数を設計し,グラフカーネルを用いてラベルとデータの関連性をモデル化する。一方,グラフとバッグのラベルをランク付けし,ハミングロスを1ステップで同時に最小化するためのしきい値ランク付け目的関数を提案し,従来のランク付けアルゴリズムの誤り蓄積問題に対処することを目的とした。非凸最適化問題に取り組むため,我々はcfmgmlで必要とされる高次元空間計算を扱うための効果的な下位勾配降下アルゴリズムを更に開発する。様々な実世界のデータセットに対する実験は、cfMGMLが最先端のアルゴリズムよりも優れたパフォーマンスを達成することを示した。 Multi-graph multi-label learning (\textsc{Mgml}) is a supervised learning framework, which aims to learn a multi-label classifier from a set of labeled bags each containing a number of graphs. Prior techniques on the \textsc{Mgml} are developed based on transfering graphs into instances and focus on learning the unseen labels only at the bag level. In this paper, we propose a \textit{coarse} and \textit{fine-grained} Multi-graph Multi-label (cfMGML) learning framework which directly builds the learning model over the graphs and empowers the label prediction at both the \textit{coarse} (aka. bag) level and \textit{fine-grained} (aka. graph in each bag) level. In particular, given a set of labeled multi-graph bags, we design the scoring functions at both graph and bag levels to model the relevance between the label and data using specific graph kernels. Meanwhile, we propose a thresholding rank-loss objective function to rank the labels for the graphs and bags and minimize the hamming-loss simultaneously at one-step, which aims to addresses the error accumulation issue in traditional rank-loss algorithms. To tackle the non-convex optimization problem, we further develop an effective sub-gradient descent algorithm to handle high-dimensional space computation required in cfMGML. Experiments over various real-world datasets demonstrate cfMGML achieves superior performance than the state-of-arts algorithms.	翻訳日:2021-05-01 11:16:10 公開日:2020-12-19
# Top-k$ Ranking Bayesian Optimization Top-$k$ Ranking Bayesian Optimization ( http://arxiv.org/abs/2012.10688v1 ) ライセンス: Link先を確認	Quoc Phong Nguyen, Sebastian Tay, Bryan Kian Hsiang Low, Patrick Jaillet	(参考訳) 本稿では、上位k$のランク付けとタイ/インディフェクション観測を扱うための優先BOの実用的で重要な一般化である、上位k$のランク付けに対する新しいアプローチ(上位k$ランク付けBO)を提案する。まず、上記の観測に対処できるだけでなく、古典的なランダムなユーティリティーモデルもサポートするサロゲートモデルを設計する。もう一つの同様に重要な貢献は、BOにおける最初の情報理論獲得関数の導入であり、多項予測エントロピー探索 (MPES) と呼ばれる、これらの観測を柔軟に扱い、クエリの全ての入力に共同で最適化する。 mpesは、クエリの入力を一度に1つ選択する既存の取得機能と比較して、優れた性能を有する。 CIFAR-$10$データセットとSUSHI選好データセットを用いてMPESの性能を実証的に評価した。 This paper presents a novel approach to top-$k$ ranking Bayesian optimization (top-$k$ ranking BO) which is a practical and significant generalization of preferential BO to handle top-$k$ ranking and tie/indifference observations. We first design a surrogate model that is not only capable of catering to the above observations, but is also supported by a classic random utility model. Another equally important contribution is the introduction of the first information-theoretic acquisition function in BO with preferential observation called multinomial predictive entropy search (MPES) which is flexible in handling these observations and optimized for all inputs of a query jointly. MPES possesses superior performance compared with existing acquisition functions that select the inputs of a query one at a time greedily. We empirically evaluate the performance of MPES using several synthetic benchmark functions, CIFAR-$10$ dataset, and SUSHI preference dataset.	翻訳日:2021-05-01 11:15:44 公開日:2020-12-19
# アクティブラーニング問題の統一のための情報理論フレームワーク An Information-Theoretic Framework for Unifying Active Learning Problems ( http://arxiv.org/abs/2012.10695v1 ) ライセンス: Link先を確認	Quoc Phong Nguyen, Bryan Kian Hsiang Low, Patrick Jaillet	(参考訳) 本稿では、レベルセット推定(LSE)、ベイズ最適化(BO)、およびそれらの一般化変種を統合化するための情報理論フレームワークを提案する。まず,既存のLSEアルゴリズムを仮定し,連続入力領域を用いたLSE問題における最先端性能を実現する,新しい能動学習基準を提案する。そして,LSEとBOの関係を利用して,高い信頼度と最大値エントロピー探索(MES)に興味深いつながりを持つBOの競合情報理論獲得関数を設計する。後者の接続は、MESだけでなく、他のMESベースの取得関数にも重要な意味を持つMESの欠点を明らかにしている。最後に、我々の統合情報理論フレームワークを用いて、複数のレベルセットをデータ効率よく含むLSEとBOの一般化問題を解くことができる。提案アルゴリズムの性能を,実世界のデータセット,機械学習モデルのハイパーパラメータチューニングなどを用いて実証的に評価した。 This paper presents an information-theoretic framework for unifying active learning problems: level set estimation (LSE), Bayesian optimization (BO), and their generalized variant. We first introduce a novel active learning criterion that subsumes an existing LSE algorithm and achieves state-of-the-art performance in LSE problems with a continuous input domain. Then, by exploiting the relationship between LSE and BO, we design a competitive information-theoretic acquisition function for BO that has interesting connections to upper confidence bound and max-value entropy search (MES). The latter connection reveals a drawback of MES which has important implications on not only MES but also on other MES-based acquisition functions. Finally, our unifying information-theoretic framework can be applied to solve a generalized problem of LSE and BO involving multiple level sets in a data-efficient manner. We empirically evaluate the performance of our proposed algorithms using synthetic benchmark functions, a real-world dataset, and in hyperparameter tuning of machine learning models.	翻訳日:2021-05-01 11:15:27 公開日:2020-12-19
# 教師なしスケール不変マルチスペクトル形状マッチング Unsupervised Scale-Invariant Multispectral Shape Matching ( http://arxiv.org/abs/2012.10685v1 ) ライセンス: Link先を確認	Idan Pazi, Dvir Ginzburg, Dan Raviv	(参考訳) 非剛性伸縮構造間のアライメントはコンピュータビジョンにおいて最も難しいタスクの1つであり、不変性は一方では定義が困難であり、他方では実際のデータセットにはラベル付きデータが存在しない。本稿では,スケール不変幾何のスペクトルに基づく教師なしニューラルネットワークアーキテクチャを提案する。関数型マップアーキテクチャの上に構築するが、局所的な特徴の学習は、等尺的仮定が破れれば十分ではないが、スケール不変幾何を用いて解けることを示す。本手法は局所的な変形によらず,既存のスペクトル状態の解と比較して異なる領域の形状をマッチングする優れた性能を示す。 Alignment between non-rigid stretchable structures is one of the hardest tasks in computer vision, as the invariant properties are hard to define on one hand, and on the other hand no labelled data exists for real datasets. We present unsupervised neural network architecture based upon the spectrum of scale-invariant geometry. We build ontop the functional maps architecture, but show that learning local features, as done until now, is not enough once the isometric assumption breaks but can be solved using scale-invariant geometry. Our method is agnostic to local-scale deformations and shows superior performance for matching shapes from different domains when compared to existing spectral state-of-the-art solutions.	翻訳日:2021-05-01 11:15:11 公開日:2020-12-19
# ロバストディープフェイク検出のための不変テクスチャ違反の同定 Identifying Invariant Texture Violation for Robust Deepfake Detection ( http://arxiv.org/abs/2012.10580v1 ) ライセンス: Link先を確認	Xinwei Sun, Botong Wu, Wei Chen	(参考訳) 既存のdeepfake検出手法では、公開済みの大規模データセットにアクセスすることで、配信結果の有望性が報告されている。しかし、非滑らか合成法により、このデータセットの偽のサンプルは、上記のフレームレベルの検出方法のほとんどに大きく依存していた明らかな人工物(例えば、スターク視覚コントラスト、非滑らか境界)を明らかにする可能性がある。これらのアーティファクトは、現実のメディア偽造には現れないので、現実に近い偽画像に適用すると、上記の方法は大きく劣化する可能性がある。高実在性偽データに対するロバスト性を改善するために、低視品質のデータセットのみにアクセスする不変テクスチャ学習(InTeLe)フレームワークを提案する。本手法は,対象人物から移入されたテクスチャによって,原顔の微視的な顔のテクスチャが必然的に侵害されることから,すべての偽画像間で共有される不変特性と見なすことができる。このようなディープフェイク検出の不変性を学習するために、我々は、プリスタンとフェイクイメージのための異なるデコーダを持つ自動エンコーダフレームワークを導入しました。このような分離により、エンコーダによる抽出された埋め込みは、フェイク画像のテクスチャ違反をキャプチャし、次いで最終プリズム/フェイク予測のための分類器を付加することができる。理論的保証として,このような非分散テクスチャ違反の同定可能性,すなわち観測データから正確に推測できることを示す。本手法の有効性と有用性は,低品質画像から明らかなアーティファクト,高リアリズムの偽画像への一般化を約束することによって実証された。 Existing deepfake detection methods have reported promising in-distribution results, by accessing published large-scale dataset. However, due to the non-smooth synthesis method, the fake samples in this dataset may expose obvious artifacts (e.g., stark visual contrast, non-smooth boundary), which were heavily relied on by most of the frame-level detection methods above. As these artifacts do not come up in real media forgeries, the above methods can suffer from a large degradation when applied to fake images that close to reality. To improve the robustness for high-realism fake data, we propose the Invariant Texture Learning (InTeLe) framework, which only accesses the published dataset with low visual quality. Our method is based on the prior that the microscopic facial texture of the source face is inevitably violated by the texture transferred from the target person, which can hence be regarded as the invariant characterization shared among all fake images. To learn such an invariance for deepfake detection, our InTeLe introduces an auto-encoder framework with different decoders for pristine and fake images, which are further appended with a shallow classifier in order to separate out the obvious artifact-effect. Equipped with such a separation, the extracted embedding by encoder can capture the texture violation in fake images, followed by the classifier for the final pristine/fake prediction. As a theoretical guarantee, we prove the identifiability of such an invariance texture violation, i.e., to be precisely inferred from observational data. The effectiveness and utility of our method are demonstrated by promising generalization ability from low-quality images with obvious artifacts to fake images with high realism.	翻訳日:2021-05-01 11:14:43 公開日:2020-12-19
# UAV撮像画像における物体検出のための高密度マルチスケールフュージョンピラミッドネットワーク Dense Multiscale Feature Fusion Pyramid Networks for Object Detection in UAV-Captured Images ( http://arxiv.org/abs/2012.10643v1 ) ライセンス: Link先を確認	Yingjie Liu	(参考訳) 深層学習による物体検出の研究分野では大きな進歩が見られたが、uavで撮影された画像で顕著に発音される小型物体には依然として課題がある。これらの問題に対処するためには、小さなオブジェクトの十分な特徴情報を抽出できる特徴抽出法を探求する必要がある。本稿では,情報伝達と再利用を改良し,よりリッチな特徴を可能な限り得ることを目的とした,高密度多スケール特徴融合ピラミッドネットワーク(dmffpn)と呼ばれる新しい手法を提案する。具体的には、密結合は、異なる畳み込み層からの表現を完全に活用するように設計されている。さらに、第2段階でカスケードアーキテクチャを適用して、ローカライゼーション能力を向上させる。 VisDrone-DETと名付けられたドローンベースのデータセットの実験から,本手法の競合性能が示唆された。 Although much significant progress has been made in the research field of object detection with deep learning, there still exists a challenging task for the objects with small size, which is notably pronounced in UAV-captured images. Addressing these issues, it is a critical need to explore the feature extraction methods that can extract more sufficient feature information of small objects. In this paper, we propose a novel method called Dense Multiscale Feature Fusion Pyramid Networks(DMFFPN), which is aimed at obtaining rich features as much as possible, improving the information propagation and reuse. Specifically, the dense connection is designed to fully utilize the representation from the different convolutional layers. Furthermore, cascade architecture is applied in the second stage to enhance the localization capability. Experiments on the drone-based datasets named VisDrone-DET suggest a competitive performance of our method.	翻訳日:2021-05-01 11:14:12 公開日:2020-12-19
# ネットワーク内の拡張 Augmentation Inside the Network ( http://arxiv.org/abs/2012.10769v1 ) ライセンス: Link先を確認	Maciej Sypetkowski, Jakub Jasiulewicz, Zbigniew Wojna	(参考訳) 本稿では,畳み込みニューラルネットワークの中間機能に対するコンピュータビジョン問題に対するデータ拡張手法をシミュレートする手法である,ネットワーク内部の拡張について述べる。これらの変換を行い、ネットワーク内のデータフローを変更し、可能であれば共通の計算を共有します。提案手法は,TTA法よりもスムーズな速度-精度トレードオフ調整を実現し,良好な結果が得られる。さらに,テスト時間拡張と組み合わせることで,モデル性能をさらに向上させることができる。本手法をimagenet-2012およびcifar-100データセットで検証した。そこで本研究では,フリップテスト時拡張よりも30%高速で,CIFAR-100と同じ結果が得られる修正を提案する。 In this paper, we present augmentation inside the network, a method that simulates data augmentation techniques for computer vision problems on intermediate features of a convolutional neural network. We perform these transformations, changing the data flow through the network, and sharing common computations when it is possible. Our method allows us to obtain smoother speed-accuracy trade-off adjustment and achieves better results than using standard test-time augmentation (TTA) techniques. Additionally, our approach can improve model performance even further when coupled with test-time augmentation. We validate our method on the ImageNet-2012 and CIFAR-100 datasets for image classification. We propose a modification that is 30% faster than the flip test-time augmentation and achieves the same results for CIFAR-100.	翻訳日:2021-05-01 11:13:59 公開日:2020-12-19
# シーケンスラベリングのための不確実性認識ラベルの精密化 Uncertainty-Aware Label Refinement for Sequence Labeling ( http://arxiv.org/abs/2012.10608v1 ) ライセンス: Link先を確認	Tao Gui, Jiacheng Ye, Qi Zhang, Zhengyan Li, Zichu Fei, Yeyun Gong and Xuanjing Huang	(参考訳) ラベルデコードのための条件付きランダムフィールド(CRF)は、シーケンスラベリングタスクにおいてユビキタスになっている。しかし、ローカルラベルの依存関係と非効率なビタビ復号化は常に解決すべき問題であった。本稿では,長期間のラベル依存をモデル化する新しい2段階のラベル復号フレームワークを提案する。ベースモデルは、まずドラフトラベルを予測し、次に、新しい2ストリームの自己アテンションモデルにより、長距離ラベル依存に基づいてこれらのドラフトラベルの予測を洗練し、高速な予測のために並列デコードを実現する。さらに、誤ったドラフトラベルの副作用を軽減するために、ベイズニューラルネットワークは、誤りの確率が高いラベルを示すために使用され、エラー伝播の防止に大いに役立つ。 3つのシークエンスラベリングベンチマーク実験の結果,提案手法はCRF法に勝るだけでなく,推論プロセスを大幅に高速化した。 Conditional random fields (CRF) for label decoding has become ubiquitous in sequence labeling tasks. However, the local label dependencies and inefficient Viterbi decoding have always been a problem to be solved. In this work, we introduce a novel two-stage label decoding framework to model long-term label dependencies, while being much more computationally efficient. A base model first predicts draft labels, and then a novel two-stream self-attention model makes refinements on these draft predictions based on long-range label dependencies, which can achieve parallel decoding for a faster prediction. In addition, in order to mitigate the side effects of incorrect draft labels, Bayesian neural networks are used to indicate the labels with a high probability of being wrong, which can greatly assist in preventing error propagation. The experimental results on three sequence labeling benchmarks demonstrated that the proposed method not only outperformed the CRF-based methods but also greatly accelerated the inference process.	翻訳日:2021-05-01 11:13:48 公開日:2020-12-19
# FraCaS: 時間分析 FraCaS: Temporal Analysis ( http://arxiv.org/abs/2012.10668v1 ) ライセンス: Link先を確認	Jean-Philippe Bernardy, Stergios Chatzikyriakidis	(参考訳) 本稿では,推論問題に適した時間意味論の実装を提案する。この実装は構文木を論理式に変換し、Coq証明アシスタントの消費に適している。我々は、時間参照、時間副詞、アスペクトクラス、プログレッシブなど、いくつかの現象をサポートしている。これらの意味論を完全なFraCaSテストスーツに適用する。時間的基準に関連する問題に対して,全体の81パーセントと73%の精度が得られる。 In this paper, we propose an implementation of temporal semantics which is suitable for inference problems. This implementation translates syntax trees to logical formulas, suitable for consumption by the Coq proof assistant. We support several phenomena including: temporal references, temporal adverbs, aspectual classes and progressives. We apply these semantics to the complete FraCaS testsuite. We obtain an accuracy of 81 percent overall and 73 percent for problems explicitly marked as related to temporal reference.	翻訳日:2021-05-01 11:13:32 公開日:2020-12-19
# コモンセンス知識抽出とインジェクションによる語彙制約付きテキスト生成 Lexically-constrained Text Generation through Commonsense Knowledge Extraction and Injection ( http://arxiv.org/abs/2012.10813v1 ) ライセンス: Link先を確認	Yikang Li, Pulkit Goel, Varsha Kuppur Rajendra, Har Simrat Singh, Jonathan Francis, Kaixin Ma, Eric Nyberg, Alessandro Oltramari	(参考訳) 条件付きテキスト生成は、最先端のモデルから人間レベルのパフォーマンスをまだ見ていない難題である。本研究では,特定の入力概念のセットに対して妥当な文を生成することを目的として,commongenベンチマークに注目した。他の作業の進歩にもかかわらず、このデータセットで微調整された大きな事前学習された言語モデルは、構文的に正しいが、人間の常識の理解から定性的に逸脱する文を生成することが多い。さらに、生成されたシーケンスは、パート・オブ・音声と完全な概念カバレッジとを一致させるような語彙要求を満たすことができない。本稿では,コモンセンス推論と語彙制約付きデコードに関して,コモンセンス知識グラフがモデル性能をどのように向上させるかを検討する。本稿では,コンセプションネットからコモンセンス関係を抽出し,注意機構を通じて統一言語モデル(Unified Language Model,UniLM)にこれらの関係を注入し,上記の語彙要求を出力制約により強制する手法を提案する。複数のアブレーションを行うことで、コモンセンスインジェクションは、語彙的要件に準拠しながら、人間の理解とより一致した文を生成することができる。 Conditional text generation has been a challenging task that is yet to see human-level performance from state-of-the-art models. In this work, we specifically focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts. Despite advances in other tasks, large pre-trained language models that are fine-tuned on this dataset often produce sentences that are syntactically correct but qualitatively deviate from a human understanding of common sense. Furthermore, generated sequences are unable to fulfill such lexical requirements as matching part-of-speech and full concept coverage. In this paper, we explore how commonsense knowledge graphs can enhance model performance, with respect to commonsense reasoning and lexically-constrained decoding. We propose strategies for enhancing the semantic correctness of the generated text, which we accomplish through: extracting commonsense relations from Conceptnet, injecting these relations into the Unified Language Model (UniLM) through attention mechanisms, and enforcing the aforementioned lexical requirements through output constraints. By performing several ablations, we find that commonsense injection enables the generation of sentences that are more aligned with human understanding, while remaining compliant with lexical requirements.	翻訳日:2021-05-01 11:13:28 公開日:2020-12-19
# Minimaxが復活 Minimax Strikes Back ( http://arxiv.org/abs/2012.10700v1 ) ライセンス: Link先を確認	Quentin Cohen-Solal and Tristan Cazenave	(参考訳) 深層強化学習(DRL)は多くの完全情報ゲームにおいて超人的なレベルに達する。 drlと組み合わせて使用されるアート探索アルゴリズムの状況はモンテカルロ木探索 (mcts) である。我々は、MCTSの代わりにMinimaxアルゴリズムを用いてDRLに別のアプローチを採り、ポリシーではなく状態の評価のみを学習する。複数のゲームにおいて,学習パフォーマンスや対決に対して,アートDRLの状況と競合することを示す。 Deep Reinforcement Learning (DRL) reaches a superhuman level of play in many complete information games. The state of the art search algorithm used in combination with DRL is Monte Carlo Tree Search (MCTS). We take another approach to DRL using a Minimax algorithm instead of MCTS and learning only the evaluation of states, not the policy. We show that for multiple games it is competitive with the state of the art DRL for the learning performances and for the confrontations.	翻訳日:2021-05-01 11:12:37 公開日:2020-12-19
# 複数シーンからの3次元形状復元システムにおけるシルエット最適化の重要性 The importance of silhouette optimization in 3D shape reconstruction system from multiple object scenes ( http://arxiv.org/abs/2012.10660v1 ) ライセンス: Link先を確認	Waqqas-ur-Rehman Butt and Martin Servin	(参考訳) 本稿では, シルエットSFS法におけるシルエットの不整合を考慮した多段立体形状再構成システムを提案する。これらの矛盾は、異なるビュー、セグメンテーションと影、あるいは物体や光の方向による反射によって、複数のビューイメージに共通している。これらの要因は、すべてのシルエットに連続的に投影される体積のその部分だけを再構成し、残りの部分を再構成せずに、既存のアプローチを用いて3次元形状を構築しようとする際に大きな課題を引き起こす。結果として、最終的な形状は、複数のビューオブジェクトの閉塞と影のために堅牢ではない。本稿では,複数の画像を解析し,シルエットを最適化するための事前処理を行うことにより,再建に影響を及ぼす要因について考察する。最後に、ボリュームアプローチSFSを用いて3次元形状を再構成する。理論および実験結果から, 修正アルゴリズムの性能は効率よく向上し, 復元された形状の精度が向上し, シルエット, 体積, 計算コストの誤差に対して堅牢であることが示唆された。 This paper presents a multi stage 3D shape reconstruction system of multiple object scenes by considering the silhouette inconsistencies in shape-from silhouette SFS method. These inconsistencies are common in multiple view images due to object occlusions in different views, segmentation and shadows or reflection due to objects or light directions. These factors raise huge challenges when attempting to construct the 3D shape by using existing approaches which reconstruct only that part of the volume which projects consistently in all the silhouettes, leaving the rest unreconstructed. As a result, final shape are not robust due to multi view objects occlusion and shadows. In this regard, we consider the primary factors affecting reconstruction by analyzing the multiple images and perform pre-processing steps to optimize the silhouettes. Finally, the 3D shape is reconstructed by using the volumetric approach SFS. Theory and experimental results show that, the performance of the modified algorithm was efficiently improved, which can improve the accuracy of the reconstructed shape and being robust to errors in the silhouettes, volume and computational inexpensive.	翻訳日:2021-05-01 11:12:04 公開日:2020-12-19
# 斜めUAVビデオからの自己教師付き単眼深度推定 Self-supervised monocular depth estimation from oblique UAV videos ( http://arxiv.org/abs/2012.10704v1 ) ライセンス: Link先を確認	Logambal Madhuanand, Francesco Nex, Michael Ying Yang	(参考訳) UAVは安価で使いやすく、汎用性が高いため、重要な測光装置となっている。 UAVから撮影した空中画像は、小型で大規模なテクスチャマッピング、3Dモデリング、オブジェクト検出タスク、DTMおよびDSM生成などに適用できる。光グラム技術は、同じシーンの複数の画像を取得するUAV画像からの3次元再構成に日常的に使用される。コンピュータビジョンとディープラーニング技術の発展により、SIDE(Single Image Depth Estimation)は強力な研究分野となった。 UAV画像におけるSIDE技術を用いることで、3次元再構成のための複数の画像の必要性を克服することができる。本稿では, 深度学習を用いて, 一つのUAV空中画像から深度を推定することを目的とする。我々は,自己教師付き学習手法である自己教師付き単眼深度推定 (smde) について述べる。深度を学習し、2つの異なるネットワークを介して協調して情報をポーズする深層学習モデルのトレーニングには、単眼ビデオフレームが使用される。予測深度とポーズを用いて、映像からの時間情報を利用した別の画像から1つの画像を再構成する。本稿では,2次元CNNエンコーダと3次元CNNデコーダを用いて,時系列フレームから情報を抽出する新しいアーキテクチャを提案する。画像生成の品質を向上させるために、対比的損失項を導入する。公開UAVidビデオデータセットを用いて実験を行った。実験の結果,本モデルは最先端手法よりも奥行き推定に優れていることがわかった。 UAVs have become an essential photogrammetric measurement as they are affordable, easily accessible and versatile. Aerial images captured from UAVs have applications in small and large scale texture mapping, 3D modelling, object detection tasks, DTM and DSM generation etc. Photogrammetric techniques are routinely used for 3D reconstruction from UAV images where multiple images of the same scene are acquired. Developments in computer vision and deep learning techniques have made Single Image Depth Estimation (SIDE) a field of intense research. Using SIDE techniques on UAV images can overcome the need for multiple images for 3D reconstruction. This paper aims to estimate depth from a single UAV aerial image using deep learning. We follow a self-supervised learning approach, Self-Supervised Monocular Depth Estimation (SMDE), which does not need ground truth depth or any extra information other than images for learning to estimate depth. Monocular video frames are used for training the deep learning model which learns depth and pose information jointly through two different networks, one each for depth and pose. The predicted depth and pose are used to reconstruct one image from the viewpoint of another image utilising the temporal information from videos. We propose a novel architecture with two 2D CNN encoders and a 3D CNN decoder for extracting information from consecutive temporal frames. A contrastive loss term is introduced for improving the quality of image generation. Our experiments are carried out on the public UAVid video dataset. The experimental results demonstrate that our model outperforms the state-of-the-art methods in estimating the depths.	翻訳日:2021-05-01 11:11:26 公開日:2020-12-19
# 雑音フィルタリングによる2つの前景差に基づくビデオの静的物体検出とセグメンテーション Static object detection and segmentation in videos based on dual foregrounds difference with noise filtering ( http://arxiv.org/abs/2012.10708v1 ) ライセンス: Link先を確認	Waqqas-ur-Rehman Butt and Martin Servin	(参考訳) 本稿では,映像中の静止物体検出とセグメンテーション手法について述べる。多くの監視アプリケーションに移動オブジェクトが存在するため、ロバストな静的オブジェクト検出は依然として難しい課題である。難易度は、元の背景を確立せず、異なるタイミングでビデオに現れる静的なオブジェクトとして識別されるオブジェクトのラベル付け方法に大きく影響されます。この文脈では、静的オブジェクトの識別にフレーム差分の概念に基づくバックグラウンドサブトラクション手法を適用する。まず、静的参照フレームに対する各フレームの差を計算することにより、前景マスク画像のフレーム差を推定する。移動粒子を検出するためにガウスMOG法の混合法を適用し, 前景マスクのフレーム差分から結果フォアグラウンドマスクを減算する。低コントラスト処理や空気中の散乱材料のノイズ低減のために,前処理法,照明等化法,消光法を適用した。水滴と塵の粒子。最後に、物体を分割しノイズを抑制するために、数学的形態的操作と最大の連結成分分析を適用する。提案手法は, 岩盤ブレーカー局に適用し, 実データ, 合成データおよび2つの公開データを用いて有効に検証した。その結果,提案手法は静的オブジェクトを事前に追跡情報を持たずにロバストに検出し,セグメンテーションできることが実証された。 This paper presents static object detection and segmentation method in videos from cluttered scenes. Robust static object detection is still challenging task due to presence of moving objects in many surveillance applications. The level of difficulty is extremely influenced by on how you label the object to be identified as static that do not establish the original background but appeared in the video at different time. In this context, background subtraction technique based on the frame difference concept is applied to the identification of static objects. Firstly, we estimate a frame differencing foreground mask image by computing the difference of each frame with respect to a static reference frame. The Mixture of Gaussian MOG method is applied to detect the moving particles and then outcome foreground mask is subtracted from frame differencing foreground mask. Pre-processing techniques, illumination equalization and de-hazing methods are applied to handle low contrast and to reduce the noise from scattered materials in the air e.g. water droplets and dust particles. Finally, a set of mathematical morphological operation and largest connected-component analysis is applied to segment the object and suppress the noise. The proposed method was built for rock breaker station application and effectively validated with real, synthetic and two public data sets. The results demonstrate the proposed approach can robustly detect, segmented the static objects without any prior information of tracking.	翻訳日:2021-05-01 11:10:43 公開日:2020-12-19
# 量子光学畳み込みニューラルネットワーク : 量子コンピューティングのための新しい画像認識フレームワーク Quantum Optical Convolutional Neural Network: A Novel Image Recognition Framework for Quantum Computing ( http://arxiv.org/abs/2012.10812v1 ) ライセンス: Link先を確認	Rishab Parthasarathy and Rohan Bhowmik	(参考訳) 畳み込みニューラルネットワーク(cnns)に基づく大規模機械学習モデルでは、大量のデータをトレーニングしたパラメータが急速に増加しており、自動運転車から医療画像まで、幅広いコンピュータビジョンタスクに展開されている。これらのモデルをトレーニングするために必要なコンピューティングリソースの要求は、古典的なコンピューティングハードウェアの進歩を急速に上回り、光ニューラルネットワーク(ONN)や量子コンピューティングといった新しいフレームワークが将来の代替手段として検討されている。本稿では,量子コンピューティングに基づく新しいディープラーニングモデルである量子光畳み込みニューラルネットワーク(QOCNN)について報告する。人気のMNISTデータセットを使用して、この新アーキテクチャをセミナルLeNetモデルに基づく従来のCNNと比較した。我々はまた、以前に報告されたONN(GridNetとComplexNet)と、コンプレックスネットと量子ベースの正弦波非線形性を組み合わせた量子光学ニューラルネットワーク(QONN)を比較した。本質的に、我々の研究は量子畳み込みとそれ以前のプール層を追加することで、QONNに関する以前の研究を拡張している。我々はそれらの精度、混乱行列、受信器動作特性(ROC)曲線、マシューズ相関係数を判定し、全てのモデルを評価する。モデルの性能は全体として類似しており、ROC曲線は新しいQOCNNモデルが堅牢であることを示している。最後に,この新しいフレームワークを量子コンピュータ上で実行することにより,計算効率の向上を推定した。ディープラーニングへの量子コンピューティングベースのアプローチへの切り替えは、古典的なモデルに匹敵する精度をもたらすが、計算性能は前例のない向上と消費電力の大幅な削減を実現している。 Large machine learning models based on Convolutional Neural Networks (CNNs) with rapidly increasing number of parameters, trained with massive amounts of data, are being deployed in a wide array of computer vision tasks from self-driving cars to medical imaging. The insatiable demand for computing resources required to train these models is fast outpacing the advancement of classical computing hardware, and new frameworks including Optical Neural Networks (ONNs) and quantum computing are being explored as future alternatives. In this work, we report a novel quantum computing based deep learning model, the Quantum Optical Convolutional Neural Network (QOCNN), to alleviate the computational bottleneck in future computer vision applications. Using the popular MNIST dataset, we have benchmarked this new architecture against a traditional CNN based on the seminal LeNet model. We have also compared the performance with previously reported ONNs, namely the GridNet and ComplexNet, as well as a Quantum Optical Neural Network (QONN) that we built by combining the ComplexNet with quantum based sinusoidal nonlinearities. In essence, our work extends the prior research on QONN by adding quantum convolution and pooling layers preceding it. We have evaluated all the models by determining their accuracies, confusion matrices, Receiver Operating Characteristic (ROC) curves, and Matthews Correlation Coefficients. The performance of the models were similar overall, and the ROC curves indicated that the new QOCNN model is robust. Finally, we estimated the gains in computational efficiencies from executing this novel framework on a quantum computer. We conclude that switching to a quantum computing based approach to deep learning may result in comparable accuracies to classical models, while achieving unprecedented boosts in computational performances and drastic reduction in power consumption.	翻訳日:2021-05-01 11:09:59 公開日:2020-12-19
# ノード分類タスクにおけるグラフニューラルネットワークの公正比較のためのパイプライン A pipeline for fair comparison of graph neural networks in node classification tasks ( http://arxiv.org/abs/2012.10619v1 ) ライセンス: Link先を確認	Wentao Zhao, Dalin Zhou, Xinguo Qiu and Wei Jiang	(参考訳) グラフニューラルネットワーク (GNN) は, グラフデータを用いた複数の分野に適用可能性について検討されている。しかし、異なるモデルアーキテクチャやデータ拡張技術を含む新しい手法と公正な比較を保証するための標準的なトレーニング設定は存在しない。ノード分類に同じトレーニング設定を適用可能な,標準的な再現可能なベンチマークを導入する。このベンチマークでは、異なるフィールドからの小規模および中規模のデータセットと7つの異なるモデルを含む9つのデータセットを構築した。我々は、小規模データセットのためのk-foldモデル評価戦略と、全データセットの標準モデルトレーニング手順を設計し、gnnの標準実験パイプラインにより、公正なモデルアーキテクチャの比較を可能にする。 node2vecとLaplacian固有ベクトルを用いてデータ拡張を行い、入力機能がモデルの性能に与える影響を調べる。トポロジ的情報はノード分類タスクにおいて重要である。モデルレイヤの数を増やすことは、グラフが接続されていないPATTERNとCLUSTERデータセットを除いて、パフォーマンスを向上しない。データ拡張は非常に有用であり、特にnode2vecをベースラインで使用すると、パフォーマンスが大幅に向上する。 Graph neural networks (GNNs) have been investigated for potential applicability in multiple fields that employ graph data. However, there are no standard training settings to ensure fair comparisons among new methods, including different model architectures and data augmentation techniques. We introduce a standard, reproducible benchmark to which the same training settings can be applied for node classification. For this benchmark, we constructed 9 datasets, including both small- and medium-scale datasets from different fields, and 7 different models. We design a k-fold model assessment strategy for small datasets and a standard set of model training procedures for all datasets, enabling a standard experimental pipeline for GNNs to help ensure fair model architecture comparisons. We use node2vec and Laplacian eigenvectors to perform data augmentation to investigate how input features affect the performance of the models. We find topological information is important for node classification tasks. Increasing the number of model layers does not improve the performance except on the PATTERN and CLUSTER datasets, in which the graphs are not connected. Data augmentation is highly useful, especially using node2vec in the baseline, resulting in a substantial baseline performance improvement.	翻訳日:2021-05-01 11:09:31 公開日:2020-12-19
# Ekya: エッジコンピューティングサーバ上のビデオ分析モデルの継続的学習 Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers ( http://arxiv.org/abs/2012.10557v1 ) ライセンス: Link先を確認	Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Nikolaos Karianakis, Yuanchao Shu, Kevin Hsieh, Victor Bahl, Ion Stoica	(参考訳) ビデオ分析アプリケーションは(帯域幅とプライバシーのために)ビデオの分析にedge compute serverを使用する。推論のためにエッジサーバにデプロイされる圧縮モデルでは、ライブビデオデータがトレーニングデータから逸脱するデータドリフトが発生している。継続的学習は、新しいデータ上で定期的にモデルをトレーニングすることで、データのドリフトを処理する。本研究は,エッジサーバ上でのタスクの推論とリトレーニングを共同で支援する課題に対処し,リトレーニングされたモデルの精度と推論精度の基本的なトレードオフをナビゲートする。当社のソリューションであるekyaでは、このトレードオフを複数のモデルでバランスさせ、micro-profilerを使用して、再トレーニングによって最もメリットのあるモデルを特定しています。 Ekyaの精度はベースラインスケジューラよりも29%高く、ベースラインはEkyaと同じ精度を達成するために4倍のGPUリソースを必要とする。 Video analytics applications use edge compute servers for the analytics of the videos (for bandwidth and privacy). Compressed models that are deployed on the edge servers for inference suffer from data drift, where the live video data diverges from the training data. Continuous learning handles data drift by periodically retraining the models on new data. Our work addresses the challenge of jointly supporting inference and retraining tasks on edge servers, which requires navigating the fundamental tradeoff between the retrained model's accuracy and the inference accuracy. Our solution Ekya balances this tradeoff across multiple models and uses a micro-profiler to identify the models that will benefit the most by retraining. Ekya's accuracy gain compared to a baseline scheduler is 29% higher, and the baseline requires 4x more GPU resources to achieve the same accuracy as Ekya.	翻訳日:2021-05-01 11:09:12 公開日:2020-12-19
# 影は残らない:近似照明と幾何学による物体とその影の除去 No Shadow Left Behind: Removing Objects and their Shadows using Approximate Lighting and Geometry ( http://arxiv.org/abs/2012.10565v1 ) ライセンス: Link先を確認	Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian Curless	(参考訳) 画像からオブジェクトを取り除くことは、混合現実を含む多くのアプリケーションにとって重要な課題である。信頼できる結果を得るためには、オブジェクトがキャストする影も取り除かなければならない。現在のインペインティングベースのメソッドでは、オブジェクト自体を削除したり、影を置き去りにしたり、少なくとも、インペイントするシャドウ領域を指定する必要がある。我々は,キャスターとともに影を取り除くための深層学習パイプラインを導入する。さまざまなテクスチャを持つ表面から、さまざまな影(硬く、柔らかく、暗く、微妙に、大きく、薄い)を除去するために、粗いシーンモデルを活用する。合成されたデータに基づいてパイプラインをトレーニングし、合成シーンと実シーンの両方で質的で定量的な結果を示す。 Removing objects from images is a challenging problem that is important for many applications, including mixed reality. For believable results, the shadows that the object casts should also be removed. Current inpainting-based methods only remove the object itself, leaving shadows behind, or at best require specifying shadow regions to inpaint. We introduce a deep learning pipeline for removing a shadow along with its caster. We leverage rough scene models in order to remove a wide variety of shadows (hard or soft, dark or subtle, large or thin) from surfaces with a wide variety of textures. We train our pipeline on synthetically rendered data, and show qualitative and quantitative results on both synthetic and real scenes.	翻訳日:2021-05-01 11:08:37 公開日:2020-12-19
# 高ダイナミックレンジ画像品質のための統合データセットとメトリクス Consolidated Dataset and Metrics for High-Dynamic-Range Image Quality ( http://arxiv.org/abs/2012.10758v1 ) ライセンス: Link先を確認	Aliaksei Mikhailiuk, Maria Perez-Ortiz, Dingcheng Yue, Wilson Suen, Rafal K. Mantiuk	(参考訳) 高ダイナミックレンジ(hdr)画像やビデオコンテンツの人気が高まると、輝度やダイナミックレンジの異なるディスプレイで見られる画像障害の重症度を予測する指標が必要となる。このようなメトリクスは、十分に大きな主観的画像品質データセット上でトレーニングされ、検証され、堅牢なパフォーマンスを保証する必要がある。既存のHDR品質データセットのサイズが制限されているため、既存のHDRと標準ダイナミックレンジ(SDR)データセットを統合、マージすることで、4000以上の画像を含む統一された測光画像品質データセット(UPIQ)を作成しました。リアライメントされた品質スコアは、すべてのデータセットで同じ統一品質スケールを共有します。このような認識は、追加のデータセットの品質比較を収集し、サイコメトリックスケーリング手法でデータを再スケーリングすることで達成された。提案したデータセットの画像は、ディスプレイから放射される光に対応する絶対光度および色度単位で表現される。新しいデータセットを使用して、既存のHDRメトリクスを再トレーニングし、データセットが深層アーキテクチャのトレーニングに十分な大きさであることを示す。輝度認識画像圧縮におけるデータセットの有用性を示す。 Increasing popularity of high-dynamic-range (HDR) image and video content brings the need for metrics that could predict the severity of image impairments as seen on displays of different brightness levels and dynamic range. Such metrics should be trained and validated on a sufficiently large subjective image quality dataset to ensure robust performance. As the existing HDR quality datasets are limited in size, we created a Unified Photometric Image Quality dataset (UPIQ) with over 4,000 images by realigning and merging existing HDR and standard-dynamic-range (SDR) datasets. The realigned quality scores share the same unified quality scale across all datasets. Such realignment was achieved by collecting additional cross-dataset quality comparisons and re-scaling data with a psychometric scaling method. Images in the proposed dataset are represented in absolute photometric and colorimetric units, corresponding to light emitted from a display. We use the new dataset to retrain existing HDR metrics and show that the dataset is sufficiently large for training deep architectures. We show the utility of the dataset on brightness aware image compression.	翻訳日:2021-05-01 11:08:25 公開日:2020-12-19
# 不確かさ下でのシリコンフォトニックニューラルネットワークのモデリング Modeling Silicon-Photonic Neural Networks under Uncertainties ( http://arxiv.org/abs/2012.10594v1 ) ライセンス: Link先を確認	Sanmitra Banerjee, Mahdi Nikdast, and Krishnendu Chakrabarty	(参考訳) シリコンフォトニクスニューラルネットワーク(spnn)は、デジタル電子回路に比べて計算速度とエネルギー効率が大幅に向上する。しかし,SPNNのエネルギー効率と精度は,製造過程や温度変化から生じる不確実性の影響が大きい。本稿では,mzi(mach-zehnder interferometer)ベースのspnの分類精度に対する不確かさの影響について,初めて包括的かつ階層的検討を行った。このような影響は、非イデアルシリコンフォトニックデバイスの位置と特性(例えば、調整された位相角)によって異なることが示される。シミュレーションの結果, 2つの層と1374個の調整可能な熱水相シフト器を持つSPNNでは, 成熟した製造プロセスにおいてもランダムな不確かさが破滅的な70%の精度損失をもたらすことが示された。 Silicon-photonic neural networks (SPNNs) offer substantial improvements in computing speed and energy efficiency compared to their digital electronic counterparts. However, the energy efficiency and accuracy of SPNNs are highly impacted by uncertainties that arise from fabrication-process and thermal variations. In this paper, we present the first comprehensive and hierarchical study on the impact of random uncertainties on the classification accuracy of a Mach-Zehnder Interferometer (MZI)-based SPNN. We show that such impact can vary based on both the location and characteristics (e.g., tuned phase angles) of a non-ideal silicon-photonic device. Simulation results show that in an SPNN with two hidden layers and 1374 tunable-thermal-phase shifters, random uncertainties even in mature fabrication processes can lead to a catastrophic 70% accuracy loss.	翻訳日:2021-05-01 11:07:29 公開日:2020-12-19
# セルネットワークにおける結合スペクトルとパワーアロケーションの深部強化学習 Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular Networks ( http://arxiv.org/abs/2012.10682v1 ) ライセンス: Link先を確認	Yasar Sinan Nasir and Dongning Guo	(参考訳) 無線ネットワークオペレータは通常、保有する電波スペクトルを複数のサブバンドに分割する。細胞ネットワークでは、これらのサブバンドは多くの細胞で再利用される。共チャネル干渉を緩和するために、結合スペクトルと電力配分問題をしばしば定式化し、和レートの目的を最大化する。このような問題を解決する最もよく知られたアルゴリズムは、即時のグローバルチャネル状態情報と集中型オプティマイザを必要とする。実際、これらのアルゴリズムは時変サブバンドを持つ大規模ネットワークでは実装されていない。深層強化学習アルゴリズムは、複雑なリソース管理問題を解決する有望なツールである。ここでの大きな課題は、スペクトル割り当ては離散サブバンド選択を伴うのに対し、パワーアロケーションは連続変数を含むことである。本稿では,離散決定変数と連続決定変数の両方を最適化するための学習フレームワークを提案する。具体的には、2つの異なる深層強化学習アルゴリズムを同時に実行し、訓練することで、共同目標を最大化する。シミュレーションの結果,提案手法は最先端分数型プログラミングアルゴリズムと,深層強化学習に基づく先行手法の両方に勝ることがわかった。 A wireless network operator typically divides the radio spectrum it possesses into a number of subbands. In a cellular network those subbands are then reused in many cells. To mitigate co-channel interference, a joint spectrum and power allocation problem is often formulated to maximize a sum-rate objective. The best known algorithms for solving such problems generally require instantaneous global channel state information and a centralized optimizer. In fact those algorithms have not been implemented in practice in large networks with time-varying subbands. Deep reinforcement learning algorithms are promising tools for solving complex resource management problems. A major challenge here is that spectrum allocation involves discrete subband selection, whereas power allocation involves continuous variables. In this paper, a learning framework is proposed to optimize both discrete and continuous decision variables. Specifically, two separate deep reinforcement learning algorithms are designed to be executed and trained simultaneously to maximize a joint objective. Simulation results show that the proposed scheme outperforms both the state-of-the-art fractional programming algorithm and a previous solution based on deep reinforcement learning.	翻訳日:2021-05-01 11:06:55 公開日:2020-12-19
# ベイズ型非教師なし学習による濃縮電解質の隠れ構造 Bayesian unsupervised learning reveals hidden structure in concentrated electrolytes ( http://arxiv.org/abs/2012.10694v1 ) ライセンス: Link先を確認	Penelope Jones, Fabian Coupette, Andreas H\"artel, Alpha A. Lee	(参考訳) 電解質は、エネルギー貯蔵から生体材料まで幅広い応用において重要な役割を果たす。それにもかかわらず、濃縮電解質の構造は謎めいたままである。多くの理論的アプローチは、イオン対のアイデアを導入して濃縮電解質をモデル化しようと試み、イオンは対イオンで密に「対」されるか、電荷を遮蔽するために「自由」になる。本研究では,この問題を計算統計学の言語に再編成し,全てのイオンが同じ局所環境を共有するというヌル仮説をテストする。この枠組みを分子動力学シミュレーションに適用すると、このヌル仮説はデータによって支持されないことが分かる。我々の統計的手法は、異なる局所イオン環境の存在を示唆している;驚くべきことに、これらの差は電荷のアトラクションと違い、電荷の相関のように生じる。非凝集環境における粒子の分画は、異なる背景誘電率とイオン濃度にまたがる普遍的なスケーリング挙動を示す。 Electrolytes play an important role in a plethora of applications ranging from energy storage to biomaterials. Notwithstanding this, the structure of concentrated electrolytes remains enigmatic. Many theoretical approaches attempt to model the concentrated electrolytes by introducing the idea of ion pairs, with ions either being tightly `paired' with a counter-ion, or `free' to screen charge. In this study we reframe the problem into the language of computational statistics, and test the null hypothesis that all ions share the same local environment. Applying the framework to molecular dynamics simulations, we show that this null hypothesis is not supported by data. Our statistical technique suggests the presence of distinct local ionic environments; surprisingly, these differences arise in like charge correlations rather than unlike charge attraction. The resulting fraction of particles in non-aggregated environments shows a universal scaling behaviour across different background dielectric constants and ionic concentrations.	翻訳日:2021-05-01 11:06:40 公開日:2020-12-19

Title

Authors

Abstract

論文公表日・翻訳日

# アジャイルで汎用的な量子コミュニケーション:署名と秘密

Agile and versatile quantum communication: signatures and secrets ( http://arxiv.org/abs/2001.10089v3 )

ライセンス: Link先を確認

Stefan Richter, Matthew Thornton, Imran Khan, Hannah Scott, Kevin Jaksch, Ulrich Vogl, Birgit Stiller, Gerd Leuchs, Christoph Marquardt, Natalia Korolkova

(参考訳) アジャイル暗号は、基盤となる古典暗号アルゴリズムのセキュリティが損なわれている場合に、リソース効率のよい暗号コアのスワップを可能にする。逆に、多用途暗号では、ユーザは内部動作に関する知識を必要とせずに暗号タスクを切り替えることができる。本稿では,量子デジタルシグネチャ(qds)と量子シークレット共有(qss)という2つの量子暗号プロトコルを,同じハードウェア送信機と受信機プラットフォーム上で明示的に示すことにより,これらの原理を量子暗号の分野に適用する方法を提案する。重要なことは、プロトコルは古典的な後処理でのみ異なる。また、量子鍵分布(QKD)にも適しており、標準的な2次位相シフトキー(QPSK)エンコーディングとヘテロダイン検出を使用するため、配置された通信インフラと高い互換性を持つ。初めてQDSプロトコルが変更され、受信機でのポストセレクションが可能となり、プロトコルのパフォーマンスが向上した。暗号プリミティブQDSとQSSは本質的にマルチパーティトであり、タスクの内部のプレイヤーが不正であるだけでなく、量子チャネル上の(外部の)盗聴が許されている場合にも安全であることを示す。アジャイルで汎用的な量子通信システムの最初の実証実験では、量子状態はGHz速度で分散された。これにより、QDSプロトコルを使用して、2kmのファイバーリンクで0.05ms未満、20kmのファイバーリンクで0.2~s未満で1ビットメッセージにセキュアに署名できる。我々の知る限り、これは連続可変直接QSSプロトコルの最初の実演でもある。

Agile cryptography allows for a resource-efficient swap of a cryptographic core in case the security of an underlying classical cryptographic algorithm becomes compromised. Conversely, versatile cryptography allows the user to switch the cryptographic task without requiring any knowledge of its inner workings. In this paper, we suggest how these related principles can be applied to the field of quantum cryptography by explicitly demonstrating two quantum cryptographic protocols, quantum digital signatures (QDS) and quantum secret sharing (QSS), on the same hardware sender and receiver platform. Crucially, the protocols differ only in their classical post-processing. The system is also suitable for quantum key distribution (QKD) and is highly compatible with deployed telecommunication infrastructures, since it uses standard quadrature phase shift keying (QPSK) encoding and heterodyne detection. For the first time, QDS protocols are modified to allow for postselection at the receiver, enhancing protocol performance. The cryptographic primitives QDS and QSS are inherently multipartite and we prove that they are secure not only when a player internal to the task is dishonest, but also when (external) eavesdropping on the quantum channel is allowed. In our first proof-of-principle demonstration of an agile and versatile quantum communication system, the quantum states were distributed at GHz rates. This allows for a one-bit message to be securely signed using our QDS protocols in less than 0.05 ms over a 2 km fiber link and in less than 0.2~s over a 20 km fiber link. To our knowledge, this also marks the first demonstration of a continuous-variable direct QSS protocol.

翻訳日:2023-06-05 11:32:31 公開日:2020-12-19

# ニューラルネットワークを用いたSU(N)フェルミオンの熱力学研究のためのヒューリスティック機械

Heuristic machinery for thermodynamic studies of SU(N) fermions with neural networks ( http://arxiv.org/abs/2006.14142v2 )

ライセンス: Link先を確認

Entong Zhao, Jeongwon Lee, Chengdong He, Zejian Ren, Elnur Hajiyev, Junwei Liu, and Gyu-Boong Jo

(参考訳) 機械学習(ML)のパワーは、前例のない感度で実験的な測定を分析する可能性を提供する。しかし、物理観測物に直接関係する微妙な効果を探索し、MLを用いた通常の実験データから物理を理解することは依然として困難である。本稿では,機械学習解析を用いたヒューリスティック機械を提案する。量子シミュレータで作製したSU($N$)スピン対称性内で相互作用する超低温フェルミオンの密度分布における熱力学的研究の導出に機械を用いる。このようなスピン対称性は多体導波路に現れなければならないが、フェルミオンの運動量分布がスピン対称性の影響を最も通常の測定値として示すことは明らかである。スピン乗数の検出に$\sim$94$\%$という極めて高い精度で、完全に訓練された畳み込みニューラルネットワーク(NN)を用いて、その精度が、フィルタされた実験画像による様々な低輝度効果に依存するかを調べる。機械によって導かれる熱力学的圧縮性は, 単一の画像内の密度変動から直接測定する。我々の機械学習フレームワークは、SU($N$)のフェルミ液体の理論的記述を検証し、最小の事前理解を持つ非常に複雑な量子物質であっても、より顕著な効果を識別する可能性を示している。

The power of machine learning (ML) provides the possibility of analyzing experimental measurements with an unprecedented sensitivity. However, it still remains challenging to probe the subtle effects directly related to physical observables and to understand physics behind from ordinary experimental data using ML. Here, we introduce a heuristic machinery by using machine learning analysis. We use our machinery to guide the thermodynamic studies in the density profile of ultracold fermions interacting within SU($N$) spin symmetry prepared in a quantum simulator. Although such spin symmetry should manifest itself in a many-body wavefuction, it is elusive how the momentum distribution of fermions, the most ordinary measurement, reveals the effect of spin symmetry. Using a fully trained convolutional neural network (NN) with a remarkably high accuracy of $\sim$94$\%$ for detection of the spin multiplicity, we investigate how the accuracy depends on various less-pronounced effects with filtered experimental images. Guided by our machinery, we directly measure a thermodynamic compressibility from density fluctuations within the single image. Our machine learning framework shows a potential to validate theoretical descriptions of SU($N$) Fermi liquids, and to identify less-pronounced effects even for highly complex quantum matter with minimal prior understanding.

翻訳日:2023-05-12 20:13:38 公開日:2020-12-19

# 動的システムの制御と自律性のための量子テレポーテーション

Quantum Teleportation for Control of Dynamic Systems and Autonomy ( http://arxiv.org/abs/2007.15249v2 )

ライセンス: Link先を確認

Farbod Khoshnoud, Lucas Lamata, Clarence W. de Silva, Marco B. Quadrelli

(参考訳) 本稿では,量子テレポーテーションの古典力学系の制御と自律性への応用について述べる。量子テレポーテーションは本質的に量子現象であり、1993年にアインシュタイン-ポドルスキー-ローゼンの二重チャネルを介して未知の量子状態のテレポーテーションによって初めて導入された。本稿では,本研究で初めて,自律移動型古典的プラットフォームに量子技術を適用する可能性について考察する。まず、量子エンタングルメントと量子暗号を、制御や自律的応用のためのマクロ力学系にどのように統合するか、また量子テレポーテーションの概念を古典領域にどのように適用するかを概観する。量子テレポーテーション(quantum teleportation)では、その分極に相関する一対の光子が生成され、Alice RobotとBob Robotと呼ばれる2つの自律プラットフォームに送られる。アリスは、絡み合った光子を受け取ることに加えて、未知の状態で調製された光子と呼ばれる量子系を与えられた。アリスは、絡み合った光子と未知の状態の状態を共同で測定し、古典的チャンネルを通じてボブに情報を送る。アリス元の未知の状態は(量子非閉化現象により)絡み合った光子の状態を測定する過程で崩壊するが、ボブはユニタリ作用素を適用することでアリス状態の正確なレプリカを構築することができる。本稿では,動的システムの制御におけるハイブリッド古典量子能力の応用について,特に自律型古典システムの自律性と制御において,量子能力の導入と古典領域への優位性を促進することを目的としている。

The application of Quantum Teleportation for control of classical dynamic systems and autonomy is proposed in this paper. Quantum teleportation is an intrinsically quantum phenomenon, and was first introduced by teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels in 1993. In this paper, we consider the possibility of applying this quantum technique to autonomous mobile classical platforms for control and autonomy purposes for the first time in this research. First, a review of how Quantum Entanglement and Quantum Cryptography can be integrated into macroscopic mechanical systems for controls and autonomy applications is presented, as well as how quantum teleportation concepts may be applied to the classical domain. In quantum teleportation, an entangled pair of photons which are correlated in their polarizations are generated and sent to two autonomous platforms, which we call the Alice Robot and the Bob Robot. Alice has been given a quantum system, i.e. a photon, prepared in an unknown state, in addition to receiving an entangled photon. Alice measures the state of her entangled photon and her unknown state jointly and sends the information through a classical channel to Bob. Although Alice original unknown state is collapsed in the process of measuring the state of the entangled photon (due to the quantum non-cloning phenomenon), Bob can construct an accurate replica of Alice state by applying a unitary operator. This paper, and the previous investigations of the applications of hybrid classical-quantum capabilities in control of dynamical systems, are aimed to promote the adoption of quantum capabilities and its advantages to the classical domain particularly for autonomy and control of autonomous classical systems.

翻訳日:2023-05-07 18:37:47 公開日:2020-12-19

# 圧縮二重層構造の散乱データと境界状態

Scattering data and bound states of a squeezed double-layer structure ( http://arxiv.org/abs/2011.11437v2 )

ライセンス: Link先を確認

Alexander V. Zolotaryuk and Yaroslav Zolotaryuk

(参考訳) 2つの平行な均質な層からなるヘテロ構造は、その幅が$l_1$ と $l_2$ であり、それらの間の距離が同時に 0 に縮小するので、極限で研究される。この問題は一次元で研究され、schr\"{o}dinger方程式のスクイーズポテンシャルは層厚に応じて$v_1$と$v_2$によって与えられる。関数のクラス全体の$V_1(l_1)$と$V_2(l_2)$は特定の極限特性によって指定される。有限系に対して導出される散乱データ $a(k)$ および $b(k)$ のスクイーズ限界は、系のパラメータ $V_j$, $l_j$, $j=1,2$, $r$ の条件が成立する場合にのみ存在する。これらの条件は、適切な発散の結果として現れる。このキャンセルの2つの方法を実行し、システムパラメータ空間内の対応する2つの共振セットを導出する。これらの集合の1つにおいて、非自明な境界状態の存在は、ディラックのデルタ函数の微分の形で絞られたポテンシャルの特定の例を含む、スクイージング極限において証明される。有限系内の有限個の有界状態から圧縮された系で1つの有界状態が存続するシナリオを詳細に記述する。

A heterostructure composed of two parallel homogeneous layers is studied in the limit as their widths $l_1$ and $l_2$, and the distance between them $r$ shrinks to zero simultaneously. The problem is investigated in one dimension and the squeezing potential in the Schr\"{o}dinger equation is given by the strengths $V_1$ and $V_2$ depending on the layer thickness. A whole class of functions $V_1(l_1)$ and $V_2(l_2)$ is specified by certain limit characteristics as $l_1$ and $l_2$ tend to zero. The squeezing limit of the scattering data $a(k)$ and $b(k)$ derived for the finite system is shown to exist only if some conditions on the system parameters $V_j$, $l_j$, $j=1,2$, and $r$ take place. These conditions appear as a result of an appropriate cancellation of divergences. Two ways of this cancellation are carried out and the corresponding two resonance sets in the system parameter space are derived. On one of these sets, the existence of non-trivial bound states is proven in the squeezing limit, including the particular example of the squeezed potential in the form of the derivative of Dirac's delta function, contrary to the widespread opinion on the non-existence of bound states in $\delta'$-like systems. The scenario how a single bound state survives in the squeezed system from a finite number of bound states in the finite system is described in detail.

翻訳日:2023-04-23 09:09:21 公開日:2020-12-19

# ADBIS、TPDL、EDA 2020合同会議参加者からのフィードバック

Feedback from the participants of the ADBIS, TPDL and EDA 2020 joint conferences ( http://arxiv.org/abs/2012.01184v2 )

ライセンス: Link先を確認

Pegdwend\'e Sawadogo, J\'er\^ome Darmont, Fabien Duchateau

(参考訳) 本稿では,ADBIS,TPDL,EDA 2020の合同会議をオンラインで開催する方法と,その後の参加者調査の結果について述べる。参加者のフィードバックから学んだ教訓を紹介する。

This paper presents the way the joint ADBIS, TPDL and EDA 2020 conferences were organized online and the results of the participant survey conducted thereafter. We present the lessons learned from the participants' feedback.

翻訳日:2023-04-22 20:21:20 公開日:2020-12-19

# フォトニックリザーバコンピュータにおける高性能アナログ読み出し層のオンライントレーニング

Online training for high-performance analogue readout layers in photonic reservoir computers ( http://arxiv.org/abs/2012.10613v1 )

ライセンス: Link先を確認

Piotr Antonik, Marc Haelterman, Serge Massar

(参考訳) はじめに。貯水池コンピューティングは、時間に依存した信号を処理するためのバイオインスパイアされたコンピューティングパラダイムである。ハードウェア実装の性能は、一連のベンチマークタスクにおける最先端のデジタルアルゴリズムに匹敵する。これらの実装の最大のボトルネックは、オフライン後処理の遅い読み込み層である。アナログソリューションはほとんど提案されていないが、セットアップの複雑さが増すため、パフォーマンスが著しく低下していることに気付きました。メソッド。本稿では,これらの問題を解決するためのオンライントレーニングを提案する。本手法の適用性について,アナログ読み出し層を有する実験可能な貯水池コンピュータの数値シミュレーションを用いて検討した。また,従来の手法では訓練が困難である非線形出力層も検討した。結果だオンライン学習により,アナログ層の複雑さの増大を回避し,ディジタル層と同じレベルのパフォーマンスが得られることを示す。結論だこの研究は、出力層をオンライントレーニングすることで、高性能な完全アナログ貯水池コンピュータへの道を開いた。

Introduction. Reservoir Computing is a bio-inspired computing paradigm for processing time-dependent signals. The performance of its hardware implementation is comparable to state-of-the-art digital algorithms on a series of benchmark tasks. The major bottleneck of these implementation is the readout layer, based on slow offline post-processing. Few analogue solutions have been proposed, but all suffered from notice able decrease in performance due to added complexity of the setup. Methods. Here we propose the use of online training to solve these issues. We study the applicability of this method using numerical simulations of an experimentally feasible reservoir computer with an analogue readout layer. We also consider a nonlinear output layer, which would be very difficult to train with traditional methods. Results. We show numerically that online learning allows to circumvent the added complexity of the analogue layer and obtain the same level of performance as with a digital layer. Conclusion. This work paves the way to high-performance fully-analogue reservoir computers through the use of online training of the output layers.

翻訳日:2023-04-20 04:21:00 公開日:2020-12-19

# 不確実性の算術は量子形式と相対論的時空を統一する

The arithmetic of uncertainty unifies quantum formalism and relativistic spacetime ( http://arxiv.org/abs/2104.05395v1 )

ライセンス: Link先を確認

John Skilling and Kevin H. Knuth

(参考訳) 量子力学と相対性理論は、現代物理学の時代における宇宙の理解を劇的に変えた。量子論は物体を小さなスケールで確率的に扱うが、相対性理論は古典的に空間と時間の運動を扱う。ここでは、量子論と相対性理論の数学的構造が、標準算術と確率の基盤となる同じ基本的「合成とシーケンシング」対称性によって定義され、一意に制約された純粋思考から従うことを示す。鍵となるのは不確実性であり、それは必然的に量の観測に付随し、数対の使用を強制する。この対称性は、複素「$\surd\mathord-1$」算術、量子力学の標準計算、相対論的時空のローレンツ変換に直接導かれる。したがって、時間の1次元と空間の3次元は物理学の深遠で避けられない枠組みとして導出される。

The theories of quantum mechanics and relativity dramatically altered our understanding of the universe ushering in the era of modern physics. Quantum theory deals with objects probabilistically at small scales, whereas relativity deals classically with motion in space and time. We show here that the mathematical structures of quantum theory and of relativity follow together from pure thought, defined and uniquely constrained by the same elementary "combining and sequencing" symmetries that underlie standard arithmetic and probability. The key is uncertainty, which inevitably accompanies observation of quantity and imposes the use of pairs of numbers. The symmetries then lead directly to the use of complex "$\surd\mathord-1$" arithmetic, the standard calculus of quantum mechanics, and the Lorentz transformations of relativistic spacetime. One dimension of time and three dimensions of space are thus derived as the profound and inevitable framework of physics.

翻訳日:2023-04-20 04:17:37 公開日:2020-12-19

# オープンシステムの量子フィッシャー情報フローと非マルコフ過程」へのコメント

Comment on "Quantum Fisher information flow and non-Markovian processes of open systems" ( http://arxiv.org/abs/2012.10767v1 )

ライセンス: Link先を確認

Mihaela Vatasescu

(参考訳) 著者らは[phys. rev. a 82, 042103 (2010)]において、「時間局所形式における非マルコフマスター方程式のクラス」について、量子フィッシャー情報(qfi)フローは異なる散逸チャネルに対応する付加的サブフローに分解できることを示した。しかし、この論文は、QFIフローの解析的分解が有効である非マルコフ時間局所マスター方程式のクラスを規定していない。ここでは、Refの中心的な結果に到達するためには、いくつかの仮定が必要であることを示す。密度作用素の狭いクラスである $\rho (\theta;t)$ と量子フィッシャー情報 $\mathcal{f}(\theta;t)$ と、時間-局所マスター方程式の厳密な条件下で有効であるように思われる \cite{luwsun10} 。より正確には、Refで得られたQFIフローの分解である。 \cite{luwsun10} は2つの条件で有効である。 (i) $\frac{d}{dt} \left( \frac {\partial \rho}{\partial \theta} \right)=$$\frac {\partial}{\partial \theta} \left( \frac{d \rho}{dt} \right)$ (ii) $\frac{\partial H}{\partial \theta}=0$, $\frac{\partial \gamma_i}{\partial \theta}=0$, $\frac{\partial A_i}{\partial \theta}=0$, すなわち、ハミルトニアン$H(t)$、崩壊率$\gamma_i(t)$、リンドブラッド作用素$A_i(t)$は、非マルコフ時間局所マスター方程式に現れるが、量子フィッシャー情報が定義されるパラメータ$\theta$に依存してはならない。

In [Phys. Rev. A 82, 042103 (2010)], the authors showed that "for a class of the non-Markovian master equations in time-local forms", the quantum Fisher information (QFI) flow can be decomposed into additive subflows corresponding to different dissipative channels. However, the paper does not specify the class of non-Markovian time-local master equations for which their analytic decomposition of the QFI flow is valid. Here we show that several suppositions have to be made in order to reach the central result of Ref. \cite{luwsun10}, which appears to be valid for a narrow class of density operators $\rho (\theta;t)$ and quantum Fisher information $\mathcal{F}(\theta;t)$, and under strict conditions on the time-local master equation. More precisely, the decomposition of the QFI flow obtained in Ref. \cite{luwsun10} is valid under two conditions not mentioned in the paper: (i) $\frac{d}{dt} \left( \frac{\partial \rho}{\partial \theta} \right)=$ $\frac{\partial}{\partial \theta} \left( \frac{d \rho}{dt} \right)$; (ii) $\frac{\partial H}{\partial \theta}=0$, $\frac{\partial \gamma_i}{\partial \theta}=0$, $\frac{\partial A_i}{\partial \theta}=0$, meaning that the Hamiltonian $H(t)$, the decay rates $\gamma_i(t)$, and the Lindblad operators $A_i(t)$ appearing in the non-Markovian time-local master equation have to not depend on the parameter $\theta$ about which the quantum Fisher information is defined.

翻訳日:2023-04-20 04:17:17 公開日:2020-12-19

# 建築設計のためのパラメトリックシステムの構成要素としてのビズーロモーティブ複雑度

Visuo-Locomotive Complexity as a Component of Parametric Systems for Architecture Design ( http://arxiv.org/abs/2012.10710v1 )

ライセンス: Link先を確認

Vasiliki Kondyli and Mehul Bhatt and Evgenia Spyridonos

(参考訳) 大規模ビルトアップ空間を設計するための人々中心のアプローチは、ナビゲーション、ウェイフィンディング、ユーザビリティといった側面に関連する人間と環境の相互作用要因の観点から、ユーザの具体化されたビズー・ロケーティブな体験を体系的に予測する必要がある。この文脈において、我々は、ビルトアップ空間における認知性能 vis-a-vis 内部ナビゲーションの重要相関として機能する行動に基づくビゾ移動型複雑性モデルを開発する。また,提案した空間的複雑性モデルのパラメータに従って,ナビゲーション経路に沿った構造形態の同定と操作を行うパラメトリックツールとして,モデルの実装と応用を実証する。本稿では,2つの医療施設における実証研究に基づいて,動的かつインタラクティブなパラメトリック(複合性)モデルが設計プロセス全体を通して行動に基づく意思決定を促進する方法を示し,ナビゲーションやウェイフィンディング体験の一部として,所望のvisospatial complexityのレベルを維持する。

A people-centred approach for designing large-scale built-up spaces necessitates systematic anticipation of user's embodied visuo-locomotive experience from the viewpoint of human-environment interaction factors pertaining to aspects such as navigation, wayfinding, usability. In this context, we develop a behaviour-based visuo-locomotive complexity model that functions as a key correlate of cognitive performance vis-a-vis internal navigation in built-up spaces. We also demonstrate the model's implementation and application as a parametric tool for the identification and manipulation of the architectural morphology along a navigation path as per the parameters of the proposed visuospatial complexity model. We present examples based on an empirical study in two healthcare buildings, and showcase the manner in which a dynamic and interactive parametric (complexity) model can promote behaviour-based decision-making throughout the design process to maintain desired levels of visuospatial complexity as part of a navigation or wayfinding experience.

翻訳日:2023-04-20 04:16:12 公開日:2020-12-19

# 完全量子化規則法による中間子スペクトルの解析

Analytical Investigation of Meson Spectrum via Exact Quantization Rule Approach ( http://arxiv.org/abs/2012.10639v1 )

ライセンス: Link先を確認

Etido P. Inyang, Ephraim P. Inyang, Eddy S. William, Etebong E. Ibekwe and Ita O.Akpan

(参考訳) 我々は、Exact Quantization Ruleアプローチを用いてラジアルシュリンガー方程式を解析的に解き、拡張コーネルポテンシャルECPを用いたエネルギー固有値を得た。本研究は、チャーモニウムccやボトムニウムbbなどの重中間子の質量スペクトルや、ボトムチャームbcやチャームストレンジcsなどの重中間子の量子状態の異なる量子状態に対する質量スペクトルの計算に応用した。ポテンシャルパラメータのいくつかがゼロに設定されると、クーロンポテンシャルとコーネルポテンシャルの2つの特別なケースが検討された。現在のポテンシャルは、最大誤差0.0065GeVの実験データと、他の研究者の作業と比較して優れた結果をもたらす。

We solved the radial Schr"odinger equation analytically using the Exact Quantization Rule approach to obtain the energy eigenvalues with the Extended Cornell potential ECP. The present results are applied for calculating the mass spectra of heavy mesons such as charmonium cc and bottomonium bb, and heavylight mesons such as bottom-charm bc and charm-Strange cs for different quantum states. Two special cases were considered when some of the potential parameters were set to zero, resulting into Coulomb potential, and Cornell potential, respectively. The present potential provides excellent results in comparison with experimental data with a maximum error of 0.0065 GeV and work of other researchers.

翻訳日:2023-04-20 04:15:30 公開日:2020-12-19

# 質量変形SYKモデルの量子カオスと熱力学における鋭い遷移

A sharp transition in quantum chaos and thermodynamics of mass deformed SYK model ( http://arxiv.org/abs/2012.10628v1 )

ライセンス: Link先を確認

Tomoki Nosaka

(参考訳) 我々は,我々の最近の研究 [arxiv:2009.10759] を概観し,2つの結合sachdev-ye-kitaev系のカオス性について考察した。局所場定式化を用いて, 大規模N限界における時間外相関器の計算により, このモデルのカオス指数は相転移温度で不連続な降下を示すことがわかった。したがって、このモデルでは、ホーキング・ページのような遷移は、ブラックホール幾何学と双対場理論におけるカオス的挙動の関係から予想されるカオス性の遷移と相関する。

We review our recent work [arXiv:2009.10759] where we studied the chaotic property of the two coupled Sachdev-Ye-Kitaev systems exhibiting a Hawking-Page like phase transition. By computing the out-of-time-ordered correlator in the large N limit by using the bilocal field formalism, we found that the chaos exponent of this model shows a discontinuous fall-off at the phase transition temperature. Hence in this model the Hawking-Page like transition is correlated with a transition in chaoticity, as expected from the relation between a black hole geometry and the chaotic behavior in the dual field theory.

翻訳日:2023-04-20 04:15:17 公開日:2020-12-19

# 出力フィードバックを持つフォトニック貯水池コンピュータを用いたランダムパターンと周波数生成

Random pattern and frequency generation using a photonic reservoir computer with output feedback ( http://arxiv.org/abs/2012.10615v1 )

ライセンス: Link先を確認

Piotr Antonik, Michiel Hermans, Marc Haelterman, Serge Massar

(参考訳) 貯水池コンピューティングは、時間に依存する信号を処理するためのバイオインスパイアされたコンピューティングパラダイムである。アナログ実装の性能は、一連のベンチマークタスクで他のデジタルアルゴリズムと一致している。それらのポテンシャルは、出力信号を貯水池に戻すことでさらに増大し、このアルゴリズムを時系列生成に適用することができる。これは原則として、リアルタイムの出力計算に十分な高速な読み出し層を実装する必要がある。ここではFPGAチップによって駆動されるデジタル出力層を用いてこれを実現する。出力フィードバックを持つ最初の光電子貯水池コンピュータを実演し、時系列生成タスクの2つの例(周波数とランダムパターン生成)でテストする。理想的な数値シミュレーションと同様、最初のタスクで非常に良い結果が得られる。しかし、後者のパフォーマンスは、実験的なノイズに悩まされている。本稿では,出力フィードバックを用いた物理貯留層コンピュータの性能に及ぼすノイズの影響について詳細に検討した。そこで,本研究はアナログ貯水池計算の新たな応用を開拓し,ノイズが出力フィードバックに与える影響について新たな知見をもたらす。

Reservoir computing is a bio-inspired computing paradigm for processing time dependent signals. The performance of its analogue implementations matches other digital algorithms on a series of benchmark tasks. Their potential can be further increased by feeding the output signal back into the reservoir, which would allow to apply the algorithm to time series generation. This requires, in principle, implementing a sufficiently fast readout layer for real-time output computation. Here we achieve this with a digital output layer driven by a FPGA chip. We demonstrate the first opto-electronic reservoir computer with output feedback and test it on two examples of time series generation tasks: frequency and random pattern generation. We obtain very good results on the first task, similar to idealised numerical simulations. The performance on the second one, however, suffers from the experimental noise. We illustrate this point with a detailed investigation of the consequences of noise on the performance of a physical reservoir computer with output feedback. Our work thus opens new possible applications for analogue reservoir computing and brings new insights on the impact of noise on the output feedback.

翻訳日:2023-04-20 04:14:39 公開日:2020-12-19

# InSARフェーズ・デノナイズ:最近の技術動向と今後の方向性

InSAR Phase Denoising: A Review of Current Technologies and Future Directions ( http://arxiv.org/abs/2001.00769v2 )

ライセンス: Link先を確認

Gang Xu, Yandong Gao, Jinwei Li and Mengdao Xing

(参考訳) 近年,インターフェロメトリ合成開口レーダ(InSAR)は情報取得の強化によってリモートセンシングにおいて強力なツールとなっている。 InSAR処理中、干渉電図の位相分解は、トポグラフィマッピングと変形モニタリングの必須ステップである。過去30年にわたって、この話題に取り組むために多くの効果的なアルゴリズムが開発されてきた。本稿では,InSAR位相分解法の概要を概説し,確立されたアルゴリズムと新興アルゴリズムを4つの主要なカテゴリに分類する。最初の2つの部分は、それぞれ伝統的なローカルフィルタとトランスフォーメーションドメインフィルタのカテゴリを参照している。第3部は非局所フィルタ(NL)のカテゴリーに着目し、その優れた性能を考慮に入れている。信号処理の新しい概念に基づく先進的な手法も、この分野でのポテンシャルを示すために導入されている。さらに,シミュレーションデータと測定データの両方を用いて数値実験を行い,いくつかの一般的な位相分数法を比較した。本研究の目的は、InSAR信号処理のアーキテクチャ開発を促進することで、関係研究者に必要なガイドラインとインスピレーションを提供することである。

Nowadays, interferometric synthetic aperture radar (InSAR) has been a powerful tool in remote sensing by enhancing the information acquisition. During the InSAR processing, phase denoising of interferogram is a mandatory step for topography mapping and deformation monitoring. Over the last three decades, a large number of effective algorithms have been developed to do efforts on this topic. In this paper, we give a comprehensive overview of InSAR phase denoising methods, classifying the established and emerging algorithms into four main categories. The first two parts refer to the categories of traditional local filters and transformed-domain filters, respectively. The third part focuses on the category of nonlocal (NL) filters, considering their outstanding performances. Latter, some advanced methods based on new concept of signal processing are also introduced to show their potentials in this field. Moreover, several popular phase denoising methods are illustrated and compared by performing the numerical experiments using both simulated and measured data. The purpose of this paper is intended to provide necessary guideline and inspiration to related researchers by promoting the architecture development of InSAR signal processing.

翻訳日:2023-01-14 18:04:55 公開日:2020-12-19

# パッチによるホログラフィックセンシング

Patch-Based Holographic Image Sensing ( http://arxiv.org/abs/2002.03314v3 )

ライセンス: Link先を確認

Alfred Marcel Bruckstein, Martianus Frederic Ezerman, Adamas Aqsa Fahreza, and San Ling

(参考訳) データのホログラフィック表現は、データの格納されたパケットが任意の順序で利用可能になったときに、漸進的に洗練された分散ストレージを可能にする。本稿では,画像データのホログラフィックセンシングを行うパッチベース変換方式を提案する。提案手法は,データのランダムな検索順序下での進行回復に最適化されている。画像パッチのコーディングは、各検索段階で$\ell_2$ノルムの観点から、最適な画像復元を保証する分散プロジェクションの設計に依存している。パフォーマンスは、これまで検索されたデータパケットの数にのみ依存する。データパケットのサイズや数を変えながら、回復の質を高めるためのいくつかの選択肢が議論され、テストされる。これにより,いくつかの興味深いビット配置とレート歪みのトレードオフを検証し,推定された統計特性を持つ自然画像の集合について強調する。

Holographic representations of data enable distributed storage with progressive refinement when the stored packets of data are made available in any arbitrary order. In this paper, we propose and test patch-based transform coding holographic sensing of image data. Our proposal is optimized for progressive recovery under random order of retrieval of the stored data. The coding of the image patches relies on the design of distributed projections ensuring best image recovery, in terms of the $\ell_2$ norm, at each retrieval stage. The performance depends only on the number of data packets that has been retrieved thus far. Several possible options to enhance the quality of the recovery while changing the size and number of data packets are discussed and tested. This leads us to examine several interesting bit-allocation and rate-distortion trade offs, highlighted for a set of natural images with ensemble estimated statistical properties.

翻訳日:2023-01-02 15:03:25 公開日:2020-12-19

# ResiliNet: 分散ニューラルネットワークにおける障害耐性推論

ResiliNet: Failure-Resilient Inference in Distributed Neural Networks ( http://arxiv.org/abs/2002.07386v4 )

ライセンス: Link先を確認

Ashkan Yousefpour, Brian Q. Nguyen, Siddartha Devic, Guanhua Wang, Aboudy Kreidieh, Hans Lobel, Alexandre M. Bayen, Jason P. Jue

(参考訳) Federated Learningの目的は、生データを集中型サーバと共有することなく、分散ディープモデルをトレーニングすることだ。同様に、ニューラルネットワークの分散推論では、ネットワークを分割して複数の物理ノードに分散することで、アクティベーションと勾配が生データではなく物理ノード間で交換される。それでも、ニューラルネットワークを物理的ノードに分割して分散する場合、物理的ノードの障害は、それらのノードに置かれる神経ユニットの障害を引き起こし、結果としてパフォーマンスが大幅に低下する。現在のアプローチは、分散ニューラルネットワークにおけるトレーニングのレジリエンスに重点を置いている。しかし、分散ニューラルネットワークにおける推論のレジリエンスは調査されていない。 ResiliNetは、分散ニューラルネットワークにおいて物理ノード障害に耐性を持たせるためのスキームである。 ResiliNetは2つの概念を組み合わせてレジリエンスを提供する: ハイパーコネクションをスキップする、分散ニューラルネットワークのノードをスキップする、resnetの接続をスキップするのと同様のコンセプト、フェールアウトと呼ばれる新しいテクニック。 Failoutは、ドロップアウトを使用したトレーニング中の物理ノード障害条件をシミュレートし、分散ニューラルネットワークのレジリエンスを改善するように設計されている。 3つのデータセットを用いた実験およびアブレーション研究の結果、分散ニューラルネットワークに対する推論レジリエンスを提供するResiliNetの能力が確認された。

Federated Learning aims to train distributed deep models without sharing the raw data with the centralized server. Similarly, in distributed inference of neural networks, by partitioning the network and distributing it across several physical nodes, activations and gradients are exchanged between physical nodes, rather than raw data. Nevertheless, when a neural network is partitioned and distributed among physical nodes, failure of physical nodes causes the failure of the neural units that are placed on those nodes, which results in a significant performance drop. Current approaches focus on resiliency of training in distributed neural networks. However, resiliency of inference in distributed neural networks is less explored. We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. ResiliNet combines two concepts to provide resiliency: skip hyperconnection, a concept for skipping nodes in distributed neural networks similar to skip connection in resnets, and a novel technique called failout, which is introduced in this paper. Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks. The results of the experiments and ablation studies using three datasets confirm the ability of ResiliNet to provide inference resiliency for distributed neural networks.

翻訳日:2022-12-30 19:25:47 公開日:2020-12-19

# 半自律遠隔操作における優先支援戦略の学習と伝達知識

Learn and Transfer Knowledge of Preferred Assistance Strategies in Semi-autonomous Telemanipulation ( http://arxiv.org/abs/2003.03516v2 )

ライセンス: Link先を確認

Lingfeng Tao, Michael Bowman, Xu Zhou, Jiucai Zhang, Xiaoli Zhang

(参考訳) ロボットを効果的に支援するための支援を行うには、ロボットの補助動作が必ずしも人間の操作者にとって直感的ではなく、人間の行動や嗜好がロボットの解釈に不明瞭であることから、操作者の指示による遠隔操作の操作は極めて困難である。様々な最適化の観点から制御品質を改善するための様々な支援手法が開発されているが、テレマニピュレーションタスクの微妙な動作制約とオペレータの好みを満たす適切なアプローチを決定することは依然として課題である。これらの問題に対処するため、我々は新しい選好支援知識学習アプローチを開発した。補助選好モデルは、人間が好む援助を学習し、段階的モデル更新方法は、人間の選好データのあいまいさに対処しながら、学習安定性を確保する。このような嗜好認識支援知識により、遠隔操作ロボットハンドは、操作の成功に対してより活発で望ましい支援を提供することができる。また,ロボット固有の学習を避けるために,異なるロボットハンド構造にまたがる嗜好知識を伝達する知識伝達手法を開発した。 3フィンガーハンドと2フィンガーハンドをテレマニピュレートして、使用、移動、カップ上のハンドを遠隔操作する実験が行われている。その結果,ロボットはロボットの選好知識を効果的に学習し,学習労力の少ないロボット間での知識伝達を可能にした。

Enabling robots to provide effective assistance yet still accommodating the operator's commands for telemanipulation of an object is very challenging because robot's assistive action is not always intuitive for human operators and human behaviors and preferences are sometimes ambiguous for the robot to interpret. Although various assistance approaches are being developed to improve the control quality from different optimization perspectives, the problem still remains in determining the appropriate approach that satisfies the fine motion constraints for the telemanipulation task and preference of the operator. To address these problems, we developed a novel preference-aware assistance knowledge learning approach. An assistance preference model learns what assistance is preferred by a human, and a stagewise model updating method ensures the learning stability while dealing with the ambiguity of human preference data. Such a preference-aware assistance knowledge enables a teleoperated robot hand to provide more active yet preferred assistance toward manipulation success. We also developed knowledge transfer methods to transfer the preference knowledge across different robot hand structures to avoid extensive robot-specific training. Experiments to telemanipulate a 3-finger hand and 2-finger hand, respectively, to use, move, and hand over a cup have been conducted. Results demonstrated that the methods enabled the robots to effectively learn the preference knowledge and allowed knowledge transfer between robots with less training effort.

翻訳日:2022-12-25 20:07:39 公開日:2020-12-19

# FLOSSバージョンリリースイベントをテキストメッセージから検出することは可能か? stack overflow のケーススタディ

Is it feasible to detect FLOSS version release events from textual messages? A case study on Stack Overflow ( http://arxiv.org/abs/2003.14257v3 )

ライセンス: Link先を確認

A. Sokolovsky, T. Gross, J. Bacardit

(参考訳) トピック検出と追跡(TDT)はテキストマイニング領域における非常に活発な研究課題であり、一般的にトピックやイベントを検出するニュースフィードやTwitterデータセットに適用される。イベント"の概念は広いが、通常は単一のポストやメッセージから検出できる事象に適用される。マイクロイベント(micro-events)と呼ばれるもので、その性質上、単一のテキスト情報からは検出できない。この研究は、Stack Overflow Q&AプラットフォームのメッセージのサンプルとLibraries.ioデータセットのFree/Libre Open Source Software(FLOSS)バージョンリリースを使用して、テキストデータ上でのマイクロイベント検出の実現可能性を検討する。格子探索手法を用いてパラメータを最適化した3つの異なる推定器を用いてマイクロイベントを検出するパイプラインを構築する。我々は、感情分析を伴うLDAトピックモデリングと、感情分析を伴うhSBMトピックの2つの特徴空間を考える。特徴空間は、クロスバリデーション(RFECV)戦略による再帰的特徴除去を用いて最適化される。本研究では,マイクロイベント発生前後のトピック分布や感情特性に特徴的な変化があるかどうかを考察し,マイクロイベント検出のための分析パイプラインの各バリエーションのキャパシティを徹底的に評価する。さらに, 影響事例, 分散インフレーション係数, 線形性仮定の検証, 擬似R2乗測度, 無情報率など, モデルに関する詳細な統計分析を行った。最後に,マイクロイベント検出の限界を研究するために,実世界のデータに類似した特性を持つマイクロイベント合成データセットを生成する手法を設計し,評価された各分類器のマイクロイベント検出可能性閾値を同定する。

Topic Detection and Tracking (TDT) is a very active research question within the area of text mining, generally applied to news feeds and Twitter datasets, where topics and events are detected. The notion of "event" is broad, but typically it applies to occurrences that can be detected from a single post or a message. Little attention has been drawn to what we call "micro-events", which, due to their nature, cannot be detected from a single piece of textual information. The study investigates the feasibility of micro-event detection on textual data using a sample of messages from the Stack Overflow Q&A platform and Free/Libre Open Source Software (FLOSS) version releases from Libraries.io dataset. We build pipelines for detection of micro-events using three different estimators whose parameters are optimized using a grid search approach. We consider two feature spaces: LDA topic modeling with sentiment analysis, and hSBM topics with sentiment analysis. The feature spaces are optimized using the recursive feature elimination with cross validation (RFECV) strategy. In our experiments we investigate whether there is a characteristic change in the topics distribution or sentiment features before or after micro-events take place and we thoroughly evaluate the capacity of each variant of our analysis pipeline to detect micro-events. Additionally, we perform a detailed statistical analysis of the models, including influential cases, variance inflation factors, validation of the linearity assumption, pseudo R squared measures and no-information rate. Finally, in order to study limits of micro-event detection, we design a method for generating micro-event synthetic datasets with similar properties to the real-world data, and use them to identify the micro-event detectability threshold for each of the evaluated classifiers.

翻訳日:2022-12-18 07:18:01 公開日:2020-12-19

# o(n)$接続は十分表現力がある:スパーストランスフォーマーの普遍近似可能性

$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers ( http://arxiv.org/abs/2006.04862v2 )

ライセンス: Link先を確認

Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

(参考訳) 近年,多くのNLPタスクにおいてトランスフォーマーネットワークが技術状況を再定義している。しかし、これらのモデルは、各層でペアワイズ注意を計算するために入力シーケンス長$n$の2次計算コストに苦しむ。このことは、注意層内の接続を分散させるスパーストランスフォーマーの最近の研究を引き起こしている。長い列に対して経験的に有望な一方で、基本的な疑問は解決されていない。 sparsityパターンとsparsityレベルはパフォーマンスにどのように影響しますか? 本稿では,これらの問題に対処し,既存のスパースアテンションモデルをキャプチャする統一フレームワークを提供する。スパース注意モデルが任意の列列列関数を普遍的に近似できることを示す十分条件を提案する。驚くべきことに、o(n)$の接続しか持たないスパーストランスフォーマは、n^2$の接続を持つ密接なモデルと同じ関数クラスに近似できることがわかった。最後に、標準NLPタスクにおいて、異なるパターンや疎度を比較検討する。

Recently, Transformer networks have redefined the state of the art in many NLP tasks. However, these models suffer from quadratic computational cost in the input sequence length $n$ to compute pairwise attention in each layer. This has prompted recent research into sparse Transformers that sparsify the connections in the attention layers. While empirically promising for long sequences, fundamental questions remain unanswered: Can sparse Transformers approximate any arbitrary sequence-to-sequence function, similar to their dense counterparts? How does the sparsity pattern and the sparsity level affect their performance? In this paper, we address these questions and provide a unifying framework that captures existing sparse attention models. We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function. Surprisingly, our results show that sparse Transformers with only $O(n)$ connections per attention layer can approximate the same function class as the dense model with $n^2$ connections. Lastly, we present experiments comparing different patterns/levels of sparsity on standard NLP tasks.

翻訳日:2022-11-24 00:58:50 公開日:2020-12-19

# ランダムマトリクスによるランダムフーリエ特徴の解析:ガウス核の向こう側、精密相転移、および対応する2重輝線

A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent ( http://arxiv.org/abs/2006.05013v2 )

ライセンス: Link先を確認

Zhenyu Liao, Romain Couillet, Michael W. Mahoney

(参考訳) この記事では、ランダムフーリエ特徴量(rff)回帰の正確な漸近性を特徴づける。データサンプル数(n$)、次元(p$)、特徴空間の次元(n$)がすべて大きく、比較できる現実的な設定である。この状態において、ランダムな RFF 文法行列は、($N \to \infty$ 単独で行うような)よく知られた極限ガウスの核行列に収束しないが、それでも我々の解析によって捉えられる引き込み可能な振る舞いを持つ。この分析はまた、大きな$n,p,N$のトレーニングとテスト回帰エラーの正確な推定も提供する。これらの推定に基づいて、それらの間の相転移を含む2つの定性的に異なる学習相の正確なキャラクタリゼーションを提供し、この相転移挙動から対応する二重降下試験誤差曲線を導出する。これらの結果はデータ分布の強い仮定には依存せず、実世界のデータセットでの経験的な結果と完全に一致する。

This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on real-world data sets.

翻訳日:2022-11-23 13:32:37 公開日:2020-12-19

# おそらくロバストなメトリクス学習

Provably Robust Metric Learning ( http://arxiv.org/abs/2006.07024v2 )

ライセンス: Link先を確認

Lu Wang, Xuanqing Liu, Jinfeng Yi, Yuan Jiang, Cho-Jui Hsieh

(参考訳) メトリック学習は分類と類似性探索のための重要なアルゴリズム群であるが、小さな逆摂動に対する学習指標の頑健性は研究されていない。本稿では,クリーンな精度を高めることに焦点を当てた既存のメトリック学習アルゴリズムが,ユークリッド距離よりも頑健なメトリクスを生成することができることを示す。この問題を解決するために, 対角的摂動に対して頑健なマハラノビス距離を求めるための新しい距離学習アルゴリズムを提案する。提案手法は,検証されたロバストエラーと経験的ロバストエラー(逆攻撃によるエラー)の両方を改善した。さらに、クリーンでロバストなエラーのトレードオフに直面するニューラルネットワークの防御とは異なり、従来のメトリック学習方法に比べてクリーンなエラーを犠牲にしない。私たちのコードはhttps://github.com/wangwllu/provably_robust_metric_learningで利用可能です。

Metric learning is an important family of algorithms for classification and similarity search, but the robustness of learned metrics against small adversarial perturbations is less studied. In this paper, we show that existing metric learning algorithms, which focus on boosting the clean accuracy, can result in metrics that are less robust than the Euclidean distance. To overcome this problem, we propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations, and the robustness of the resulting model is certifiable. Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors (errors under adversarial attacks). Furthermore, unlike neural network defenses which usually encounter a trade-off between clean and robust errors, our method does not sacrifice clean errors compared with previous metric learning methods. Our code is available at https://github.com/wangwllu/provably_robust_metric_learning.

翻訳日:2022-11-22 02:41:07 公開日:2020-12-19

# 時空間クリギングのためのインダクティブグラフニューラルネットワーク

Inductive Graph Neural Networks for Spatiotemporal Kriging ( http://arxiv.org/abs/2006.07527v2 )

ライセンス: Link先を確認

Yuankai Wu, Dingyi Zhuang, Aurelie Labbe and Lijun Sun

(参考訳) 時系列予測と時空間クリグは時空間データ解析において最も重要な2つのタスクである。グラフニューラルネットワークに関する最近の研究は、時系列予測にかなりの進歩をもたらしているが、サンプリングされていない場所やセンサーの信号を回復するkriging問題にはほとんど注意が払われていない。既存のスケーラブルなkrigingメソッド(例えば、マトリックス/テンソル補完)は、トランスダクティブであり、補間するための新しいセンサーがある場合、完全なリトレーニングが必要である。本稿では,インダクティブグラフニューラルネットワーク(IGNNK, Inductive Graph Neural Network Kriging)モデルを構築し,ネットワーク/グラフ構造上のアンサンプリングセンサのデータを復元する。距離と到達性の影響を一般化するため,サンプルとしてランダムな部分グラフを生成し,各サンプルに対して対応する隣接行列を再構成する。各サンプルサブグラフ上のすべての信号を再構成することにより、IGNNKは空間メッセージパッシング機構を効果的に学習することができる。実世界の時空間データセットにおける実験結果から,モデルの有効性が示された。さらに、学習したモデルは、目に見えないデータセット上の同じタイプのクリグタスクにうまく転送できることがわかった。結果はこう示しています 1)GNNは空間クリギングの効率的かつ効果的なツールである。 2)誘導性GNNは動的隣接行列を用いて訓練することができる。 3) トレーニングされたモデルを新しいグラフ構造に転送し、 4)IGNNKは仮想センサの生成に使用できる。

Time series forecasting and spatiotemporal kriging are the two most important tasks in spatiotemporal data analysis. Recent research on graph neural networks has made substantial progress in time series forecasting, while little attention has been paid to the kriging problem -- recovering signals for unsampled locations/sensors. Most existing scalable kriging methods (e.g., matrix/tensor completion) are transductive, and thus full retraining is required when we have a new sensor to interpolate. In this paper, we develop an Inductive Graph Neural Network Kriging (IGNNK) model to recover data for unsampled sensors on a network/graph structure. To generalize the effect of distance and reachability, we generate random subgraphs as samples and reconstruct the corresponding adjacency matrix for each sample. By reconstructing all signals on each sample subgraph, IGNNK can effectively learn the spatial message passing mechanism. Empirical results on several real-world spatiotemporal datasets demonstrate the effectiveness of our model. In addition, we also find that the learned model can be successfully transferred to the same type of kriging tasks on an unseen dataset. Our results show that: 1) GNN is an efficient and effective tool for spatial kriging; 2) inductive GNNs can be trained using dynamic adjacency matrices; 3) a trained model can be transferred to new graph structures and 4) IGNNK can be used to generate virtual sensors.

翻訳日:2022-11-21 20:33:39 公開日:2020-12-19

# 科学的プロットのための物体検出ネットワークの系統的評価

A Systematic Evaluation of Object Detection Networks for Scientific Plots ( http://arxiv.org/abs/2007.02240v2 )

ライセンス: Link先を確認

Pritha Ganguly, Nitesh Methani, Mitesh M. Khapra and Pratyush Kumar

(参考訳) 既存の物体検出法は、自然画像に見られる物体と明らかに異なる科学的プロットのテキストや視覚要素を検出するのに適切か? この質問に答えるために、PlotQAデータセット上の様々なSOTAオブジェクト検出ネットワークの精度をトレーニングし比較する。 0.5の標準IOU設定では、ほとんどのネットワークはプロット内の比較的単純な物体を検出する場合、mAPスコアが80%以上である。しかし、パフォーマンスは0.9のより厳格なIOUで評価されると大幅に低下し、最高のモデルでmAPは35.70%となった。このような厳密な評価は、小さな局所化誤差でさえ下流の数値推論において大きな誤差をもたらす科学的なプロットを扱う際に必要である。この性能が劣ると、異なるオブジェクト検出ネットワークのアイデアを組み合わせることで、既存のモデルに小さな修正を加えることを提案する。これはパフォーマンスを大幅に改善するが、依然として2つの大きな問題がある。 (i)推論に欠かせないテキストオブジェクトのパフォーマンスは、非常に貧弱である。 (ii)プロットの単純さを考えると、推論時間は明らかに大きい。この未解決の問題を解決するために一連の貢献をします (a)ラプラシアンエッジ検出器に基づく効率的な領域提案法 (b)隣接情報を含む地域提案の特徴表示 (c)より長いテキストオブジェクトを検出するための複数の領域提案に結合するリンクコンポーネント、 (d)スムーズなL1ロスとIOUベースのロスを組み合わせたカスタムロス関数。これらのアイデアを組み合わせることで、最終モデルは、93.44%@0.9 IOUのmAPを達成する極端なIOU値において非常に正確である。同時に、我々のモデルは1段検出器を含む現在のモデルよりも16倍少ない推論時間で非常に効率的である。これらの貢献により、プロットの自動推論のさらなる探索が可能になる。

Are existing object detection methods adequate for detecting text and visual elements in scientific plots which are arguably different than the objects found in natural images? To answer this question, we train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset. At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in detecting the relatively simple objects in plots. However, the performance drops drastically when evaluated at a stricter IOU of 0.9 with the best model giving a mAP of 35.70%. Note that such a stricter evaluation is essential when dealing with scientific plots where even minor localisation errors can lead to large errors in downstream numerical inferences. Given this poor performance, we propose minor modifications to existing models by combining ideas from different object detection networks. While this significantly improves the performance, there are still 2 main issues: (i) performance on text objects which are essential for reasoning is very poor, and (ii) inference time is unacceptably large considering the simplicity of plots. To solve this open problem, we make a series of contributions: (a) an efficient region proposal method based on Laplacian edge detectors, (b) a feature representation of region proposals that includes neighbouring information, (c) a linking component to join multiple region proposals for detecting longer textual objects, and (d) a custom loss function that combines a smooth L1-loss with an IOU-based loss. Combining these ideas, our final model is very accurate at extreme IOU values achieving a mAP of 93.44%@0.9 IOU. Simultaneously, our model is very efficient with an inference time 16x lesser than the current models, including one-stage detectors. With these contributions, we enable further exploration on the automated reasoning of plots.

翻訳日:2022-11-13 08:23:15 公開日:2020-12-19

# プロジェクション・プールを用いた野生の総合的多視点ビルディング解析

Holistic Multi-View Building Analysis in the Wild with Projection Pooling ( http://arxiv.org/abs/2008.10041v3 )

ライセンス: Link先を確認

Zbigniew Wojna, Krzysztof Maziarz, {\L}ukasz Jocz, Robert Pa{\l}uba, Robert Kozikowski, Iasonas Kokkinos

(参考訳) 建設タイプ, 床数, 屋根のピッチと形状, ファサード材, 占有階級といった, きめ細かい建築特性に関する6つの異なる分類課題に対処する。このようなリモートビル分析問題に取り組むことは、都市シーンの大規模データセットの成長によって最近初めて可能になった。この目的のために,9674棟の49426画像(トップビューとストリートビュー)からなる新しいベンチマークデータセットを導入する。これらの写真は幾何学的メタデータと共にさらに組み立てられる。データセットには、オクルージョン、ぼやけ、部分的に見える物体、広い範囲の建物など、さまざまな現実世界の課題が展示されている。本研究では,高次元空間におけるトップビューとサイドビューの統一的なトップビュー表現を作成する,新しい投影プーリング層を提案する。これにより、ビルドとイメージメタデータをシームレスに利用することができます。このレイヤの導入により、高度に調整されたベースラインモデルと比較して、分類精度が向上する。

We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.

翻訳日:2022-10-26 02:45:13 公開日:2020-12-19

# (参考訳) authnet: 時間的顔特徴運動を用いた深層学習に基づく認証機構

AuthNet: A Deep Learning based Authentication Mechanism using Temporal Facial Feature Movements ( http://arxiv.org/abs/2012.02515v2 )

ライセンス: CC BY 4.0

Mohit Raghavendra, Pravan Omprakash, B R Mukesh, Sowmya Kamath

(参考訳) 機械学習とディープラーニングに基づくバイオメトリックシステムは、スマートフォンや他の小さなコンピューティングデバイスのようなリソースに制約のある環境で認証メカニズムとして広く使われている。これらのAIを利用した顔認識メカニズムは、透明で接触のない非侵襲的な性質のため、近年大きな人気を集めている。効果は大きいが、写真やマスク、メガネなどを使って、許可されていないアクセスを得る方法もある。本稿では,顔認識と,その顔の特異な動作の両方を用いて,パスワードを発話する認証機構,すなわち時間的顔特徴動作を提案する。提案モデルは、ユーザが任意の言語でパスワードを設定できるため、言語障壁によって阻害されない。標準のMIRACL-VC1データセットで評価すると、提案モデルは98.1%の精度を達成し、有効で堅牢なシステムとしての有効性を実証した。提案手法は, 正の映像サンプル10点をトレーニングしても良好な結果が得られたため, データ効率も高い。また, ネットワークの学習能力は, 様々な複合顔認証とリップ読解モデルに対して, 提案方式のベンチマークによって実証される。

Biometric systems based on Machine learning and Deep learning are being extensively used as authentication mechanisms in resource-constrained environments like smartphones and other small computing devices. These AI-powered facial recognition mechanisms have gained enormous popularity in recent years due to their transparent, contact-less and non-invasive nature. While they are effective to a large extent, there are ways to gain unauthorized access using photographs, masks, glasses, etc. In this paper, we propose an alternative authentication mechanism that uses both facial recognition and the unique movements of that particular face while uttering a password, that is, the temporal facial feature movements. The proposed model is not inhibited by language barriers because a user can set a password in any language. When evaluated on the standard MIRACL-VC1 dataset, the proposed model achieved an accuracy of 98.1%, underscoring its effectiveness as an effective and robust system. The proposed method is also data-efficient since the model gave good results even when trained with only 10 positive video samples. The competence of the training of the network is also demonstrated by benchmarking the proposed system against various compounded Facial recognition and Lip reading models.

翻訳日:2021-05-23 07:04:08 公開日:2020-12-19

# MLS:音声研究のための大規模多言語データセット

MLS: A Large-Scale Multilingual Dataset for Speech Research ( http://arxiv.org/abs/2012.03411v2 )

ライセンス: Link先を確認

Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

(参考訳) 本稿では,音声研究に適した多言語コーパスであるMLSデータセットを提案する。データセットは、LibriVoxの読み上げオーディオブックから派生したもので、英語の約44.5K時間、他の言語で約6K時間を含む8言語で構成されている。さらに,言語モデル(LM)とベースライン自動音声認識(ASR)モデル,およびデータセットのすべての言語に対して提供する。このような大きな転写されたデータセットは、ASR と Text-To-Speech (TTS) 研究に新たな道を開くと信じている。データセットはhttp://www.openslr.orgの誰でも自由に利用できる。

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages, including about 44.5K hours of English and a total of about 6K hours for other languages. Additionally, we provide Language Models (LM) and baseline Automatic Speech Recognition (ASR) models and for all the languages in our dataset. We believe such a large transcribed dataset will open new avenues in ASR and Text-To-Speech (TTS) research. The dataset will be made freely available for anyone at http://www.openslr.org.

翻訳日:2021-05-16 21:02:42 公開日:2020-12-19

# ニューラルネットワーク分類器の最終層および最後層における四面体対称性の出現について

On the emergence of tetrahedral symmetry in the final and penultimate layers of neural network classifiers ( http://arxiv.org/abs/2012.05420v2 )

ライセンス: Link先を確認

Weinan E and Stephan Wojtowytsch

(参考訳) 最近の数値的な研究により、ニューラルネットワーク分類器は有極層において大きな対称性を持つことがわかった。すなわち、$h(x) = af(x) +b$ ここで$a$が線型写像であり、$f$がネットワークのペナルティマイト層の出力である場合(活性化後)、すべてのデータポイント $x_{i, 1}, \dots, x_{i, n_i}$ はクラス $c_i$ で 1 つの点 $y_i$ にマッピングされ、ポイント $y_i$ は高次元ユークリッド空間における通常の $k-1$ 次元四面体の頂点に位置する。本研究は,高表現性深層ニューラルネットワークの玩具モデルで解析的に説明する。補完的な例では、$h$ が浅いネットワークであれば $c_i$ クラスからのデータサンプルよりも、$h$ の最終的な出力が均一でないことを厳密に示します(あるいは、より深い層がデータサンプルを便利な幾何学的構成にしない場合)。

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional tetrahedron in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

翻訳日:2021-05-15 06:35:44 公開日:2020-12-19

# 可視領域のセグメンテーションと形状を考慮したアモーダルセグメンテーション

Amodal Segmentation Based on Visible Region Segmentation and Shape Prior ( http://arxiv.org/abs/2012.05598v2 )

ライセンス: Link先を確認

Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao

(参考訳) 既存のアモダルセグメンテーション手法のほとんど全ては、画像全体に対応する特徴を用いてオクルード領域の推論を行う。これは人間のアモーダル知覚に反し、人間の目に見える部分と、対象の事前の知識を使って、隠された領域を推測する。人間の振る舞いを模倣し,学習の曖昧さを解決するために,まず,粗い目に見えるマスクと粗いアモーダルマスクを推定する枠組みを提案する。そして、粗い予測に基づいて、我々のモデルは、可視領域に集中し、メモリに先行する形状を利用してアモーダルマスクを推定する。これにより、アモーダルマスク推定において、背景と閉塞に対応する特徴を抑えることができる。その結果、アモダルマスクは、オクルージョンが同じ可視領域に与えられるものの影響を受けない。以前の形状の活用により、アモーダルマスクの推定はより堅牢で合理的になる。提案モデルは3つのデータセットで評価される。実験の結果,提案手法は既存手法よりも優れていた。形状の可視化は、コードブックのカテゴリ固有の特徴がある程度解釈可能であることを示している。

Almost all existing amodal segmentation methods make the inferences of occluded regions by using features corresponding to the whole image. This is against the human's amodal perception, where human uses the visible part and the shape prior knowledge of the target to infer the occluded region. To mimic the behavior of human and solve the ambiguity in the learning, we propose a framework, it firstly estimates a coarse visible mask and a coarse amodal mask. Then based on the coarse prediction, our model infers the amodal mask by concentrating on the visible region and utilizing the shape prior in the memory. In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation. Consequently, the amodal mask would not be affected by what the occlusion is given the same visible regions. The leverage of shape prior makes the amodal mask estimation more robust and reasonable. Our proposed model is evaluated on three datasets. Experiments show that our proposed model outperforms existing state-of-the-art methods. The visualization of shape prior indicates that the category-specific feature in the codebook has certain interpretability.

翻訳日:2021-05-15 06:22:56 公開日:2020-12-19

# OpenHoldem: 大規模不完全な情報ゲーム研究のためのオープンツールキット

OpenHoldem: An Open Toolkit for Large-Scale Imperfect-Information Game Research ( http://arxiv.org/abs/2012.06168v2 )

ライセンス: Link先を確認

Kai Li, Hang Xu, Meng Zhang, Enmin Zhao, Zhe Wu, Junliang Xing, Kaiqi Huang

(参考訳) 少数の研究所による未許可の努力に則って、大規模不完全情報ゲーム研究の主要な試験場であるNo-Limit Texas Hold'em (NLTH)における超人的AIの設計において、近年大きな進歩が見られた。しかし、既存の手法と比較するための標準ベンチマークがないため、新しい研究者がこの問題を研究することは依然として困難であり、この研究領域のさらなる発展を著しく妨げている。本研究では,NLTHを用いた大規模不完全情報ゲーム研究用統合ツールキットOpenHoldemを提案する。 1)異なるnlth aisを徹底的に評価するための標準化された評価プロトコル、2)nlth aiの3つの公に利用可能な強力なベースライン、3)公開nlth ai評価のための使いやすいapiを備えたオンラインテスティングプラットフォーム。我々はopenholdemをhttp://holdem.ia.ac.cn/でリリースし、この分野における未解決の理論的および計算的問題に関するさらなる研究を促進し、敵モデリング、大規模平衡探索、人間-コンピュータ対話学習といった重要な研究課題を育むことを願っている。

Owning to the unremitting efforts by a few institutes, significant progress has recently been made in designing superhuman AIs in No-limit Texas Hold'em (NLTH), the primary testbed for large-scale imperfect-information game research. However, it remains challenging for new researchers to study this problem since there are no standard benchmarks for comparing with existing methods, which seriously hinders further developments in this research area. In this work, we present OpenHoldem, an integrated toolkit for large-scale imperfect-information game research using NLTH. OpenHoldem makes three main contributions to this research direction: 1) a standardized evaluation protocol for thoroughly evaluating different NLTH AIs, 2) three publicly available strong baselines for NLTH AI, and 3) an online testing platform with easy-to-use APIs for public NLTH AI evaluation. We have released OpenHoldem at http://holdem.ia.ac.cn/, hoping it facilitates further studies on the unsolved theoretical and computational issues in this area and cultivate crucial research problems like opponent modeling, large-scale equilibrium-finding, and human-computer interactive learning.

翻訳日:2021-05-11 03:07:24 公開日:2020-12-19

# (参考訳) on-device full neural end-to-end automatic speech recognition algorithmのレビュー

A review of on-device fully neural end-to-end automatic speech recognition algorithms ( http://arxiv.org/abs/2012.07974v2 )

ライセンス: CC BY 4.0

Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, and Changwoo Han

(参考訳) 本稿では,デバイス上での音声認識アルゴリズムとその最適化手法について述べる。従来の音声認識システムは、音響モデル、言語モデル、発音モデル、テキスト正規化器、逆テキスト正規化器、重み付き有限状態変換器(WFST)に基づくデコーダなど、多数の独立したコンポーネントで構成されている。従来の音声認識システムで十分高い音声認識精度を得るには、通常、非常に大きな言語モデル(最大100GB)が必要である。したがって、対応するWFSTサイズは巨大になり、デバイス上での実装が禁止される。近年,完全ニューラルネットワークのエンドツーエンド音声認識アルゴリズムが提案されている。例えば、コネクショニスト時間分類(CTC)に基づく音声認識システム、リカレントニューラルネットワークトランスデューサ(RNN-T)、アテンションベースエンコーダ-デコーダモデル(AED)、モノトニックチャンク-ワイドアテンション(MoChA)、トランスフォーマーベース音声認識システムなどである。これらのニューラルネットワークベースのシステムでは、従来のアルゴリズムに比べてメモリフットプリントがはるかに小さいため、デバイス上での実装が実現可能になっている。本稿では,このようなエンドツーエンド音声認識モデルについてレビューする。従来のアルゴリズムと比較して,それらの構造,性能,利点を広く論じる。

In this paper, we review various end-to-end automatic speech recognition algorithms and their optimization techniques for on-device applications. Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite State Transducer (WFST), and so on. To obtain sufficiently high speech recognition accuracy with such conventional speech recognition systems, a very large language model (up to 100 GB) is usually needed. Hence, the corresponding WFST size becomes enormous, which prohibits their on-device implementation. Recently, fully neural network end-to-end speech recognition algorithms have been proposed. Examples include speech recognition systems based on Connectionist Temporal Classification (CTC), Recurrent Neural Network Transducer (RNN-T), Attention-based Encoder-Decoder models (AED), Monotonic Chunk-wise Attention (MoChA), transformer-based speech recognition systems, and so on. These fully neural network-based systems require much smaller memory footprints compared to conventional algorithms, therefore their on-device implementation has become feasible. In this paper, we review such end-to-end speech recognition models. We extensively discuss their structures, performance, and advantages compared to conventional algorithms.

翻訳日:2021-05-08 17:20:34 公開日:2020-12-19

# (参考訳) LSTMによる高効率建築エネルギー管理に向けた空間占有予測

LSTM-based Space Occupancy Prediction towards Efficient Building Energy Management ( http://arxiv.org/abs/2012.08114v2 )

ライセンス: CC BY 4.0

Juye Kim

(参考訳) 建物で消費されるエネルギーは、総エネルギー使用量のかなりの部分を占める。大量の建築エネルギーは、暖房、冷却、換気、空調(HVAC)に使用される。しかし、その重要性に比較して、近年のエネルギー管理システムの構築は、単純なルールベース制御(RBC)技術に基づくHVACの制御に限られている。空調を効率的に管理できるシステムを設計する能力は、エネルギー使用量と温室効果ガス排出量を減らすことができる。本稿では,LSTMを用いた占領パターンの時系列予測モデルを提案する。 HVACの動作には、次の時間帯(例えば、次の30分)における将来の部屋占有状況の予測信号を直接使用することができる。例えば、予知と冷却または加熱の時間を考慮すると、部屋が使用される前にHVACをオンにすることができる(例えば、10分前にオンにする)。また、次の部屋の空いた予測タイミングに基づき、HVACを早期にオフにすることができ、快適さを損なわずにHVACの効率を高めるのに役立つ。大学ビルの複数の部屋から収集した実世界のエネルギーデータを用いて,本手法の能力を示す。 LSTMの部屋占有予測に基づくHVAC制御は,従来のRBC制御と比較してエネルギー使用量を50%削減できることを示した。

Energy consumed in buildings takes significant portions of the total global energy usage. A large amount of building energy is used for heating, cooling, ventilation, and air-conditioning (HVAC). However, compared to its importance, building energy management systems nowadays are limited in controlling HVAC based on simple rule-based control (RBC) technologies. The ability to design systems that can efficiently manage HVAC can reduce energy usage and greenhouse gas emissions, and, all in all, it can help us to mitigate climate change. This paper proposes predictive time-series models of occupancy patterns using LSTM. Prediction signal for future room occupancy status on the next time span (e.g., next 30 minutes) can be directly used to operate HVAC. For example, based on the prediction and considering the time for cooling or heating, HVAC can be turned on before the room is being used (e.g., turn on 10 minutes earlier). Also, based on the next room empty prediction timing, HVAC can be turned off earlier, and it can help us increase the efficiency of HVAC while not decreasing comfort. We demonstrate our approach's capabilities using real-world energy data collected from multiple rooms of a university building. We show that LSTM's room occupancy prediction based HVAC control could save energy usage by 50% compared to conventional RBC based control.

翻訳日:2021-05-08 11:45:00 公開日:2020-12-19

# 限定的なコミュニケーション下でのマルチエージェントコラボレーションによる分散オンラインメタラーニングの高速化

Accelerating Distributed Online Meta-Learning via Multi-Agent Collaboration under Limited Communication ( http://arxiv.org/abs/2012.08660v2 )

ライセンス: Link先を確認

Sen Lin, Mehmet Dedeoglu and Junshan Zhang

(参考訳) IoTエコシステムにおけるエッジインテリジェンスの実現を可能にする技術として,オンラインメタ学習が登場している。それでも、タスク内高速適応のための優れたメタモデルを学ぶには、単一のエージェントだけで多くのタスクを学習する必要がある。マルチエージェントネットワークにおいて、異なるエージェント間の学習タスクがモデル類似性を共有することが多いことを観察するため、我々は、以下の根本的な疑問に答える:「限られたコミュニケーションと、どの程度の利益が達成できるかどうかによって、エージェント間のオンラインメタラーニングを加速することは可能か? そこで本研究では,マルチエージェントオンラインメタラーニングフレームワークを提案し,それと同等の2レベルネスト型オンライン凸最適化(oco)問題として位置づける。エージェントタスク平均的後悔の上限を特徴づけることで、マルチエージェントオンラインメタ学習の性能は、限られた通信によるメタモデル更新において、エージェントが分散ネットワークレベルのOCOからどれだけ恩恵を受けられるかに大きく依存することを示したが、よく理解されていない。この課題に取り組むために、我々は分散オンライン勾配降下アルゴリズムを考案し、各エージェントが1イテレーションあたり1回の通信ステップだけを使用してグローバル勾配を追跡し、その結果、エージェントあたりの平均後悔額$o(\sqrt{t/n})$が、最適なシングルエージェントの後悔額$o(\sqrt{t})$が、t$イテレーションの後に$n$がエージェント数であることを示す。この急激な性能向上を基盤として,マルチエージェントのオンラインメタ学習アルゴリズムを開発し,単一エージェントのオンラインメタ学習と比較して,O(1/\sqrt{NT})$の速さで最適なタスク平均後悔を達成可能であることを示す。広範な実験は理論結果を裏付ける。

Online meta-learning is emerging as an enabling technique for achieving edge intelligence in the IoT ecosystem. Nevertheless, to learn a good meta-model for within-task fast adaptation, a single agent alone has to learn over many tasks, and this is the so-called 'cold-start' problem. Observing that in a multi-agent network the learning tasks across different agents often share some model similarity, we ask the following fundamental question: "Is it possible to accelerate the online meta-learning across agents via limited communication and if yes how much benefit can be achieved? " To answer this question, we propose a multi-agent online meta-learning framework and cast it as an equivalent two-level nested online convex optimization (OCO) problem. By characterizing the upper bound of the agent-task-averaged regret, we show that the performance of multi-agent online meta-learning depends heavily on how much an agent can benefit from the distributed network-level OCO for meta-model updates via limited communication, which however is not well understood. To tackle this challenge, we devise a distributed online gradient descent algorithm with gradient tracking where each agent tracks the global gradient using only one communication step with its neighbors per iteration, and it results in an average regret $O(\sqrt{T/N})$ per agent, indicating that a factor of $\sqrt{1/N}$ speedup over the optimal single-agent regret $O(\sqrt{T})$ after $T$ iterations, where $N$ is the number of agents. Building on this sharp performance speedup, we next develop a multi-agent online meta-learning algorithm and show that it can achieve the optimal task-average regret at a faster rate of $O(1/\sqrt{NT})$ via limited communication, compared to single-agent online meta-learning. Extensive experiments corroborate the theoretic results.

翻訳日:2021-05-07 05:05:02 公開日:2020-12-19

# 高速かつ連続的なエッジ学習のための非接触ADMMに基づくフェデレーションメタラーニング

Inexact-ADMM Based Federated Meta-Learning for Fast and Continual Edge Learning ( http://arxiv.org/abs/2012.08677v2 )

ライセンス: Link先を確認

Sheng Yue, Ju Ren, Jiang Xin, Sen Lin, Junshan Zhang

(参考訳) 多くのIoTアプリケーションのパフォーマンス、安全性、レイテンシの要件を満たすために、インテリジェントな決定をここでネットワークエッジで行う必要があります。しかし、制約のあるリソースと限られたローカルデータ量は、エッジAIの開発に重大な課題をもたらす。これらの課題を克服するため,我々は,先行課題からの知識伝達を活用できる連続エッジ学習について検討する。高速かつ連続的なエッジ学習の実現を目的として,エッジノードが協調してメタモデルを学習する,プラットフォーム支援型統合メタ学習アーキテクチャを提案する。エッジ学習問題を正規化最適化問題として、従来のタスクから得られた貴重な知識を正規化として抽出する。次に,admmベースのフェデレーションメタラーニングアルゴリズムであるadmm-fedmetaを考案する。admmは元の問題を多数のサブ問題に分解する自然なメカニズムを提供し,エッジノードとプラットフォーム間で並列に解くことができる。さらに、線形近似とヘッセン推定によって部分問題が解かれるような inexact-admm 法の変種を用い、1ラウンドあたりの計算コストを$\mathcal{o}(n)$ に削減する。一般の非凸の場合において,ADMM-FedMetaの収束特性,迅速な適応性能,事前知識伝達の忘れ効果を総合的に分析する。大規模な実験ではADMM-FedMetaの有効性と効率が示され、既存のベースラインを大きく上回っている。

In order to meet the requirements for performance, safety, and latency in many IoT applications, intelligent decisions must be made right here right now at the network edge. However, the constrained resources and limited local data amount pose significant challenges to the development of edge AI. To overcome these challenges, we explore continual edge learning capable of leveraging the knowledge transfer from previous tasks. Aiming to achieve fast and continual edge learning, we propose a platform-aided federated meta-learning architecture where edge nodes collaboratively learn a meta-model, aided by the knowledge transfer from prior tasks. The edge learning problem is cast as a regularized optimization problem, where the valuable knowledge learned from previous tasks is extracted as regularization. Then, we devise an ADMM based federated meta-learning algorithm, namely ADMM-FedMeta, where ADMM offers a natural mechanism to decompose the original problem into many subproblems which can be solved in parallel across edge nodes and the platform. Further, a variant of inexact-ADMM method is employed where the subproblems are `solved' via linear approximation as well as Hessian estimation to reduce the computational cost per round to $\mathcal{O}(n)$. We provide a comprehensive analysis of ADMM-FedMeta, in terms of the convergence properties, the rapid adaptation performance, and the forgetting effect of prior knowledge transfer, for the general non-convex case. Extensive experimental studies demonstrate the effectiveness and efficiency of ADMM-FedMeta, and showcase that it substantially outperforms the existing baselines.

翻訳日:2021-05-03 02:48:57 公開日:2020-12-19

# (参考訳) 臨床領域適応による胸部X線写真におけるコンピュータ支援異常検出

Computer-aided abnormality detection in chest radiographs in a clinical setting via domain-adaptation ( http://arxiv.org/abs/2012.10564v1 )

ライセンス: CC BY 4.0

Abhishek K Dubey, Michael T Young, Christopher Stanley, Dalton Lunga, Jacob Hinkle

(参考訳) 深層学習(DL)モデルは、放射線医が胸部X線写真から肺疾患の診断を助けるために医療センターに配備されている。このようなモデルは、しばしば多くの公開ラベル付きラジオグラフィーで訓練される。これらの訓練済みDLモデルが臨床現場で一般化する能力は、公開と非公開のラジオグラフィー間のデータ分布の変化のため、貧弱である。胸部X線写真では、分布の不均一性はX線装置の様々な条件と画像の生成に使用される構成から生じる。機械学習のコミュニティでは、データ生成ソースの多様性によって生じる課題はドメインシフトと呼ばれ、これは生成モデルのモードシフトである。本研究では,ドメインシフト検出と除去手法を導入し,この問題を克服する。臨床における胸部x線画像の異常検出のための事前訓練したdlモデルの導入における提案手法の有効性について検討した。

Deep learning (DL) models are being deployed at medical centers to aid radiologists for diagnosis of lung conditions from chest radiographs. Such models are often trained on a large volume of publicly available labeled radiographs. These pre-trained DL models' ability to generalize in clinical settings is poor because of the changes in data distributions between publicly available and privately held radiographs. In chest radiographs, the heterogeneity in distributions arises from the diverse conditions in X-ray equipment and their configurations used for generating the images. In the machine learning community, the challenges posed by the heterogeneity in the data generation source is known as domain shift, which is a mode shift in the generative model. In this work, we introduce a domain-shift detection and removal method to overcome this problem. Our experimental results show the proposed method's effectiveness in deploying a pre-trained DL model for abnormality detection in chest radiographs in a clinical setting.

翻訳日:2021-05-01 17:09:08 公開日:2020-12-19

# (参考訳) 多要素ベイズニューラルネットワーク:アルゴリズムとその応用

Multi-fidelity Bayesian Neural Networks: Algorithms and Applications ( http://arxiv.org/abs/2012.13294v1 )

ライセンス: CC BY 4.0

Xuhui Meng, Hessam Babaee, and George Em Karniadakis

(参考訳) 本稿では,可変忠実性のノイズデータを用いて学習可能なベイズ型ニューラルネットワーク(bnns)の新たなクラスを提案し,関数近似の学習や偏微分方程式(pdes)に基づく逆問題を解く。 BNNは3つのニューラルネットワークで構成されている: 1つは完全連結ニューラルネットワークで、これは低忠実度データに適合する最大アプテリ確率(MAP)法に従って訓練され、2つ目は低忠実度データと高忠実度データの間の不確実性定量化による相互相関を捉えるために使用されるベイズニューラルネットワーク、そしてもう1つは物理情報処理で記述された物理法則を符号化するニューラルネットワークである。最後の2つのニューラルネットワークのトレーニングのために、ハミルトニアンモンテカルロ法を用いて、対応するハイパーパラメータの後方分布を正確に推定する。本稿では合成データと実測値を用いて,本手法の精度を示す。具体的には、まず1次元と4次元の関数を近似し、1次元と2次元の拡散反応系の反応速度を推定する。さらに,マサチューセッツ州およびケープコッド湾の海面温度 (sst) を衛星画像とその場測定を用いて推定した。その結果,本手法は低次・高次データ間の線形および非線形の相関関係を適応的に捉え,未知パラメータをPDEで同定し,ノイズの多い高忠実度データから予測の不確かさを定量化できることを示した。最後に,特定の一次元関数近似と逆PDE問題を用いて,不確かさを効果的かつ効率的に低減し,能動的学習手法による予測精度を向上させることを実証した。

We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity, and we apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs). These multi-fidelity BNNs consist of three neural networks: The first is a fully connected neural network, which is trained following the maximum a posteriori probability (MAP) method to fit the low-fidelity data; the second is a Bayesian neural network employed to capture the cross-correlation with uncertainty quantification between the low- and high-fidelity data; and the last one is the physics-informed neural network, which encodes the physical laws described by PDEs. For the training of the last two neural networks, we use the Hamiltonian Monte Carlo method to estimate accurately the posterior distributions for the corresponding hyperparameters. We demonstrate the accuracy of the present method using synthetic data as well as real measurements. Specifically, we first approximate a one- and four-dimensional function, and then infer the reaction rates in one- and two-dimensional diffusion-reaction systems. Moreover, we infer the sea surface temperature (SST) in the Massachusetts and Cape Cod Bays using satellite images and in-situ measurements. Taken together, our results demonstrate that the present method can capture both linear and nonlinear correlation between the low- and high-fideilty data adaptively, identify unknown parameters in PDEs, and quantify uncertainties in predictions, given a few scattered noisy high-fidelity data. Finally, we demonstrate that we can effectively and efficiently reduce the uncertainties and hence enhance the prediction accuracy with an active learning approach, using as examples a specific one-dimensional function approximation and an inverse PDE problem.

翻訳日:2021-05-01 16:56:07 公開日:2020-12-19

# (参考訳) T-GAP: 時間的知識グラフ補完のための歩行学習

T-GAP: Learning to Walk across Time for Temporal Knowledge Graph Completion ( http://arxiv.org/abs/2012.10595v1 )

ライセンス: CC BY 4.0

Jaehun Jung, Jinhong Jung, U Kang

(参考訳) 時間的知識グラフ(TKG)は、静的知識グラフとは対照的に、本質的に現実世界の知識の過渡的な性質を反映している。自然に、自動tkg補完はリレーショナル推論のより現実的なモデリングのために多くの研究の関心を集めている。しかし、既存のTKGコンプリート用モジュールのほとんどは、TKG構造を完全に活用しない静的KG埋め込みを拡張しており、1)クエリのlo-cal地区にすでに存在する時間的関連イベントのアカウント化、2)マルチホップ推論とより良い解釈性を促進するパスベースの推論を欠いている。本稿では,そのエンコーダとデコーダにおける時間情報とグラフ構造の両方を最大限に活用するTKG補完の新しいモデルであるT-GAPを提案する。 T-GAPは、各イベントとクエリタイムスタンプ間の時間的変位に着目して、TKGのクエリ固有のサブ構造を符号化し、グラフを通して注意を伝播することでパスベースの推論を行う。我々の実証実験は、T-GAPが最先端のベースラインに対して優れた性能を発揮するだけでなく、目に見えないタイムスタンプを持つクエリにも有能に一般化できることを示した。また, T-GAPは透明な解釈性から, その推論過程において人間の直感に従うことが示唆された。

Temporal knowledge graphs (TKGs) inherently reflect the transient nature of real-world knowledge, as opposed to static knowledge graphs. Naturally, automatic TKG completion has drawn much research interests for a more realistic modeling of relational reasoning. However, most of the existing mod-els for TKG completion extend static KG embeddings that donot fully exploit TKG structure, thus lacking in 1) account-ing for temporally relevant events already residing in the lo-cal neighborhood of a query, and 2) path-based inference that facilitates multi-hop reasoning and better interpretability. In this paper, we propose T-GAP, a novel model for TKG completion that maximally utilizes both temporal information and graph structure in its encoder and decoder. T-GAP encodes query-specific substructure of TKG by focusing on the temporal displacement between each event and the query times-tamp, and performs path-based inference by propagating attention through the graph. Our empirical experiments demonstrate that T-GAP not only achieves superior performance against state-of-the-art baselines, but also competently generalizes to queries with unseen timestamps. Through extensive qualitative analyses, we also show that T-GAP enjoys from transparent interpretability, and follows human intuition in its reasoning process.

翻訳日:2021-05-01 16:54:57 公開日:2020-12-19

# (参考訳) マルチデコーダアテンションモデルによる車両経路問題の可視化

Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems ( http://arxiv.org/abs/2012.10638v1 )

ライセンス: CC BY 4.0

Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang

(参考訳) 車両経路問題に対する建設ヒューリスティックスを学習するための新しい強化学習手法を提案する。具体的には,多種多様なポリシーを学習するためのMDAM(Multi-Decoder Attention Model)を提案する。 MDAMの多様性を完全に活用するために、カスタマイズされたビームサーチ戦略が設計されている。また,提案手法では,mdamにおける再帰的構造に基づく埋め込みの可視化層を提案し,より情報的な埋め込みを提供することで,各ポリシーの質を向上させることができる。 6種類の経路問題に対する広範囲な実験により,本手法が最先端のディープラーニングモデルを大きく上回っていることが示された。

We present a novel deep reinforcement learning method to learn construction heuristics for vehicle routing problems. In specific, we propose a Multi-Decoder Attention Model (MDAM) to train multiple diverse policies, which effectively increases the chance of finding good solutions compared with existing methods that train only one policy. A customized beam search strategy is designed to fully exploit the diversity of MDAM. In addition, we propose an Embedding Glimpse layer in MDAM based on the recursive nature of construction, which can improve the quality of each policy by providing more informative embeddings. Extensive experiments on six different routing problems show that our method significantly outperforms the state-of-the-art deep learning based models.

翻訳日:2021-05-01 16:06:12 公開日:2020-12-19

# (参考訳) 行動認識のためのSMARTフレーム選択

SMART Frame Selection for Action Recognition ( http://arxiv.org/abs/2012.10671v1 )

ライセンス: CC BY 4.0

Shreyank N Gowda, Marcus Rohrbach, Laura Sevilla-Lara

(参考訳) 動作認識は計算コストが高い。本稿では,アクション認識の精度を向上させるために,フレーム選択の問題に対処する。特に,優れたフレームの選択は,トリミングされたビデオ領域においても行動認識性能に寄与することを示す。最近の研究は、多くのコンテンツが関係なく、廃棄が容易な長いビデオに対して、フレーム選択の活用に成功している。しかし、本研究では、より標準的でトリミングされた行動認識問題に焦点を当てる。優れたフレーム選択は、行動認識の計算コストを削減できるだけでなく、分類が難しいフレームを除去することで精度を向上させることができると論じる。従来の研究とは対照的に,フレームの選択を一度に考えるのではなく,共同で考える手法を提案する。これにより、ストーリーを語るスナップショットなど、優れたフレームがビデオ上でより効果的に分散する、より効率的な選択が可能になる。提案したフレーム選択SMARTを,異なるバックボーンアーキテクチャと複数のベンチマーク(Kinetics, Something-something, UCF101)で組み合わせてテストする。 SMARTフレーム選択は,計算コストを4倍から10倍に削減しつつ,他のフレーム選択方法と比較して常に精度を向上することを示す。さらに,認識性能を第一の目標とする場合には,近年の最先端モデルや各種ベンチマーク(UCF101, HMDB51, FCVID, ActivityNet)のフレーム選択戦略よりも優れた選択戦略を実現できることを示す。

Action recognition is computationally expensive. In this paper, we address the problem of frame selection to improve the accuracy of action recognition. In particular, we show that selecting good frames helps in action recognition performance even in the trimmed videos domain. Recent work has successfully leveraged frame selection for long, untrimmed videos, where much of the content is not relevant, and easy to discard. In this work, however, we focus on the more standard short, trimmed action recognition problem. We argue that good frame selection can not only reduce the computational cost of action recognition but also increase the accuracy by getting rid of frames that are hard to classify. In contrast to previous work, we propose a method that instead of selecting frames by considering one at a time, considers them jointly. This results in a more efficient selection, where good frames are more effectively distributed over the video, like snapshots that tell a story. We call the proposed frame selection SMART and we test it in combination with different backbone architectures and on multiple benchmarks (Kinetics, Something-something, UCF101). We show that the SMART frame selection consistently improves the accuracy compared to other frame selection strategies while reducing the computational cost by a factor of 4 to 10 times. Additionally, we show that when the primary goal is recognition performance, our selection strategy can improve over recent state-of-the-art models and frame selection strategies on various benchmarks (UCF101, HMDB51, FCVID, and ActivityNet).

翻訳日:2021-05-01 15:21:52 公開日:2020-12-19

# (参考訳) コンフューズド・モデュロ・プロジェクションに基づくホモモルフィック暗号化 -セキュアスマートシティにおける暗号システム, ライブラリおよび応用-

Confused Modulo Projection based Somewhat Homomorphic Encryption -- Cryptosystem, Library and Applications on Secure Smart Cities ( http://arxiv.org/abs/2012.10692v1 )

ライセンス: CC BY 4.0

Xin Jin, Hongyu Zhang, Xiaodong Li, Haoyang Yu, Beisheng Liu, Shujiang Xie, Amit Kumar Singh and Yujie Li

(参考訳) クラウドコンピューティングの発展に伴い、大規模なビジュアルメディアデータのストレージと処理は徐々にクラウドサーバに移されていった。例えば、インテリジェントなビデオ監視システムが大量のデータをローカルで処理できない場合、データはクラウドにアップロードされる。そのため、元のデータを露呈することなくクラウドでデータを処理する方法が重要な研究テーマとなっている。そこで我々は,CMP-SWHEという混同したモジュラープロジェクション定理に基づく暗号システムの単一サーババージョンを提案し,サーバがユーザデータの有効情報を「emph{seeing}」することなく,ブラインドデータ処理を完了できるようにする。クライアント側では、元のデータは増幅、ランダム化、紛らわしい冗長性の設定によって暗号化される。サーバ側で暗号化されたデータを操作することは、元のデータ上での操作と同等である。拡張として,バッチ処理技術に基づく高速化バージョンによるブラインドコンピューティング方式を設計,実装し,効率を向上した。このアルゴリズムを使いやすくするために、cmp-swheに基づく効率的な汎用ブラインドコンピューティングライブラリを設計し実装した。我々は,このライブラリを,スマートシティ構築に有用な,フォアグラウンド抽出,オプティカルフロー追跡,オブジェクト検出に応用した。アルゴリズムをディープラーニングアプリケーションに拡張する方法についても論じる。他の同型暗号システムやライブラリと比較すると,本手法は計算効率において明らかに有利である。我々のアルゴリズムは、データが大きすぎると小さなエラー(10^{-6}$)があるが、非常に効率的で実用的であり、特にブラインド画像やビデオ処理に適している。

With the development of cloud computing, the storage and processing of massive visual media data has gradually transferred to the cloud server. For example, if the intelligent video monitoring system cannot process a large amount of data locally, the data will be uploaded to the cloud. Therefore, how to process data in the cloud without exposing the original data has become an important research topic. We propose a single-server version of somewhat homomorphic encryption cryptosystem based on confused modulo projection theorem named CMP-SWHE, which allows the server to complete blind data processing without \emph{seeing} the effective information of user data. On the client side, the original data is encrypted by amplification, randomization, and setting confusing redundancy. Operating on the encrypted data on the server side is equivalent to operating on the original data. As an extension, we designed and implemented a blind computing scheme of accelerated version based on batch processing technology to improve efficiency. To make this algorithm easy to use, we also designed and implemented an efficient general blind computing library based on CMP-SWHE. We have applied this library to foreground extraction, optical flow tracking and object detection with satisfactory results, which are helpful for building smart cities. We also discuss how to extend the algorithm to deep learning applications. Compared with other homomorphic encryption cryptosystems and libraries, the results show that our method has obvious advantages in computing efficiency. Although our algorithm has some tiny errors ($10^{-6}$) when the data is too large, it is very efficient and practical, especially suitable for blind image and video processing.

翻訳日:2021-05-01 15:07:03 公開日:2020-12-19

# (参考訳) ファジィ認知マップの進化的アルゴリズム

Evolutionary Algorithms for Fuzzy Cognitive Maps ( http://arxiv.org/abs/2102.01012v1 )

ライセンス: CC BY 4.0

Stefanos Tsimenidis

(参考訳) ファジィ認知マップ(fcms)は複雑なシステムモデリング手法であり、その特異な利点のために最近人気が高まっている。それらは、モデル化されるシステムのパラメータ間の因果関係を表すグラフに基づいており、解釈可能性と柔軟性のために際立っている。近年のFCMの普及に伴い、モデルの開発と最適化のための研究が数多く行われている。 FCMの最も重要な要素の1つは、彼らが使用する学習アルゴリズムであり、その有効性は、主にそれによって決定される。学習アルゴリズムは、所望の行動に収束することを目的として、fcmのノード重みを学習する。本研究は、FCMの学習に使用される遺伝的アルゴリズムを概説するとともに、FCM学習アルゴリズムの概要を概説し、進化的コンピューティングをより広い文脈に導入する。

Fuzzy Cognitive Maps (FCMs) is a complex systems modeling technique which, due to its unique advantages, has lately risen in popularity. They are based on graphs that represent the causal relationships among the parameters of the system to be modeled, and they stand out for their interpretability and flexibility. With the late popularity of FCMs, a plethora of research efforts have taken place to develop and optimize the model. One of the most important elements of FCMs is the learning algorithm they use, and their effectiveness is largely determined by it. The learning algorithms learn the node weights of an FCM, with the goal of converging towards the desired behavior. The present study reviews the genetic algorithms used for training FCMs, as well as gives a general overview of the FCM learning algorithms, putting evolutionary computing into the wider context.

翻訳日:2021-05-01 14:26:27 公開日:2020-12-19

# (参考訳) 各種駆動サイクルにおけるリチウムイオン電池の電荷推定のためのNARXNNの解析

Analysis of NARXNN for State of Charge Estimation for Li-ion Batteries on various Drive Cycles ( http://arxiv.org/abs/2012.10725v1 )

ライセンス: CC BY 4.0

Aniruddh Herle, Janamejaya Channegowda, Kali Naraharisetti

(参考訳) 電気自動車(EV)は環境に優しいため、急速に普及している。リチウムイオン電池はEV技術の中心であり、EVの重量とコストの大部分に貢献している。充電状態(soc)はevの範囲を予測するのに役立つ非常に重要な指標である。車両の利用可能な範囲が決定できるように、バッテリパックで利用可能なバッテリ容量を正確に推定する必要がある。 SOCを推定する技術は様々である。本稿では,データ駆動アプローチを選択し,外部入出力ニューラルネットワーク(narxnn)を用いた非線形自己回帰ネットワークを用いてsocを正確に推定する。 NARXNNは、文献で利用可能な従来の機械学習技術よりも優れていることが示されている。 NARXNNモデルは、LA92、US06、UDDS、HWFETといった様々なEVドライブサイクル上で開発、テストされ、実世界のシナリオでそのパフォーマンスをテストする。このモデルは,従来の統計的機械学習手法より優れ,平均正方形誤差(MSE)を1e-5の範囲で達成する。

Electric Vehicles (EVs) are rapidly increasing in popularity as they are environment friendly. Lithium Ion batteries are at the heart of EV technology and contribute to most of the weight and cost of an EV. State of Charge (SOC) is a very important metric which helps to predict the range of an EV. There is a need to accurately estimate available battery capacity in a battery pack such that the available range in a vehicle can be determined. There are various techniques available to estimate SOC. In this paper, a data driven approach is selected and a Nonlinear Autoregressive Network with Exogenous Inputs Neural Network (NARXNN) is explored to accurately estimate SOC. NARXNN has been shown to be superior to conventional Machine Learning techniques available in the literature. The NARXNN model is developed and tested on various EV Drive Cycles like LA92, US06, UDDS and HWFET to test its performance on real world scenarios. The model is shown to outperform conventional statistical machine learning methods and achieve a Mean Squared Error (MSE) in the 1e-5 range.

翻訳日:2021-05-01 13:41:53 公開日:2020-12-19

# (参考訳) 外観テキスト融合による政治ポスターの識別

Political Posters Identification with Appearance-Text Fusion ( http://arxiv.org/abs/2012.10728v1 )

ライセンス: CC BY 4.0

Xuan Qin, Meizhu Liu, Yifan Hu, Christina Moo, Christian M. Riblet, Changwei Hu, Kevin Yen and Haibin Ling

(参考訳) 本稿では,外観特徴とテキストベクトルを効率的に活用し,政治ポスターを他の類似の政治イメージから正確に分類する手法を提案する。この作品の大半は、特定の政治イベントのプロモーションとして設計された政治ポスターに焦点が当てられ、その自動識別によって詳細な統計が生成され、様々な分野での判断ニーズを満たすことができる。政治家や政治イベントの包括的なキーワードリストから始めて、運動やキャンペーンを明示的に支援する3K政治ポスターを含む13K人の政治的イメージを含む、効果的で実用的な政治ポスターデータセットをキュレートする。第二に、このデータセットの徹底的なケーススタディを行い、政治ポスターの一般的なパターンや傾向を分析します。最後に, 外観情報とテキスト情報の両方を組み合わせることによって, 政治的ポスターを高い精度で分類するモデルを提案する。

In this paper, we propose a method that efficiently utilizes appearance features and text vectors to accurately classify political posters from other similar political images. The majority of this work focuses on political posters that are designed to serve as a promotion of a certain political event, and the automated identification of which can lead to the generation of detailed statistics and meets the judgment needs in a variety of areas. Starting with a comprehensive keyword list for politicians and political events, we curate for the first time an effective and practical political poster dataset containing 13K human-labeled political images, including 3K political posters that explicitly support a movement or a campaign. Second, we make a thorough case study for this dataset and analyze common patterns and outliers of political posters. Finally, we propose a model that combines the power of both appearance and text information to classify political posters with significantly high accuracy.

翻訳日:2021-05-01 13:35:20 公開日:2020-12-19

# (参考訳) (決定と回帰)回帰と分類のための木のアンサンブルに基づくカーネル

(Decision and regression) tree ensemble based kernels for regression and classification ( http://arxiv.org/abs/2012.10737v1 )

ライセンス: CC BY 4.0

Dai Feng and Richard Baumgartner

(参考訳) Breiman's random forest (RF) や Gradient Boosted Trees (GBT) のような木に基づくアンサンブルは暗黙のカーネルジェネレータと解釈できる。 RFのカーネル・パースペクティブは、その統計的性質を理論的に研究するための原則的な枠組みの開発に使用されている。近年、カーネルの解釈は他の木に基づくアンサンブルに対してドイツ語であることが示されている。 GBT。しかしながら、カーネルとツリーアンサンブル間のリンクの実用性は広く研究されておらず、体系的に評価されていない。本研究の焦点は, RFやGBTを含む木に基づくアンサンブルとカーネルメソッドの相互作用を調べることである。 RFおよびGBTをベースとしたカーネルの性能と特性を連続的および二元的ターゲットからなる総合シミュレーション研究で解明する。その結果,rf/gbtカーネルは,高次元のシナリオにおいて,特にノイズが多い場合において,それぞれのアンサンブルと競合することがわかった。バイナリターゲットでは、RF/GBTカーネルとそのアンサンブルは同等のパフォーマンスを示す。回帰と分類のための実際のデータセットの結果を提供し、これらの洞察が実際にどのように活用されるかを示します。全体として、私たちの結果は、実践者のツールボックスに価値ある追加として、ツリーアンサンブルベースのカーネルをサポートします。最後に,サバイバルターゲット,解釈可能なプロトタイプ,ランドマーク分類と回帰のためのツリーアンサンブルベースのカーネルの拡張について述べる。我々は, ベイジアン系の多頻度ツリーアンサンブルによるカーネルの研究の今後の展開について概説する。

Tree based ensembles such as Breiman's random forest (RF) and Gradient Boosted Trees (GBT) can be interpreted as implicit kernel generators, where the ensuing proximity matrix represents the data-driven tree ensemble kernel. Kernel perspective on the RF has been used to develop a principled framework for theoretical investigation of its statistical properties. Recently, it has been shown that the kernel interpretation is germane to other tree-based ensembles e.g. GBTs. However, practical utility of the links between kernels and the tree ensembles has not been widely explored and systematically evaluated. Focus of our work is investigation of the interplay between kernel methods and the tree based ensembles including the RF and GBT. We elucidate the performance and properties of the RF and GBT based kernels in a comprehensive simulation study comprising of continuous and binary targets. We show that for continuous targets, the RF/GBT kernels are competitive to their respective ensembles in higher dimensional scenarios, particularly in cases with larger number of noisy features. For the binary target, the RF/GBT kernels and their respective ensembles exhibit comparable performance. We provide the results from real life data sets for regression and classification to show how these insights may be leveraged in practice. Overall, our results support the tree ensemble based kernels as a valuable addition to the practitioner's toolbox. Finally, we discuss extensions of the tree ensemble based kernels for survival targets, interpretable prototype and landmarking classification and regression. We outline future line of research for kernels furnished by Bayesian counterparts of the frequentist tree ensembles.

翻訳日:2021-05-01 13:26:31 公開日:2020-12-19

# (参考訳) GlocalNet: クラスを意識した長期人間の動作合成

GlocalNet: Class-aware Long-term Human Motion Synthesis ( http://arxiv.org/abs/2012.10744v1 )

ライセンス: CC BY 4.0

Neeraj Battan, Yudhik Agrawal, Veeravalli Saisooryarao, Aman Goel and Avinash Sharma

(参考訳) Augmented Reality, 3Dキャラクタアニメーション, 歩行者軌道予測などに適用可能な, 人間中心のビデオ生成を支援するためには, 長期人間の骨格配列の合成が不可欠である。ポーズ間の長期的時間的依存関係、ポーズ間の周期的反復、ポーズ間の双方向およびマルチスケールの依存関係、行動の変動速度、および人間の活動の複数のクラス/タイプにまたがる時間的ポーズ変動の空間が部分的に重なり合うため、長期的人間の動作合成は難しい課題である。本稿では,多種多様な活動クラス (>50) において,長期的(6000ms以下)人間の運動軌跡を合成する課題を解決することを目的とする。本稿では,この目標を達成するための2段階のアクティビティ生成手法を提案する。第1段階は,活動系列の長期的グローバルなポーズ依存性を学習し,スパース動作軌跡を合成し,第2段階は第1段階の出力を取り入れた濃密な動き軌跡の生成に対処する。公開されているデータセットの様々な定量的評価指標を用いて,SOTA法よりも提案手法の方が優れていることを示す。

Synthesis of long-term human motion skeleton sequences is essential to aid human-centric video generation with potential applications in Augmented Reality, 3D character animations, pedestrian trajectory prediction, etc. Long-term human motion synthesis is a challenging task due to multiple factors like, long-term temporal dependencies among poses, cyclic repetition across poses, bi-directional and multi-scale dependencies among poses, variable speed of actions, and a large as well as partially overlapping space of temporal pose variations across multiple class/types of human activities. This paper aims to address these challenges to synthesize a long-term (> 6000 ms) human motion trajectory across a large variety of human activity classes (>50). We propose a two-stage activity generation method to achieve this goal, where the first stage deals with learning the long-term global pose dependencies in activity sequences by learning to synthesize a sparse motion trajectory while the second stage addresses the generation of dense motion trajectories taking the output of the first stage. We demonstrate the superiority of the proposed method over SOTA methods using various quantitative evaluation metrics on publicly available datasets.

翻訳日:2021-05-01 13:08:52 公開日:2020-12-19

# (参考訳) 風環境における汚染センサの最適配置

Optimising Placement of Pollution Sensors in Windy Environments ( http://arxiv.org/abs/2012.10770v1 )

ライセンス: CC BY 4.0

Sigrid Passano Hellan, Christopher G. Lucas and Nigel H. Goddard

(参考訳) 大気汚染は世界で最も重要な死亡原因の1つである。大気汚染のモニタリングは、健康と汚染物質との関係についてより深く学び、介入すべき領域を特定するのに有用である。このようなモニタリングは高価であるため、センサーをできるだけ効率的に設置することが重要である。ベイズ最適化はセンサの位置を選択するのに有用であることが証明されているが、一般的には大気汚染の統計構造を無視したカーネル機能に依存している。本稿では,2つの新しい風化カーネルについて述べるとともに,ベイズ最適化による最大汚染箇所の学習を積極的に行うことの利点について考察する。

Air pollution is one of the most important causes of mortality in the world. Monitoring air pollution is useful to learn more about the link between health and pollutants, and to identify areas for intervention. Such monitoring is expensive, so it is important to place sensors as efficiently as possible. Bayesian optimisation has proven useful in choosing sensor locations, but typically relies on kernel functions that neglect the statistical structure of air pollution, such as the tendency of pollution to propagate in the prevailing wind direction. We describe two new wind-informed kernels and investigate their advantage for the task of actively learning locations of maximum pollution using Bayesian optimisation.

翻訳日:2021-05-01 12:57:40 公開日:2020-12-19

# (参考訳) データマイニング変数による信頼性のある因果推論の実現:測定誤差問題に対するランダムフォレストアプローチ

Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem ( http://arxiv.org/abs/2012.10790v1 )

ライセンス: CC BY 4.0

Mochen Yang, Edward McFowland III, Gordon Burtch and Gediminas Adomavicius

(参考訳) 機械学習と計量分析の組み合わせは、研究と実践の両方でますます普及している。一般的な実証的戦略は、利用可能なデータから興味のある変数を「マイニング」するために予測モデリング技術を適用し、その後、因果効果を推定する目的で、それらの変数を計量的フレームワークに含めることである。最近の研究は、機械学習モデルからの予測は必然的に不完全であるため、予測変数に基づく計量分析は測定誤差によるバイアスに悩まされる可能性が高いことを強調している。本稿では,ランダム森林として知られるアンサンブル学習手法を利用して,これらのバイアスを軽減する新しい手法を提案する。予測に埋め込まれた測定誤差に対処するために,予測だけでなく,機器変数の生成にもランダムフォレストを用いることを提案する。ランダムフォレストアルゴリズムは、予測において個別に正確でありながら「異なる」誤り、すなわち弱い相関の予測誤差を生じさせる一連の木からなる場合に最もよく機能する。鍵となる観察は、これらの性質が有効な機器変数の関連性と排除要件に密接に関連していることである。ランダムな森林から個々の樹木のタプルを選抜するデータ駆動手法を考案し,1つの木が内生的共変量体として,もう1つの木がその道具として機能する。シミュレーション実験により, 推定バイアスの軽減における提案手法の有効性と, バイアス補正のための3つの代替手法よりも優れた性能を示す。

Combining machine learning with econometric analysis is becoming increasingly prevalent in both research and practice. A common empirical strategy involves the application of predictive modeling techniques to 'mine' variables of interest from available data, followed by the inclusion of those variables into an econometric framework, with the objective of estimating causal effects. Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error. We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest. We propose employing random forest not just for prediction, but also for generating instrumental variables to address the measurement error embedded in the prediction. The random forest algorithm performs best when comprised of a set of trees that are individually accurate in their predictions, yet which also make 'different' mistakes, i.e., have weakly correlated prediction errors. A key observation is that these properties are closely related to the relevance and exclusion requirements of valid instrumental variables. We design a data-driven procedure to select tuples of individual trees from a random forest, in which one tree serves as the endogenous covariate and the other trees serve as its instruments. Simulation experiments demonstrate the efficacy of the proposed approach in mitigating estimation biases and its superior performance over three alternative methods for bias correction.

翻訳日:2021-05-01 12:38:52 公開日:2020-12-19

# (参考訳) 分離データに基づく逆ロバスト線形分類のサンプル複雑性

Sample Complexity of Adversarially Robust Linear Classification on Separated Data ( http://arxiv.org/abs/2012.10794v1 )

ライセンス: CC BY 4.0

Robi Bhattacharjee, Somesh Jha, Kamalika Chaudhuri

(参考訳) 対向的堅牢性を伴う学習の複雑さについて考察する。この問題の最も初期の理論的な結果は、データの異なるクラスが近接したり重なり合うような設定を考えることである。現実の応用に動機づけられ、対照的に、完全な正確性と堅牢性を持った分類器が存在する、十分に分離されたケースを検討し、サンプル複雑性が全く異なるストーリーを成すことを示す。具体的には、線形分類器に対して、任意のアルゴリズムの期待ロバストな損失が少なくとも$\omega(\frac{d}{n})$であるようなよく分離された分布のクラスを示し、一方max marginアルゴリズムは標準損失$o(\frac{1}{n})$を期待する。これは、従来の技術では得られない標準と堅牢な損失のギャップを示している。さらに,ロバスト性半径がクラス間のギャップよりはるかに小さい場合において,ロバスト損失が期待される解が$o(\frac{1}{n})$となるようなアルゴリズムを提案する。これは、非常によく分離されたデータの場合、$o(\frac{1}{n})$の収束率は達成可能であることを示している。我々の結果は、$p > 1$ ($p = \infty$を含む) の任意の$\ell_p$ノルムで測定されたロバスト性に適用できる。

We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least $\Omega(\frac{d}{n})$, whereas the max margin algorithm has expected standard loss $O(\frac{1}{n})$. This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is $O(\frac{1}{n})$. This shows that for very well-separated data, convergence rates of $O(\frac{1}{n})$ are achievable, which is not the case otherwise. Our results apply to robustness measured in any $\ell_p$ norm with $p > 1$ (including $p = \infty$).

翻訳日:2021-05-01 12:01:01 公開日:2020-12-19

# (参考訳) 確率的依存グラフ

Probabilistic Dependency Graphs ( http://arxiv.org/abs/2012.10800v1 )

ライセンス: CC BY 4.0

Oliver Richardson, Joseph Y Halpern

(参考訳) 我々は,有向グラフィカルモデルの新しいクラスである確率依存グラフ(pdgs)を導入する。 pdgは自然な方法で一貫性のない信念を捉えることができ、ベイジアンネットワーク(bns)よりもモジュラーであり、新しい情報を取り入れ、表現を再構成しやすくする。 PDGが特に自然なモデリングツールであることを示す。 PDGに対する3つのセマンティクスを提供し、それぞれが、PDGとの不整合性を表すものとみなすことのできるスコアリング関数(ネットワーク上の変数の共役分布)から導出することができる。 BNに対応するPDGに対して、この関数はBNが表す分布によって一意に最小化され、PDG意味論がBN意味論を拡張することを示す。さらに,因子グラフとその指数関数族は pdg として忠実に表現できるが,因子グラフを用いた pdg のモデル化には大きな障壁がある。

We introduce Probabilistic Dependency Graphs (PDGs), a new class of directed graphical models. PDGs can capture inconsistent beliefs in a natural way and are more modular than Bayesian Networks (BNs), in that they make it easier to incorporate new information and restructure the representation. We show by example how PDGs are an especially natural modeling tool. We provide three semantics for PDGs, each of which can be derived from a scoring function (on joint distributions over the variables in the network) that can be viewed as representing a distribution's incompatibility with the PDG. For the PDG corresponding to a BN, this function is uniquely minimized by the distribution the BN represents, showing that PDG semantics extend BN semantics. We show further that factor graphs and their exponential families can also be faithfully represented as PDGs, while there are significant barriers to modeling a PDG with a factor graph.

翻訳日:2021-05-01 11:58:23 公開日:2020-12-19

# 修正による学習:弱い監督で数学の単語問題を解決する

Learning by Fixing: Solving Math Word Problems with Weak Supervision ( http://arxiv.org/abs/2012.10582v1 )

ライセンス: Link先を確認

Yining Hong, Qing Li, Daniel Ciao, Siyuan Haung, Song-Chun Zhu

(参考訳) 数学用語問題(mwps)の従来のニューラルネットワークソルバは、完全な監視によって学習され、多様なソリューションを生み出すことができない。本稿では,MWPを学習するための‘textit{weakly-supervised} パラダイムを導入することでこの問題に対処する。この手法は最終回答のアノテーションのみを必要とし、単一の問題に対して様々な解決策を生成できる。弱い教師付き学習を促進するために,シンボリック推論によるニューラルネットワークの誤認識を補正する新しい \textit{learning-by-fixing} (lbf) フレームワークを提案する。具体的には、ニューラルネットワークによって生成された誤った解木に対して、‘textit{fixing} メカニズムは、ルートノードから葉ノードへのエラーを伝搬し、最も確率の高い修正を推測して、所望の回答を得る。より多様なソリューションを生成するために、ソリューション空間の効率的な縮小と探索を導くために \textit{tree regularization} が適用され、各問題で発見された様々な修正を追跡し保存する \textit{memory buffer} が設計されている。 Math23Kデータセットによる実験結果から,提案したLBFフレームワークは,弱教師付き学習における強化学習ベースラインを著しく上回ることがわかった。さらに、完全な教師付き手法よりも優れたトップ1とトップ3/5の回答精度を実現し、多様なソリューションを生み出す上での強みを示している。

Previous neural solvers of math word problems (MWPs) are learned with full supervision and fail to generate diverse solutions. In this paper, we address this issue by introducing a \textit{weakly-supervised} paradigm for learning MWPs. Our method only requires the annotations of the final answers and can generate various solutions for a single problem. To boost weakly-supervised learning, we propose a novel \textit{learning-by-fixing} (LBF) framework, which corrects the misperceptions of the neural network via symbolic reasoning. Specifically, for an incorrect solution tree generated by the neural network, the \textit{fixing} mechanism propagates the error from the root node to the leaf nodes and infers the most probable fix that can be executed to get the desired answer. To generate more diverse solutions, \textit{tree regularization} is applied to guide the efficient shrinkage and exploration of the solution space, and a \textit{memory buffer} is designed to track and save the discovered various fixes for each problem. Experimental results on the Math23K dataset show the proposed LBF framework significantly outperforms reinforcement learning baselines in weakly-supervised learning. Furthermore, it achieves comparable top-1 and much better top-3/5 answer accuracies than fully-supervised methods, demonstrating its strength in producing diverse solutions.

翻訳日:2021-05-01 11:18:35 公開日:2020-12-19

# ストレートスルーガムベル・ソフトマックス推定器を用いた視覚参照ゲームにおける体系的一般化と構成性について

On (Emergent) Systematic Generalisation and Compositionality in Visual Referential Games with Straight-Through Gumbel-Softmax Estimator ( http://arxiv.org/abs/2012.10776v1 )

ライセンス: Link先を確認

Kevin Denamgana\"i and James Alfred Walker

(参考訳) 2つの(またはそれ以上の)エージェントが非視覚的な参照ゲームを行うときに現れる人工言語における構成性のドライバは、強化アルゴリズムと(神経)反復学習モデルに基づくアプローチを用いて以前に研究されてきた。より最近の textit{Straight-Through Gumbel-Softmax} (ST-GS) アプローチの導入に続いて,本研究では,ST-GS の文脈において,これまでフィールドで認識されていた構成性の要因がどの程度適用され,また,視覚的参照ゲームにおいて,それらが体系的一般化能力(創発的)にどの程度変換されるかを検討する。地形類似性とゼロショット合成テストを用いて,創発言語の構成性と一般化能力を評価する。第一に,テストトレイン分割戦略が視覚刺激の処理においてゼロショット構成テストに大きく影響することを示す一方で,シンボル刺激の処理では影響しないことを示す。第2に,st-gsアプローチをバッチサイズとオーバーコンプリート通信チャネルで使用すると,新興言語のコンポジション性が向上することを示す実証的証拠がある。それにもかかわらず、視覚的な刺激を扱う場合、バッチサイズの影響はそれほど明確ではない。また,全通信チャネルが等しく作成されるわけではないことを示した。実際、最大文長の増大は、合成能力と一般化能力の両方に有益であるが、語彙サイズの増加は有害である。最後に,視覚刺激を伴う識別的参照ゲームにおいて,学習時の言語構成性とエージェントの一般化能力の相関性の欠如が観察された。これは、シンボリック刺激を伴う生成変異体を用いたフィールドでの以前の観測と似ている。

The drivers of compositionality in artificial languages that emerge when two (or more) agents play a non-visual referential game has been previously investigated using approaches based on the REINFORCE algorithm and the (Neural) Iterated Learning Model. Following the more recent introduction of the \textit{Straight-Through Gumbel-Softmax} (ST-GS) approach, this paper investigates to what extent the drivers of compositionality identified so far in the field apply in the ST-GS context and to what extent do they translate into (emergent) systematic generalisation abilities, when playing a visual referential game. Compositionality and the generalisation abilities of the emergent languages are assessed using topographic similarity and zero-shot compositional tests. Firstly, we provide evidence that the test-train split strategy significantly impacts the zero-shot compositional tests when dealing with visual stimuli, whilst it does not when dealing with symbolic ones. Secondly, empirical evidence shows that using the ST-GS approach with small batch sizes and an overcomplete communication channel improves compositionality in the emerging languages. Nevertheless, while shown robust with symbolic stimuli, the effect of the batch size is not so clear-cut when dealing with visual stimuli. Our results also show that not all overcomplete communication channels are created equal. Indeed, while increasing the maximum sentence length is found to be beneficial to further both compositionality and generalisation abilities, increasing the vocabulary size is found detrimental. Finally, a lack of correlation between the language compositionality at training-time and the agents' generalisation abilities is observed in the context of discriminative referential games with visual stimuli. This is similar to previous observations in the field using the generative variant with symbolic stimuli.

翻訳日:2021-05-01 11:18:10 公開日:2020-12-19

# 不変表現学習における基本限界とトレードオフ

Fundamental Limits and Tradeoffs in Invariant Representation Learning ( http://arxiv.org/abs/2012.10713v1 )

ライセンス: Link先を確認

Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar

(参考訳) 多くの機械学習アプリケーションは、2つの競合する目標を達成する学習表現を含んでいる: 機能のサブセット(例えば、予測のために)に関する情報や精度を最大化し、同時に別の、潜在的に重複している、機能のサブセット(例えば、公正性、プライバシーなど)に関して不変または独立性を最大化する。典型的な例としては、プライバシー保護学習、ドメイン適応、アルゴリズムフェアネスなどがある。実際、上記の問題はすべて、その平衡が精度と不変性の基本的なトレードオフを表す、共通のミニマックスゲーム理論の定式化を受け入れている。上記の領域における豊富な応用にもかかわらず、不変表現の極限とトレードオフに関する理論的理解は著しく不足している。本稿では,分類と回帰設定の両方において,この一般的かつ重要な問題を情報論的に解析する。いずれの場合においても、情報平面における実現可能領域の幾何学的特徴付けを提供し、この実現可能領域の幾何学的性質とトレードオフ問題の基本的な制限を結びつけることで、精度と不変性の固有のトレードオフを分析する。回帰設定では、精度と不変性の間のトレードオフを定量化するラグランジアン目的の厳密な下限も導出する。この低い境界は、関節分布のスペクトル特性を通じてトレードオフをよりよく理解する。いずれの場合も,正確性と不変性の間の相互作用に関する洞察を提供することで,この根本的な問題に新たな光を当てた。これらの結果は、この根本的な問題の理解を深め、対向表現学習アルゴリズムの設計を導くのに役立つかもしれない。

Many machine learning applications involve learning representations that achieve two competing goals: To maximize information or accuracy with respect to a subset of features (e.g.\ for prediction) while simultaneously maximizing invariance or independence with respect to another, potentially overlapping, subset of features (e.g.\ for fairness, privacy, etc). Typical examples include privacy-preserving learning, domain adaptation, and algorithmic fairness, just to name a few. In fact, all of the above problems admit a common minimax game-theoretic formulation, whose equilibrium represents a fundamental tradeoff between accuracy and invariance. Despite its abundant applications in the aforementioned domains, theoretical understanding on the limits and tradeoffs of invariant representations is severely lacking. In this paper, we provide an information-theoretic analysis of this general and important problem under both classification and regression settings. In both cases, we analyze the inherent tradeoffs between accuracy and invariance by providing a geometric characterization of the feasible region in the information plane, where we connect the geometric properties of this feasible region to the fundamental limitations of the tradeoff problem. In the regression setting, we also derive a tight lower bound on the Lagrangian objective that quantifies the tradeoff between accuracy and invariance. This lower bound leads to a better understanding of the tradeoff via the spectral properties of the joint distribution. In both cases, our results shed new light on this fundamental problem by providing insights on the interplay between accuracy and invariance. These results deepen our understanding of this fundamental problem and may be useful in guiding the design of adversarial representation learning algorithms.

翻訳日:2021-05-01 11:17:41 公開日:2020-12-19

# 不確実性を考慮した政策最適化:ロバストで適応的な信頼領域アプローチ

Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach ( http://arxiv.org/abs/2012.10791v1 )

ライセンス: Link先を確認

James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras

(参考訳) 強化学習技術が実世界の意思決定プロセスで有用になるためには、限られたデータから堅牢なパフォーマンスを生み出す必要がある。深いポリシー最適化手法は複雑なタスクで素晴らしい結果を得ていますが、実際の採用は、成功するためにかなりの量のデータを必要とするため、限られています。小さなサンプルサイズと組み合わせると、これらの手法は高次元のサンプルベース推定に依存するため不安定な学習をもたらす。本研究では,これらの推定値がもたらす不確実性を制御する手法を開発する。我々は,これらの手法を活用して,データが不足しても安定したパフォーマンスを実現するように設計された,深いポリシー最適化手法を提案する。得られたアルゴリズムである不確実性認識地域政策最適化は、学習プロセスを通じて存在する不確実性レベルに適応する堅牢なポリシー更新を生成する。

In order for reinforcement learning techniques to be useful in real-world decision making processes, they must be able to produce robust performance from limited data. Deep policy optimization methods have achieved impressive results on complex tasks, but their real-world adoption remains limited because they often require significant amounts of data to succeed. When combined with small sample sizes, these methods can result in unstable learning due to their reliance on high-dimensional sample-based estimates. In this work, we develop techniques to control the uncertainty introduced by these estimates. We leverage these techniques to propose a deep policy optimization approach designed to produce stable performance even when data is scarce. The resulting algorithm, Uncertainty-Aware Trust Region Policy Optimization, generates robust policy updates that adapt to the level of uncertainty present throughout the learning process.

翻訳日:2021-05-01 11:17:19 公開日:2020-12-19

# コミュニケーションを意識した協調学習

Communication-Aware Collaborative Learning ( http://arxiv.org/abs/2012.10569v1 )

ライセンス: Link先を確認

Avrim Blum, Shelby Heinecke, Lev Reyzin

(参考訳) ノイズレス協調pac学習のアルゴリズムは,近年,サンプル複雑性に関して解析・最適化されている。本稿では,通信コストの削減を目標とし,サンプル複雑性に対して実質的にペナルティを伴わない協調的pac学習について検討する。分散ブースティングを用いた通信効率の高い協調pac学習アルゴリズムを開発した。次に,分類ノイズの存在下での協調学習のコミュニケーションコストを検討する。中間段階として、協調的なPAC学習アルゴリズムが分類ノイズにどのように適応できるかを示す。そこで本研究では,ノイズ分類に頑健な協調pac学習のための通信効率の高いアルゴリズムを開発した。

Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an intermediate step, we show how collaborative PAC learning algorithms can be adapted to handle classification noise. With this insight, we develop communication efficient algorithms for collaborative PAC learning robust to classification noise.

翻訳日:2021-05-01 11:16:18 公開日:2020-12-19

# 粗大かつ微細なマルチグラフ学習を目指して

Towards Coarse and Fine-grained Multi-Graph Multi-Label Learning ( http://arxiv.org/abs/2012.10650v1 )

ライセンス: Link先を確認

Yejiang Wang and Yuhai Zhao and Zhengkui Wang and Chengqi Zhang

(参考訳) Multi-graph Multi-label Learning (\textsc{Mgml})は、複数のグラフを含むラベル付きバッグの集合からマルチラベル分類器を学習することを目的とした教師付き学習フレームワークである。従来のテクニックは、グラフをインスタンスに転送し、バッグレベルでのみ目に見えないラベルを学習することに集中して開発された。本稿では,グラフ上に学習モデルを直接構築し,その両方においてラベル予測の権限を付与する多層グラフ多層学習フレームワークである \textit{coarse} と \textit{fine-fine} multi-graph multi-label (cfmgml) を提案する。 bag) レベルと \textit{fine-fine} (別名。それぞれのバッグにグラフ)レベル。特に,ラベル付きマルチグラフバッグの集合を考えると,グラフレベルとバッグレベルのスコアリング関数を設計し,グラフカーネルを用いてラベルとデータの関連性をモデル化する。一方,グラフとバッグのラベルをランク付けし,ハミングロスを1ステップで同時に最小化するためのしきい値ランク付け目的関数を提案し,従来のランク付けアルゴリズムの誤り蓄積問題に対処することを目的とした。非凸最適化問題に取り組むため,我々はcfmgmlで必要とされる高次元空間計算を扱うための効果的な下位勾配降下アルゴリズムを更に開発する。様々な実世界のデータセットに対する実験は、cfMGMLが最先端のアルゴリズムよりも優れたパフォーマンスを達成することを示した。

Multi-graph multi-label learning (\textsc{Mgml}) is a supervised learning framework, which aims to learn a multi-label classifier from a set of labeled bags each containing a number of graphs. Prior techniques on the \textsc{Mgml} are developed based on transfering graphs into instances and focus on learning the unseen labels only at the bag level. In this paper, we propose a \textit{coarse} and \textit{fine-grained} Multi-graph Multi-label (cfMGML) learning framework which directly builds the learning model over the graphs and empowers the label prediction at both the \textit{coarse} (aka. bag) level and \textit{fine-grained} (aka. graph in each bag) level. In particular, given a set of labeled multi-graph bags, we design the scoring functions at both graph and bag levels to model the relevance between the label and data using specific graph kernels. Meanwhile, we propose a thresholding rank-loss objective function to rank the labels for the graphs and bags and minimize the hamming-loss simultaneously at one-step, which aims to addresses the error accumulation issue in traditional rank-loss algorithms. To tackle the non-convex optimization problem, we further develop an effective sub-gradient descent algorithm to handle high-dimensional space computation required in cfMGML. Experiments over various real-world datasets demonstrate cfMGML achieves superior performance than the state-of-arts algorithms.

翻訳日:2021-05-01 11:16:10 公開日:2020-12-19

# Top-k$ Ranking Bayesian Optimization

Top-$k$ Ranking Bayesian Optimization ( http://arxiv.org/abs/2012.10688v1 )

ライセンス: Link先を確認

Quoc Phong Nguyen, Sebastian Tay, Bryan Kian Hsiang Low, Patrick Jaillet

(参考訳) 本稿では、上位k$のランク付けとタイ/インディフェクション観測を扱うための優先BOの実用的で重要な一般化である、上位k$のランク付けに対する新しいアプローチ(上位k$ランク付けBO)を提案する。まず、上記の観測に対処できるだけでなく、古典的なランダムなユーティリティーモデルもサポートするサロゲートモデルを設計する。もう一つの同様に重要な貢献は、BOにおける最初の情報理論獲得関数の導入であり、多項予測エントロピー探索 (MPES) と呼ばれる、これらの観測を柔軟に扱い、クエリの全ての入力に共同で最適化する。 mpesは、クエリの入力を一度に1つ選択する既存の取得機能と比較して、優れた性能を有する。 CIFAR-$10$データセットとSUSHI選好データセットを用いてMPESの性能を実証的に評価した。

This paper presents a novel approach to top-$k$ ranking Bayesian optimization (top-$k$ ranking BO) which is a practical and significant generalization of preferential BO to handle top-$k$ ranking and tie/indifference observations. We first design a surrogate model that is not only capable of catering to the above observations, but is also supported by a classic random utility model. Another equally important contribution is the introduction of the first information-theoretic acquisition function in BO with preferential observation called multinomial predictive entropy search (MPES) which is flexible in handling these observations and optimized for all inputs of a query jointly. MPES possesses superior performance compared with existing acquisition functions that select the inputs of a query one at a time greedily. We empirically evaluate the performance of MPES using several synthetic benchmark functions, CIFAR-$10$ dataset, and SUSHI preference dataset.

翻訳日:2021-05-01 11:15:44 公開日:2020-12-19

# アクティブラーニング問題の統一のための情報理論フレームワーク

An Information-Theoretic Framework for Unifying Active Learning Problems ( http://arxiv.org/abs/2012.10695v1 )

ライセンス: Link先を確認

Quoc Phong Nguyen, Bryan Kian Hsiang Low, Patrick Jaillet

(参考訳) 本稿では、レベルセット推定(LSE)、ベイズ最適化(BO)、およびそれらの一般化変種を統合化するための情報理論フレームワークを提案する。まず,既存のLSEアルゴリズムを仮定し,連続入力領域を用いたLSE問題における最先端性能を実現する,新しい能動学習基準を提案する。そして,LSEとBOの関係を利用して,高い信頼度と最大値エントロピー探索(MES)に興味深いつながりを持つBOの競合情報理論獲得関数を設計する。後者の接続は、MESだけでなく、他のMESベースの取得関数にも重要な意味を持つMESの欠点を明らかにしている。最後に、我々の統合情報理論フレームワークを用いて、複数のレベルセットをデータ効率よく含むLSEとBOの一般化問題を解くことができる。提案アルゴリズムの性能を,実世界のデータセット,機械学習モデルのハイパーパラメータチューニングなどを用いて実証的に評価した。

This paper presents an information-theoretic framework for unifying active learning problems: level set estimation (LSE), Bayesian optimization (BO), and their generalized variant. We first introduce a novel active learning criterion that subsumes an existing LSE algorithm and achieves state-of-the-art performance in LSE problems with a continuous input domain. Then, by exploiting the relationship between LSE and BO, we design a competitive information-theoretic acquisition function for BO that has interesting connections to upper confidence bound and max-value entropy search (MES). The latter connection reveals a drawback of MES which has important implications on not only MES but also on other MES-based acquisition functions. Finally, our unifying information-theoretic framework can be applied to solve a generalized problem of LSE and BO involving multiple level sets in a data-efficient manner. We empirically evaluate the performance of our proposed algorithms using synthetic benchmark functions, a real-world dataset, and in hyperparameter tuning of machine learning models.

翻訳日:2021-05-01 11:15:27 公開日:2020-12-19

# 教師なしスケール不変マルチスペクトル形状マッチング

Unsupervised Scale-Invariant Multispectral Shape Matching ( http://arxiv.org/abs/2012.10685v1 )

ライセンス: Link先を確認

Idan Pazi, Dvir Ginzburg, Dan Raviv

(参考訳) 非剛性伸縮構造間のアライメントはコンピュータビジョンにおいて最も難しいタスクの1つであり、不変性は一方では定義が困難であり、他方では実際のデータセットにはラベル付きデータが存在しない。本稿では,スケール不変幾何のスペクトルに基づく教師なしニューラルネットワークアーキテクチャを提案する。関数型マップアーキテクチャの上に構築するが、局所的な特徴の学習は、等尺的仮定が破れれば十分ではないが、スケール不変幾何を用いて解けることを示す。本手法は局所的な変形によらず,既存のスペクトル状態の解と比較して異なる領域の形状をマッチングする優れた性能を示す。

Alignment between non-rigid stretchable structures is one of the hardest tasks in computer vision, as the invariant properties are hard to define on one hand, and on the other hand no labelled data exists for real datasets. We present unsupervised neural network architecture based upon the spectrum of scale-invariant geometry. We build ontop the functional maps architecture, but show that learning local features, as done until now, is not enough once the isometric assumption breaks but can be solved using scale-invariant geometry. Our method is agnostic to local-scale deformations and shows superior performance for matching shapes from different domains when compared to existing spectral state-of-the-art solutions.

翻訳日:2021-05-01 11:15:11 公開日:2020-12-19

# ロバストディープフェイク検出のための不変テクスチャ違反の同定

Identifying Invariant Texture Violation for Robust Deepfake Detection ( http://arxiv.org/abs/2012.10580v1 )

ライセンス: Link先を確認

Xinwei Sun, Botong Wu, Wei Chen

(参考訳) 既存のdeepfake検出手法では、公開済みの大規模データセットにアクセスすることで、配信結果の有望性が報告されている。しかし、非滑らか合成法により、このデータセットの偽のサンプルは、上記のフレームレベルの検出方法のほとんどに大きく依存していた明らかな人工物(例えば、スターク視覚コントラスト、非滑らか境界)を明らかにする可能性がある。これらのアーティファクトは、現実のメディア偽造には現れないので、現実に近い偽画像に適用すると、上記の方法は大きく劣化する可能性がある。高実在性偽データに対するロバスト性を改善するために、低視品質のデータセットのみにアクセスする不変テクスチャ学習(InTeLe)フレームワークを提案する。本手法は,対象人物から移入されたテクスチャによって,原顔の微視的な顔のテクスチャが必然的に侵害されることから,すべての偽画像間で共有される不変特性と見なすことができる。このようなディープフェイク検出の不変性を学習するために、我々は、プリスタンとフェイクイメージのための異なるデコーダを持つ自動エンコーダフレームワークを導入しました。このような分離により、エンコーダによる抽出された埋め込みは、フェイク画像のテクスチャ違反をキャプチャし、次いで最終プリズム/フェイク予測のための分類器を付加することができる。理論的保証として,このような非分散テクスチャ違反の同定可能性,すなわち観測データから正確に推測できることを示す。本手法の有効性と有用性は,低品質画像から明らかなアーティファクト,高リアリズムの偽画像への一般化を約束することによって実証された。

Existing deepfake detection methods have reported promising in-distribution results, by accessing published large-scale dataset. However, due to the non-smooth synthesis method, the fake samples in this dataset may expose obvious artifacts (e.g., stark visual contrast, non-smooth boundary), which were heavily relied on by most of the frame-level detection methods above. As these artifacts do not come up in real media forgeries, the above methods can suffer from a large degradation when applied to fake images that close to reality. To improve the robustness for high-realism fake data, we propose the Invariant Texture Learning (InTeLe) framework, which only accesses the published dataset with low visual quality. Our method is based on the prior that the microscopic facial texture of the source face is inevitably violated by the texture transferred from the target person, which can hence be regarded as the invariant characterization shared among all fake images. To learn such an invariance for deepfake detection, our InTeLe introduces an auto-encoder framework with different decoders for pristine and fake images, which are further appended with a shallow classifier in order to separate out the obvious artifact-effect. Equipped with such a separation, the extracted embedding by encoder can capture the texture violation in fake images, followed by the classifier for the final pristine/fake prediction. As a theoretical guarantee, we prove the identifiability of such an invariance texture violation, i.e., to be precisely inferred from observational data. The effectiveness and utility of our method are demonstrated by promising generalization ability from low-quality images with obvious artifacts to fake images with high realism.

翻訳日:2021-05-01 11:14:43 公開日:2020-12-19

# UAV撮像画像における物体検出のための高密度マルチスケールフュージョンピラミッドネットワーク

Dense Multiscale Feature Fusion Pyramid Networks for Object Detection in UAV-Captured Images ( http://arxiv.org/abs/2012.10643v1 )

ライセンス: Link先を確認

Yingjie Liu

(参考訳) 深層学習による物体検出の研究分野では大きな進歩が見られたが、uavで撮影された画像で顕著に発音される小型物体には依然として課題がある。これらの問題に対処するためには、小さなオブジェクトの十分な特徴情報を抽出できる特徴抽出法を探求する必要がある。本稿では,情報伝達と再利用を改良し,よりリッチな特徴を可能な限り得ることを目的とした,高密度多スケール特徴融合ピラミッドネットワーク(dmffpn)と呼ばれる新しい手法を提案する。具体的には、密結合は、異なる畳み込み層からの表現を完全に活用するように設計されている。さらに、第2段階でカスケードアーキテクチャを適用して、ローカライゼーション能力を向上させる。 VisDrone-DETと名付けられたドローンベースのデータセットの実験から,本手法の競合性能が示唆された。

Although much significant progress has been made in the research field of object detection with deep learning, there still exists a challenging task for the objects with small size, which is notably pronounced in UAV-captured images. Addressing these issues, it is a critical need to explore the feature extraction methods that can extract more sufficient feature information of small objects. In this paper, we propose a novel method called Dense Multiscale Feature Fusion Pyramid Networks(DMFFPN), which is aimed at obtaining rich features as much as possible, improving the information propagation and reuse. Specifically, the dense connection is designed to fully utilize the representation from the different convolutional layers. Furthermore, cascade architecture is applied in the second stage to enhance the localization capability. Experiments on the drone-based datasets named VisDrone-DET suggest a competitive performance of our method.

翻訳日:2021-05-01 11:14:12 公開日:2020-12-19

# ネットワーク内の拡張

Augmentation Inside the Network ( http://arxiv.org/abs/2012.10769v1 )

ライセンス: Link先を確認

Maciej Sypetkowski, Jakub Jasiulewicz, Zbigniew Wojna

(参考訳) 本稿では,畳み込みニューラルネットワークの中間機能に対するコンピュータビジョン問題に対するデータ拡張手法をシミュレートする手法である,ネットワーク内部の拡張について述べる。これらの変換を行い、ネットワーク内のデータフローを変更し、可能であれば共通の計算を共有します。提案手法は,TTA法よりもスムーズな速度-精度トレードオフ調整を実現し,良好な結果が得られる。さらに,テスト時間拡張と組み合わせることで,モデル性能をさらに向上させることができる。本手法をimagenet-2012およびcifar-100データセットで検証した。そこで本研究では,フリップテスト時拡張よりも30%高速で,CIFAR-100と同じ結果が得られる修正を提案する。

In this paper, we present augmentation inside the network, a method that simulates data augmentation techniques for computer vision problems on intermediate features of a convolutional neural network. We perform these transformations, changing the data flow through the network, and sharing common computations when it is possible. Our method allows us to obtain smoother speed-accuracy trade-off adjustment and achieves better results than using standard test-time augmentation (TTA) techniques. Additionally, our approach can improve model performance even further when coupled with test-time augmentation. We validate our method on the ImageNet-2012 and CIFAR-100 datasets for image classification. We propose a modification that is 30% faster than the flip test-time augmentation and achieves the same results for CIFAR-100.

翻訳日:2021-05-01 11:13:59 公開日:2020-12-19

# シーケンスラベリングのための不確実性認識ラベルの精密化

Uncertainty-Aware Label Refinement for Sequence Labeling ( http://arxiv.org/abs/2012.10608v1 )

ライセンス: Link先を確認

Tao Gui, Jiacheng Ye, Qi Zhang, Zhengyan Li, Zichu Fei, Yeyun Gong and Xuanjing Huang

(参考訳) ラベルデコードのための条件付きランダムフィールド(CRF)は、シーケンスラベリングタスクにおいてユビキタスになっている。しかし、ローカルラベルの依存関係と非効率なビタビ復号化は常に解決すべき問題であった。本稿では,長期間のラベル依存をモデル化する新しい2段階のラベル復号フレームワークを提案する。ベースモデルは、まずドラフトラベルを予測し、次に、新しい2ストリームの自己アテンションモデルにより、長距離ラベル依存に基づいてこれらのドラフトラベルの予測を洗練し、高速な予測のために並列デコードを実現する。さらに、誤ったドラフトラベルの副作用を軽減するために、ベイズニューラルネットワークは、誤りの確率が高いラベルを示すために使用され、エラー伝播の防止に大いに役立つ。 3つのシークエンスラベリングベンチマーク実験の結果,提案手法はCRF法に勝るだけでなく,推論プロセスを大幅に高速化した。

Conditional random fields (CRF) for label decoding has become ubiquitous in sequence labeling tasks. However, the local label dependencies and inefficient Viterbi decoding have always been a problem to be solved. In this work, we introduce a novel two-stage label decoding framework to model long-term label dependencies, while being much more computationally efficient. A base model first predicts draft labels, and then a novel two-stream self-attention model makes refinements on these draft predictions based on long-range label dependencies, which can achieve parallel decoding for a faster prediction. In addition, in order to mitigate the side effects of incorrect draft labels, Bayesian neural networks are used to indicate the labels with a high probability of being wrong, which can greatly assist in preventing error propagation. The experimental results on three sequence labeling benchmarks demonstrated that the proposed method not only outperformed the CRF-based methods but also greatly accelerated the inference process.

翻訳日:2021-05-01 11:13:48 公開日:2020-12-19

# FraCaS: 時間分析

FraCaS: Temporal Analysis ( http://arxiv.org/abs/2012.10668v1 )

ライセンス: Link先を確認

Jean-Philippe Bernardy, Stergios Chatzikyriakidis

(参考訳) 本稿では,推論問題に適した時間意味論の実装を提案する。この実装は構文木を論理式に変換し、Coq証明アシスタントの消費に適している。我々は、時間参照、時間副詞、アスペクトクラス、プログレッシブなど、いくつかの現象をサポートしている。これらの意味論を完全なFraCaSテストスーツに適用する。時間的基準に関連する問題に対して,全体の81パーセントと73%の精度が得られる。

In this paper, we propose an implementation of temporal semantics which is suitable for inference problems. This implementation translates syntax trees to logical formulas, suitable for consumption by the Coq proof assistant. We support several phenomena including: temporal references, temporal adverbs, aspectual classes and progressives. We apply these semantics to the complete FraCaS testsuite. We obtain an accuracy of 81 percent overall and 73 percent for problems explicitly marked as related to temporal reference.

翻訳日:2021-05-01 11:13:32 公開日:2020-12-19

# コモンセンス知識抽出とインジェクションによる語彙制約付きテキスト生成

Lexically-constrained Text Generation through Commonsense Knowledge Extraction and Injection ( http://arxiv.org/abs/2012.10813v1 )

ライセンス: Link先を確認

Yikang Li, Pulkit Goel, Varsha Kuppur Rajendra, Har Simrat Singh, Jonathan Francis, Kaixin Ma, Eric Nyberg, Alessandro Oltramari

(参考訳) 条件付きテキスト生成は、最先端のモデルから人間レベルのパフォーマンスをまだ見ていない難題である。本研究では,特定の入力概念のセットに対して妥当な文を生成することを目的として,commongenベンチマークに注目した。他の作業の進歩にもかかわらず、このデータセットで微調整された大きな事前学習された言語モデルは、構文的に正しいが、人間の常識の理解から定性的に逸脱する文を生成することが多い。さらに、生成されたシーケンスは、パート・オブ・音声と完全な概念カバレッジとを一致させるような語彙要求を満たすことができない。本稿では,コモンセンス推論と語彙制約付きデコードに関して,コモンセンス知識グラフがモデル性能をどのように向上させるかを検討する。本稿では,コンセプションネットからコモンセンス関係を抽出し,注意機構を通じて統一言語モデル(Unified Language Model,UniLM)にこれらの関係を注入し,上記の語彙要求を出力制約により強制する手法を提案する。複数のアブレーションを行うことで、コモンセンスインジェクションは、語彙的要件に準拠しながら、人間の理解とより一致した文を生成することができる。

Conditional text generation has been a challenging task that is yet to see human-level performance from state-of-the-art models. In this work, we specifically focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts. Despite advances in other tasks, large pre-trained language models that are fine-tuned on this dataset often produce sentences that are syntactically correct but qualitatively deviate from a human understanding of common sense. Furthermore, generated sequences are unable to fulfill such lexical requirements as matching part-of-speech and full concept coverage. In this paper, we explore how commonsense knowledge graphs can enhance model performance, with respect to commonsense reasoning and lexically-constrained decoding. We propose strategies for enhancing the semantic correctness of the generated text, which we accomplish through: extracting commonsense relations from Conceptnet, injecting these relations into the Unified Language Model (UniLM) through attention mechanisms, and enforcing the aforementioned lexical requirements through output constraints. By performing several ablations, we find that commonsense injection enables the generation of sentences that are more aligned with human understanding, while remaining compliant with lexical requirements.

翻訳日:2021-05-01 11:13:28 公開日:2020-12-19

# Minimaxが復活

Minimax Strikes Back ( http://arxiv.org/abs/2012.10700v1 )

ライセンス: Link先を確認

Quentin Cohen-Solal and Tristan Cazenave

(参考訳) 深層強化学習(DRL)は多くの完全情報ゲームにおいて超人的なレベルに達する。 drlと組み合わせて使用されるアート探索アルゴリズムの状況はモンテカルロ木探索 (mcts) である。我々は、MCTSの代わりにMinimaxアルゴリズムを用いてDRLに別のアプローチを採り、ポリシーではなく状態の評価のみを学習する。複数のゲームにおいて,学習パフォーマンスや対決に対して,アートDRLの状況と競合することを示す。

Deep Reinforcement Learning (DRL) reaches a superhuman level of play in many complete information games. The state of the art search algorithm used in combination with DRL is Monte Carlo Tree Search (MCTS). We take another approach to DRL using a Minimax algorithm instead of MCTS and learning only the evaluation of states, not the policy. We show that for multiple games it is competitive with the state of the art DRL for the learning performances and for the confrontations.

翻訳日:2021-05-01 11:12:37 公開日:2020-12-19

# 複数シーンからの3次元形状復元システムにおけるシルエット最適化の重要性

The importance of silhouette optimization in 3D shape reconstruction system from multiple object scenes ( http://arxiv.org/abs/2012.10660v1 )

ライセンス: Link先を確認

Waqqas-ur-Rehman Butt and Martin Servin

(参考訳) 本稿では, シルエットSFS法におけるシルエットの不整合を考慮した多段立体形状再構成システムを提案する。これらの矛盾は、異なるビュー、セグメンテーションと影、あるいは物体や光の方向による反射によって、複数のビューイメージに共通している。これらの要因は、すべてのシルエットに連続的に投影される体積のその部分だけを再構成し、残りの部分を再構成せずに、既存のアプローチを用いて3次元形状を構築しようとする際に大きな課題を引き起こす。結果として、最終的な形状は、複数のビューオブジェクトの閉塞と影のために堅牢ではない。本稿では,複数の画像を解析し,シルエットを最適化するための事前処理を行うことにより,再建に影響を及ぼす要因について考察する。最後に、ボリュームアプローチSFSを用いて3次元形状を再構成する。理論および実験結果から, 修正アルゴリズムの性能は効率よく向上し, 復元された形状の精度が向上し, シルエット, 体積, 計算コストの誤差に対して堅牢であることが示唆された。

This paper presents a multi stage 3D shape reconstruction system of multiple object scenes by considering the silhouette inconsistencies in shape-from silhouette SFS method. These inconsistencies are common in multiple view images due to object occlusions in different views, segmentation and shadows or reflection due to objects or light directions. These factors raise huge challenges when attempting to construct the 3D shape by using existing approaches which reconstruct only that part of the volume which projects consistently in all the silhouettes, leaving the rest unreconstructed. As a result, final shape are not robust due to multi view objects occlusion and shadows. In this regard, we consider the primary factors affecting reconstruction by analyzing the multiple images and perform pre-processing steps to optimize the silhouettes. Finally, the 3D shape is reconstructed by using the volumetric approach SFS. Theory and experimental results show that, the performance of the modified algorithm was efficiently improved, which can improve the accuracy of the reconstructed shape and being robust to errors in the silhouettes, volume and computational inexpensive.

翻訳日:2021-05-01 11:12:04 公開日:2020-12-19

# 斜めUAVビデオからの自己教師付き単眼深度推定

Self-supervised monocular depth estimation from oblique UAV videos ( http://arxiv.org/abs/2012.10704v1 )

ライセンス: Link先を確認

Logambal Madhuanand, Francesco Nex, Michael Ying Yang

(参考訳) UAVは安価で使いやすく、汎用性が高いため、重要な測光装置となっている。 UAVから撮影した空中画像は、小型で大規模なテクスチャマッピング、3Dモデリング、オブジェクト検出タスク、DTMおよびDSM生成などに適用できる。光グラム技術は、同じシーンの複数の画像を取得するUAV画像からの3次元再構成に日常的に使用される。コンピュータビジョンとディープラーニング技術の発展により、SIDE(Single Image Depth Estimation)は強力な研究分野となった。 UAV画像におけるSIDE技術を用いることで、3次元再構成のための複数の画像の必要性を克服することができる。本稿では, 深度学習を用いて, 一つのUAV空中画像から深度を推定することを目的とする。我々は,自己教師付き学習手法である自己教師付き単眼深度推定 (smde) について述べる。深度を学習し、2つの異なるネットワークを介して協調して情報をポーズする深層学習モデルのトレーニングには、単眼ビデオフレームが使用される。予測深度とポーズを用いて、映像からの時間情報を利用した別の画像から1つの画像を再構成する。本稿では,2次元CNNエンコーダと3次元CNNデコーダを用いて,時系列フレームから情報を抽出する新しいアーキテクチャを提案する。画像生成の品質を向上させるために、対比的損失項を導入する。公開UAVidビデオデータセットを用いて実験を行った。実験の結果,本モデルは最先端手法よりも奥行き推定に優れていることがわかった。

UAVs have become an essential photogrammetric measurement as they are affordable, easily accessible and versatile. Aerial images captured from UAVs have applications in small and large scale texture mapping, 3D modelling, object detection tasks, DTM and DSM generation etc. Photogrammetric techniques are routinely used for 3D reconstruction from UAV images where multiple images of the same scene are acquired. Developments in computer vision and deep learning techniques have made Single Image Depth Estimation (SIDE) a field of intense research. Using SIDE techniques on UAV images can overcome the need for multiple images for 3D reconstruction. This paper aims to estimate depth from a single UAV aerial image using deep learning. We follow a self-supervised learning approach, Self-Supervised Monocular Depth Estimation (SMDE), which does not need ground truth depth or any extra information other than images for learning to estimate depth. Monocular video frames are used for training the deep learning model which learns depth and pose information jointly through two different networks, one each for depth and pose. The predicted depth and pose are used to reconstruct one image from the viewpoint of another image utilising the temporal information from videos. We propose a novel architecture with two 2D CNN encoders and a 3D CNN decoder for extracting information from consecutive temporal frames. A contrastive loss term is introduced for improving the quality of image generation. Our experiments are carried out on the public UAVid video dataset. The experimental results demonstrate that our model outperforms the state-of-the-art methods in estimating the depths.

翻訳日:2021-05-01 11:11:26 公開日:2020-12-19

# 雑音フィルタリングによる2つの前景差に基づくビデオの静的物体検出とセグメンテーション

Static object detection and segmentation in videos based on dual foregrounds difference with noise filtering ( http://arxiv.org/abs/2012.10708v1 )

ライセンス: Link先を確認

Waqqas-ur-Rehman Butt and Martin Servin

(参考訳) 本稿では,映像中の静止物体検出とセグメンテーション手法について述べる。多くの監視アプリケーションに移動オブジェクトが存在するため、ロバストな静的オブジェクト検出は依然として難しい課題である。難易度は、元の背景を確立せず、異なるタイミングでビデオに現れる静的なオブジェクトとして識別されるオブジェクトのラベル付け方法に大きく影響されます。この文脈では、静的オブジェクトの識別にフレーム差分の概念に基づくバックグラウンドサブトラクション手法を適用する。まず、静的参照フレームに対する各フレームの差を計算することにより、前景マスク画像のフレーム差を推定する。移動粒子を検出するためにガウスMOG法の混合法を適用し, 前景マスクのフレーム差分から結果フォアグラウンドマスクを減算する。低コントラスト処理や空気中の散乱材料のノイズ低減のために,前処理法,照明等化法,消光法を適用した。水滴と塵の粒子。最後に、物体を分割しノイズを抑制するために、数学的形態的操作と最大の連結成分分析を適用する。提案手法は, 岩盤ブレーカー局に適用し, 実データ, 合成データおよび2つの公開データを用いて有効に検証した。その結果,提案手法は静的オブジェクトを事前に追跡情報を持たずにロバストに検出し,セグメンテーションできることが実証された。

This paper presents static object detection and segmentation method in videos from cluttered scenes. Robust static object detection is still challenging task due to presence of moving objects in many surveillance applications. The level of difficulty is extremely influenced by on how you label the object to be identified as static that do not establish the original background but appeared in the video at different time. In this context, background subtraction technique based on the frame difference concept is applied to the identification of static objects. Firstly, we estimate a frame differencing foreground mask image by computing the difference of each frame with respect to a static reference frame. The Mixture of Gaussian MOG method is applied to detect the moving particles and then outcome foreground mask is subtracted from frame differencing foreground mask. Pre-processing techniques, illumination equalization and de-hazing methods are applied to handle low contrast and to reduce the noise from scattered materials in the air e.g. water droplets and dust particles. Finally, a set of mathematical morphological operation and largest connected-component analysis is applied to segment the object and suppress the noise. The proposed method was built for rock breaker station application and effectively validated with real, synthetic and two public data sets. The results demonstrate the proposed approach can robustly detect, segmented the static objects without any prior information of tracking.

翻訳日:2021-05-01 11:10:43 公開日:2020-12-19

# 量子光学畳み込みニューラルネットワーク : 量子コンピューティングのための新しい画像認識フレームワーク

Quantum Optical Convolutional Neural Network: A Novel Image Recognition Framework for Quantum Computing ( http://arxiv.org/abs/2012.10812v1 )

ライセンス: Link先を確認

Rishab Parthasarathy and Rohan Bhowmik

(参考訳) 畳み込みニューラルネットワーク(cnns)に基づく大規模機械学習モデルでは、大量のデータをトレーニングしたパラメータが急速に増加しており、自動運転車から医療画像まで、幅広いコンピュータビジョンタスクに展開されている。これらのモデルをトレーニングするために必要なコンピューティングリソースの要求は、古典的なコンピューティングハードウェアの進歩を急速に上回り、光ニューラルネットワーク(ONN)や量子コンピューティングといった新しいフレームワークが将来の代替手段として検討されている。本稿では,量子コンピューティングに基づく新しいディープラーニングモデルである量子光畳み込みニューラルネットワーク(QOCNN)について報告する。人気のMNISTデータセットを使用して、この新アーキテクチャをセミナルLeNetモデルに基づく従来のCNNと比較した。我々はまた、以前に報告されたONN(GridNetとComplexNet)と、コンプレックスネットと量子ベースの正弦波非線形性を組み合わせた量子光学ニューラルネットワーク(QONN)を比較した。本質的に、我々の研究は量子畳み込みとそれ以前のプール層を追加することで、QONNに関する以前の研究を拡張している。我々はそれらの精度、混乱行列、受信器動作特性(ROC)曲線、マシューズ相関係数を判定し、全てのモデルを評価する。モデルの性能は全体として類似しており、ROC曲線は新しいQOCNNモデルが堅牢であることを示している。最後に,この新しいフレームワークを量子コンピュータ上で実行することにより,計算効率の向上を推定した。ディープラーニングへの量子コンピューティングベースのアプローチへの切り替えは、古典的なモデルに匹敵する精度をもたらすが、計算性能は前例のない向上と消費電力の大幅な削減を実現している。

Large machine learning models based on Convolutional Neural Networks (CNNs) with rapidly increasing number of parameters, trained with massive amounts of data, are being deployed in a wide array of computer vision tasks from self-driving cars to medical imaging. The insatiable demand for computing resources required to train these models is fast outpacing the advancement of classical computing hardware, and new frameworks including Optical Neural Networks (ONNs) and quantum computing are being explored as future alternatives. In this work, we report a novel quantum computing based deep learning model, the Quantum Optical Convolutional Neural Network (QOCNN), to alleviate the computational bottleneck in future computer vision applications. Using the popular MNIST dataset, we have benchmarked this new architecture against a traditional CNN based on the seminal LeNet model. We have also compared the performance with previously reported ONNs, namely the GridNet and ComplexNet, as well as a Quantum Optical Neural Network (QONN) that we built by combining the ComplexNet with quantum based sinusoidal nonlinearities. In essence, our work extends the prior research on QONN by adding quantum convolution and pooling layers preceding it. We have evaluated all the models by determining their accuracies, confusion matrices, Receiver Operating Characteristic (ROC) curves, and Matthews Correlation Coefficients. The performance of the models were similar overall, and the ROC curves indicated that the new QOCNN model is robust. Finally, we estimated the gains in computational efficiencies from executing this novel framework on a quantum computer. We conclude that switching to a quantum computing based approach to deep learning may result in comparable accuracies to classical models, while achieving unprecedented boosts in computational performances and drastic reduction in power consumption.

翻訳日:2021-05-01 11:09:59 公開日:2020-12-19

# ノード分類タスクにおけるグラフニューラルネットワークの公正比較のためのパイプライン

A pipeline for fair comparison of graph neural networks in node classification tasks ( http://arxiv.org/abs/2012.10619v1 )

ライセンス: Link先を確認

Wentao Zhao, Dalin Zhou, Xinguo Qiu and Wei Jiang

(参考訳) グラフニューラルネットワーク (GNN) は, グラフデータを用いた複数の分野に適用可能性について検討されている。しかし、異なるモデルアーキテクチャやデータ拡張技術を含む新しい手法と公正な比較を保証するための標準的なトレーニング設定は存在しない。ノード分類に同じトレーニング設定を適用可能な,標準的な再現可能なベンチマークを導入する。このベンチマークでは、異なるフィールドからの小規模および中規模のデータセットと7つの異なるモデルを含む9つのデータセットを構築した。我々は、小規模データセットのためのk-foldモデル評価戦略と、全データセットの標準モデルトレーニング手順を設計し、gnnの標準実験パイプラインにより、公正なモデルアーキテクチャの比較を可能にする。 node2vecとLaplacian固有ベクトルを用いてデータ拡張を行い、入力機能がモデルの性能に与える影響を調べる。トポロジ的情報はノード分類タスクにおいて重要である。モデルレイヤの数を増やすことは、グラフが接続されていないPATTERNとCLUSTERデータセットを除いて、パフォーマンスを向上しない。データ拡張は非常に有用であり、特にnode2vecをベースラインで使用すると、パフォーマンスが大幅に向上する。

Graph neural networks (GNNs) have been investigated for potential applicability in multiple fields that employ graph data. However, there are no standard training settings to ensure fair comparisons among new methods, including different model architectures and data augmentation techniques. We introduce a standard, reproducible benchmark to which the same training settings can be applied for node classification. For this benchmark, we constructed 9 datasets, including both small- and medium-scale datasets from different fields, and 7 different models. We design a k-fold model assessment strategy for small datasets and a standard set of model training procedures for all datasets, enabling a standard experimental pipeline for GNNs to help ensure fair model architecture comparisons. We use node2vec and Laplacian eigenvectors to perform data augmentation to investigate how input features affect the performance of the models. We find topological information is important for node classification tasks. Increasing the number of model layers does not improve the performance except on the PATTERN and CLUSTER datasets, in which the graphs are not connected. Data augmentation is highly useful, especially using node2vec in the baseline, resulting in a substantial baseline performance improvement.

翻訳日:2021-05-01 11:09:31 公開日:2020-12-19

# Ekya: エッジコンピューティングサーバ上のビデオ分析モデルの継続的学習

Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers ( http://arxiv.org/abs/2012.10557v1 )

ライセンス: Link先を確認

Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Nikolaos Karianakis, Yuanchao Shu, Kevin Hsieh, Victor Bahl, Ion Stoica

(参考訳) ビデオ分析アプリケーションは(帯域幅とプライバシーのために)ビデオの分析にedge compute serverを使用する。推論のためにエッジサーバにデプロイされる圧縮モデルでは、ライブビデオデータがトレーニングデータから逸脱するデータドリフトが発生している。継続的学習は、新しいデータ上で定期的にモデルをトレーニングすることで、データのドリフトを処理する。本研究は,エッジサーバ上でのタスクの推論とリトレーニングを共同で支援する課題に対処し,リトレーニングされたモデルの精度と推論精度の基本的なトレードオフをナビゲートする。当社のソリューションであるekyaでは、このトレードオフを複数のモデルでバランスさせ、micro-profilerを使用して、再トレーニングによって最もメリットのあるモデルを特定しています。 Ekyaの精度はベースラインスケジューラよりも29%高く、ベースラインはEkyaと同じ精度を達成するために4倍のGPUリソースを必要とする。

Video analytics applications use edge compute servers for the analytics of the videos (for bandwidth and privacy). Compressed models that are deployed on the edge servers for inference suffer from data drift, where the live video data diverges from the training data. Continuous learning handles data drift by periodically retraining the models on new data. Our work addresses the challenge of jointly supporting inference and retraining tasks on edge servers, which requires navigating the fundamental tradeoff between the retrained model's accuracy and the inference accuracy. Our solution Ekya balances this tradeoff across multiple models and uses a micro-profiler to identify the models that will benefit the most by retraining. Ekya's accuracy gain compared to a baseline scheduler is 29% higher, and the baseline requires 4x more GPU resources to achieve the same accuracy as Ekya.

翻訳日:2021-05-01 11:09:12 公開日:2020-12-19

# 影は残らない:近似照明と幾何学による物体とその影の除去

No Shadow Left Behind: Removing Objects and their Shadows using Approximate Lighting and Geometry ( http://arxiv.org/abs/2012.10565v1 )

ライセンス: Link先を確認

Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian Curless

(参考訳) 画像からオブジェクトを取り除くことは、混合現実を含む多くのアプリケーションにとって重要な課題である。信頼できる結果を得るためには、オブジェクトがキャストする影も取り除かなければならない。現在のインペインティングベースのメソッドでは、オブジェクト自体を削除したり、影を置き去りにしたり、少なくとも、インペイントするシャドウ領域を指定する必要がある。我々は,キャスターとともに影を取り除くための深層学習パイプラインを導入する。さまざまなテクスチャを持つ表面から、さまざまな影(硬く、柔らかく、暗く、微妙に、大きく、薄い)を除去するために、粗いシーンモデルを活用する。合成されたデータに基づいてパイプラインをトレーニングし、合成シーンと実シーンの両方で質的で定量的な結果を示す。

Removing objects from images is a challenging problem that is important for many applications, including mixed reality. For believable results, the shadows that the object casts should also be removed. Current inpainting-based methods only remove the object itself, leaving shadows behind, or at best require specifying shadow regions to inpaint. We introduce a deep learning pipeline for removing a shadow along with its caster. We leverage rough scene models in order to remove a wide variety of shadows (hard or soft, dark or subtle, large or thin) from surfaces with a wide variety of textures. We train our pipeline on synthetically rendered data, and show qualitative and quantitative results on both synthetic and real scenes.

翻訳日:2021-05-01 11:08:37 公開日:2020-12-19

# 高ダイナミックレンジ画像品質のための統合データセットとメトリクス

Consolidated Dataset and Metrics for High-Dynamic-Range Image Quality ( http://arxiv.org/abs/2012.10758v1 )

ライセンス: Link先を確認

Aliaksei Mikhailiuk, Maria Perez-Ortiz, Dingcheng Yue, Wilson Suen, Rafal K. Mantiuk

(参考訳) 高ダイナミックレンジ(hdr)画像やビデオコンテンツの人気が高まると、輝度やダイナミックレンジの異なるディスプレイで見られる画像障害の重症度を予測する指標が必要となる。このようなメトリクスは、十分に大きな主観的画像品質データセット上でトレーニングされ、検証され、堅牢なパフォーマンスを保証する必要がある。既存のHDR品質データセットのサイズが制限されているため、既存のHDRと標準ダイナミックレンジ(SDR)データセットを統合、マージすることで、4000以上の画像を含む統一された測光画像品質データセット(UPIQ)を作成しました。リアライメントされた品質スコアは、すべてのデータセットで同じ統一品質スケールを共有します。このような認識は、追加のデータセットの品質比較を収集し、サイコメトリックスケーリング手法でデータを再スケーリングすることで達成された。提案したデータセットの画像は、ディスプレイから放射される光に対応する絶対光度および色度単位で表現される。新しいデータセットを使用して、既存のHDRメトリクスを再トレーニングし、データセットが深層アーキテクチャのトレーニングに十分な大きさであることを示す。輝度認識画像圧縮におけるデータセットの有用性を示す。

Increasing popularity of high-dynamic-range (HDR) image and video content brings the need for metrics that could predict the severity of image impairments as seen on displays of different brightness levels and dynamic range. Such metrics should be trained and validated on a sufficiently large subjective image quality dataset to ensure robust performance. As the existing HDR quality datasets are limited in size, we created a Unified Photometric Image Quality dataset (UPIQ) with over 4,000 images by realigning and merging existing HDR and standard-dynamic-range (SDR) datasets. The realigned quality scores share the same unified quality scale across all datasets. Such realignment was achieved by collecting additional cross-dataset quality comparisons and re-scaling data with a psychometric scaling method. Images in the proposed dataset are represented in absolute photometric and colorimetric units, corresponding to light emitted from a display. We use the new dataset to retrain existing HDR metrics and show that the dataset is sufficiently large for training deep architectures. We show the utility of the dataset on brightness aware image compression.

翻訳日:2021-05-01 11:08:25 公開日:2020-12-19

# 不確かさ下でのシリコンフォトニックニューラルネットワークのモデリング

Modeling Silicon-Photonic Neural Networks under Uncertainties ( http://arxiv.org/abs/2012.10594v1 )

ライセンス: Link先を確認

Sanmitra Banerjee, Mahdi Nikdast, and Krishnendu Chakrabarty

(参考訳) シリコンフォトニクスニューラルネットワーク(spnn)は、デジタル電子回路に比べて計算速度とエネルギー効率が大幅に向上する。しかし,SPNNのエネルギー効率と精度は,製造過程や温度変化から生じる不確実性の影響が大きい。本稿では,mzi(mach-zehnder interferometer)ベースのspnの分類精度に対する不確かさの影響について,初めて包括的かつ階層的検討を行った。このような影響は、非イデアルシリコンフォトニックデバイスの位置と特性(例えば、調整された位相角)によって異なることが示される。シミュレーションの結果, 2つの層と1374個の調整可能な熱水相シフト器を持つSPNNでは, 成熟した製造プロセスにおいてもランダムな不確かさが破滅的な70%の精度損失をもたらすことが示された。

Silicon-photonic neural networks (SPNNs) offer substantial improvements in computing speed and energy efficiency compared to their digital electronic counterparts. However, the energy efficiency and accuracy of SPNNs are highly impacted by uncertainties that arise from fabrication-process and thermal variations. In this paper, we present the first comprehensive and hierarchical study on the impact of random uncertainties on the classification accuracy of a Mach-Zehnder Interferometer (MZI)-based SPNN. We show that such impact can vary based on both the location and characteristics (e.g., tuned phase angles) of a non-ideal silicon-photonic device. Simulation results show that in an SPNN with two hidden layers and 1374 tunable-thermal-phase shifters, random uncertainties even in mature fabrication processes can lead to a catastrophic 70% accuracy loss.

翻訳日:2021-05-01 11:07:29 公開日:2020-12-19

# セルネットワークにおける結合スペクトルとパワーアロケーションの深部強化学習

Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular Networks ( http://arxiv.org/abs/2012.10682v1 )

ライセンス: Link先を確認

Yasar Sinan Nasir and Dongning Guo

(参考訳) 無線ネットワークオペレータは通常、保有する電波スペクトルを複数のサブバンドに分割する。細胞ネットワークでは、これらのサブバンドは多くの細胞で再利用される。共チャネル干渉を緩和するために、結合スペクトルと電力配分問題をしばしば定式化し、和レートの目的を最大化する。このような問題を解決する最もよく知られたアルゴリズムは、即時のグローバルチャネル状態情報と集中型オプティマイザを必要とする。実際、これらのアルゴリズムは時変サブバンドを持つ大規模ネットワークでは実装されていない。深層強化学習アルゴリズムは、複雑なリソース管理問題を解決する有望なツールである。ここでの大きな課題は、スペクトル割り当ては離散サブバンド選択を伴うのに対し、パワーアロケーションは連続変数を含むことである。本稿では,離散決定変数と連続決定変数の両方を最適化するための学習フレームワークを提案する。具体的には、2つの異なる深層強化学習アルゴリズムを同時に実行し、訓練することで、共同目標を最大化する。シミュレーションの結果,提案手法は最先端分数型プログラミングアルゴリズムと,深層強化学習に基づく先行手法の両方に勝ることがわかった。

A wireless network operator typically divides the radio spectrum it possesses into a number of subbands. In a cellular network those subbands are then reused in many cells. To mitigate co-channel interference, a joint spectrum and power allocation problem is often formulated to maximize a sum-rate objective. The best known algorithms for solving such problems generally require instantaneous global channel state information and a centralized optimizer. In fact those algorithms have not been implemented in practice in large networks with time-varying subbands. Deep reinforcement learning algorithms are promising tools for solving complex resource management problems. A major challenge here is that spectrum allocation involves discrete subband selection, whereas power allocation involves continuous variables. In this paper, a learning framework is proposed to optimize both discrete and continuous decision variables. Specifically, two separate deep reinforcement learning algorithms are designed to be executed and trained simultaneously to maximize a joint objective. Simulation results show that the proposed scheme outperforms both the state-of-the-art fractional programming algorithm and a previous solution based on deep reinforcement learning.

翻訳日:2021-05-01 11:06:55 公開日:2020-12-19

# ベイズ型非教師なし学習による濃縮電解質の隠れ構造

Bayesian unsupervised learning reveals hidden structure in concentrated electrolytes ( http://arxiv.org/abs/2012.10694v1 )

ライセンス: Link先を確認

Penelope Jones, Fabian Coupette, Andreas H\"artel, Alpha A. Lee

(参考訳) 電解質は、エネルギー貯蔵から生体材料まで幅広い応用において重要な役割を果たす。それにもかかわらず、濃縮電解質の構造は謎めいたままである。多くの理論的アプローチは、イオン対のアイデアを導入して濃縮電解質をモデル化しようと試み、イオンは対イオンで密に「対」されるか、電荷を遮蔽するために「自由」になる。本研究では,この問題を計算統計学の言語に再編成し,全てのイオンが同じ局所環境を共有するというヌル仮説をテストする。この枠組みを分子動力学シミュレーションに適用すると、このヌル仮説はデータによって支持されないことが分かる。我々の統計的手法は、異なる局所イオン環境の存在を示唆している;驚くべきことに、これらの差は電荷のアトラクションと違い、電荷の相関のように生じる。非凝集環境における粒子の分画は、異なる背景誘電率とイオン濃度にまたがる普遍的なスケーリング挙動を示す。

Electrolytes play an important role in a plethora of applications ranging from energy storage to biomaterials. Notwithstanding this, the structure of concentrated electrolytes remains enigmatic. Many theoretical approaches attempt to model the concentrated electrolytes by introducing the idea of ion pairs, with ions either being tightly `paired' with a counter-ion, or `free' to screen charge. In this study we reframe the problem into the language of computational statistics, and test the null hypothesis that all ions share the same local environment. Applying the framework to molecular dynamics simulations, we show that this null hypothesis is not supported by data. Our statistical technique suggests the presence of distinct local ionic environments; surprisingly, these differences arise in like charge correlations rather than unlike charge attraction. The resulting fraction of particles in non-aggregated environments shows a universal scaling behaviour across different background dielectric constants and ionic concentrations.

翻訳日:2021-05-01 11:06:40 公開日:2020-12-19

PDF登録状況（公開日: 20201219）